Warum will ich das Ergebnis des Parsers im SQL Server verwenden? Dort ist mein Zuhause und mit TSQL kenne ich mich aus. Mit Python habe ich leider zu wenig Erfahrung, sodass es für mich wahrscheinlich einfacher wird, das Ergebnis des Parsers mit TSQL zu verarbeiten.
Der Parser liefert eine geschachtelte Token-Liste aus einzelnen Token, und die will ich in JSON convertieren, und zwar nicht flach (das funktioniert schon mit jsonpickle), sondern als geschachtelte Liste.

Code: Alles auswählen
# https://github.com/andialbrecht/sqlparse
import sqlparse
# # https://docs.python.org/3/library/json.html
import json
# import simplejson as json
# https://github.com/jsonpickle/jsonpickle
# https://jsonpickle.github.io/
# https://jsonpickle.readthedocs.io/en/latest/api.html
import jsonpickle
# Split a string containing two SQL statements:
# raw = 'select * from foo; select * from bar;'
# statements = sqlparse.split(raw)
# print(statements)
# # Get the first statement
# stmt = statements[0]
stmt = """
CREATE VIEW a.c AS
SELECT
[Date] = [T1].[Date],
aaa = /*Comment line 1
Comment line 2*/
bbb,
ccc as ccc1,
ddd as ddd1,
eee = /*Comment line 1
Comment line 2*/
eee,
fff = fff
FROM
a.b T1 (NOLOCK)
LEFT JOIN c.d T2
ON T1.jjj = T2.jjj
"""
stmt_format1 = sqlparse.format(stmt, strip_comments=False, use_space_around_operators=False, comma_first=True, reindent=True, keyword_case='upper')
## remove block comments in the first step, otherwise some unwanted indents are created
stmt_format2 = sqlparse.format(stmt_format1, strip_comments=True, use_space_around_operators=False, comma_first=True, reindent=True, keyword_case='upper')
## reformat the statement with removed comments to get results without unwanted indents
stmt_format3 = sqlparse.format(stmt_format2, strip_comments=True, use_space_around_operators=False, comma_first=True, reindent=True, keyword_case='upper')
print(stmt_format3)
# Parsing a SQL statement:
# the result is fine and nested, there is an IdentifierList which I would like to keep as one group
parsed = sqlparse.parse(stmt_format3)[0]
# # json.dumps doesn't work, the object is to complex
# jsonStr = json.dumps(dict(parsed.tokens))
# I found jsonpickle
# it can convert also complex objects into json
frozen = jsonpickle.encode(parsed.tokens,keys=True,indent=1)
print(frozen)
# # the object is not restored as the original because the json represenation lost information
# thawed = jsonpickle.decode(frozen)
# # I dont get nested tokens exported into json, instead I get only the token type, but not the content:
# {
# "py/object": "sqlparse.sql.IdentifierList"
# },
# {
# "py/object": "sqlparse.sql.Identifier"
# },
# to get all tokens with content, I can flatten the "parsed" and convert to json
flattened = list(parsed.flatten())
print (flattened)
frozen2 = jsonpickle.encode(flattened,keys=True,indent=1)
print(frozen2)
# But now they are in a flat list and I lost information about parents and nested structures
# print(parsed.tokens)