So ganz klappt es aber doch noch nicht:
Code: Alles auswählen
ascii_listing = """\
10 PRINT "A STRING"
20 ' ONLY A COMMENT
30 A$="ANOTHER STRING" ' INLINE COMMENT
40 REM COMMENT AND NEXT LINE IS CODE ONLY:
50 B=3
60 PRINT "STRING WIHOUT CLOSE!
70 GOTO 10 : REM COMMENT AS SECOND STATEMENT
"""
regex = re.compile(
r"""
^
(?P<line_number>\d+)\s?
(
(
(?P<pre_comment>[^"\n]*)
(?P<comment> ('|REM).* )
)?
|
(
(?P<pre_string>[^"\n]*)
(?P<string>"[^"\n]*"?)
(?P<post_string>[^"\n]*)
)?
|
(?P<code>.*)
)$
""",
re.VERBOSE | re.MULTILINE
)
def debug_match(match):
print "Line: >>>%s<<<\n" % match.group(0)
match_groupdict = match.groupdict()
for name, text in match_groupdict.items():
if text is not None:
print "%15s: %r" % (name, text)
print "-" * 79
print "-" * 79
regex.sub(debug_match, ascii_listing)
Code: Alles auswählen
-------------------------------------------------------------------------------
Line: >>>10 PRINT "A STRING"<<<
line_number: '10'
pre_string: 'PRINT '
string: '"A STRING"'
post_string: ''
-------------------------------------------------------------------------------
Line: >>>20 ' ONLY A COMMENT<<<
comment: "' ONLY A COMMENT"
line_number: '20'
pre_comment: ''
-------------------------------------------------------------------------------
Line: >>>30 A$="ANOTHER STRING" ' INLINE COMMENT<<<
line_number: '30'
pre_string: 'A$='
string: '"ANOTHER STRING"'
post_string: " ' INLINE COMMENT"
-------------------------------------------------------------------------------
Line: >>>40 REM COMMENT AND NEXT LINE IS CODE ONLY:<<<
comment: 'REM COMMENT AND NEXT LINE IS CODE ONLY:'
line_number: '40'
pre_comment: ''
-------------------------------------------------------------------------------
Line: >>>50 B=3<<<
line_number: '50'
code: 'B=3'
-------------------------------------------------------------------------------
Line: >>>60 PRINT "STRING WIHOUT CLOSE!<<<
line_number: '60'
pre_string: 'PRINT '
string: '"STRING WIHOUT CLOSE!'
post_string: ''
-------------------------------------------------------------------------------
Line: >>>70 GOTO 10 : REM COMMENT AS SECOND STATEMENT<<<
comment: 'REM COMMENT AS SECOND STATEMENT'
line_number: '70'
pre_comment: 'GOTO 10 : '
-------------------------------------------------------------------------------
Damit kann man eigentlich schon gut leben, aber so ganz richtig ist es nicht, denn z.B. ist Zeile 30 falsch: post_string: " ' INLINE COMMENT" sollte eigentlich als "comment" erkannt werden.
Geht das überhaupt, alles mit einer RE zu erschlagen?