ich sitze hier im Augenblick an einem gar garstigen Problem. Und zwar möchte ich zu Testzwecken gerne XML Parsen um mich der Unicode-Problematik anzunähern.
Ich versuche dafür die Wetter-Api von Google auszulesen:
Code: Alles auswählen
# -*- coding: utf-8 -*-
import urllib
import xml.dom.minidom as dom
def getWeather(village):
try:
params = urllib.urlencode({'weather': village})
f = urllib.urlopen("http://www.google.com/ig/api?%s" % params)
tree = dom.parse(f)
retstr = "!OK!"
except Exception, e:
if "list index out of range" in str(e):
retstr = "No information found"
else:
retstr = "!FAILED!"
raise
return retstr
if __name__ == "__main__":
v = "Lübeck"
print getWeather(v)
Fogender Fehler tritt auf:
Code: Alles auswählen
Traceback (most recent call last):
File "weather.py", line 22, in <module>
print getWeather(v)
File "weather.py", line 10, in getWeather
tree = dom.parse(f)
File "/usr/lib/python2.5/xml/dom/minidom.py", line 1915, in parse
return expatbuilder.parse(file)
File "/usr/lib/python2.5/xml/dom/expatbuilder.py", line 928, in parse
result = builder.parseFile(file)
File "/usr/lib/python2.5/xml/dom/expatbuilder.py", line 207, in parseFile
parser.Parse(buffer, 0)
xml.parsers.expat.ExpatError: not well-formed (invalid token): line 1, column 171
Code: Alles auswählen
# -*- coding: utf-8 -*-
import urllib
import xml.dom.minidom as dom
def getWeather(village):
try:
params = urllib.urlencode({'weather': village})
f = urllib.urlopen("http://www.google.com/ig/api?%s" % params)
fu = unicode(f.read(), "utf-8", errors="replace")
tree = dom.parseString(fu)
retstr = "!OK!"
except Exception, e:
if "list index out of range" in str(e):
retstr = "No information found"
else:
retstr = "!FAILED!"
raise
return retstr
if __name__ == "__main__":
v = "Lübeck"
print getWeather(v)
Code: Alles auswählen
Traceback (most recent call last):
File "weather.py", line 23, in <module>
print getWeather(v)
File "weather.py", line 11, in getWeather
tree = dom.parseString(fu)
File "/usr/lib/python2.5/xml/dom/minidom.py", line 1925, in parseString
return expatbuilder.parseString(string)
File "/usr/lib/python2.5/xml/dom/expatbuilder.py", line 940, in parseString
return builder.parseString(string)
File "/usr/lib/python2.5/xml/dom/expatbuilder.py", line 223, in parseString
parser.Parse(string, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'\ufffd' in position 171: ordinal not in range(128)
Könnte mir hier vielleicht jemand freundlicherweise auf die Sprünge helfen?
Ich danke im Voraus,
sparrow