Stop trying to decode image tags to utf-8
authorBrett Parker <iDunno@sommitrealweird.co.uk>
Mon, 11 Feb 2013 09:41:39 +0000 (09:41 +0000)
committerBrett Parker <iDunno@sommitrealweird.co.uk>
Mon, 11 Feb 2013 09:41:39 +0000 (09:41 +0000)
    - If the title/url contains a utf-8 character and we try to decode it it
      will fail due to not being present in the ascii set. Feedparser has already
      made sure that everything is utf-8 before we get it.

rss2maildir.py

index 8a59c8548761e87c06c1befe66a9f3550cf4b07b..dc0427a6d2e8ac9ee5a206a89ce5f0634364a3f8 100755 (executable)
@@ -307,9 +307,9 @@ class HTML2Text(HTMLParser):
         url = u''
         for attr in attrs:
             if attr[0] == 'alt':
         url = u''
         for attr in attrs:
             if attr[0] == 'alt':
-                alt = attr[1].decode('utf-8')
+                alt = attr[1]
             elif attr[0] == 'src':
             elif attr[0] == 'src':
-                url = attr[1].decode('utf-8')
+                url = attr[1]
         if url:
             if alt:
                 if self.images.has_key(alt):
         if url:
             if alt:
                 if self.images.has_key(alt):