My mail aggregator is working fine - thanks Sam for the 2 pointers. The big problem is of course the "standard" XML/RSS that is used. Just like in almost all other places where XML is used - the benefit of a standard syntax is countered by the complete random and obfuscated ( and countless) schemas and variations.
First problem: many feeds don't include the date - I added few regexp to extract common patterns ( Updated: ..., Posted: ... ) from the content. If I can't get it - I'll assume the time of the collection - which would be wrong for older news, but it'll be close enough as I update.
Second problem: Since most of the time you can't tell if a "permalink" has been updated - I have to cache the MD5, so I get new mail when the link content changes.
What's next ? One think I allways liked is the multipart mime - with HTML and images in the same message. I don't remember the details ( the links have special syntax ), but it shouldn't be difficult to fecth the images and save them with the entry, for full off-line reading. And the other side - using mail to update my own
weblog.
I expect more tweaks as I read more weblogs - each weblog I add has its own (standard :-) XML style.
BTW, I (re)discovered the "trick" to include the images in the mail - the program will need to grab all , add the content as mime parts and use "cid:" magic protocol, and Content-ID header in the parts. It shouldn't be very hard to code - but for now it's not a big priority, the mailer can get the images from the web when online and only few weblogs have images.
Technical stuff
Sunday, February 16, 2003
Mail aggregator
Subscribe to:
Post Comments (Atom)
Blog Archive
-
▼
2003
(44)
-
▼
February
(16)
- JMX console
- Sending mail to the blog
- Back from traffic school
- Wiki, Weblog and Pine
- First problem with mail blogs
- Mail aggregator
- mail and weblog (3)
- Soap over SSL without signed certs
- mail blog (2)
- Mail blog
- OS problems
- Got my linux back
- Blog reader distribution
- linux and java
- JMX servlets
- Gentoo (and gump)
-
▼
February
(16)
No comments:
Post a Comment