Showing posts with label Var. Show all posts
Showing posts with label Var. Show all posts

Friday, March 07, 2003

RSS for comments and ids

I got Sam's comment feed - and it imediately broke my agregator. The problem is quite simple - I use the link of the item as a key, and in the comment feed all
comments use the same key as the article. And the same title.

This changes the entire data model - for each link I'll have to store a list of MD5s, and treat it as a multi-value. The first time ( if no MD5 is found ) it'll be the "source" message, and all other occurances - including edits of the original - can be treated as "Re: " - to implement the threading in the mail reader.

Probably I'll just go for the simplest solution - and keep a .db keyed by MD5 (that should be unique enough ) - the whole idea is to avoid sending the same item multiple times.

Once again - the RSS proves to be almost completely useless.

Update: Sam changed the links. There are few problems. Some of his links are quite strange .../blog/('1247.html#c1047152121',), I suppose a small bug sneaked in.

Worse - I now lost the way to relate comments with postings. I'll have to parse the generated links and remove the ending to guess the original posting. Sam - please change back, I fixed my code to deal with the old problem and I think the workaround for the new problems is harder :-)

A better solution would be to just add a separate tag with the comment id. Of couse, that would be in the RSS-2003-03-08 "standard", and nobody else will use the same tag name. ( we should use the date of the "standard du jour")

Outlook mail aggregator

It seems there are other people who preffer the mail reader - NewsGator is specific to Outlook, so I won't pay them $29.

mail aggregator code

I cleaned up a bit and uploaded my mail aggregator. Getting rid of the pickle was a great move - db seems good enough.

For now I use simple name-value mappings and several db files - it's easier to debug and get other tools to use the same data. I was thinking of a real database - but that would make it more complex and I doubt it'll scale or work better.

I doubt too many people will find it usefull - when they have all the fine 3-panel graphical agreggators. I just use it to read weblogs in train, using my old "pine" and sometimes mozilla/evolution/kmail ( when I feel a need for real nice GUI - and go back to pine when I want to read mail ).

Saturday, February 22, 2003

Sending mail to the blog

Armed with Sam's short intro and the MT manuals - I started the second piece of my mail toy - weblog posting via email. Posting via xmlrpc is much easier than I expected - it took me a while to figure what is the blogid, but after that everything was smooth.

I'm using the metalog xml-rpc style - and I'll use the mt extensions to set the category.



One comment on Sam's "REST" alternative - I think an even better approach would be WebDAV, with the item body as plain HTML, with META tags used for the category and the extra (meta) info. This way all HTML editors would be able to edit and publish weblog entries without any modifications. WebDAV also allow editing existing entries. The "core" of a weblog entry is a piece of HTML - with extra metadata and with extra processing done to generate the weblog-style pages ( templates to generate rss, html in a specific layout, etc ).



The next item to implement is the wiki filter, so I can use pine to send wiki-style plain email and have it converted to html that I can upload via xmlrpc. The real hard part will be implementing the special "headers" for meta-info, and implementing the "special" posting forms ( comments to other postings and about web sites ). I want to just use the "mail this page" button, add the comments on top and send it to "myBlog@myDOMAIN.com".

Back from traffic school

Boring - but I got 8 hours to think about other things. Like where to put the extra information that is required to weblog by email. The logical place would be in headers - but most mailers don't make this easy, and it would be confusing to some people. In addition - if you do "send page" in a web browser, and you want to comment on the page ( a very nice way to enter this kind of weblogs ) - you have an even less functional mailer.



Another option would be the recipient address - it can be myBlogPostinEmail+extra-infor@domain.com. Unfortunately - the "+" syntax is not supported by most domains, and it assumes you are in control of the mail system - which would work great for a "mail to weblog" provider, but not that well for individual use.



So - the remaining place is the top and bottom of the post. Each post will start ( and be parsed for ) a set of "magic" headers - headers inside the body and at the end, after some -- that indicate the signature. Some mailers allow to associate a signature with a particular address - that would work pretty well.



What meta-information is required ? First, a "key" - the preffered mechanism would be to sign the message, but again one of the target for such a system is people who don't want all the technical complexity of a new application.

The other info is the "category", and also some address to send talkback to ( this can be extracted from Reply To: if a mail-based "agregator" is used to read the weblogs ).



Is this too complicated ? I don't think so, people are used to start the mails with a "Hi" and with many kinds of "forms". I think there are also "mail templates" that could help.



HTML mail would be more difficult to parse for that - but it's not impossible ( just need to look for the magic keywords delimited by < and > ).

Thursday, February 20, 2003

Wiki, Weblog and Pine

It seems I'm not the only one - James Strachan wants to use a WYSYWIG editor for wiki.



The main point of wiki is the ease of authoring. Well, learning a new set of rules (slightly different for every wiki ) and editing it in a HTML form is not "easy" by my definition.



Some web browsers have a nice menu option - "Edit this page", and you are presented with a decent HTML editor. Of course, it generates crappy HTML, but it's easy to filter it out to the same level as wiki ( i.e. only simple tags with no style attributes ). And then - you can either "publish" ( again, widely available ) or just "mail" - so you can do your stuff offline.



I see a lot of value in the wiki style - but _one_ style, chosen by the author and not by each site. And used when the author preffers to use a text editor ( a decent one - not a form ) - like pine or vi or emacs. Again - "mail this" is so trivial and available in so many editors.



As you have noticed - I am not happy with the current model for "agregators" and "authoring" for weblogs. For exactly the same reason. Even if they will create an intuitive and easy to use agregator and authoring tool ( and I heard they're not ) - why should people have to learn and install another tool ?



Last days I tought about this - as my "mail agregator" is growing and I am moving to the publishing side, I'm prioritising the list of features. Composing a weblog using wiki style from pine is very high - since I use pine a lot. I also use Evolution

or Mozilla from time to time - so authoring weblogs in one of those and publishing it via a simple "send email" is the second priority.



Back to wiki - the fundamental idea is to be VERY simple. Using a familiar tool is simpler. It is harder to implement the wiki ( you need to enable webdav or some mail filters, better locking, convery ugly html to wiki or simple html, support multiple editors, etc ).

Sunday, February 16, 2003

Mail aggregator

My mail aggregator is working fine - thanks Sam for the 2 pointers. The big problem is of course the "standard" XML/RSS that is used. Just like in almost all other places where XML is used - the benefit of a standard syntax is countered by the complete random and obfuscated ( and countless) schemas and variations.



First problem: many feeds don't include the date - I added few regexp to extract common patterns ( Updated: ..., Posted: ... ) from the content. If I can't get it - I'll assume the time of the collection - which would be wrong for older news, but it'll be close enough as I update.



Second problem: Since most of the time you can't tell if a "permalink" has been updated - I have to cache the MD5, so I get new mail when the link content changes.



What's next ? One think I allways liked is the multipart mime - with HTML and images in the same message. I don't remember the details ( the links have special syntax ), but it shouldn't be difficult to fecth the images and save them with the entry, for full off-line reading. And the other side - using mail to update my own

weblog.



I expect more tweaks as I read more weblogs - each weblog I add has its own (standard :-) XML style.



BTW, I (re)discovered the "trick" to include the images in the mail - the program will need to grab all , add the content as mime parts and use "cid:" magic protocol, and Content-ID header in the parts. It shouldn't be very hard to code - but for now it's not a big priority, the mailer can get the images from the web when online and only few weblogs have images.

Wednesday, February 12, 2003

mail and weblog (3)

Finally, reading weblogs starts to become easy. I modified blagg - actually rewrite it in python, I'm clearly out of touch with perl. I generate the files in maildir format, then I use KMail to read it. I usually use pine, but pine requires some patches to support maildir and it's not that nice with HTML.



There are many details to sort out - I can parse only 2-3 .rss formats so far, but the benefit is huge. I need to get the comments and fix the headers so I can see them as threads, and I'll like to do a simple scan on the content and get the images - and generate the multiplart MIME message.



The big one is getting the category and sorting in folders - but that can be done in procmail. I'm trying to preserve all the RSS data as mail headers - so procmail can do its job.



In any case - even with the ugly hacky script I'm using, KNode ( or pine or Evolution or mozilla - after I get procmail I can just push the items in regular imap ) are so much better than any of the agregators I've seen so far. Sorting, offline read, filtering, organizing data and moving interesting items in the same folders I use for interesting mails - and above all, the "familiar" feeling and having all the keys in my fingers...



The other direction is also interesting - editing HTML mail in any mailer ( offline or not ) and then having it published as a weblog entry.

Tuesday, February 11, 2003

mail blog (2)

Much better... I made some modifications to blagg to send the content as text/html, use the feed name as Sender and the feed title as subject. My perl is way too rusty, I may to write something similar in python and then get more info into headers and play with procmail.

Mail blog

Sam Ruby added mail based comments to his weblog. This allows people to send comments using email. The other missing half is to allow people to read the weblog using email - a mini mail lists, and a small form allowing people to subscribe and get each entry and the comments by email.



So far the news-based agregator seems the most intuitive and least painfull for me. Since it's very unlikely people will start adding the "mail" publishing to their site - the only option is to modify one of the scripts ( blagg ? ) to get the RDF and send mails. Then procmail or mail filters could sort them - and finally I can read them without pain...



Aparently - there is already an email plugin. Let's see how this work...

Sunday, February 09, 2003

OS problems

At least I'm not the only one... Ovidiu seems to have OS X problems too. Last week I did think about getting a mac - where everything just works and you only have to click ( and pay some extra $ ). Then I realized that I learned a lot by just trying to fix my linux box - even if I didn't succeed in the end. And besides frustration, I did get some fun...

Blog reader distribution

Interesting link about blog power distribution. Found it on Cafe au Lait/Leche, one of my daily reads.



I'm interested in the relation and shifts between weblogs and mail lists/news groups. This confirms that for most people like me ( without much writting talent or more private ) the mail lists remain the best way to communicate their toughts ( if they want to be heard). The weblog is great for organizing ideas and for communication in small circles. I'll probably post more on this topic, I have a list (that gets longer every day) comparing the 2 mediums. My conclusion so far is that weblog needs more topic-related agregation and subject-based mirroring into mail/news.



As I find more and more interesting people I find it harder and harder to read their postings - switching between so many subjects and mazes of links is becoming painfull.

Tuesday, January 28, 2003

Kde3.1 on RedHat 8

I got it to work - and decided it's time to move to GenToo. Compiling it from sources on my system was almost impossible - too many dependencies. Rawhide was painfull and in the end would have replaced essential libraries.



What I did - just used Slackware binary packages. Since they use tar.gz, it was trivial to install - just few changes in one of their postinst scripts. They use /opt/kde - so they don't interfere with any other system. After RPM hell, that was an amazing experience.



KDE3.1 works great, konqueror is fast - tab browsing works. The only thing it lacks is remembering passwords. What I've "lost" - the startup menu. Kappfinder

was able to create one - with _all_ the apps it found ( amazing how many apps were hidden ).

Saturday, January 25, 2003

Tahoe

I'll be in Tahoe this weekend. No programming or internet.

Thursday, January 23, 2003

Weblog and wiki

I wonder if anyone has combined Weblog/wiki. It would be pretty nice

* WikiNames for links - you need the name of the weblog and the title of the entry, no long URLs or numbers

* you can easily type lists, bold, etc

* wiki already has a lot of nice feature related to changes/updates to entries.



I find it very usefull to update previous weblog entries - and pretty hard to navigate through the changes ( if I read someone's log, and he changed few things - I'll have to search for it ).



Of course, wiki-style names (including category ) for the weblog entries would make the weblog easier to navigate.



That won't solve the biggest problem I have with weblogs - the complete mess of informations. I can read few high-traffic lists very fast - but I can hardly manage to read more than 5-6 weblogs. I'm just lost when I have to switch too many contexts - in a mail reader I just ignore threads I don't care about, I read related issues one after another ( thread view ). With weblog - while I find fascinating and very usefull to hear all the personal and generic informations, it costs me a lot of energy to click all the links and understand what's happening.



Update: Many thanks for the pointers. While navigating the links, I found a nice

weblog search engine. That's by far my worst problem with reading weblogs - reading by subject instead of by author.







* Test wiki

** Test1 '''test'''









* Test2

** Test1 ''test''



Wednesday, January 22, 2003

It's already done - nntp/rss gateway

I was thinking about the effects of blogs on existing mailing lists and news. And how to bridge the two worlds - I'm used to get information by a specific subjects, not by author.



Of course, someone else already did that on sourceforge - http://www.methodize.org/nntprss



Update: it works pretty well - not perfect, but far better than any other agregator I've

tried so far. Using a familiar interface ( the news reader ) is very nice, but the real potential is in bridging the blog and news/mail worlds. More on that later - I feel this is extremely important to explore - the relation between weblog and mail/news and the effect on communities.

Monday, January 13, 2003

Starting to "blog"

Few days ago I decided it's time to do what everyone else is doing. It seems most people I know have blogs - and it's getting pretty hard to know what they think without reading the weblogs. Mailing lists are very quiet lately, and the substance seems to move to dozens of weblogs.



I can see the benefits ( at least for the weblog authors ) - you get to organize the information, you don't have to google mail archives to find your own postings. There is a lot of noise in the mailing lists - and even if I just seen the first inter-blog flame war, it seems less likely to involve more that a few (2?) people.



Ovidiu is the one who convinced me - and provided the space. As a new blogger, I have a lot of open questions about this.



First, how do people put so many links in their blogs ? Is it just cut&paste, or some magic ? What about the "talkback" feature ? I understand how it works, but not how to do it.



Another question is how to get content back into mailing lists ( and if ). A lot of people are using the lists to exchange info - the blog shouldn't take this awa



What is the correct way to update entries ? Just change the date ?



With mail lists, you can filter out or skip things. With blogs I need to find the relevant entries and people - and then read them. RSS seems the answer - I found blagg and I am able to get a feed, but that includes all the postings by a particular person. I need to sort them on categories - or I won't be able to track more than a few people.



Using blogs instead of mailing lists to exchange ideas seems possible - I can see how a proposal could be posted in a blog, feedback and votes accumulated - and development status updated. I made many proposals that get almost no feedback ( and never got implemented ) - and many that I implemented, sometimes in a very different way than I originally expected.



Update: For updating, it seems people just change the date - so it gets up in RSS.

For links - it seems cut&paste is the solution. I'm pretty worried about the relation between weblog and mail lists - and the information chaos, but that will be a separate entry.



Many thanks to all who read and commented on this.