Index of /code/caterpillar3

      Name                    Last modified      Size  Description
Parent Directory - Caterpillar.app/ 29-May-2007 19:40 - License.txt 29-May-2007 19:39 15K META-INF/ 29-May-2007 20:02 - _darcs/ 29-May-2007 20:46 - build/ 29-May-2007 19:39 - data/ 29-May-2007 19:40 - graphs/ 29-May-2007 19:56 - lib/ 29-May-2007 20:02 - prefs/ 29-May-2007 20:02 - run_caterpillar.bat 29-May-2007 20:02 1.3K run_caterpillar.sh 29-May-2007 20:02 1.0K sql/ 29-May-2007 20:02 - src/ 29-May-2007 20:07 - testFeeds/ 29-May-2007 20:07 - tests/ 29-May-2007 20:07 -
Caterpillar 3.0 README Caterpillar 3.0 is a proof of concept app to demonstrate how incredibly useful Bayesian filtering can be when tasked with finding "interesting" articles instead of, or in addition to, spam. Download it here (10.7 Mb). It works for me. I enjoy it. and it makes my life easier. Maybe it will yours too.

How does it work?
Well, the simple explanation is that Caterpillar watches what you read and what you don't read. If you click on a title Caterpillar figures it must have been at least somewhat "interesting" to you. If you click on a link in a post it figures the post's contents were probably "interesting" too. If something sits around in your list for so long that it eventually gets purged Caterpillar figures that it's title probably wasn't very "interesting" to you.

In a surprisingly short amount of time Caterpillar is able to start picking out articles that you'll probably find interesting. These it will highlight in green, instead of the standard blue. That's it. The only time you should ever have to specifically train Caterpillar is when it gets something wrong, which isn't that often. So, if Caterpillar thinks something would be interesting to you but you really disagree just select it and choose "Downvote" from the "Entries" menu. That's it. You can, of course, Upvote things too but you really shouldn't need to.

Who should use it?
Caterpillar is good for two primary groups of people.
  1. People with way too many subscriptions and not enough time or energy to sift through them all for the posts that are actually worth reading.
  2. People who want a simple user interface for their feed reader.
And, of course, randomly curious geeks. Regardless of why you've chosen to try Caterpillar it's important that you remember that it is just a proof of concept and as such it may have some rough edges.

What about screenshots?
Visually it hasn't changed much since the 2.0 release and there are plenty of screenshots of that on the Caterpillar 2.0 site. There's now the highlighting of "interesting" items in green, a few more menu options, and a couple extra useful links and info when reading an article. See "Why Release it Now?" for why I don't have updated screenshots.

Requirements & Use
It requires Java 5 or higher, although if you have some pressing need, and you're a geek, you could compile it under Java 1.4.x and it should work.
To run it just double click on the Caterpillar.jar file in their GUI and all should work. OS X users should be able to just double click on the pretty icon.
You can also use the command line, change to the Caterpillar directory and type: java -jar Caterpillar.jar

Give it a week.
If you read a crazy number of feeds like I do you'll probably see Caterpillar start picking out new entries for you by the end of your first day. If you're like most people it may take a bit longer. The more you use it, the faster it learns. It's that simple. Just don't click on random entries in hopes that it will help. It won't. Just read the entries you would normally read and let it learn what's really interesting to you.

How do I import feeds from my current aggregator?
Most feed aggregators will allow you to export your list of feeds as an OPML file. Rename that file exportedFeeds.opml and place it in the same directory as the Caterpillar.jar file. Then choose "Import Feeds" from the file menu and give it a while to go and download them. If you started it from the command line you'll see it say calling out the names of the feeds it's importing as it goes.

What about a manual?
The Caterpillar 2.0 site has the Caterpillar 2.0 docs which cover basically everything except the Bayesian learning stuff which I just covered in the "How does it work?" section above. See "Why Release it Now?" for why I don't have an updated manual.

So what do I mean by  "proof of concept"?
Well, Caterpillar's a good app. I use it every day. But it's got some limitations, and right now I just don't have the time to fix them. So here they are in no particular order:

Why release it now?
Or, more to the point, why release it in an unfinished state? Well, I had intended to finish polishing it up and release it as a commercial product. But, that was over two years ago and there have just been too many other projects on my plate that are more important to me. I'd rather see people get some use out of it than have it continue to sit on my computer benefiting no-one but me. I'm also hoping that some smart programmer at Google will see the value of positive Bayesian filtering and apply it to Google Reader and Gmail. Just imagine how awesome it would be if all those mailing lists you subscribe to had a filter looking for "interesting" posts for you so that you didn't have to read everything or feel so overloaded that you end up reading nothing.

Wanna help?
If you're a Java geek feel free to download the source with Darcs from http://caterpillar.masukomi.org/code/caterpillar3
Tweak it however you want, add whatever feature you want, use the send feature of Darcs to send me a patch file (masukomi at masukomi dot org) and I'll probably add it in. I figure any forward motion in Caterpillar is good at this point. The only restriction being it needs to have a unit test with it. Yes, I know, it seems hypocritical in light of the utter lack of tests in Caterpillar's source but since I wrote it I got the full-on testing religion so... deal :P . To build Caterpillar just switch into the build directory and run ant.

Bugs & Feature requests

Report a bug.Report a Caterpillar bug
Suggest a feature.Request a Caterpillar feature

License & Copyright
The Caterpillar feed aggregator version 3.0 is copyright 2007 Kate Rhodes (masukomi at masukomi dot org) and is released under the GPL v2.0. Have fun. Don't blow anything up. Convince your rich company that they should buy the source from me so that they can sell it under any license they want. Or hire me and pay me a decent salary. Either / or...

    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation; either version 2 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License along
    with this program; if not, write to the Free Software Foundation, Inc.,
    51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.