Witigonen

Feeds Feeds

How to Actually Search Craigslist

Posted by James on October 29, 2007 at 12:39 p.m.
Yahoo's Pipes makes it possible to perform advanced combined searches in Craigslist
How to Actually Search Craigslist

Craigslist is an amazingly useful site. I have bought many things from its listings and even got my current job through it. I hope also to find my next job through it, but my field has many nebulous names for itself: library, catalog, information science, archive, museum. Also, I don't want to work anywhere in the Bay Area, but wouldn't mind being in the South Bay or the Peninsula. In situations such as this, which are likely quite frequent, Craigslist is not so helpful. There is no way to truncate searches, such as "librar*" to include librarian, library, libraries, etc. There is no way to perform Boolean AND, OR, NOT searches. There is no way to remove frequently occuring irrelevant items. There is no way to search two sub-regions at once. So, unless I want to perform 20 searches a day and receive MANY completely irrelevant hits, I basically have to browse.

Until Yahoo Pipes that is. This site makes Craigslist searching powerful and incredibly useful. It allows you to combine any number of Craigslist searches via Craigslist's RSS feed (see the link at the bottom of your search results) and then manipulate them in untold ways using the built-in RSS field values, such as title, text, date, etc. I designed my search by including every possible variation I could think of for the relevant terms, filtering out non-unique links (you could do this to titles too, but I didn't want to risk losing a job with the same title), filtering out frequent false positives I saw, sorting by date, and then dumping to an RSS feed.

The site works with a fairly simple module system. On the left hand side of the pipes workflow you have a variety of categories of modules and within them various modules, the titles of which are usually self-explanatory. There is, however, decent documentation on-site if you can't figure it out. For Craigslist, as I said, it's simplest to use RSS feeds and for that you use the fetch feed module, seen to the left. You simply enter the URL for the feed you want and click the plus sign to add as many as you want. By including several search feeds, you are in essence building your OR Boolean search.

Then you can manipulate the information gathered from the feed in a variety of ways. Most of these fall under the category "operations". One that basically any good Craigslist search will want to use is "Unique". As you can see from the drop down in the image to the left, the fields that you make unique are actually derived from the fields of the RSS feed. There are a few generic fields as well, but clearly the power of pipes comes from that integration. Note, this can only happen once you connect the feed and unique modules with the blue "pipe."

Filter is obviously going to be important as well. It likwise bases the fields on the RSS feed that went into it, as you can see in the dropdown to the left. You can set any of the various settings to include, exclude, any or all of the matching hits. If you would like an include list and an exclude list, you simply add a second module and connect them. This is the NOT of your Boolean search.

Once you have performed all your selecting operations, you will likely want to sort the results. This is simple with the module to the left, which once again includes the fields in the RSS feed itself.

Throughout the process, you can evaluate how your operations are effecting the results by using the debugger at the bottom of the screen. From it you can select any of the modules and see its output and see if maybe you messed something up and have no results or find that you are leaving out an important filter. You can also change the order the data passes through each module. Once you've saved the pipe and are taken to the results, you can get a link to it in any of the various ways you might want the feed sent to you.

The end result: my RSS reader dumps several relevant hits a day, along with a few bad ones of course, but a lot fewer and while spending no time performing the same searches over and over. To start with a completed pipe and modify it, see my job search pipe.


Spread the word

Facebook Share

Comments

Comments from site editors have a darker background than comments from everybody else.
  • Anyone know what the various craigslist uniques actually are? for example, which one is "city" or "neighborhood" - I'm pulling feeds from 20 different cities, and I want to sort by city. Is that "item.dc:source" or one of the others?

    Posted by: Moe on November 02, 2007 at 12:30 p.m.
    • If you select an item in the debugger, it will drop down and show the names of the fields and their values. As for city, that's actually not a part of the RSS feed fields, because it is built into the URL; i.e. sfbay.craigslist.org/sfc is clearly pullling SF stuff, sfbay.craigslist.org/sby is south bay, portland.craigslist.org is pulling Portland, etc.

      As far as I can tell, the easiest thing to do for your case would be to use the rename module and rename a field that's unimportant for your purposes, such as "item.dc:rights" to the name of the city. In this case, you'll need a separate fetch feed module for each city, pipe each to a rename, and then join them to sort. It would obviously be best to simply add a field for this, but I don't see a module that would do such a thing.

      Posted by James on November 02, 2007 at 7:23 p.m.
  • This is simply an amazing way to search craigslist. Thank you very much for creating this guide.

    I noticed something odd when I subscribed to my custom pipe's RSS. When I create a RSS for my custom pipe, it appears that it doesn't include the sort module. The sort by item.dc:date module works fine and displays correctly on my yahoo pipes List, but when I create a RSS of the custom pipe to use in another program like firefox, the results are no longer sorted by date. It gets sorted by each individual search feed in my pipe.

    Anyone else notice this?

    Posted by: Gen on November 03, 2007 at 2:50 a.m.
    • I'm glad you liked it. I haven't had the problem you describe. If you publish your pipe and put a link here, I could take a look at it and see what might be causing it.

      Posted by James on November 03, 2007 at 3:46 p.m.
  • Greetings from Portland :)

    This is really a fantastic tutorial. I wasn't even aware that Yahoo Pipes existed, but this is really going to make my life easier!

    Any other applications for Pipes that you've found noteworthy? And I notice you're running on Django, which I've investigated quite a bit recently (coming from a Rails background). What's your experience with it been?

    Posted by: Jeremy Wilkins on November 03, 2007 at 1:04 p.m.
    • Django is great - if you're doing what it's designed for. If you're using the admin interface, and have trusted users editing structured content then it's wonderful. Templating is also marvelous.

      If you have some specific questions or want to talk about it, feel free to email me.

      Posted by Michael on November 03, 2007 at 1:25 p.m.
    • I'm glad you liked the tutorial. This is the only use for Pipes I've had personally, but I know that there are tons of different pipes published with a wide variety of uses. It works well as an aggregator of just about anything as well as a way to search any feed with poor or non-existant search capabilities. Look around at the featured pipes on the site.

      Posted by James on November 03, 2007 at 3:51 p.m.
  • Great post. I have been using pipes with CL for a few months to combine search results from a few of the different city sites in my area. That way I can get a little larger area searched for items I am looking for (mainly tools and horses). There are some great tips in here to make those searches a little more exact.

    Posted by: chris on November 05, 2007 at 3:07 p.m.
  • FYI, you can do some simple NOT filtering on craigslist items by simply prepending a hyphen to a word. Alas, like you said you cannot use word prefixes to remove variations (like libr*). Yaho pipes looks very promising if I need to do complex online searches. Thanks for the pointer.

    Posted by: Greg on November 18, 2007 at 11:44 p.m.
  • If you're searching craigslist often, you might find this site useful... it sends you an email or text message whenever something you're searching for gets posted:

    http://craigslistwatch.com

    Posted by: BW on December 17, 2007 at 2:22 a.m.
  • I'm using this search browser: http://www.craigspal.com

    It is made to search Craigslist....wild.

    Posted by: Peter on December 24, 2007 at 2:32 p.m.

We use Markdown to style our comments. **This is bolded.** *This is italic.* [This is a link](url)
For more options, try reading the wikipedia article or the official style guide.