Irrsinn.net: taking joy in human unreason

Search-building: custom or Google

Until earlier this week, I had a lousy site search in place. It was one of Google’s Custom Search Engines, barely configured and only on its own page, due to it’s hefty (and blocking!) JavaScript. I’d long since disabled WordPress’s search since my stories aren’t being run in WordPress, and I didn’t feel like trying to chew on the internal search mechanisms to include the stories.

Last week, I started playing around with a project to create my own (Python) site search, including a crawler and Whoosh-based search. I’d seen the implementation of a Lucene search in Zend go fairly easy-peasy, and liked the idea of a self-hosted search.

Problem is–well, one of the problems is–the crawl time for a site with 1200 posts (most of which are low-priority) is a deal-breaker on a shared hosting provider. It takes far longer than 5 minutes just to collect the links, even with multiple threads. Add the parse time to get indexable content for 1200 pages, and I was stuck contemplating how to crawl and index the site in parts.

This sounds like a great, fun, project. …Except that it’s already been done and I have other things I’d rather be doing. Google did it; their index for my site updates surprisingly quickly and doesn’t make me afraid that Dreamhost will smite me. (I’ve been with Dreamhost for several years now, and while I’ve learned how to properly deploy a site since moving here from… Brinkster, was it?, I don’t relish the idea of learning a new environment for all the stuff I run here.)

So instead of the 4-5 hours I’d spent screwing with the Ikea-esque assembly of a site crawler and search, I spent two this week really making Google’s Custom Search Engine (CSE) work for me. Yes, there are ads. Yes, it’s not a solution that I own. (Then again, neither is my email, in that sense.)
Keep reading >>

August 24th 2010
Tags: Linkage

No Comments

Weekend linkage

Just a few, since I'm so far behind on my reading (down to 425 unread items!) and I just posted a set on Friday. Enjoy. How to Help Your Kids Build $25,000 Stock Portfolios - This is a tempting idea even with regards to something like weddings. "Help us ...
August 22nd 2010
Tags: Writing

No Comments

Chapter 10 of Witches posted!

Chapter 10, "Opening Salvo" is posted: "What brings you to my world, little bug?" the colorful woman asked Hardi and Robert. Her glance flitted between both of them before settling on Hardi. "Well?" she cooed, head tilting. Her accent was much lighter than Lucia's. Hardi straightened in her seat. "We're looking for ...
August 20th 2010
Tags: Linkage

2 Comments

Weekly linkage

This week's internet cruising: A Beginner’s Guide to Website Feedback - If I can wrap up and launch this damn character sheet app, stuff in this post will be handy for when it betas, especially the surveying. I suspect the LARPing audience will be sufficiently... opinionated to speak on it. Six Useful ...
August 10th 2010
Tags: Techiness, Writing

No Comments

“Pursuit” posted and character sheet news

Chapter 9 of Witches, "Pursuit" is live. Actually, it went live yesterday, I just neglected to post here for it. I was busy having my ass handed to me by some type of hydra. Damn D & D 4e hydras. It had seven heads by the end. "But ...
August 5th 2010
Tags: Techiness

3 Comments

Tools, Consumption, and Sin-Eaters

In working on my Character Sheet Manager for Geist characters, I'm finally building something that I've wanted for a couple of years now in projects at work -- an Exception logger. Just a piece of middleware that grabs exceptions and logs them somewhere. In my case that's Redmine, ...