Random: the internet is funny

Technical Information

General: How it works
Random has several different interactions with the server-side software. It checks to see if it has enough searches left to continue; it retrieves the designated "Comic of the Week"; it retrieves and renders comics from the archive (coming soon!); and it interfaces with the search software to render comics on the fly.
Flash: Front-end
The flash document is a single flash movie of approximately 350KB. I have experimented some with multiple movies, but the single-movie seemed to work the best and was the easiest to implement. In future, I may choose to use multiple movies that load on the fly. (Suggestions about implementation welcome).

The movie interacts with the server-side software using the loadVariables movie clip function. The server-side software returns variables as needed, and all the comic-rendering is done on the flash side. The comic assigns the returned values to choose a situation and fill in the word-bubbles.

When users save their comic, the movie uses loadVariables again to interact with a different PERL script that records the strip as a set of variables.
Perl: server-side software
The server-side software is written in PERL, the "pathetically ecclectic rubbish lister." PERL is a relatively loose language that handles data nicely. There may be other languages more suited to this task, but I'm most fluent in PERL. I use the CGI module and the Time:Local module, but not much else. I've attached the GNU GPL to the code, so programmers interested in the code should email me to get a copy.
Google API
Google has a developer kit for people interested in using Google's search in their programs. As with nearly all web browsers, it is against the terms of service to simply re-format the search results of the web page; the API is the appropriate way to use the service.

Unfortunately, the API limits developers to 1000 searches per day. Thus, if users do too many searches on a single day, Random's "make your own comic" feature will be de-activated until the next day. My hope is that by the time the site gets popular enough (I'm making assumptions here!) that it runs out of searches, I will have the archive up and running, giving users access to a significant number of old comics.
Search tool
The search tool is part of the PERL software I've written for this project. Below is a breakdown of the steps the program uses:
- Generates a query if the user did not enter one
- Checks to make sure it's not over the Google search limit for the day
- Gets the preliminary search from Google
- Chooses three pages from the top ten results to download
- Gets the cached versions of the pages from Google
- Filters the pages (see "Filter tool" below)
- Ranks the lines of text based on number of keywords and total wordcount
- Randomly grabs one sentence or 200 characters from the top three-rated lines
- Returns those top three choices to the flash front-end.
Filter tool
The software runs each web page through a series of filters to get rid of tags, whitespace, and other junk that makes it difficult to get interesting quotes for the script. This filter system has been crafted by experiment over the last two years, and, while it's not perfect, it performs quite well.
- Removes anything before the <BODY> tag
- Removes other inline tags
- Removes many HTML special characters
- Removes whitespace

Back

BACK || briley at curragh-labs.org || Copyright © 2005, Brendan Riley || Tuesday, 22-Mar-2005 18:40:32 PST