Metadata Rhymes with Metadata

Metadata!

What is it? It’s data, but it’s more than that: it’s data about data. What does that mean? Well, to pick a hypothetical situation, let’s say Karl takes a photograph of me biting Abby. The photograph is data. But let’s say, the photo was taken on August 9th, 2003, at the Won Kee in Plymouth, New Hampshire, while the photographer was picking his nose. That’s data about our photograph, or data about data. In a word, it’s metadata.

“So,” you’re asking, “why should I care?” Well, I’m glad you asked! Sit a spell, won’t you?

I read a lot of web comics. Generally, they’re pretty funny, and occasionally I like to pass certain comics off to friends. Trouble is, I can hardly ever find said comics when I really want to. Most of the comics I read have several hundred archived strips. Sinfest alone has nearly 1,300. Since they exist only as images, there is virually no way to search through past strips for a particular word or phrase or character.

Enter Goats.

That came out wrong.

Anyway, Goats. Goats records the script for each comic, as well as each frame’s location and the props involved. Do a search for “kittens, pop tarts” to see just how useful this saved script data is. (Hint: kittens = pop tarts.) This is amazing, and every comic should be creating this sort of metadata. It’s something I’ve been thinking about of late, and it’s great to see that a webcomic has already picked it up, and seems to have done a stellar job to boot.

I envision a distributed network of comic readers, each reading a few past strips of their favorite comics and creating a huge database of all the comic metatdata you could ask for. That would be joyous. Deep sigh. I’ll Google for it tomorrow. ;)

Sharing:

 

5 Comments

  1. Matt says:

    Penny Arcade has a keyword search which contains props for it’s comics. I hate to admit how many times I’ve searched for Fruit Fucker.

    I am a huge advocate of metadata. My only problem with your distributed network is having too much metadata. You have to ask yourself what’s important and what’s not. Lets say I saw a comic, but I don’t remember what strip it was or who the creator was. All I remember is that it had a cat. The cat is pretty damn funny. Well if I searched for a web comic that has cats and a result came back for every comic that had a cat in the background (or any little thing that might resemble a feline) I’d probably short myself. Yeah, you could have relevence rankings, but (as google shows us) sometimes somebody elses ranking is much different than out own.

    Metadata is very cool, though. I really hope someone creates a standard for it and incorporates it into all OSes.

  2. The problem with Google and the kind of search you talked about is the lack of distinction between the different “cats,” as it were. I find it hard to search on Google because nearly every result is either a) a weblog entry, b) a mailing list question, or c) an online catalog offering to sell me something. Google has no good way to separate types of pages, since it deal with no metadata whatsoever. (Indeed, there is hardly any metadata available.)

    It’s the same with your “cat” search. Maybe the character speaking was a cat, like in Garfield. Maybe it was just a background prop, like the cat in Penny-Arcade. Maybe it’s a desk lamp shaped like a prop. Maybe somebody said “cat.”

    Useful metadata is more than just a text file with keywords. You need ontologies (categorical breakdowns) as well.

  3. Matt says:

    I understand that completely and that’s the whole point of metadata: to categorize certain aspects of data. I just got from the impression of your post that you were suggesting a “keywords” file associated with each comic that anyone could contribute to. Which scares me.

  4. Karl says:

    Hey, sexylosers does it by charactures that are in the comic. (:

  5. Search rocks
    Jon just forwarded me a link to this post in praise of our search engine. I love reading this kind…

Leave a Comment

HTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>