It’s been very enlightening finding out that we’re still smarter than those little programs that go around and index websites. I realize that computers like IBM’s Watson are making waves on Jeopardy, but semantic understanding seems to still be primarily the domain of humans. Bots come up with the darndest things!
I and a team of really great indexer/taxonomists are learning first-hand in this database project we’re working on how literal web bots are when they “crawl” a website looking for terms. Our main job is to analyze their “choices” and keep the good/throw out the ridiculous. The web crawler bots go to the sites that the client wants to get information from and comes back with some real doozies, because all those terms on the site are just 1s and 0s to the bot. No distinction between “beadsandbaubles” and “beads and baubles” (or that beads should be a separate keyword from baubles). No concept that “industry” is such a general term that it is virtually useless in search in the context we’re working in.
Ah, there it is. Context. Meaning. Only the humans know what those little scratches called “letters” actually mean when they are combined. Semantics and significance are still important to the humans who search for information, so, in order to provide them with quality as well as quantity in search results, we get to make decisions that enhance a site’s quality in the minds of potential customers. A valuable application of our semantic understanding and an important contribution to the businesses we serve.
PI’s Pick of the Week
Let me introduce you to Louise Harnby, a professional specialist in proofreading, and also a very fine writer of extremely useful information on the proofreading task and the freelance word expert’s life. Her blog, The Proofreader’s Parlour (yes, she’s British, and “parlour” is fine) is always worthwhile (although now, by linking to it, I’ll have to spend extra time proofreading this post!).