Will.Whim

A weblog by Will Fitzgerald

Monthly Archives: April 2007

for people to comment

I used the term ‘shyster’ in a post on a mail list, and someone thought it derogatory. Looking into it, I agree; I apologized and will not use it in the future. Others want to discuss whether it is or not derogatory; here’s a place to discuss this.

I reserve the right to remove comments that are or approach hate speech.

Advertisements

Future of Search conference

The Future of Search conference coming up on May 4 looks interesting, with lots of heavy hitters from Google, Yahoo and Microsoft and other companies; Powerset’s CEO, Barney Pell is on the list.

WordNet, saints and Robin Hood

Why are you reading this, when you should be reading Natalia’s ball-peen hammer to the head?

One of the many things that it takes for a computer to “understand” text (as we are trying to do at Powerset) is for it recognize names and what they refer to. So for example, in the sentence “Maid Marian is the female companion to the legendary figure Robin Hood.”, a computer needs to see the names Maid Marian and Robin Hood are present, and have some kind of internal representation of Ms. Marian an d Mr. Hood.

WordNet is a kind of super-dictionary that knows things like John F. Kennedy was a president, and Robin Hood is a fictional character. Alas, it knows naught of Marian (although it does know Little John).

But, all abstractions are leaky, so I’m not particularly interested in bashing WordNet. Of course there will be gaps—some small (like not having Marian); some large (WordNet thinks saints of the Catholic flavor are a kind of god). One of the things I’m working on at Powerset is addressing some of these gaps.

What’s more disturbing is when there seem to be structural problems in the representational system. I am currently working on “named entity recognition,” meaning (at least) building systems that find names of things in running text, and knowing what types of things are being named. So, last week, I was trying to get a list of all the names of people in WordNet. Unfortunately, fictional characters are not “people” to WordNet. Oddly, there is one exception: Ali Baba is both a fictional character as well as a woodcutter (which, by turns, is a kind of person). But Ali Baba doesn’t cut down trees and chop wood as a job (as the WordNet gloss has woodcutters do); he cuts down imaginary trees and chops imaginary wood for his imaginary job; everything about him is fictional, right down to his forty thieves (well, maybe not the fortiness, but that’s another essay).

I think the right solution for this, within the WordNet framework, is to take advantage of WordNet’s adjectives. That is, noun types (like person or woodcutter) could be modified by adjectival descriptions, (like fictional or imaginary). After all, this is what adjectives are for, more or less. There’s a bunch of tricky stuff involved in doing this; but it’s the kind of thing lexicographers love, I think. So let ‘em at it!

(Rewritten April 29; translated into English)

Semantic Parser, best in show

I got an email today from Larry Hunter at University of Colorado School of Medicine, who writes to say that a DMAP-based parser was the “international world-champion” in the The BioCreAtIvE (Critical Assessment of Information Extraction systems in Biology) data text mining challenge. The chart above are the results from the Protein-Protein Interaction Task.

My colleague at Powerset, Jim Firby and I worked on some of the semantic parsing technology used by the Colorado group. Members of the Center for Computational Pharmacology Biomedical Text Mining Group were the direct researchers on this project though. Congratulations, Larry and team!

Two small occasional works

This has been a delightful weekend. On Saturday, I attended the Golden Gate Sacred Harp singing in San Francisco (with a nice Apr├Ęs-chant at Philip’s house, nicely described by Linda, who also has pictures of the singing). In the morning today, Jeff Shrager took me to see redwoods and banana slugs and tidepools, capped by some nice bluegrass music at a cafe. Worship at Sojourners was also good.

Back in Michigan, I missed some cool things happening. Bess hosted an Earth Day parade for our neighborhood, with lots of bikes and kids and people dressing up like something they admired in nature (Bess went as a maple tree). Some 60 people came; I wish I could have been there. Or perhaps at Sam Sommers and Beth Hall’s house blessing in Elkhart that I was disappointed to miss.

Anyway, I wrote two small occasional works for this weekend. One was the opening prayer for the Golden Gate singing; the other a text to sing to a new tune by Thomas Malone that he composed for Sam and Beth’s house blessing. First, the text for the tune (in “Common Meter,” so you can sing it to Amazing Grace or the Gilligan’s Island theme song). It helps to remember that Sam and Beth’s last names:

Home Blessing

The halls we build seem grand and strong
And sturdy in our sight;
Eternal eyes can see them fall
Swift as a summer’s night.

So, gracious Lord, we ask Thou grant
Our homes be built on praise,
And guide us to Thy boundless halls
And ceaseless summer days.

And here’s the prayer, with footnotes, even, since it’s a bit of a pastiche:

Lord,

When you laid earth’s foundation the morning stars sang together and the angels shouted for joy [1]. Out of the mouths of babes and sucklings you have perfected praise [2]. The very stones would cry out should forget our voices [3].

We have neither the innocence of babes nor subtlety of angels to praise you as you deserve. What we do have is this time and this place and these voices.

So, in the singing of our spiritual songs, let us sing with a soul flying away to you; help us not give over the struggle till we feel ourselves come into a holy symphony with the saints [4] who with the angels and babes and stones and stars shout out your perfect praise.

I ask this in the name of Jesus Christ, the author and perfecter of our faith [5]. Amen.

[1] After Job 38:4,7.
[2] Matthew 26:16. On the title page of William Billings’s New England Psalm-Singer, the first book composed of music written in America.
[3] After Luke 19:40
[4] After Cotton Mather (from The Accomplished Singer, quoted in the biography William Billings of Boston, by David P McCay and Richard Crawford).
[5] After Hebrews 12:2

Predictshunz 4 2000 frm 1990

K h00t!:

There will be No C, X or Q in our every-day alphabet. They will be abandoned because unnecessary. Spelling by sound will have been adopted, first by the newspapers. English will be a language of condensed words expressing condensed ideas, and will be more extensively spoken than any other. Russian will rank second.”

C,Q–ok! But X? wtf? No haxxors n 1900, imho.

& Russia? wft? Xhina #2. English #1, ok. (English #1 in 1900, I spek).

ttyl.

MP3 players for every student!

Last night, we went to a Good Friday service which included a meal, and I sat down next to my very Republican friend, Butch, who never has anything good to say about government (despite, or perhaps because of, having been a government social services employee for years). We don’t usually talk politics–the gap between us is too large and our friendship too fragile for this–but he started complaining about recent proposals to raise taxes to fix Michigan roads. Our roads are a bit of a mess, a special shame in the state that started and for many years hosted the automobile industry. I asked him whether he thought we should just leave the roads unfixed; his point was that fixing the roads is always given as the reason to raise taxes, but the money goes for other reasons altogether. At this point, I needed to be quiet. I didn’t have any facts at hand, and I wasn’t going to win any arguments in any case.

But Michigan is in a pretty bad way, and it will take innovative government and private partnerships to work our way out. So it was depressing to read that the House Democrats are proposing spending $38 million to provide MP3 players (using, I think ‘ipod’ as a generic term for an MP3 player) to every Michigan student for individualized education, according to the best write-up I can find, from Michigan Technology News.

It is very hard for me to believe the Dems would suggest such a chuckle-headed proposal, which has no chance to pass, and, if passed, would make absolutely no positive contribution to the education of Michigan children. This is definitely a ‘Worse Than Failure’ moment; it makes one think that sneaky Republican pirates have p0wnd the budget proposal. But alas: ‘never attribute to malice what can be adequately explained by stupidity’ seems to apply here.

I don’t think this will help convince Butch that raising taxes might be needful.

Daylight saving time shift

It looks like the early change-over to Daylight Saving Time failed to save energy; the failure was predicted by Berkeley scientists and even the Department of Energy.