Knowledge interface, Uncategorized

Knowledge elicitation for the knowledge interface

Print Friendly, PDF & Email

Dzevad yesterday mailed me a link to a Google video that (as they say) blew my mind away and gave me some much needed inspiration for doing the requisite knowledge elicitation preliminary to the business of building knowledge interfaces (let’s say, to begin with, simply in the two domains of “peoples’ museums” and ICT4D). The issue, in a nutshell, had been this:

  • museums, whose business (according to the ICOM Statutes) is to “acquire, conserve, research, communicate and exhibit, for purposes of study, education, enjoyment, the tangible and intangible evidence of people and their environment”, communicate a representation of, rather than the ‘raw’ (though that’s probably not the right word) actuality of, “people and their environment”. The physical topography (and constraints) of the exhibition space; the limited collection of artefacts to which the museum has access (and equally significantly the artefacts that are missing); the decontextualisation within museum space of artefacts from their originating historical, geographical, social, political, economic, industrial, and epistemic contexts of occurrence and use; the exhibition design and display decisions; as well as the explicit commentaries and explanations presented to the museum visitor interpose between the visitor and the ostensible subject, as much a veil as a window. Every representation is consequently to some degree a misrepresentation. I therefore have two interests, one as an anthropologist, the other as a cognitive scientist: [1] we best understand a people if we see them as they see themselves, know why they believe what they believe, know why they act as they act, know the world in the way they know the world, can share their (inter-)subjectivity1 ; [2] since the intersubjectively experienced world of others is inevitably bound to be remote from our own, we need a ‘translation service’: a knowledge interface that intervenes not as an opaque and impermeable maquette of the Other but as an interpreter, a channel, a facilitator of ‘dialogue’.
  • much of what I’ve written above about museums applies equally to development practices in general and to the enterprise of ICT4D in particular. Projects have failed for want of dialogue, with donor and provider organisations endeavouring to impose solutions to misconstrued problems and misconceived opportunities. There is an urgent need to develop tools that will assist in the capture, structuring, and interpretation of the knowledge and interests of the end-user communities.

First, watch the video (51:31 minutes):

A quick summary (but, in case you’ve skipped it, do take the time to watch the video anyway):

  1. computers can’t interpret images, i.e., can’t tell what the image content is
  2. therefore image search on, say, Google is done on the basis of captions and on words found in the proximity of an image (and this may lead to errors)
  3. human beings, however, can interpret images, hence the best way to prepare images for search is to get human beings to label them
  4. Luis Von Ahn devised the online ESP Game2 to engage web users, through an entertaining game, to label image content. Users are randomly paired, with the goal of the game that, given the same image to label, each player has to guess what the other player is typing. (Note: they are not asked to do the more restricting task of labelling the image; rather, the more open task of merely guessing what, when presented with a common image, the other player is typing: this will mean that an image with mixed content can ultimately be labelled for all of its constituent objects.) When they finally both type the same word (i.e. have agreed a label for the image), both players get points and can move on to the next shared image. The word on which the players agree will generally turn out to be a good descriptive label for (at least some object within the whole content of) the image since it will have come from two independent sources. ‘Taboo’ words–words that two or more players have previously agreed on and are thereafter excluded for that image–ensure that, over enough plays, each component element in an image gets labelled. (Von Ahn estimates that 5000 people playing regularly over a period of around two months would be sufficient to label all the images in Google Images.)
  5. Cheating is preempted by the use of already-labelled test images inserted at random in the sequence presented to paired players. Players common guesses are stored only if they also agree on the test images.
  6. esp1The ESP Game will, over enough plays, record for any image a number of agreed labels for the component contents (e.g., man, hat, glasses, moustache, tie, eyebrows, …). But it won’t identify in what region of an image a specific component object is located. Thus a second two-player game: Peekaboom. In this game, one of the players (‘Boom’) gets an image (containing, say, a butterfly) and a word (‘butterfly’), taken directly from the output of the ESP Game; the partner player ‘Peek’ has initially only a blank screen. ‘Boom’ has to get ‘Peek’ to guess the word; and does so by clicking on that part of the image that contains the corresponding object (in this case, the butterfly). That region (circled) of the total image is then replicated to Peek’s screen; and Peek has to guess the word. (Hints may be given by Boom with regard to part of speech … noun, verb, adjective, …) When he has guessed correctly, both players get points and they swap roles. The selected region is at the same time recorded for the image; and can then be used as the basis for an image search (with the object area of the image circled).
  7. The third game, Verbosity, collects ‘commonsense facts’ such as “water quenches thirst” or “cars usually have four wheels” (von Ahn’s examples). “Now the thing about commonsense facts”, says von Ahn, “is that it is estimated that each one of us has literally hundreds of millions of them in our head; and these are what allow us to act normal and navigate our world successfully”. The challenge is to get such facts into computers so that computers will be able to reason with them in much the same way (or, at least, with much the same output) that humans do. The game has two players, a Narrator and a Guesser; at the beginning of every round the Narrator is given a word and has to get the Guesser to guess the word. To do so, he can select one of many sentence templates available for the word (such as “is a __” or “is typically near __”) and fill it with an appropriate word. Given, for example, the word MILK, he might use this template with the blank filled in thus: “is a LIQUID” and “is typically near CEREAL“, and send this to the guesser. When, after enough such hints, the Guesser guesses the word, both players get points. A more detailed description of Verbosity is published in von Ahn et al, (2006).

The ESP Game is a ‘Symmetric Verification Game’: both players are given the same input and have to agree on a label. Peekaboom and Verbosity are ‘Asymmetric Verification Games’: given output (a selected region of an image, a set of contingent facts), player 2 has to guess the picture object (Peekaboom) or word (Verbosity) given to player 1 as input.

Both the ESP Game and Peekaboom have obvious usefulness for ‘virtual museums’ in that they would enable the collaborative tagging of (graphical representations of) artefacts.

It’s the third game, Verbosity, that obviously interests me the most, suggesting as it does a generic strategy for knowledge elicitation that is both productive and participatory. The idea for the game came in response to the problem that “Efforts for collecting common-sense facts”, such as the Cyc project, “have been unable to collect a large enough fraction of common human knowledge. After 20 years, much less than five million facts have been collected–far from the estimated hundreds of millions that are required. … if our game is played as much as other popular games, we can collect millions of facts in just a few weeks” (von Ahn et al, 2006). To set the context for my further comments and reflections below, the following summary of exactly and solely what the game seeks to achieve will be useful:

  1. The goal of Verbosity is to collect commonsense facts; it uses a game format to do so in a way that accelerates the process of fact-collection beyond other initiatives such as Cyc.
  2. It uses templates (“is a kind of __”, “is used for __”, …) to guide the Narrator and constrain the format of the collected data.
  3. It has mechanisms to ensure that every fact is verified.
  4. It is non-committal about how the collection of facts will subsequently be processed or used.
  5. It makes no assumptions about the cultural background or cultural diversity of its players (or, perhaps more precisely, by default assumes cultural homogeneity).3

That, through game play, Verbosity not only engages the participation of ordinary web users but (except for the constraint of having to use pre-given sentence templates) in effect surrenders ownership and control of the fact collecting to the users, is for von Ahn a matter of simple expediency. I believe it has a more significant implication: it enables an ‘autoethnography’ / ‘autoethnoepistemology’ as community members generate their own corpus of ‘commonsense facts’ about the epistemic world in which they live.

The game also interests me as much for the questions it raises as for the problems it solves. Questions that immediately come to mind are:

  • to what extent are ‘facts’ in isolation really facts? I’m inclined towards what one might think of as a neo-Saussurian (or maybe neo-Trierian) ‘structural semantics’ for facts as much as for words
  • as noted above, Verbosity makes no assumptions about the cultural background or cultural diversity of its players. “What everybody knows” will to some degree vary from culture to culture, community to community.


Related projects

No time to talk about these at the moment, but worth listing here:

Push Singh’s Open Mind Common Sense initiative
Doug Lenat’s Cyc, OpenCyc, and ResearchCyc projects
Chris McKinstry’s Mindpixel

That both Push Singh and Chris McKinstry committed suicide (2006) invites dark humour on the perils of doing research in commonsense reasoning.


von Ahn, L. & Dabbish, L. (2004). ‘Labeling Images with a Computer Game’. CHI 2004, April 24–29, 2004, Vienna, Austria. [PDF]

von Ahn, L., Kedia, M., & Blum, M. (2006). ‘Verbosity: A Game for Collecting Common-Sense Facts’. In ACM CHI Notes 2006. [PDF]

Gíslason, H. (2003). ‘Gathering Common Sense’. [Blog]. Accessed 25 May 2007 at:

Thompson, C. (2007). ‘For Certain Tasks, the Cortex Still Beats the CPU’, Wired Magazine, issue 15.07

  1. “[Ethnography] shifts the focus of research from the perspective of the ethnographer as an outsider to a discovery of the insider’s point of view. … It is a systematic attempt to discover the knowledge a group of people have learned and are using to organize their behavior”, Spradley & McCurdy, 1972, p.9. This applies properly only to peoples and cultures that still flourish; extinct historical peoples are another matter that I address elsewhere. []
  2. Now licenced to Google as ‘Google Image Labeler’, at []
  3. In two short sentences von Ahn both acknowledges and effectively dismisses the influence of cultural belief on the veracity of facts: “Something to note is that many of the sentences not rated as true by all were debatable–for example, ‘Buddha is a kind of god’. Thus, even without our mechanisms for validating facts, the collected data was extremely accurate”. ‘Debatable’ may not have been, it seems to me, the most helpful choice of adjective. []

Leave a Reply