Analysis and retrieval of structured text

Mr Malcolm Clark, PhD research student, School of Computing
Fri, 10 Oct 2008 14:00 GMT in A23 (St Andrew St.)

This research focuses on overlaps of information retrieval (IR), cognitive science and genre, merging and utilizing these for one particular goal: to analyse and retrieve structured text. Structured textual documents are normally composed of several layers or sections which together form types, or genres, of text preserved, in particular, in e-mail and Wikipedia (XHTML). At present, the IR community of computer scientists and other researchers, such as computational linguists, is using genre to categorise documents in digitally structured media i.e. hypertext markup language (HTML) but more research needs to be done on linking cognitive science modelling techniques and IR. There has been much discussion on the definition of genre, but for the purpose of this project, genre is defined as purpose and form. This research intends to examine the attributes of genre to discover how they are used and, in particular, perceived.

Return to seminars