Skip to Content

Projects in 2002–2003

Fair, Isaac, and Company, Inc.: Identification of Suspicious Investors

David Panton
Henry Krieger

The Patriot Act requires banks to scrutinize certain transactions involving an "immediate family member or close associate of a senior foreign political figure". Fair Isaac's would like to be able automatically to discern lists of individuals associated with such figures in order to enable banks to detect money-laundering activity more effectively.: This project will focus on the narrower problem of detecting relationships between individuals based on their co-occurrence and context in selected databases. This is an instance of a much more general problem that is often referred to as "relationship discovery", "link analysis", or "multi-relational data mining".: The project will involve formulating the problem, proposing a solution approach, then implementing a solution prototype in software and testing its performance on data supplied by Fair Isaac. One approach could be to use Bayesian techniques to compute the strengths of relationships between entities based on co-occurrence in a data corpus. But the project team will be free to decide on the approach. The project team may benefit from work carried out during a previous HMC/FI Math Clinic, "Intelligent Techniques for Scanning and Extracting Information from Text". The final report provides a good overview of text processing techniques. That team's software may be useful for the current project.


  • Brie Finger
  • Shea Lawrence
  • Jonathan Nadel
  • Michael Vrable

Overture Services, Inc.: Improved Relevance Ordering For Web Search

Lesley Ward

This project will build on previous Math Clinic projects on Web search, with the goal of improving techniques for topic-dependent relevance ordering. Of the different approaches to ranking Web search results, some are static and global, using the Web as a whole to vote on the relative importance of pages, while others are more dynamic and local, ranking pages only within a search result set or a topic cluster. In a purely static and global approach, if page A appears before page B when a user enters a search query Q, then A will appear before B for all other queries Q'. An example of a local technique is Kleinberg's HITS technique for link analysis; Brin and Page's PageRank is global. In a Summer 2002 Math Clinic, a specific approach was proposed for focusing link analysis, effectively making it more local. This 2002-2003 Math Clinic project will begin by evaluating that approach against some competing baseline approaches, then look at ways to blend local and global relevance scores, and compare local analysis within a topic cluster to local analysis within a search result set.


  • Erin Bodine
  • David Gleich
  • Cathy Kurata
  • Jordon Kwan