Considering Search: Search Topics

_NOTE: This series of essays is still rather informal as I'm still organizing my thoughts. I've been spending a great deal of time researching search issues, and I'm going to build the basic structure here and just keep fleshing it out as I refine my thesis. I have found no other presentations of this information, so I think this is important to post even as a work-in-progres.

Essays in this search series:

I've been trying to develop a framework for understanding search -- what it is, how it's used, how to test it, etc. -- and I've been finding that it is a very complicated topic, and that most people who discuss it don't share the same understandings.

I've put together some very high-level considerations about how search should work. These are broad usability points for any search system:

  • some results are usually better than no results
  • relevant results are better than irrelevant results
  • users shouldn't have to learn a new language to find what they want
  • searches shouldn't fail because of bad data

I also have some assumptions about how people use search:

  • users don't fully understand general search methodology
  • users don't fully understand the search interface
  • users don't know the best type of search for their needs
  • users may be confused about the scope of a search
  • users have difficulty formulating queries
  • users may not understand the logic applied to their query
  • users don't fully understand the subject domain
  • many users employ search as a means of navigation
  • users are likely to make simple input mistakes

Types of Information Collections

I've been trying to classify the different kinds of searches. We've probably all used search, but how do we classify it? How should search be classified? On which of the following, if any, should you base a categorization scheme?

  • what the search engine does with your query string
  • the query language used
  • the interface layout and structure
  • how the search results are handled, organized or prioritized
  • the kind of information being searched against

The first thing to note is that search behaves differently, and users expect different things of search, depending on the kind of information collection. A search against a document collection (or index), like what you do when you search on AltaVista or Google, is not the same as a search against a product catalog, such as what you'd find at an online bookstore. This is an especially important distinction when discussing commerce sites, because finding a product is part of the purchasing "track".

Understanding a Little About How Search Works

There are also different types of search, based on how the search functionality performs the query or handles the data. I'm still building this list of search types or mechanisms. In order to understand the differences in types of search, you need to understand a little about the different layers that comprise a search system.

From a quality assurance point-of-view, search can be tough to test. Comparing search engines is usually like comparing apples with oranges: the logic behind the scenes is often different and follows different priorities, so that the definition of "success" for one search engine may not match that of another.

Measuring a particular search engine, and tracking its performance and accuracy over time and through code iterations is possible, however, and I describe some the useful tests for product catalogue search engines.

Search References and Resources

I'm compiling the search references that I found useful, including some interesting surveys, reports, and books.