Indexes

Indexes are special access methods used to retrieve data within the data-store. Indexes allow users to select items from the data-store by matching words contained within those items. An index may be built over any field within the source data. Indexes may also be built over data that will not reside within the data-store. This is accomplished by defining a temporary field to hold the data only for the purpose of indexing. Indexes provide a much more efficient way of searching large amounts of data than simple record scanning.

Indexing within SLICCWARE Flexible Search is accomplished through a number of index types: keyword indexes, range indexes, timelines, spatial indexes, geo-spatial indexes. Any one of these index types may be combined with any number of the same or other index types to produce a result set within the search engine.

Flexible Search defines seven different types of indexes:

Type

Description

Keyword

Used to index string and text data. A keyword index can be constructed in such a way as to support close matches -- through synonym lookup and stemming, or exact matches, include phrase matching; or both close matches and exact matches. Individual words can be excluded from the index by defining them as stop words within the translate table.

Range

An index covering a specified numeric range with a predefined number of slots corresponding to uniform subdivisions of the total range. A range index can be thought of as a ruler, where the entire range corresponds to the whole ruler and each slot corresponds to a division on the ruler. A range index can only be created over a numeric field. Items that fall outside the range maybe included within the slot corresponding to the nearest endpoint, included in a special overflow slot, or discarded from the index.

Timeline

Special form of range index which is defined using date-time constants or expressions. A timeline should only be created over a numeric field used to hold date, time, or date-time data. The endpoints are defined by date-time literals such as “Jan1, 2000”, and the size of each slot is a time duration value such as ‘1day’. Items that fall outside the range may be included within the slot corresponding to the nearest endpoint, included in a special overflow slot, or discarded from the index.

Sliding Timeline

A powerful variation of a primary timeline. It allows the timeline to be based upon the current date and time. As time moves forward, the limits of the timeline are automatically adjusted, and items falling outside the new timeline are removed from the data-store and associated indexes. This form of primary index can be very useful in organizing such things as daily publications, current news-wire articles, or upcoming events and movies.

Spatial

Two dimensional index similar to a one dimensional range index. A spatial index can only be constructed over a pair of numeric fields. A spatial index can be thought of as a grid composed of rectangles, or cells, with an X and Y direction. The grid is defined by a base (X,Y) coordinate pair, a width and height for each cell within the grid, and the number of cells in each direction. Items that fall outside the grid may be included within the slot corresponding to the nearest cell, included in a special overflow slot, or discarded from the index.

Geo-Spatial

Special form of spatial index which is based upon earth mapping coordinates in fractions of degrees. A geo-spatial index can only be constructed over a pair of numeric fields containing geo-spatial coordinates. One value must be identified as containing the longitude, and one value must be defined as containing the latitude. A geo-spatial index can be thought of as a distorted flat map similar to ones you may have seen in the past. The cells or rectangles used to partition the map are uniform, but the area near the poles is much larger than it would appear on a globe.

The grid is defined by a base (X,Y) coordinate pair, a width and height in fractional degrees for each cell within the grid, and the number of cells in each direction. Items that fall outside the grid may be included within the slot corresponding to the nearest cell, included in a special overflow slot, or discarded from the index.

Package

Special kind of primary index. It maps the entire package to a single slot. It is used when further division of the package by a primary index is not warranted. A package index could be used to organize and make available a package of enduring reference articles on a current news site that uses a sliding timeline for all other articles.

Range, timeline, spatial and geo-spatial indexes may be used as a primary index for a particular package. Sliding timeline and package indexes must be the primary index for the package they are defined across.