windows-nt/Source/XPSP1/NT/inetsrv/query/fsci/dll/strategy.txt

TYPES OF RESTRICTIONS

1) Content Restriction

   Input: <property>, <text>, <fuzzy level>

   Matches documents which contain <text> in <property>.

   <property> may be any textual Ole property or a special property.  The
   special properties include CONTENTS (the main body of the document), ALL
   (search all properties), and user-defined PSEUDO-PROPERTIES (text
   distinguished for purposes of content search).

   <fuzzy level> describes how exactly <text> has to match the document.
   Fuzzy level 0 is exact match.  Fuzzy level 1 is prefix match (each word
   is treated as a prefix).  Fuzzy level 2 is morphological stemming (run
   would match run, running, ran, etc.)

   The result of a content query may be out-of-date.

2) Property Restriction

   Input: <property>, <relop>, <value>

   Matches documents where <property> <relop> <value>.

   <property> must be a true Ole property, or a few special properties that
   are valid only in query results.  The special properties are RANK (how
   well the restriction matches the object), HITCOUNT (number of content
   index 'hits'), and RANK VECTOR (for use with vector restriction)

   <relop> is one of: <, <=, =, !=, >=, >, SOME OF, and ALL OF.  The last
   two are bitwise operations valid only for integer types. In C++ syntax,
   SOME OF is (<property> & <value>) != 0, and ALL OF is (<property> &
   <value>) == <value>.

   <value> is a STGVARIANT.

   The result of a property query always reflects the last saved state of
   all objects.

TYPES OF INDEXES

1) Content Index

   The content index is a mapping of <property>,<words> back to the
   documents which contain <words> in <property>.

   There is no scoping within the content index.

   The content index is lazily updated.  It may be out-of-date.

2) Value Index

   A value index is a mapping from <property>,<range of values> back to the
   documents which have a value within <range of values> for the
   <property>.

   In other words, the possible range of values for a data type
   (VT_FILETIME, VT_I4, etc) is divided into "buckets".  Every possible
   value falls into one of these buckets.  Note that the mapping is from
   bucket to document, not value to document.  A search for SIZE == 500
   might map to a bucket from 250 to 525.  So the result of index lookup
   would be all files from SIZE 250 to 525, not just those having SIZE ==
   500.

   There is no scoping within a value index.

   Value indices can be used in conjunction with content index.  They are
   lazily updated with the same frequency as content index.

   There is no administration necessary to set up value indices.  All
   properties are value indexed except a few hard-coded exceptions.  This
   may change in the future.

3) View Index

   A view index is a B-Tree.  It contains a complete sorted list of files
   for a single directory.  Besides key columns, the view can contain
   additional unsorted columns.  These improve retrieval efficiency but
   have less effect on query efficiency.

   View indices must be created by an administrator.

4) Directory Index

   Listed for completeness.  This is a view index on the filename property.
   It is always available.

RULES FOR MATCHING QUERY WITH INDEX (in order of precedence)

1) If a query contains a content restriction, use content index, adding
   value indices if appropriate.

2) If one or more properties of a property restriction are used in the
   sort order of a view index, and the query is shallow, then use view
   index.

   Note that properties of the view must be used in order.  A view on SIZE
   and FILENAME could be used for queries involving SIZE, and queries
   involving both SIZE and FILENAME, but not for queries involving just
   FILENAME.

   If more than one view is applicable, then the view in which the most
   keys of the sort appear in the restriction is used.  Thus given two
   views: SIZE, FILENAME and SIZE, ATTRIBUTES, a query for SIZE and
   FILENAME would use the former.

3) If one or more properties of a property restriction is value indexed,
   and the value index is reasonably up-to-date, and the query is
   shallow, then use value indexing.

4) If 1, 2, and 3 do not apply, or if the volume is downlevel (not Ofs),
   then use the directory index (e.g. enumeration).