277 lines
19 KiB
HTML
277 lines
19 KiB
HTML
|
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
|
||
|
|
||
|
<html>
|
||
|
|
||
|
<head>
|
||
|
<title>Microsoft Index Server Guide: Glossary</title>
|
||
|
<meta name="FORMATTER" content="Microsoft FrontPage 1.1">
|
||
|
<meta name="GENERATOR" content="Microsoft FrontPage 1.1">
|
||
|
</head>
|
||
|
|
||
|
<body bgcolor="#FFFFFF">
|
||
|
<!--Headerbegin--><p align=center><a name="TOP"><img src="onepix.gif" alt="Space" align=middle width=1 height=1></a> <a href="default.htm#Top"><img src="toc.gif" alt=" Contents" align=middle border=0 width=89 height=31></a> <a href="errhandl.htm"><img src="previous.gif" alt="Previous" align=middle border=0 width=32 height=31></a> <a href="faq.htm"><img src="next.gif" alt="Next" align=middle border=0 width=32 height=31></a> </p>
|
||
|
<hr>
|
||
|
<!--Headerend--><p align=left><a name="Glossary"><font size=6><strong>Glossary</strong></font></a></p>
|
||
|
<hr>
|
||
|
<h3 align=left><a href="#sectA"><font color="#000000">A</font></a><font color="#000000"> </font><a href="#sectB"><font color="#000000">B</font></a><font color="#000000"> </font><a href="#sectC"><font color="#000000">C</font></a><font color="#000000"> </font><a href="#sectD"><font color="#000000">D</font></a><font color="#000000"> </font><a href="#sectE"><font color="#000000">E</font></a><font color="#000000"> </font><a href="#sectF"><font color="#000000">F</font></a><font color="#000000"> </font><a href="#sectG"><font color="#000000">G</font></a><font color="#000000"> </font><a href="#sectI"><font color="#000000">I</font></a><font color="#000000"> </font><a href="#sectL"><font color="#000000">L</font></a><font color="#000000"> </font><a href="#sectM"><font color="#000000">M</font></a><font color="#000000"> </font><a href="#sectN"><font color="#000000">N</font></a><font color="#000000"> </font><a href="#sectO"><font color="#000000">O</font></a><font color="#000000"> </font><a href="#sectP"><font color="#000000">P</font></a><font color="#000000"> </font><a href="#sectQ"><font color="#000000">Q</font></a><font color="#000000"> </font><a href="#sectR"><font color="#000000">R</font></a><font color="#000000"> </font><a href="#sectS"><font color="#000000">S</font></a><font color="#000000"> </font><a href="#sectT"><font color="#000000">T</font></a><font color="#000000"> </font><a href="#sectW"><font color="#000000">W</font></a></h3>
|
||
|
<p>Select the first letter of the word from the list above to jump to appropriate section of the glossary. </p>
|
||
|
<hr>
|
||
|
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="sectA">A</a></h1>
|
||
|
<blockquote>
|
||
|
<p><a name="Abstract"><strong>Abstract</strong></a><strong><br>
|
||
|
</strong>A summary of a document or HTML page. Microsoft Index Server can automatically generate a document
|
||
|
abstract using information contained within the document, such as Heading information in HTML pages and
|
||
|
property information on documents. Also called a <em>characterization</em>.</p>
|
||
|
<p> <a name="ACL"><strong>Access Control List (ACL)<br>
|
||
|
</strong></a>A level of Windows NT permission that you can set on a file or a folder allowing some users to access it while
|
||
|
other users cannot access it. For details, see the Windows NT documentation.</p>
|
||
|
</blockquote>
|
||
|
<hr>
|
||
|
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="sectB">B</a></h1>
|
||
|
<blockquote>
|
||
|
<p><a name="Boolean"><strong>Boolean</strong></a><strong><br>
|
||
|
</strong>A type of variable that can have only two values, typically 1 or 0. Boolean variables are often used to express
|
||
|
conditions that are either <strong>TRUE</strong> or <strong>FALSE</strong>. Queries with Boolean operators (<strong>AND</strong>, <strong>OR</strong>, <strong>NOT</strong>, and
|
||
|
<strong>NEAR</strong>) are referred to as Boolean queries.</p>
|
||
|
<p><a name="breaker"><strong>Breaker, word</strong></a><strong><br>
|
||
|
</strong>An Index Server language utility that is responsible for identifying words in a document. As the document
|
||
|
contents are emitted by the content filter, the word breaker identifies where the words are located in the
|
||
|
sentence. There is one word breaking module for each of the languages supported by Index Server.</p>
|
||
|
</blockquote>
|
||
|
<hr>
|
||
|
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="sectC">C</a></h1>
|
||
|
<dl>
|
||
|
<dd><a name="CatalogDrive"><strong>Catalog<br>
|
||
|
</strong></a>The directory in which Index Server data is stored. The data is stored in the directory Catalog.wci under the path
|
||
|
chosen at the time of installation.</dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="Characterization"><strong>Characterization</strong></a><a name="Corpus"><strong><br>
|
||
|
</strong></a>See <a href="#Abstract">Abstract</a>.</dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="Childprocess"><strong>Child process</strong></a><br>
|
||
|
An executing computer program that is started by another executing program. For example, if Process-A is running and
|
||
|
it executes another program, Process-B, Process-B is a child process of Process-A.</dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="Corpus"><strong>Corpus<br>
|
||
|
</strong></a>The collection of documents and HTML pages indexed by Index Server.</dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="Cursor"><strong>Cursor</strong></a><br>
|
||
|
A pointer into the context index. Functionally the same as a database cursor, the cursor points to the next record to
|
||
|
retrieve from the information store.</dd>
|
||
|
</dl>
|
||
|
<hr>
|
||
|
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="sectD">D</a></h1>
|
||
|
<dl>
|
||
|
<dd><a name="DLL"><strong>DLL, dynamic-link library</strong></a><strong><br>
|
||
|
</strong>A collection of programs that can be accessed and executed by other programs running on the computer. These files
|
||
|
typically use the extension .dll. For example, the Microsoft Word filter DLL Wordfilt.dll may contain several programs
|
||
|
(the content filters) that read different versions of Word files. These different programs are packaged together in a single
|
||
|
dynamic-link library for convenience and efficiency.</dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="DirtyShutdown"><strong>Dirty shutdown</strong></a><br>
|
||
|
Any unusual or abnormal shutdown for Index Server or IIS. Index Server has a very specific shutdown sequence that
|
||
|
must be followed to guarantee that updates to the index happen correctly. If this shutdown sequence is not followed, the
|
||
|
index may become corrupted. For example, a power failure is considered a dirty shutdown because Index Server is not
|
||
|
given the chance to execute its shutdown sequence.</dd>
|
||
|
</dl>
|
||
|
<hr>
|
||
|
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="sectE">E</a></h1>
|
||
|
<dl>
|
||
|
<dd><a name="Embedding"><strong>Embedding, embedded object</strong></a><strong><br>
|
||
|
</strong>Typically data from one program that is stored within the data of another program. For example, a user may create a
|
||
|
Microsoft Word document. Later the user creates a spreadsheet using Microsoft Excel and inserts this spreadsheet in
|
||
|
the Microsoft Word document. The spreadsheet is embedded in the Word document and is referred to as an
|
||
|
embedding, or embedded object.</dd>
|
||
|
</dl>
|
||
|
<hr>
|
||
|
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="sectF">F</a></h1>
|
||
|
<dl>
|
||
|
<dd><a name="Filter"><strong>Filter, content</strong></a><strong><br>
|
||
|
</strong>An Index Server component that is responsible for reading a document from the disk and extracting the textual content
|
||
|
from that document. Typically filters are associated with particular document formats. For example, Microsoft Word
|
||
|
documents have their contents extracted by a different filter than Microsoft Excel documents.</dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="FilterDLL"><strong>Filter DLL</strong></a><strong><br>
|
||
|
</strong>A dynamic-link library (DLL) that collects together a number of content filters.</dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="Freetext"><strong>Free-text query</strong></a><strong><br>
|
||
|
</strong>With a free-text query, the user can enter any set of words or phrases, or even a complete sentence, as the query
|
||
|
restriction. Index Server examines this text, identifies all the nouns and noun phrases, and posts a query with those
|
||
|
terms. For example, assume the user typed the following free-text query:<blockquote>
|
||
|
<p><em>The Fulton County Grand Jury said Friday an investigation of Atlanta’s recent primary election
|
||
|
produced no evidence that any irregularities took place.</em></p>
|
||
|
</blockquote>
|
||
|
<p>The system would identify the following words and noun phrases: </p>
|
||
|
<blockquote>
|
||
|
<p><strong>Words: </strong>Fulton, county, grand, jury, Friday, investigation, Atlanta, recent, primary, election, produce,
|
||
|
evidence, irregularities.</p>
|
||
|
<p><strong>Phrases: </strong>Fulton county grand jury, primary election, grand jury, Atlanta’s recent primary election</p>
|
||
|
</blockquote>
|
||
|
<p>The words and phrases are combined into a restriction, weighted for proper ranking, and posted as a query against the
|
||
|
corpus. </p>
|
||
|
<p><strong>Note</strong>   You must preface all free-text queries with <em>$contents</em>. </p>
|
||
|
<p><a name="FuzzyQuery"><strong>Fuzzy Query</strong></a><strong><br>
|
||
|
</strong>Fuzzy queries search for words that are similar to the words or text entered in the query restriction. Rather than looking
|
||
|
for only exact matches, the system will modify the words in the query and look for these modified forms. </p>
|
||
|
<p>The system supports simple wildcards (such as those in MS-DOS®) and regular expression matching (as used in
|
||
|
UNIX) against textual properties. Content queries support simple prefix matching (for example, “dog*” will return
|
||
|
“dogmatic” and “doghouse”). The system also provides linguistic stemming support that matches inflected and base
|
||
|
forms of query words. (For example, “swim” is expanded to “swimming”, “swam”, “swum”, and so on.) </p>
|
||
|
</dd>
|
||
|
</dl>
|
||
|
<hr>
|
||
|
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="sectG">G</a></h1>
|
||
|
<dl>
|
||
|
<dd><a name="GUID"><strong>GUID<br>
|
||
|
</strong></a>A globally unique identifier (GUID), expressed as <em>xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.</em> For example,
|
||
|
F29F85E0-4FF9-AB91-08002B27B3D9</dd>
|
||
|
</dl>
|
||
|
<hr>
|
||
|
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="sectI">I</a></h1>
|
||
|
<dl>
|
||
|
<dd><a name="IndexedDirectory"><strong>Indexed Directory<br>
|
||
|
</strong></a>A directory pointed to by a virtual root that is configured by the administrator to be indexed by Index Server.</dd>
|
||
|
</dl>
|
||
|
<hr>
|
||
|
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="sectL">L</a></h1>
|
||
|
<dl>
|
||
|
<dd><a name="Locale"><strong>Locale</strong></a><strong><br>
|
||
|
</strong>Used to indicate language information. For example, a Web server may have a locale variable that indicates the default
|
||
|
language used on that server. A server in Seattle will probably have a locale of EN-US (U.S. English) whereas a server
|
||
|
in Berlin would have a local of DE (German or Deutsch). Web browsers can specify locale also to indicate the language
|
||
|
that the user of that browser understands. Documents and Web pages also can specify a locale to indicate what
|
||
|
language the text is in.</dd>
|
||
|
</dl>
|
||
|
<hr>
|
||
|
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="sectM">M</a></h1>
|
||
|
<dl>
|
||
|
<dd><a name="MIndex"><strong>Master index</strong></a><strong><br>
|
||
|
</strong>A <a href="#PIndex">persistent index</a> that contains the indexed data for a large number of documents. </dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="Metadata"><strong>Metadata</strong></a><strong><br>
|
||
|
</strong>Data used to describe other data. For example, Index Server must maintain data that describes the data in the content
|
||
|
index. This data that Index Server maintains is called metadata because it describes how data in the index is stored.</dd>
|
||
|
</dl>
|
||
|
<hr>
|
||
|
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="sectN">N</a></h1>
|
||
|
<dl>
|
||
|
<dd><a name="NLS"><strong>National Language Support (NLS)</strong></a><strong><br>
|
||
|
</strong>Helps applications developed for the Win32 application programming interface (API) adapt to the differing language
|
||
|
and locale-specific needs of users around the world.</dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="Noisewords"><strong>Noise words</strong></a><strong><br>
|
||
|
</strong>Words that are not significant in searches, such as <em>a</em>, <em>an</em>, and <em>the</em> in English..</dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="Normalizer"><strong>Normalizer, word</strong></a><strong><br>
|
||
|
</strong>An Index Server component that takes accepts words and converts them into a standard representation before placing
|
||
|
them in the index.</dd>
|
||
|
</dl>
|
||
|
<hr>
|
||
|
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="sectO">O</a></h1>
|
||
|
<dl>
|
||
|
<dd><a name="Overhead"><strong>Overhead, disk</strong></a><strong><br>
|
||
|
</strong>The amount of space required to store the index information. </dd>
|
||
|
</dl>
|
||
|
<hr>
|
||
|
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="sectP">P</a></h1>
|
||
|
<dl>
|
||
|
<dd><a name="PIndex"><strong>Persistent index</strong></a><strong><br>
|
||
|
</strong>An index with data stored on a disk.</dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="Phrase"><strong>Phrase</strong></a><strong><br>
|
||
|
</strong>See <a href="#Freetext">Free text query</a> for an example.</dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="Property"><strong>Property</strong></a><strong><br>
|
||
|
</strong>Data associated with a file, but not actually stored within the contents of a file. For example, a Microsoft Word
|
||
|
document may possess an AUTHOR property, which gives information about the person who wrote the document.
|
||
|
Properties are often accessible by the operating system directly and do not require the original application to read them.
|
||
|
For example, Windows 95 lets you read the AUTHOR property on a Microsoft Word document without having to
|
||
|
start Microsoft Word or even have it installed on your computer.</dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="PropertyValue"><strong>Property Value</strong></a><strong><br>
|
||
|
</strong>The data contained in a property. If a document is authored by John Smith, the AUTHOR property contains (its value
|
||
|
is) "<tt>John Smith</tt>".</dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="PROPID"><strong>PROPID<br>
|
||
|
</strong></a>An integer that uniquely identifes a property. This integer can be expressed as decimal (10-based) or hexadecimal
|
||
|
(16-based) number.</dd>
|
||
|
</dl>
|
||
|
<hr>
|
||
|
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="sectQ">Q</a></h1>
|
||
|
<dl>
|
||
|
<dd><a name="Query"><strong>Query</strong></a><strong><br>
|
||
|
</strong>In Index Server, the process of searching for specific data in a set of files and returning links to the files containing that
|
||
|
data.</dd>
|
||
|
</dl>
|
||
|
<hr>
|
||
|
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="sectR">R</a></h1>
|
||
|
<dl>
|
||
|
<dd><a name="Regex"><strong>Regex</strong></a><strong><br>
|
||
|
</strong>An abbreviation for <a href="#regex">regular expression</a>.</dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="Registeredbinaryfile"><strong>Registered binary file</strong></a><strong><br>
|
||
|
</strong>A binary file is typically an executable file with the extension .exe. The term <em>binary file</em> can also refer to a file whose
|
||
|
disk format is unknown. A registered binary file is one whose format is known and is entered in the system registry, but
|
||
|
which is not assigned a content filter.</dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="regex"><strong>Regular expression</strong></a><strong><br>
|
||
|
</strong>An expression syntax used by many operating systems, especially UNIX, to specify similarity between words and
|
||
|
phrases. A powerful way to express wildcards in textual expressions.</dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="Restriction"><strong>Restriction</strong></a><strong><br>
|
||
|
</strong>A description of what to look for in a query. A restriction narrows the focus of a query.</dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="Resultset"><strong>Result set</strong></a><strong><br>
|
||
|
</strong>The information returned by Index Server in response to a query. Also used to define the set of properties or columns
|
||
|
to return from the files that matched the query restriction.</dd>
|
||
|
</dl>
|
||
|
<hr>
|
||
|
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="sectS">S</a></h1>
|
||
|
<dl>
|
||
|
<dd><a name="Scan"><strong>Scan</strong></a><strong><br>
|
||
|
</strong>The action of checking all files and directories for modifications among the virtual roots selected for indexing. When
|
||
|
Index Server is first activated it must scan all directories and files to find the documents that may have changed since
|
||
|
Index Server was shut down. Scanning is a background operation that allows queries to be executed. Once scanning is
|
||
|
complete, Index Server can usually use change notifications to keep its indexes up to date.</dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="Scope"><strong>Scope</strong></a><strong><br>
|
||
|
</strong>A query scope specifies the set of documents that must be searched. Typically scopes are specified by a directory path
|
||
|
on a storage volume, such as D:\Docs. Index Server can also use virtual roots to indicate scope.</dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="SIndex"><strong>Shadow index</strong></a><strong><br>
|
||
|
</strong>A <a href="#PIndex">persistent index</a> created by merging word lists and sometimes other shadow indexes into a single index..</dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="Sleeptime"><strong>Sleep time</strong></a><strong><br>
|
||
|
</strong>A waiting period during which a particular operation does not take place. For example, if index merging takes place
|
||
|
every 24 hours, the sleep time between merges is 24 hours.</dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="Stemmer"><strong>Stemmer, word</strong></a><strong><br>
|
||
|
</strong>An Index Server component that takes a word and generates grammatically correct variations of that word. Different
|
||
|
lanugages require their own stemmer. For example, the English stemmer if given the word “swam”, would generate
|
||
|
“swim”, “swam”, “swum”, “swimming”, “swims”, etc.</dd>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="Stopwords"><strong>Stop words</strong></a><strong><br>
|
||
|
</strong>See <a href="#Noisewords">Noise words</a>.</dd>
|
||
|
<dd> </dd>
|
||
|
</dl>
|
||
|
<hr>
|
||
|
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="sectT">T</a></h1>
|
||
|
<dl>
|
||
|
<dd><a name="Timeslice"><strong>Time slice</strong></a><strong><br>
|
||
|
</strong>A specific amount of time dedicated to computational task.</dd>
|
||
|
</dl>
|
||
|
<hr>
|
||
|
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="sectW">W</a></h1>
|
||
|
<dl>
|
||
|
<dd> </dd>
|
||
|
<dd><a name="Wordbreaker"><strong>Word breaker DLL</strong></a><strong><br>
|
||
|
</strong>See <a href="#breaker">Breaker, word</a>.</dd>
|
||
|
</dl>
|
||
|
<!--Footerbegin--><hr>
|
||
|
<p align=center><a href="default.htm#Top"><img src="toc.gif" alt=" Contents" align=middle border=0 width=89 height=31></a> <a href="errhandl.htm"><img src="previous.gif" alt="Previous" align=middle border=0 width=32 height=31></a> <a href="#TOP"><img src="up_end.gif" alt="To Top" align=middle border=0 width=32 height=31></a> <a href="faq.htm"><img src="next.gif" alt="Next" align=middle border=0 width=32 height=31></a></p>
|
||
|
<hr>
|
||
|
<p align=center><em>© 1996 by Microsoft Corporation. All rights reserved.</em> <!--Footerend--></p>
|
||
|
</body>
|
||
|
|
||
|
</html>
|