windows-nt/Source/XPSP1/NT/inetsrv/query/web/html/help/filtrhlp.htm

364 lines
25 KiB
HTML
Raw Normal View History

2020-09-26 03:20:57 -05:00
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html>
<head>
<title>Microsoft Index Server Guide: Filtering</title>
<meta name="FORMATTER" content="Microsoft FrontPage 1.1">
<meta name="FORMATTER" content="Microsoft FrontPage 1.1">
<meta name="GENERATOR" content="Microsoft FrontPage 1.1">
</head>
<body bgcolor="#FFFFFF">
<!--Headerbegin--><p align=center><a name="TOP"><img src="onepix.gif" alt="Space" align=middle width=1 height=1></a> <a href="default.htm#Top"><img src="toc.gif" alt=" Contents" align=middle border=0 width=89 height=31></a> <a href="indexhlp.htm"><img src="previous.gif" alt="Previous" align=middle border=0 width=32 height=31></a> <a href="scanhlp.htm"><img src="next.gif" alt="Next" align=middle border=0 width=32 height=31></a> </p>
<hr>
<!--Headerend--><p><a name="Filtering"><font size=6><strong>Filtering</strong></font></a></p>
<p align=left><!--Chaptoc--></p>
<blockquote>
<p><a href="filtrhlp.htm#CiDaemon">CiDaemon Process</a> <br>
<a href="filtrhlp.htm#FilterDLLs">Filter DLLs</a> <br>
<a href="filtrhlp.htm#RegisteringFileTypes">Associating File Types with Extensions</a> <br>
<a href="filtrhlp.htm#Word-BreakerDLLs">Word-Breaker DLLs</a> <br>
<a href="filtrhlp.htm#NoiseWords">Noise Words</a> <br>
<a href="filtrhlp.htm#CiDaemonPrioritySettings">CiDaemon Priority Settings</a> <br>
<a href="filtrhlp.htm#RelatedPerformanceCounters">Related Performance Counters</a> <br>
<a href="filtrhlp.htm#DiskFullCondition">Disk Full Condition</a> <br>
</p>
</blockquote>
<hr>
<!--ChaptocEnd--><p>Microsoft Index Server filters documents by inserting data from the document files into content indexes. Content filters break
documents into words (keys) and create <a href="indexhlp.htm#WordLists">word lists</a>, which supply raw data for the index. Filtering is a three-step process:</p>
<ol>
<li>A <a href="#FilterDLLs"><em>filter DLL</em></a> (dynamic link library) extracts the text and properties out of a document. </li>
<li>A <a href="#Word-BreakerDLLs"><em>word-breaker DLL</em></a><em> </em>parses the text and textual properties into <em>words</em>.</li>
<li><a href="#NoiseWords"><em>Noise words</em></a> (also known as <em>Stop Words</em>) are removed from the data extracted from the document and the remaining
words are stored in the index. </li>
</ol>
<hr>
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="CiDaemon">CiDaemon Process</a></h1>
<p>The <em>CiDaemon</em> process is a child process created by the Microsoft Index Server engine. The Index Server engine gives a list
of documents to the CiDaemon process and it is responsible for filtering the documents by identifying the correct <a href="#FilterDLLs"><em>filter DLL</em>
</a>and <a href="#Word-BreakerDLLs"><em>word-breaker DLL</em></a> associated with a specific document.</p>
<p>Filtering is done as a background activity so as not to interfere with any foreground activity. On local drives, if a document
opened by the CiDaemon process for reading is needed by another process for writing, the CiDaemon process closes the
document as soon as possible. The document will be retried for filtering at a later time. (This feature is not available on
network shares.)</p>
<p>If the CiDaemon process stops, it will be automatically restarted by the Index Server engine.</p>
<hr>
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="FilterDLLs">Filter DLLs</a></h1>
<p>A filter DLL &#147;understands&#148; one or more document formats and is capable of extracting text and properties out of those
document types. A filter DLL implements the <em>IFilter</em> ActiveX interface. The CiDaemon process uses the IFilter interface to
extract the text out of a document. To track down a problem with a filter DLL, an administrator needs to know where to look
to find out the filter DLL for a particular document. Editing the registry is also a good way to avoid filtering documents with no
useful content.</p>
<hr>
<p><font color="#FF0000"><strong>Caution</strong></font>&#160;&#160;&#160;Editing the registry incorrectly can cause serious problems, including corruption that may make it necessary to
reinstall Windows&#160;NT or Microsoft Index Server. Using the Registry Editor to edit entries in the registry is equivalent to editing
raw sectors on a hard disk. If you make mistakes, your computer&#146;s configuration could be damaged. You should edit registry
entries only for settings that you cannot adjust through the user interface, and be very careful whenever you edit the registry
directly. </p>
<hr>
<p>Document types and the associated filter DLL entries are specified in the registry under the
\HKEY_LOCAL_MACHINE\Software\Classes tree. To find out the filter DLL associated with a particular document type,
navigate through the registry entries in the \HKEY_LOCAL_MACHINE\Software\Classes tree.</p>
<p>The four steps to find out the filter DLL for a document follow. The example is for HTML files.</p>
<h2>Step 1: Determine the CLSID</h2>
<p>Find the <em>CLSID</em> associated with the document type under the registry key
\HKEY_LOCAL_MACHINE\SOFTWARE\Classes. Let this be &lt;<em>Value1&gt;.</em></p>
<blockquote>
<pre>\HKEY_LOCAL_MACHINE\SOFTWARE\Classes
htmlfile
= Class for WWW HTML files
CLSID
= {<a name="25336920-03F9-11CF-8FD0-00AA00686F13">25336920-03F9-11CF-8FD0-00AA00686F13</a>}</pre>
</blockquote>
<h2>Step 2: Determine the Persistent Handler</h2>
<p>Using &lt;<em>Value1&gt;</em> found out in Step 1, find the <a href="#PersistentHandler"><em>PersistentHandler</em></a> value for the
\HKEY_LOCAL_MACHINE\SOFTWARE\Classes\CLSID\<em>&lt;Value1&gt;</em> key. Let this be <em>&lt;Value2&gt;.</em></p>
<blockquote>
<pre>\HKEY_LOCAL_MACHINE\SOFTWARE\Classes\CLSID
<a name="Value1">{25336920-03F9-11CF-8FD0-00AA00686F13}</a>
= WWW HTML files
<a name="PersistentHandler">PersistentHandler</a>
= {<a name="Value2">EEC97550-47A9-11CF-B952-00AA0051FE20</a>}</pre>
</blockquote>
<h2>Step 3: Determine the IFilter Persistent Handler GUID</h2>
<p>Using &lt;<em>Value2&gt;</em> determined in Step 2, find the <em>IFilter Persistent Handler GUID</em> for the document type. The value under the
key \HKEY_LOCAL_MACHINE\SOFTWARE\Classes\CLSID\<em>&lt;Value2&gt;</em>\PersistentAddinsRegistered\<br>
89BCB740-6119-101A-BCB7-00DD010655AF yields the <em>IFilter Persistent Handler GUID</em> for this document type<em>.</em> Let
this be <em>&lt;Value3&gt;.</em> 89BCB740-6119-101A-BCB7-00DD010655AF is the <em>IFilter</em> interface GUID.</p>
<blockquote>
<pre>\Registry\Machine\Software\Classes\CLSID
{<a name="HTMLPERSHANDLER">EEC97550-47A9-11CF-B952-00AA0051FE20</a>}
= REG_SZ HTML File Persistent Handler
PersistentAddinsRegistered
{<a name="IFILTERGUID">89BCB740-6119-101A-BCB7-00DD010655AF</a>}
= REG_SZ {<a name="Value3">E0CA5340-4534-11CF-B952-00AA0051FE20</a>} </pre>
</blockquote>
<h2>Step 4: Determine the Filter DLL</h2>
<p>Using <em>&lt;Value3&gt;</em> determined in Step 3, the filter DLL can be found under the entry
\HKEY_LOCAL_MACHINE\SOFTWARE\Classes\CLSID\<em>&lt;Value3&gt;</em>\InprocServer32.</p>
<blockquote>
<pre>\Registry\Machine\Software\Classes\CLSID
{<a name="HTMLPERSHANDLERDLL">E0CA5340-4534-11CF-B952-00AA0051FE20</a>}
= REG_SZ HTML Filter
InprocServer32
= REG_SZ htmlfilt.dll</pre>
</blockquote>
<p>In this example, the filter DLL for HTML documents is Htmlfilt.dll.</p>
<hr>
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="RegisteringFileTypes">Associating File Types with Extensions</a></h1>
<p>File types are associated with file extensions under the \HKEY_LOCAL_MACHINE\SOFTWARE\Classes tree. Following
are the associations for <em>htmlfile</em> document type:</p>
<blockquote>
<pre>\HKEY_LOCAL_MACHINE\SOFTWARE\Classes
.htm
= REG_SZ htmlfile
.html
= REG_SZ htmlfile
.htx
= REG_SZ htmlfile
.stm
= REG_SZ htmlfile
</pre>
</blockquote>
<p>By default, the extensions listed above are considered to be <em>htmlfile</em> documents. To add another extension to this list, an entry
must be created in the registry associating that extension with htmlfile type. For example, to treat .htx files as htmlfile type, add
the following entry:</p>
<blockquote>
<pre>\HKEY_LOCAL_MACHINE\SOFTWARE\Classes
.htx
= REG_SZ htmlfile
</pre>
</blockquote>
<h2>Adding Filter DLLs</h2>
<p>To add new filter DLLs, please refer to the documentation provided with the filter DLLs.</p>
<h2>Removing Filter DLLs</h2>
<p>To remove a filter DLL, the <em>IFilter</em> <em>PersistentHandler </em>entry associated with a document type and the filter DLL entry must
be deleted. Please refer to the <a href="#FilterDLLs"><em>Filter DLLs</em></a> section to see how to find out a <em>IFilter PersistentHandler</em> for a particular
document type.</p>
<p>For example, to remove the installed Htmlfilt.dll, the following two entries must be removed:</p>
<blockquote>
<pre>\Registry\Machine\Software\Classes\CLSID
{EEC97550-47A9-11CF-B952-00AA0051FE20}
PersistentAddinsRegistered
{89BCB740-6119-101A-BCB7-00DD010655AF}
= REG_SZ {<a href="#HTMLPERSHANDLERDLL">E0CA5340-4534-11CF-B952-00AA0051FE20</a>} </pre>
</blockquote>
<blockquote>
<pre>\Registry\Machine\Software\Classes\CLSID
{E0CA5340-4534-11CF-B952-00AA0051FE20}
= REG_SZ HTML Filter
InprocServer32
= REG_SZ htmlfilt.dll</pre>
</blockquote>
<h2><a name="BinaryFiles">Binary Files</a> - NULL Filter</h2>
<p>When a <em>registered binary file</em> is encountered, the <em>NULL filter </em>is used. The NULL filter retrieves only the <em>system properties</em>.
The contents of a binary file are not filtered. Examples of system properties are the FileName, last Write time, file Size,
Attributes, and so on.</p>
<p>A file with a certain extension is considered to be a binary file if its type in the registry is set to <em>BinaryFile.</em> For example, to
associate the extension<em> </em>.lib with the binary file type, add the following entry to the registry:</p>
<blockquote>
<pre>\HKEY_LOCAL_MACHINES\Software\Classes
\.lib
= REG_SZ BinaryFile</pre>
</blockquote>
<p>The class BinaryFile is a predefined type that uses the NULL filter for its IFilter implementation.</p>
<hr>
<p><font color="#FF0000"><strong>Warning</strong></font>&#160;&#160;&#160;If the extension for which you wish to use the NULL filter already has a file type, do not change it to BinaryFile.
Doing so could damage your Windows NT installation. Instead, use the following procedure to set the implementation of the
IFilter interface for the file type. </p>
<hr>
<p>When a file extension already has a file type, use the previous procedure to lookup the PersistentAddinsRegistered key and set
the IFilter interface implementation. The example below is for files with the extension .dll.</p>
<h2>Step 1: Determine the file type</h2>
<p>Find the file type associated with the file extension .dll.</p>
<blockquote>
<pre>\HKEY_LOCAL_MACHINE\Software\Classes
\.dll
= REG_SZ dllfile
</pre>
</blockquote>
<h2>Step 2: Determine the CLSID</h2>
<p>Look up the CLSID associated with the dllfile type in the registry.</p>
<blockquote>
<pre>\HKEY_LOCAL_MACHINE\Software\Classes
dllfile
= REG_SZ Application Extension
CLSID
= REG_SZ {3cf51a00-84eb-11ce-ac07-00004c752752}</pre>
</blockquote>
<h2>Step 3: Determine the Persistent Handler</h2>
<p>Look up the persistent handler GUID for the CLSID in the registry. If there is no persistent handler, set it to the CLSID for the
persistent handler of the NULL filter, &#147;{098F2470-BAE0-11CD-B579-08002B30BFEB}&#148;. Otherwise, continue with the
next step.</p>
<blockquote>
<pre>\HKEY_LOCAL_MACHINE\Software\Classes
CLSID
{3cf51a00-84eb-11ce-ac07-00004c752752}
PersistentHandler
= REG_SZ {098F2470-BAE0-11CD-B579-08002B30BFEB}</pre>
</blockquote>
<h2>Step 4: Set the IFilter Persistent Handler</h2>
<p>Look up the CLSID found in the step above and set the IFilter handler {89BCB740-6119-101A-BCB7-00DD010655AF}
to the NULL filter GUID {C3278E90-BEA7-11CD-B579-08002B30BFEB}.</p>
<blockquote>
<pre>\HKEY_LOCAL_MACHINE\Software\Classes
CLSID
{098F2470-BAE0-11CD-B579-08002B30BFEB}
PersistentAddinsRegistered
{89BCB740-6119-101A-BCB7-00DD010655AF}
= REG_SZ {C3278E90-BEA7-11CD-B579-08002B30BFEB}</pre>
</blockquote>
<p>Here is a list of default extensions for binary files:</p>
<blockquote>
<p>.aif,.avi,.cgm,.com,.dct,.dic,.dll,.exe,.eyb,.fnt,.ghi,.gif,<br>
.hqx,.ico,.inv,.jbf,.jpg,.m14,.mov,.movie,.mv,<br>
.pdf,.pic,.pma,.pmc,.pml,.pmr,.psd,.sc2,<br>
.tar,.tif,.tiff,.ttf,.wav,.wll,.wlt,.wmf,.z,.z96,.zip</p>
</blockquote>
<h2><a name="DefaultFilter">Default Filter</a></h2>
<p>In Index Server, a default filter filters both the system properties (such as file name) and the contents of a file. The default filter
does not &#147;understand&#148; any document formats; when filtering the contents of a file, it treats the file as a sequence of characters.
Index Server uses the default filter when a extension of a file has no association in the registry, and if the value of the registry
setting <a href="reghelp.htm#FilterFilesWithUnknownExtensions">FilterFilesWithUnknownExtensions</a> is 1.</p>
<p><strong>Note</strong>&#160;&#160;&#160;The default filter filters plain text and files of unknown origin. It assumes all text to be in the default codepage of the
server.</p>
<h2>Corrupted Files</h2>
<p>If a file is corrupted, the filter DLL may not be able to properly interpret the contents of that file. To get a list of files that could
not be filtered, see <a href="adminhlp.htm#UnfilteredFiles"><u>Unfiltered Files</u></a>. An <a href="errorhlp.htm#MaxRetries">event</a> is also written to the event log. Sometimes a file cannot be filtered because of a
defective third-party filter DLL. After verifying the contents of a file, an administrator should report the problems to the filter
DLL vendor. Files protected by passwords are not filtered.</p>
<h2><a name="MaximumRetries">Maximum Retries</a></h2>
<p>If a document cannot be filtered, it will be retried a certain maxium number of times.&#160;If the document still cannot be filtered,
then it will be considered to be an <em>unfiltered</em> file. The registry key <a href="reghelp.htm#FilterRetries"><em>FilterRetries</em></a> controls the maximum number of retries for a
document.</p>
<p>To get a list of all the files that could not be filtered, issue the query @unfiltered = TRUE.</p>
<h2>Unknown Extensions</h2>
<p>A file with an extension that does not have an association in the registry is treated as an <em>Unknown</em> <em>Extension</em>. The behavior of
Index Server depends upon the registry setting <a href="reghelp.htm#FilterFilesWithUnknownExtension"><em>FilterFilesWithUnknownExtension</em></a>. If this value is set to 0, then the NULL
Filter is used to filter those files. Otherwise, the <a href="#DefaultFilter">default filter DLL</a> is used to filter the contents.</p>
<h2>Filtering Directories</h2>
<p>By default, directories are <em>not</em> filtered and will not appear in query results. To filter directories, set the registry key
<a href="reghelp.htm#FilterDirectories"><em>FilterDirectories</em></a> to 1. When directories are filtered, their system properties are filtered.</p>
<h2><a name="Characterization">Characterization</a></h2>
<p>CiDaemon process is capable of automatically generating summaries or <a href="glossary.htm#Abstract"><em>characterization</em> (also called abstract)</a> for documents.
If the registry key <a href="reghelp.htm#GenerateCharacterization"><em>GenerateCharacterization</em></a><a href="reghelp.htm#GenerateCharacterization"> </a>is set to 1, the characterization will be automatically generated. The maximum
number of chatacters in the generated characterization is controlled by the registry key <a href="reghelp.htm#MaxCharacterization"><em>MaxCharacterization</em></a><em>. </em></p>
<h2><a name="Pre-InstalledFilterDLLs">Preinstalled Filter DLLs</a></h2>
<p>The list of document types for which filter DLLS are preinstalled is given below:</p>
<ul>
<li>HTML 3.0 or lower</li>
<li>Microsoft&#174; Word</li>
<li>Microsoft&#174; Excel</li>
<li>Microsoft&#174; Powerpoint&#174;</li>
<li>Plain Text </li>
<li>Binary Files</li>
</ul>
<hr>
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="Word-BreakerDLLs">Word-Breaker DLLs</a></h1>
<p>A <em>word-breaker DLL</em> parses the text and textual properties returned by the <a href="#Filter DLLs"><em>filter DLL</em></a> into words. The word-breaker DLL is
language dependent. The following languages are supported by Microsoft Index Server: </p>
<ul>
<li>Dutch</li>
<li>English (U.S. and International)</li>
<li>French</li>
<li>German</li>
<li>Italian</li>
<li>Spanish</li>
<li>Swedish</li>
</ul>
<hr>
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="NoiseWords">Noise Words</a></h1>
<p>Words that are not significant for searching are called <em>noise words</em> or <em>stop words</em>. Noise words are stored in
<em>%systemroot%\</em>system32 directory in various noise word files (Noise.enu, by default). The noise word files are language
dependent. The noise word file for a particular language is specified in the registry under the key:</p>
<blockquote>
<p><code>HKEY_LOCAL_MACHINE\SYSTEM<br>
\SYSTEM <br>
&#160;\CurrentControlSet <br>
&#160;&#160;\Control <br>
&#160;&#160;&#160;\ContentIndex<br>
&#160;&#160;&#160;&#160;\Language<br>
&#160;&#160;&#160;&#160; \&lt;</code><code><em>language</em></code><code>&gt;<br>
&#160;&#160;&#160;&#160; </code><code><tt>&#160;\</tt></code><tt>NoiseFile</tt></p>
</blockquote>
<p>For example, the noise word file for <em>English_US</em> is listed as the registry key:</p>
<blockquote>
<p><code>HKEY_LOCAL_MACHINE\SYSTEM<br>
\SYSTEM <br>
&#160;\CurrentControlSet <br>
&#160;&#160;\Control <br>
&#160;&#160;&#160;\ContentIndex<br>
&#160;&#160;&#160;&#160;\Language<br>
&#160;&#160;&#160;&#160; \English_US<br>
&#160;&#160;&#160;&#160; &#160;</code><code><tt>\</tt></code><tt>NoiseFile<br>
</tt><code><tt>&#160;&#160;&#160;&#160; &#160; \</tt></code><tt>noise.enu</tt></p>
</blockquote>
<p>The noise word files can be edited with a text editor to either add new words or remove words that are not considered &#147;noise&#148;
at a particular installation. Note that querying for noise words will not yield any hits. </p>
<p><font color="#FF0000">Removing all noise words from the<em> </em>noise word files can significantly increase the size of indexes.</font></p>
<hr>
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="CiDaemonPrioritySettings">CiDaemon Priority Settings</a></h1>
<p>The CiDaemon priority is controlled by two settings:</p>
<ul>
<li><a href="reghelp.htm#ThreadClassFilter" name="ThreadClassFilter">ThreadClassFilter</a></li>
<li><a href="reghelp.htm#ThreadPriorityFilter" name="ThreadPriorityFilter">ThreadPriorityFilter</a></li>
</ul>
<p>ThreadClassFilter<em> </em>specifies the priority class of the filter daemon. The possible values are:</p>
<table border=1 cellpadding=5 cellspacing=0 width=100%>
<tr><td valign=top width=40%><a name="NORMAL_PRIORITY_CLASS "><font size=2>NORMAL_PRIORITY_CLASS </font></a></td><td valign=top width=60%><font size=2>0x00000020</font></td></tr>
<tr><td valign=top width=40%><font size=2>IDLE_PRORITY_CLASS (default)</font></td><td valign=top width=60%><font size=2>0x00000040</font></td></tr>
<tr><td valign=top width=40%><font size=2>HIGH_PRIORITY_CLASS </font></td><td valign=top width=60%><font size=2>0x00000080</font></td></tr>
<tr><td valign=top width=40%><font size=2>REALTIME_PRIORITY_CLASS </font></td><td valign=top width=60%><font size=2>0x00000100</font></td></tr>
</table>
<p>ThreadPriorityFilter<em> </em>specifies the priority in the specific class. The possible values are:</p>
<table border=1 cellpadding=5 cellspacing=0 width=100%>
<tr><td valign=top width=40%><font size=2>THREAD_PRIORITY_LOWEST </font></td><td valign=top width=60%><font size=2>-2</font></td></tr>
<tr><td valign=top width=40%><font size=2>THREAD_PRIORITY_BELOW_NORMAL</font></td><td valign=top width=60%><font size=2>-1</font></td></tr>
<tr><td valign=top width=40%><a name="THREAD_PRIORITY_NORMAL"><font size=2>THREAD_PRIORITY_NORMAL</font></a><font size=2> </font></td><td valign=top width=60%><font size=2>0</font></td></tr>
<tr><td valign=top width=40%><font size=2>THREAD_PRIORITY_ABOVE_NORMAL (default)</font></td><td valign=top width=60%><font size=2>+1</font></td></tr>
<tr><td valign=top width=40%><font size=2>THREAD_PRIORITY_HIGHEST </font></td><td valign=top width=60%><font size=2>+2</font></td></tr>
</table>
<p>By default the CiDaemon process is set to run in the idle priority class to prevent interference with normal foreground activity.
On a busy server, this might result in the files never being filtered. To run the <a href="#CiDaemon">CiDaemon</a> process as a <em>normal process</em>, set the
<em>ThreadClassFilter</em> to <a href="#NORMAL_PRIORITY_CLASS ">NORMAL_PRIORITY_CLASS</a> and <em>ThreadPriorityFilter</em> to <a href="#THREAD_PRIORITY_NORMAL">THREAD_PRIORITY_NORMAL</a>.
<font color="#FF0000">Setting <em>ThreadClassFilter </em>to HIGH_PRIORITY_CLASS or REALTIME_PROCESS_CLASS is <em>not</em> recommended
because it may interfere with normal activity on the system.</font></p>
<hr>
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="RelatedPerformanceCounters">Related Performance Counters</a></h1>
<p>The following counters are present under the performance monitor object <em>Content Index.</em> </p>
<table border=1 cellpadding=5 cellspacing=0 width=100%>
<tr><th align=left valign=bottom width=40%><font size=2><strong>Counter Name </strong></font></th><th align=left valign=bottom width=60%><font size=2><strong>Explanation</strong></font></th></tr>
<tr><td valign=top width=40%><font size=2># documents filtered </font></td><td valign=top width=60%><font size=2>The number of documents filtered since the indexing was started in the <em>current</em>
process instantiation. Note that this does not include the documents filtered in prior
runs of Index Server.</font></td></tr>
<tr><td valign=top width=40%><font size=2>Files to be filtered</font></td><td valign=top width=60%><font size=2>These are the files remaining to be filtered.</font></td></tr>
<tr><td valign=top width=40%><font size=2>Total # of documents</font></td><td valign=top width=60%><font size=2>Total number of documents known to the index. </font></td></tr>
</table>
<p>The following counters are present under the perfmon object <em>Content Index Filter</em></p>
<table border=1 cellpadding=5 cellspacing=0 width=100%>
<tr><th align=left valign=bottom width=40%><font size=2><strong>Counter Name </strong></font></th><th align=left valign=bottom width=60%><font size=2><strong>Explanation</strong></font></th></tr>
<tr><td valign=top width=40%><font size=2>Binding Time</font></td><td valign=top width=60%><font size=2>Average time (in milliseconds) to bind to a </font><a href="#Filter DLLs"><font size=2>filter DLL</font></a><font size=2>.</font></td></tr>
<tr><td valign=top width=40%><font size=2>Filter Speed</font></td><td valign=top width=60%><font size=2>Speed (in megabytes per hour) at which documents are filtered.</font></td></tr>
<tr><td valign=top width=40%><font size=2>Total Filter Speed</font></td><td valign=top width=60%><font size=2>Speed (in megabytes per hour) at which documents are indexed. This includes both
the time to filter document contents, plus time to filter properties and generate
abstracts.</font></td></tr>
</table>
<hr>
<h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="DiskFullCondition">Disk Full Condition</a></h1>
<p>If the free disk space on the index disk starts running low (less than 3 MB), filtering will be temporarily paused. A <a href="errorhlp.htm#LowDiskEvent">disk-full</a>
event will be written to the event log. The administrator should free up disk space by deleting or moving files from that drive.</p>
<!--Footerbegin--><hr>
<p align=center><a href="default.htm#Top"><img src="toc.gif" alt=" Contents" align=middle border=0 width=89 height=31></a> <a href="indexhlp.htm"><img src="previous.gif" alt="Previous" align=middle border=0 width=32 height=31></a> <a href="#TOP"><img src="up_end.gif" alt="To Top" align=middle border=0 width=32 height=31></a> <a href="scanhlp.htm"><img src="next.gif" alt="Next" align=middle border=0 width=32 height=31></a> </p>
<hr>
<p align=center><em>&#169; 1996 by Microsoft Corporation. All rights reserved.<!--Footerend--></em></p>
</body>
</html>