dtsrdbfl.sgm 8.6 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236
  1. <!-- $XConsortium: dtsrdbfl.sgm /main/5 1996/08/30 15:13:01 rws $ -->
  2. <!-- (c) Copyright 1996 Digital Equipment Corporation. -->
  3. <!-- (c) Copyright 1996 Hewlett-Packard Company. -->
  4. <!-- (c) Copyright 1996 International Business Machines Corp. -->
  5. <!-- (c) Copyright 1996 Sun Microsystems, Inc. -->
  6. <!-- (c) Copyright 1996 Novell, Inc. -->
  7. <!-- (c) Copyright 1996 FUJITSU LIMITED. -->
  8. <!-- (c) Copyright 1996 Hitachi. -->
  9. <![ %CDE.C.CDE; [<RefEntry Id="CDE.INFO.dtsrdbfiles">]]>
  10. <RefMeta>
  11. <RefEntryTitle>dtsrdbfiles</RefEntryTitle>
  12. <ManVolNum>special file</ManVolNum>
  13. </RefMeta>
  14. <RefNameDiv>
  15. <RefName>dtsrdbfiles</RefName>
  16. <RefPurpose>
  17. Describes the complete set of DtSearch database files
  18. </RefPurpose>
  19. </RefNameDiv>
  20. <RefSect1>
  21. <Title>DESCRIPTION</Title>
  22. <Para>Each DtSearch database consists of a set of core files
  23. that are created and maintained by the DtSearch offline build tools.
  24. Each database may also include a set of one or more language files
  25. that vary depending on the DtSearch language of the database.
  26. Some language files are part of the DtSearch package but
  27. may also be enhanced by the database developer.
  28. </para>
  29. <para>All database files for a single database must be located in the same
  30. directory. The directory is specified in the offline build tools by the
  31. optional path prefix in the <literal>&minus;d</literal><Symbol Role="Variable">dbname</Symbol> argument. The directory is specified for
  32. the online API by a <systemitem class="environvar">PATH</systemitem>
  33. configuration file (ocf file).
  34. </para>
  35. <refsect2>
  36. <Title>Core Files</Title>
  37. <Para>The base name of the core files is formed by appending a period and
  38. 3-character name extension to the 1- to 8-character database name
  39. specified at creation time. Core files are binary and accessible only
  40. via DtSearch programs.
  41. </para>
  42. <para>The DtSearch core files are as follows:
  43. </para>
  44. <variablelist>
  45. <varlistentry><term><Symbol Role="Variable">dbname</Symbol>.dbe</term>
  46. <listitem>
  47. <para>Database dictionary file. Binary schema created by
  48. <command>dtsrcreate</command> from <filename>dtsearch.dbe</filename>.
  49. Never modified thereafter.
  50. </para>
  51. </listitem>
  52. </varlistentry>
  53. <varlistentry><term><Symbol Role="Variable">dbname</Symbol>.k00</term>
  54. <listitem>
  55. <para>Main key file for database documents. Created and initialized by
  56. <command>dtsrcreate</command>, updated by <command>dtsrload</command>.
  57. Contains the b-tree of unique keys for each document.
  58. </para>
  59. </listitem>
  60. </varlistentry>
  61. <varlistentry><term><Symbol Role="Variable">dbname</Symbol>.k01</term>
  62. <listitem>
  63. <para>Optional key file for database documents. Created and initialized by
  64. <command>dtsrcreate</command>. Contains the b-tree of optional keys for
  65. each document. Not currently used.
  66. </para>
  67. </listitem>
  68. </varlistentry>
  69. <varlistentry><term><Symbol Role="Variable">dbname</Symbol>.d00</term>
  70. <listitem>
  71. <para>Documents header file. Created by <command>dtsrcreate</command>, updated
  72. by <command>dtsrload</command>. Contains the databases configuration
  73. status record and, for each document in the database, a header record
  74. and one or more abstract records.
  75. </para>
  76. </listitem>
  77. </varlistentry>
  78. <varlistentry><term><Symbol Role="Variable">dbname</Symbol>.d01</term>
  79. <listitem>
  80. <para>Compressed text file. Created by <command>dtsrcreate</command>, but
  81. updated by <command>dtsrload</command> only for AusText type dataases.
  82. Repository of compressed text for each document.
  83. </para>
  84. </listitem>
  85. </varlistentry>
  86. <varlistentry><term><Symbol Role="Variable">dbname</Symbol>.k21,
  87. <Symbol Role="Variable">dbname</Symbol>.k22,
  88. <Symbol Role="Variable">dbname</Symbol>.k23</term>
  89. <listitem>
  90. <para>Key files for words and stems. Created and initialized by
  91. <command>dtsrcreate</command>, updated by <command>dtsrindex</command>.
  92. Contains the b-tree of each word and stem indexed for the database. The
  93. k21 file finds "short" words, 1 to 15 bytes, in the d21 file. The k22
  94. file finds "long" words, 16 to 39 bytes, in the d22 file. The k23 file
  95. finds "huge" words, 40 to 133 bytes, in the d23 file. Long and huge word
  96. files may not be used depending on the database maximum word size
  97. specified at creation time.
  98. </para>
  99. </listitem>
  100. </varlistentry>
  101. <varlistentry><term><Symbol Role="Variable">dbname</Symbol>.d21,
  102. <Symbol Role="Variable">dbname</Symbol>.d22,
  103. <Symbol Role="Variable">dbname</Symbol>.d23</term>
  104. <listitem>
  105. <para>Data files for words and stems. Created and initialized by
  106. <command>dtsrcreate</command>, updated by <command>dtsrindex</command>.
  107. For each word contains document counts, offset to inverted index (d99
  108. file), and storage recovery data. The d21 file contains short words, the
  109. d22 file contains long words, and the d23 file contains huge words. Long
  110. and huge word files may not be used depending on the database maximum
  111. word size specified at creation time.
  112. </para>
  113. </listitem>
  114. </varlistentry>
  115. </variablelist>
  116. </refsect2>
  117. <refsect2>
  118. <Title>Language Files</Title>
  119. <Para>Databases also need a set of files associated with the DtSearch language
  120. of the database. When looking for these files DtSearch will first look
  121. for a customized version applicable only to a database, and then look
  122. for the generic language version. Like core files, the base file name of
  123. a customized language file is formed by the database name and a 3
  124. character extension. The alternative generic language files are named
  125. with a language name and the same 3 character extension.
  126. </para>
  127. <para>Language files are mandatory or optional depending on the language.
  128. See &cdeman.dtsrlangfiles; for formats of language files.
  129. </para>
  130. <para>The DtSearch language-related files are as follows:
  131. </para>
  132. <variablelist>
  133. <varlistentry><term><Symbol Role="Variable">dbname</Symbol>.stp</term>
  134. <listitem>
  135. <para>Stop file. The supported stop files are:
  136. </para>
  137. <simplelist>
  138. <member>
  139. <filename>eng.stp</filename> &minus; for
  140. <systemitem class="constant">DtSrLaENG</systemitem> and
  141. <systemitem class="constant">DtSrLaENG2</systemitem>
  142. </member>
  143. <member>
  144. <filename>esp.stp</filename> &minus; for
  145. <systemitem class="constant">DtSrLaESP</systemitem>
  146. </member>
  147. <member>
  148. <filename>fra.stp</filename> &minus; for
  149. <systemitem class="constant">DtSrLaFRA</systemitem>
  150. </member>
  151. <member>
  152. <filename>deu.stp</filename> &minus; for
  153. <systemitem class="constant">DtSrLaDEU</systemitem>
  154. </member>
  155. <member>
  156. <filename>ita.stp</filename> &minus; for
  157. <systemitem class="constant">DtSrLaITA</systemitem>
  158. </member>
  159. </simplelist>
  160. <para>Stop lists are mandatory for European languages, and
  161. optional for other supported languages.
  162. </para>
  163. </listitem>
  164. </varlistentry>
  165. <varlistentry><term><Symbol Role="Variable">dbname</Symbol>.inc</term>
  166. <listitem>
  167. <para>An include list is always optional for all supported languages.
  168. There are no generic versions of include lists.
  169. </para>
  170. </listitem>
  171. </varlistentry>
  172. <varlistentry><term><filename>eng.sfx</filename></term>
  173. <listitem>
  174. <para>For<systemitem class="constant">DtSrLaENG</systemitem> and
  175. <systemitem class="constant">DtSrLaENG2</systemitem>.
  176. and is not currently required for other supported languages.
  177. </para>
  178. </listitem>
  179. </varlistentry>
  180. <varlistentry><term><Symbol Role="Variable">dbname</Symbol>.knj</term>
  181. <listitem>
  182. <para><filename>jpn.knj</filename> for
  183. <systemitem class="constant">DtSrLaJPN2</systemitem>.
  184. A kanji compounds file is mandatory only for language number 7
  185. <systemitem class="constant">DtSrLaJPN2</systemitem>,
  186. a supported Japanese language.
  187. </para>
  188. </listitem>
  189. </varlistentry>
  190. </variablelist>
  191. <RefSect3>
  192. <Title>Examples</Title>
  193. <para>Files associated with a minimum
  194. <systemitem class="constant">DtSrLaENG</systemitem> database
  195. (English, ASCII) that uses no customized or optional files:
  196. </para>
  197. <programlisting>
  198. All core files plus <filename>eng.stp</filename>, <filename>eng.sfx</filename>.
  199. </programlisting>
  200. <para>Files for a <systemitem class="constant">DtSrLaITA</systemitem>
  201. database (Italian, ISO Latin-1)
  202. with enhanced stop list and an include list:
  203. </para>
  204. <programlisting>
  205. All core files plus <Symbol Role="Variable">dbname</Symbol>.stp, <Symbol Role="Variable">dbname</Symbol>.inc.
  206. </programlisting>
  207. <para>Files associated with a minimum <systemitem class="constant">DtSrLaJPN</systemitem>
  208. database
  209. (Japanese with full, automatic kanji compounding)
  210. that uses no customized or optional files:
  211. </para>
  212. <programlisting>
  213. Only core files.
  214. </programlisting>
  215. <para>Files for a <systemitem class="constant">DtSrLaJPN2</systemitem>
  216. database (Japanese with kanji compounds
  217. from a word list), with optional stop list for ASCII substrings:
  218. </para>
  219. <programlisting>
  220. All core files plus <Symbol Role="Variable">dbname</Symbol>.stp, <filename>jpn.knj</filename>.
  221. </programlisting>
  222. </refsect3>
  223. </refsect2>
  224. </refsect1>
  225. <RefSect1>
  226. <Title>SEE ALSO</Title>
  227. <Para>&cdeman.dtsrcreate;,
  228. &cdeman.dtsrload;,
  229. &cdeman.dtsrindex;,
  230. &cdeman.DtSrAPI;,
  231. &cdeman.dtsrlangfiles;,
  232. &cdeman.dtsrocffile;,
  233. &cdeman.DtSearch;
  234. </Para>
  235. </RefSect1>
  236. </RefEntry>