Skip to content

Commit 66d633c

Browse files
committed
DOCS-1046 add default_language and language_override options for text index
DOCS-1046 language and text search DOCS-1046 language and text search DOCS-1046 language and text search
1 parent 945c946 commit 66d633c

File tree

1 file changed

+96
-3
lines changed

1 file changed

+96
-3
lines changed

source/release-notes/2.4.txt

Lines changed: 96 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -299,6 +299,59 @@ likely to appear in responses than words from other fields.
299299
when creating the index, you can find the name using
300300
:method:`db.collection.getIndexes()`
301301

302+
.. _text-index-specify-language:
303+
304+
Specify Languages for Text Index
305+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
306+
307+
The default language associated with the indexed data determines the
308+
list of stop words and the rules for the stemmer and tokenizer. The
309+
default language for the indexed data is ``english``.
310+
311+
Use the ``default_language`` option when creating the ``text`` index to
312+
specify a different language. See :ref:`text-search-languages`.
313+
314+
The following example creates a ``text`` index on the
315+
``content`` field and sets the ``default_language`` to
316+
``spanish``:
317+
318+
.. code-block:: javascript
319+
320+
db.collection.ensureIndex( { content : "text" },
321+
{ default_language: "spanish" } )
322+
323+
If a collection contains documents that are in different languages, the
324+
individual documents can specify the language to use.
325+
326+
- By default, if the documents in the collection contain a field named
327+
``language``, the value of the ``language`` field overrides the
328+
default language.
329+
330+
For example, the following document overrides the default language
331+
``spanish`` with ``portuguese``, the value in its ``language`` field.
332+
333+
.. code-block:: javascript
334+
335+
{ content: "A sorte protege os audazes", language: "portuguese" }
336+
337+
- To use a different field to override the default language, specify the
338+
field with the ``language_override`` option when creating the index.
339+
340+
For example, if the documents contain the field named ``myLanguage``
341+
instead of ``language``, create the ``text`` index with the
342+
``language_override`` option.
343+
344+
.. code-block:: javascript
345+
346+
db.collection.ensureIndex( { content : "text" },
347+
{ language_override: "myLanguage" } )
348+
349+
.. .. note::
350+
.. If you specify a ``default_language`` of ``"none"``, or the override
351+
language is ``"none"``, the :dbcommand:`text` command will not stem
352+
the words. The command will also consider all words, i.e., it will not
353+
drop the stop words.
354+
302355
Text Queries
303356
^^^^^^^^^^^^
304357

@@ -363,9 +416,11 @@ cursor.
363416

364417
:param string language:
365418

366-
Optional. Specify the language that determines the tokenization,
367-
stemming, and the stop words for the search. The default language
368-
is ``english``.
419+
Optional. Specify, for the search, the language that determines
420+
the list of stop words and the rules for the stemmer and
421+
tokenizer. The default language is the value of the
422+
``default_language`` field specified during the index creation.
423+
See :ref:`text-search-languages` for the supported languages.
369424

370425
:return:
371426

@@ -477,6 +532,44 @@ cursor.
477532
document, you cannot mix inclusions (i.e. ``<fieldA>: 1``) and
478533
exclusions (i.e. ``<fieldB>: 0``), except for the ``_id`` field.
479534

535+
.. _text-search-languages:
536+
537+
Languages Supported in Text Search
538+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
539+
540+
The ``text`` index and the :dbcommand:`text` command support the
541+
following languages:
542+
543+
- ``danish``
544+
545+
- ``dutch``
546+
547+
- ``english``
548+
549+
- ``finnish``
550+
551+
- ``french``
552+
553+
- ``german``
554+
555+
- ``hungarian``
556+
557+
- ``italian``
558+
559+
- ``norwegian``
560+
561+
- ``portuguese``
562+
563+
- ``romanian``
564+
565+
- ``russian``
566+
567+
- ``spanish``
568+
569+
- ``swedish``
570+
571+
- ``turkish``
572+
480573
.. _kerberos-authentication:
481574

482575
New Modular Authentication System with Support for Kerberos

0 commit comments

Comments
 (0)