From 915f77a20d217450b77ebe8fd154b00ff4e89872 Mon Sep 17 00:00:00 2001 From: Sam Kleinman Date: Tue, 8 Jan 2013 21:05:47 -0500 Subject: [PATCH 1/5] DOCS-953: text index release notes. --- source/release-notes/2.4.txt | 239 ++++++++++++++++++++++++++++++++++- 1 file changed, 233 insertions(+), 6 deletions(-) diff --git a/source/release-notes/2.4.txt b/source/release-notes/2.4.txt index da0477d94af..1f28ea0bb45 100644 --- a/source/release-notes/2.4.txt +++ b/source/release-notes/2.4.txt @@ -40,6 +40,238 @@ process. Changes ------- +Text Indexes +~~~~~~~~~~~~ + +.. note:: + + The ``text`` index type is currently an experimental feature and + you must enable it at run time. Interfaces and on-disk format may + change in future releases. + +Background +`````````` + +MongoDB 2.3.2 added a new ``text`` index type that creates a special +index that allows rich arbitrary queries over the content of a string +field in MongoDB. MongoDB updates ``text`` indexes in real time as +clients update data in MongoDB. Queries that use the ``text`` index +will always be able to find the latest data using the text index. + +However, ``text`` indexes have large storage requirements and incur +**significant** performance costs: + +- Building ``text`` indexes takes time. For larger data sets, it may + take many minutes or hours to build a text index. + +- ``text`` indexes will impede insertion throughput for collection, as + MongoDB must update index entries for each word in the source + collection. + +Additionally, the current *experimental* implementation of ``text`` +indexes have the following limitations and behaviors: + +- MongoDB stores words stemmed during insertion in the index, using + simple suffix stemming, including support for a number of + languages. MongoDB automatically stems :dbcommand:`text` queries at + before beginning the query. + +- queries drop stop words (i.e. "the," "an," "a," "and," etc.) + +- the index does not store phrases or information about the proximity + of words in the documents. As a result, **only** use phrase queries + when the entire collection fits in RAM. + +- queries that negate + +- you may only create a single text index on a collection at a time. + +.. important:: Do not enable or use ``text`` indexes on production + systems. + +For production-grade search requirements consider using a third-party +search tool, and the `mongo-connector `_ +or a similar integration strategy to provide more advanced search +capabilities. + +Test ``text`` Indexes +````````````````````` + +.. important:: The ``text`` index type is an experimental feature and + you must enable the feature before creating or accessing a text + index. To enable text indexes issue the following command at the + :program:`mongo` shell: + + .. code-block:: javascript + + db.adminComand( { setParameter: 1, textSearchEnabled: true } ) + + You can also start the :program:`mongod` with the following + invocation: + + .. code-block:: sh + + mongo --setParameter textSearchEnabled=true + +Create Text Indexes +^^^^^^^^^^^^^^^^^^^ + +To create a text index, use the following invocation of +:method:`~db.collection.ensureIndex()`: + +.. code-block:: javascript + + db.collection.ensureIndex( { content: "text" } ) + +Your ``text`` index can include content from multiple fields, and from +fields in sub-documents, as in the following: + +.. code-block:: javascript + + db.collection.ensureIndex( { content: "text", + users.comment: "text", + users.profiles: "text" } ) + +These indexes may run into the :limit:`Index Name Length` limit, to +avoid creating an index with a too-long name, you can specify a name +in the options category, as in the following: + +.. code-block:: javascript + + db.collection.ensureIndex( { content: "text", + users.profiles: "text" }, + { name: "TextIndex" } ) + +When creating a ``text`` indexes you may also specify *weights* for +specific, which help shape the result set. Consider the following: + +.. code-block:: javascript + + db.collection.ensureIndex( { content: "text", users.profiles: "text" }, + { name: "TextIndex", + weights: { content: 1, + users.profiles: 2 } } ) + +This example creates a ``text`` index on the top-level field named +``content`` and the ``profiles`` field in the ``users`` +sub-documents. Furthermore, ``content`` field has a weight of 1 and +the ``users.profiles`` field has a weight of 2. + +You can add a conventional ascending or descending index field as a +prefix of the index so that queries can limit the number of index +entries the query must review to perform the query. Create this index +with the following command: + +.. code-block:: javascript + + db.collection.ensureIndex( { username: "1", + users.profiles: "text" } ) + +Alternately you can specify a conventional ascending or descending +field as a suffix to a ``text`` index make it possible to use the +``text`` index to return covered queries using a projection, as in the +following index: + +.. code-block:: javascript + + db.collection.ensureIndex( { users.profiles: "text" + username: "1" } ) + +Finally, you may use the special wild card field name specifier +(i.e. ``$**``) to specify index weights and fields. Consider the +following: + +.. code-block:: javascript + + db.collection.ensureIndex( { "$**": "text" + username: "1" }, + { name: "TextIndex" } ) + +This creates an index named ``TextIndex``, that indexes all string +data in every field of every document in a collection + +.. warning:: Create indexes using the wildcard operator with extreme + caution. + +You may also specify weights for a text index with compound fields, as +in the following: + +.. code-block:: javascript + + db.collection.ensureIndex( { content: "text", + users.profiles: "text", + comments: "text", + keywords: "text" + about: "text" }, + { name: "TextIndex", + weight: + { "$**": 5 + content: 10, + user.profiles: 2, + comments: 1 } } ) + + +This index, named ``TextIndex``, includes a number of fields, that all +have a weight of 5, except for the: + +- ``content`` field that has a weight of 5, +- ``users.profiles`` that has a weight of 2, and +- ``comments`` that has a weight of 1. + +This means that results that match words in the ``content`` field more +than all other fields in the index, and that the ``user.profiles`` and +``comments`` fields will be less likely to appear in responses than +words from other fields. + +Text Queries +^^^^^^^^^^^^ + +MongoDB 2.3.2 introduces the :dbcommand:`text` command to provide +query support for ``text`` indexes. Unlike normal MongoDB queries, +:dbcommand:`text` returns a document rather than a cursor. + +.. dbcommand:: text + + The :dbcommand:`text` provides an interface to search text context + stored in ta ``text`` index. Consider the following prototype: + :dbcommand:`text`: + + .. code-block:: javascript + + db.runCommand( text: { search: , filter: } ) + + The :dbcommand:`text` command has the following parameters: + + :param string query: + + A text string that MongoDB stems and uses to query the ``text`` + index. + + :param document filter: + + Optional. A :ref:`query document ` to + further limit the results of the query using another database + field. If the index as a compound ascending or descending index + field, this query will use that index field. + + Unless your index includes an ordered field as a subset of the + + :param document projection: + + Optional. Allows you to limit the fields returned by the query + to only those specified. + + :return: + + :dbcommand:`text` returns results in the form of a + document. Results must fit within the :limit:`BSON Document + Size`. Use a projection setting to limit the size of the result + set. + +.. example:: + + Consider the following examples of :dbcommand:`text` queries: + Additional Authentication Features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -53,7 +285,7 @@ Additional Authentication Features - `Amazon Linux 6.4 `_ - `Red Hat Enterprise Linux 6.2 `_ - `Ubuntu 11.04 `_ - - `SUSE 11 `_ + - `SUSE 11 `_ An improved authentication system is a core focus of the entire 2.3 cycle, as of 2.3.1, the following components of the new authentication @@ -258,11 +490,6 @@ for ``s2d`` indexes as ``2d`` indexes. .. operator:: $intersect -.. note:: - - In 2.3.2, the :operator:`$intersect` operator will - become :operator:`$geoIntersects` - The :operator:`$intersect` selects all indexed points that intersect with the provided geometry. (i.e. ``Point``, ``LineString``, and ``Polygon``.) You must pass :operator:`$intersect` a document From 027eac36f800c3eed1bc125ea02a5acb72dbeadd Mon Sep 17 00:00:00 2001 From: Sam Kleinman Date: Thu, 10 Jan 2013 13:04:24 -0500 Subject: [PATCH 2/5] DOCS-953: examples for the release notes --- source/release-notes/2.4.txt | 223 ++++++++++++++++++++++++++++------- 1 file changed, 180 insertions(+), 43 deletions(-) diff --git a/source/release-notes/2.4.txt b/source/release-notes/2.4.txt index 1f28ea0bb45..326195dcaf0 100644 --- a/source/release-notes/2.4.txt +++ b/source/release-notes/2.4.txt @@ -68,6 +68,11 @@ However, ``text`` indexes have large storage requirements and incur MongoDB must update index entries for each word in the source collection. +- some :dbcommand:`text` searches may affect performance on your + :program:`mongod`, particularly for negation queries and phrase + matches that cannot use the index as effectively as other kinds of + queries. + Additionally, the current *experimental* implementation of ``text`` indexes have the following limitations and behaviors: @@ -76,15 +81,21 @@ indexes have the following limitations and behaviors: languages. MongoDB automatically stems :dbcommand:`text` queries at before beginning the query. -- queries drop stop words (i.e. "the," "an," "a," "and," etc.) +- indexes and queries drop stop words (i.e. "the," "an," "a," "and," + etc.) - the index does not store phrases or information about the proximity of words in the documents. As a result, **only** use phrase queries when the entire collection fits in RAM. -- queries that negate +- :dbcommand:`text` queries with negations must scan the content of + the document to guarantee the negation. As a result, **only** use + phrase queries when the entire collection fits in RAM. + +- MongoDB does not stem phrases or negations in :dbcommand:`text` + queries. -- you may only create a single text index on a collection at a time. +- a collection may only have a single ``text`` index at a time. .. important:: Do not enable or use ``text`` indexes on production systems. @@ -104,42 +115,43 @@ Test ``text`` Indexes .. code-block:: javascript - db.adminComand( { setParameter: 1, textSearchEnabled: true } ) + db.adminCommand( { setParameter: 1, textSearchEnabled: true } ) You can also start the :program:`mongod` with the following invocation: .. code-block:: sh - mongo --setParameter textSearchEnabled=true + mongod --setParameter textSearchEnabled=true Create Text Indexes ^^^^^^^^^^^^^^^^^^^ -To create a text index, use the following invocation of +To create a ``text`` index, use the following invocation of :method:`~db.collection.ensureIndex()`: .. code-block:: javascript db.collection.ensureIndex( { content: "text" } ) -Your ``text`` index can include content from multiple fields, and from -fields in sub-documents, as in the following: +``text`` indexes ignore all string data in the ``content`` field. Your +``text`` index can include content from multiple fields, or arrays, +and from fields in sub-documents, as in the following: .. code-block:: javascript db.collection.ensureIndex( { content: "text", - users.comment: "text", - users.profiles: "text" } ) + "users.comments": "text", + "users.profiles": "text" } ) -These indexes may run into the :limit:`Index Name Length` limit, to +These indexes may run into the :limit:`Index Name Length` limit. To avoid creating an index with a too-long name, you can specify a name -in the options category, as in the following: +in the options parameter, as in the following: .. code-block:: javascript db.collection.ensureIndex( { content: "text", - users.profiles: "text" }, + "users.profiles": "text" }, { name: "TextIndex" } ) When creating a ``text`` indexes you may also specify *weights* for @@ -147,10 +159,11 @@ specific, which help shape the result set. Consider the following: .. code-block:: javascript - db.collection.ensureIndex( { content: "text", users.profiles: "text" }, + db.collection.ensureIndex( { content: "text", + "users.profiles": "text" }, { name: "TextIndex", weights: { content: 1, - users.profiles: 2 } } ) + "users.profiles": 2 } } ) This example creates a ``text`` index on the top-level field named ``content`` and the ``profiles`` field in the ``users`` @@ -165,7 +178,16 @@ with the following command: .. code-block:: javascript db.collection.ensureIndex( { username: "1", - users.profiles: "text" } ) + "users.profiles": "text" } ) + +If you create an ascending or descending index as a prefix of a +``text`` index: + +- MongoDB will only index documents that have the prefix field + (i.e. ``username``) and + +- All :dbcommand:`text` queries using this index must specify the + prefix field in the ``filter`` query. Alternately you can specify a conventional ascending or descending field as a suffix to a ``text`` index make it possible to use the @@ -174,7 +196,7 @@ following index: .. code-block:: javascript - db.collection.ensureIndex( { users.profiles: "text" + db.collection.ensureIndex( { "users.profiles": "text", username: "1" } ) Finally, you may use the special wild card field name specifier @@ -183,7 +205,7 @@ following: .. code-block:: javascript - db.collection.ensureIndex( { "$**": "text" + db.collection.ensureIndex( { "$**": "text", username: "1" }, { name: "TextIndex" } ) @@ -199,46 +221,52 @@ in the following: .. code-block:: javascript db.collection.ensureIndex( { content: "text", - users.profiles: "text", + "users.profiles": "text", comments: "text", - keywords: "text" + keywords: "text", about: "text" }, { name: "TextIndex", - weight: - { "$**": 5 - content: 10, - user.profiles: 2, - comments: 1 } } ) + weights: + { content: 10, + "user.profiles": 2, + keywords: 5, + about: 5 } } ) -This index, named ``TextIndex``, includes a number of fields, that all -have a weight of 5, except for the: +This index, named ``TextIndex``, includes a number of fields, with the +following weights: -- ``content`` field that has a weight of 5, -- ``users.profiles`` that has a weight of 2, and -- ``comments`` that has a weight of 1. +- ``content`` field that has a weight of 10, +- ``users.profiles`` that has a weight of 2, +- ``comments`` that has a weight of 1, +- ``keywords`` that has a weight of 5, and +- ``about`` that has a weight of 5. -This means that results that match words in the ``content`` field more -than all other fields in the index, and that the ``user.profiles`` and -``comments`` fields will be less likely to appear in responses than -words from other fields. +This means that documents that match words in the ``content`` field +will appear in the result set more than all other fields in the index, +and that the ``user.profiles`` and ``comments`` fields will be less +likely to appear in responses than words from other fields. Text Queries ^^^^^^^^^^^^ MongoDB 2.3.2 introduces the :dbcommand:`text` command to provide query support for ``text`` indexes. Unlike normal MongoDB queries, -:dbcommand:`text` returns a document rather than a cursor. +:dbcommand:`text` returns a document rather than a +cursor. .. dbcommand:: text The :dbcommand:`text` provides an interface to search text context - stored in ta ``text`` index. Consider the following prototype: + stored in the ``text`` index. Consider the following prototype: :dbcommand:`text`: .. code-block:: javascript - db.runCommand( text: { search: , filter: } ) + db.collection.runCommand( text: { search: , + filter: , + projection: , + limit: } ) The :dbcommand:`text` command has the following parameters: @@ -251,16 +279,19 @@ query support for ``text`` indexes. Unlike normal MongoDB queries, Optional. A :ref:`query document ` to further limit the results of the query using another database - field. If the index as a compound ascending or descending index + field. If the index is a compound ascending or descending index field, this query will use that index field. - Unless your index includes an ordered field as a subset of the - :param document projection: Optional. Allows you to limit the fields returned by the query to only those specified. + :param number limit: + + Optional. Specify the maximum number of documents to include in + the response. + :return: :dbcommand:`text` returns results in the form of a @@ -268,10 +299,116 @@ query support for ``text`` indexes. Unlike normal MongoDB queries, Size`. Use a projection setting to limit the size of the result set. -.. example:: + Unlike other queries in MongoDB, :dbcommand:`text` takes all the + words in a query an selects matching documents from the index using + a logical ``OR`` operator. However, consider the following + behaviors of :dbcommand:`text` queries: + + - MongoDB adds the words in phrase (i.e. queries enclosed in + quotation marks,) to the search and then adds the praise itself + to the query joined with a logical ``AND``. + + - :dbcommand:`text` adds all negations to the query with the + logical ``AND`` operator. - Consider the following examples of :dbcommand:`text` queries: +.. example:: + + Consider the following examples of :dbcommand:`text` queries. All + examples assume that you have a ``text`` index on the field named + ``content`` in a collection named ``collection``. + + #. Create a ``text`` index on the ``content`` field to enable text + search on the field: + + .. code-block:: javascript + + db.collection.ensureIndex( { content: "text" } ) + + #. Search for a single word ``search``: + + .. code-block:: javascript + + db.collection.runCommand( "text", { search: "search" } ) + + This query returns documents that contain the word + ``search``, case-insensitive, in the ``content`` field. + + #. Search for multiple words, ``create`` or ``search`` or ``fields``: + + .. code-block:: javascript + + db.collection.runCommand( "text", { search: "create search fields" } ) + + This query returns documents that contain the either ``creat`` + **or** ``search`` **or** ``field`` in the ``content`` field. The + :dbcommand:`text` command has stemmed the word ``create`` to + ``creat`` and the word ``fields`` to ``field`` for the search. + + #. Search for the exact phrase ``create search fields``: + + .. code-block:: javascript + + db.collection.runCommand( "text", { search: "\"create search fields\""" } ) + + This query returns documents that contain the exact phrase + ``create search fields`` **and** (``creat`` **or** ``search`` + **or** ``field``), all case-insensitive, in the ``content`` + field. + + .. note:: + + Exact phrase matches use the index for the initial selection + phase, but the query must scan the entire result document set + to fulfill the query. + + #. Search for documents that contain the words ``create`` or ``search``, + but **not** ``fields``: + + .. code-block:: javascript + + db.collection.runCommand( "text", { search: "create search -fields" } ) + + Use the ``-`` as a prefix to terms to specify negation in the + search string. The query returns documents that contain the + either ``creat`` **or** ``search``, but **not** ``field``, all + case-insensitive, in the ``content`` field. Prefixing a word + with a hyphen (``-``) negates a word: + + - The negated word filters out documents from the result set, + after selecting documents. + + - A ```` that only contains negative words returns no match. + + - A hyphenated word, such as ``case-insensitive``, is not a negated + word. The :dbcommand:`text` command ignores the hyphen and + searches for either ``case`` or the stem ``insensit``. + + #. Search for a single word ``search`` with an additional ``filter`` on + the ``about`` field, but **limit** the results to 2 documents with the + highest score and return only the ``comments`` field in the matching + documents: + + .. code-block:: javascript + + db.collection.runCommand( "text", { + search: "insensitive", + filter: { about: /something/ }, + limit: 2, + projection: { comments: 1, _id: 0 } + } + ) + + - The ``filter`` :ref:`query document ` + is uses a :operator:`regular expression <$regex>`. See the + :ref:`query operators ` page for available query + operators. + + - The ``projection`` must explicitly exclude (``0``) the ``_id`` + field. Within the ``projection`` document, you cannot mix + inclusions (i.e. ``: 1``) and exclusions (i.e. ``: + 0``), except for the ``_id`` field. + Additional Authentication Features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -285,7 +422,7 @@ Additional Authentication Features - `Amazon Linux 6.4 `_ - `Red Hat Enterprise Linux 6.2 `_ - `Ubuntu 11.04 `_ - - `SUSE 11 `_ + - `SUSE 11 `_ An improved authentication system is a core focus of the entire 2.3 cycle, as of 2.3.1, the following components of the new authentication From 9d93e2e9ce03b88862084704e9c35c2760f27c53 Mon Sep 17 00:00:00 2001 From: Sam Kleinman Date: Thu, 10 Jan 2013 14:21:29 -0500 Subject: [PATCH 3/5] DOCS-953 minor fixes --- source/release-notes/2.4.txt | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/source/release-notes/2.4.txt b/source/release-notes/2.4.txt index 326195dcaf0..67075ce053d 100644 --- a/source/release-notes/2.4.txt +++ b/source/release-notes/2.4.txt @@ -13,8 +13,8 @@ are *not for production use* under any circumstances. document are subject to change before the 2.4.0 release. This document will eventually contain the full release notes for -MongoDB 2.4; during the development cycle this document will contain -documentation of new features and functionality only available in the +MongoDB 2.4; during the development cycle this document will containd +ocumentation of new features and functionality only available in the 2.3 releases. .. contents:: See the :doc:`full index of this page <2.4-changes>` for @@ -177,7 +177,7 @@ with the following command: .. code-block:: javascript - db.collection.ensureIndex( { username: "1", + db.collection.ensureIndex( { username: 1, "users.profiles": "text" } ) If you create an ascending or descending index as a prefix of a @@ -197,7 +197,7 @@ following index: .. code-block:: javascript db.collection.ensureIndex( { "users.profiles": "text", - username: "1" } ) + username: 1 } ) Finally, you may use the special wild card field name specifier (i.e. ``$**``) to specify index weights and fields. Consider the @@ -206,7 +206,7 @@ following: .. code-block:: javascript db.collection.ensureIndex( { "$**": "text", - username: "1" }, + username: 1 }, { name: "TextIndex" } ) This creates an index named ``TextIndex``, that indexes all string @@ -263,17 +263,18 @@ cursor. .. code-block:: javascript - db.collection.runCommand( text: { search: , - filter: , - projection: , - limit: } ) + db.collection.runCommand( "text": { search: , + filter: , + projection: , + limit: } ) The :dbcommand:`text` command has the following parameters: :param string query: A text string that MongoDB stems and uses to query the ``text`` - index. + index. When specifying phrase matches, you must escape quote + characters (i.e. ``"``) with backslashes (i.e. ``\``). :param document filter: @@ -311,7 +312,6 @@ cursor. - :dbcommand:`text` adds all negations to the query with the logical ``AND`` operator. - .. example:: Consider the following examples of :dbcommand:`text` queries. All From dc699fc9f31c97896e8ab2921c1668be899e07d4 Mon Sep 17 00:00:00 2001 From: Sam Kleinman Date: Thu, 10 Jan 2013 17:49:56 -0500 Subject: [PATCH 4/5] DOCS-929: corrections to introduction from discussion with paul --- source/release-notes/2.4.txt | 63 +++++++++++++++++++----------------- 1 file changed, 34 insertions(+), 29 deletions(-) diff --git a/source/release-notes/2.4.txt b/source/release-notes/2.4.txt index 67075ce053d..923864cd15c 100644 --- a/source/release-notes/2.4.txt +++ b/source/release-notes/2.4.txt @@ -52,21 +52,35 @@ Text Indexes Background `````````` -MongoDB 2.3.2 added a new ``text`` index type that creates a special -index that allows rich arbitrary queries over the content of a string -field in MongoDB. MongoDB updates ``text`` indexes in real time as -clients update data in MongoDB. Queries that use the ``text`` index -will always be able to find the latest data using the text index. +MongoDB2.3.2 includes a new ``text`` index type. ``text`` indexes +support boolean text search queries. Any set of fields containing +string data may be text indexed. You may only maintain a single +``text`` index per collection. ``text`` indexes are fully consistent +and updated in real-time as applications insert, update, or delete +documents from the database. The ``text`` index and query system +supports language specific stemming and stop-words. Additionally: + +- indexes and queries drop stop words (i.e. "the," "an," "a," "and," + etc.) + +- MongoDB stores words stemmed during insertion in the index, using + simple suffix stemming, including support for a number of + languages. MongoDB automatically stems :dbcommand:`text` queries at + before beginning the query. However, ``text`` indexes have large storage requirements and incur **significant** performance costs: -- Building ``text`` indexes takes time. For larger data sets, it may - take many minutes or hours to build a text index. +- Text indexes can be large. They contain one index entry for each + unique word indexed for each document inserted. -- ``text`` indexes will impede insertion throughput for collection, as - MongoDB must update index entries for each word in the source - collection. +- Building a ``text`` index is very similar to building a large + multi-key index, and therefore may take longer than building a + simple ordered (scalar)index. + +- ``text`` indexes will impede insertion throughput, because MongoDB + must add an index entry for each unique word in each indexed field + of each new source document. - some :dbcommand:`text` searches may affect performance on your :program:`mongod`, particularly for negation queries and phrase @@ -76,21 +90,10 @@ However, ``text`` indexes have large storage requirements and incur Additionally, the current *experimental* implementation of ``text`` indexes have the following limitations and behaviors: -- MongoDB stores words stemmed during insertion in the index, using - simple suffix stemming, including support for a number of - languages. MongoDB automatically stems :dbcommand:`text` queries at - before beginning the query. - -- indexes and queries drop stop words (i.e. "the," "an," "a," "and," - etc.) - -- the index does not store phrases or information about the proximity - of words in the documents. As a result, **only** use phrase queries - when the entire collection fits in RAM. - -- :dbcommand:`text` queries with negations must scan the content of - the document to guarantee the negation. As a result, **only** use - phrase queries when the entire collection fits in RAM. +- ``text`` indexes do not store phrases or information about the + proximity of words in the documents. As a result, phrase queries + will run much more effectively when the entire collection fits in + RAM. - MongoDB does not stem phrases or negations in :dbcommand:`text` queries. @@ -100,10 +103,12 @@ indexes have the following limitations and behaviors: .. important:: Do not enable or use ``text`` indexes on production systems. -For production-grade search requirements consider using a third-party -search tool, and the `mongo-connector `_ -or a similar integration strategy to provide more advanced search -capabilities. +.. May be worth including this: + + For production-grade search requirements consider using a + third-party search tool, and the `mongo-connector + `_ or a similar + integration strategy to provide more advanced search capabilities. Test ``text`` Indexes ````````````````````` From e8625759c9998ab833bc51ccc0043d2d8a256dfc Mon Sep 17 00:00:00 2001 From: kay Date: Thu, 10 Jan 2013 18:19:50 -0500 Subject: [PATCH 5/5] DOCS-929 edits to release notes for text search --- source/release-notes/2.4.txt | 148 +++++++++++++++++++---------------- 1 file changed, 82 insertions(+), 66 deletions(-) diff --git a/source/release-notes/2.4.txt b/source/release-notes/2.4.txt index 923864cd15c..20cd3a43e17 100644 --- a/source/release-notes/2.4.txt +++ b/source/release-notes/2.4.txt @@ -13,8 +13,8 @@ are *not for production use* under any circumstances. document are subject to change before the 2.4.0 release. This document will eventually contain the full release notes for -MongoDB 2.4; during the development cycle this document will containd -ocumentation of new features and functionality only available in the +MongoDB 2.4; during the development cycle this document will contain +documentation of new features and functionality only available in the 2.3 releases. .. contents:: See the :doc:`full index of this page <2.4-changes>` for @@ -98,6 +98,8 @@ indexes have the following limitations and behaviors: - MongoDB does not stem phrases or negations in :dbcommand:`text` queries. +- the index is case insensitive. + - a collection may only have a single ``text`` index at a time. .. important:: Do not enable or use ``text`` indexes on production @@ -115,7 +117,7 @@ Test ``text`` Indexes .. important:: The ``text`` index type is an experimental feature and you must enable the feature before creating or accessing a text - index. To enable text indexes issue the following command at the + index. To enable text indexes, issue the following command at the :program:`mongo` shell: .. code-block:: javascript @@ -139,7 +141,7 @@ To create a ``text`` index, use the following invocation of db.collection.ensureIndex( { content: "text" } ) -``text`` indexes ignore all string data in the ``content`` field. Your +``text`` indexes catalog all string data in the ``content`` field. Your ``text`` index can include content from multiple fields, or arrays, and from fields in sub-documents, as in the following: @@ -149,6 +151,13 @@ and from fields in sub-documents, as in the following: "users.comments": "text", "users.profiles": "text" } ) +The default name for the index consists of the ```` +concatenated with ``_text``, as in the following: + +.. code-block:: javascript + + "content_text_users.comments_text_users.profiles_text" + These indexes may run into the :limit:`Index Name Length` limit. To avoid creating an index with a too-long name, you can specify a name in the options parameter, as in the following: @@ -159,8 +168,11 @@ in the options parameter, as in the following: "users.profiles": "text" }, { name: "TextIndex" } ) -When creating a ``text`` indexes you may also specify *weights* for -specific, which help shape the result set. Consider the following: +When creating ``text`` indexes you may specify *weights* for specific +fields. *Weights* are factored into the relevant score for each +document. The score for a given word in a document is the weighted sum +of the frequency for each of the indexed fields in that document. +Consider the following: .. code-block:: javascript @@ -172,18 +184,14 @@ specific, which help shape the result set. Consider the following: This example creates a ``text`` index on the top-level field named ``content`` and the ``profiles`` field in the ``users`` -sub-documents. Furthermore, ``content`` field has a weight of 1 and +sub-documents. Furthermore, the ``content`` field has a weight of 1 and the ``users.profiles`` field has a weight of 2. -You can add a conventional ascending or descending index field as a -prefix of the index so that queries can limit the number of index -entries the query must review to perform the query. Create this index -with the following command: - -.. code-block:: javascript - - db.collection.ensureIndex( { username: 1, - "users.profiles": "text" } ) +You can add a conventional ascending or descending index field(s) as a +prefix or suffix of the index so that queries can limit the number of +index entries the query must review to perform the query. You cannot +include :ref:`multi-key ` index field nor +:ref:`geospatial ` index field. If you create an ascending or descending index as a prefix of a ``text`` index: @@ -194,19 +202,29 @@ If you create an ascending or descending index as a prefix of a - All :dbcommand:`text` queries using this index must specify the prefix field in the ``filter`` query. -Alternately you can specify a conventional ascending or descending -field as a suffix to a ``text`` index make it possible to use the -``text`` index to return covered queries using a projection, as in the -following index: +Create this index with the following operation: + +.. code-block:: javascript + + db.collection.ensureIndex( { username: 1, + "users.profiles": "text" } ) + +Alternatively you create an ascending or descending index as a suffix +to a ``text`` index. Then the ``text`` index can support +:ref:`covered queries ` if the +:dbcommand:`text` command specifies a ``projection`` option. + +Create this index with the following operation: .. code-block:: javascript db.collection.ensureIndex( { "users.profiles": "text", username: 1 } ) -Finally, you may use the special wild card field name specifier -(i.e. ``$**``) to specify index weights and fields. Consider the -following: +Finally, you may use the special wild card field specifier (i.e. +``$**``) to specify index weights and fields. Consider the following +example that indexes any string value in the data of every field of +every document in a collection and names it ``TextIndex``: .. code-block:: javascript @@ -214,14 +232,8 @@ following: username: 1 }, { name: "TextIndex" } ) -This creates an index named ``TextIndex``, that indexes all string -data in every field of every document in a collection - -.. warning:: Create indexes using the wildcard operator with extreme - caution. - -You may also specify weights for a text index with compound fields, as -in the following: +By default, an index field has a weight of ``1``. You may specify +weights for a ``text`` index with compound fields, as in the following: .. code-block:: javascript @@ -237,7 +249,6 @@ in the following: keywords: 5, about: 5 } } ) - This index, named ``TextIndex``, includes a number of fields, with the following weights: @@ -268,25 +279,31 @@ cursor. .. code-block:: javascript - db.collection.runCommand( "text": { search: , - filter: , - projection: , - limit: } ) + db.collection.runCommand( "text", { search: , + filter: , + projection: , + limit: , + language: } ) The :dbcommand:`text` command has the following parameters: - :param string query: + :param string search: A text string that MongoDB stems and uses to query the ``text`` index. When specifying phrase matches, you must escape quote - characters (i.e. ``"``) with backslashes (i.e. ``\``). + characters as ``\"``. :param document filter: Optional. A :ref:`query document ` to further limit the results of the query using another database - field. If the index is a compound ascending or descending index - field, this query will use that index field. + field. You can use any valid MongoDB query in the filter + document, except if the index includes an ascending or descending + index field as a prefix. + + If the index includes an ascending or descending index field, the + ``filter`` is required and the ``filter`` query must be an + equality match. :param document projection: @@ -298,6 +315,11 @@ cursor. Optional. Specify the maximum number of documents to include in the response. + :param string language: + + Optional. Specify the language that determines the tokenization, + stemming, and the stop words for the search. + :return: :dbcommand:`text` returns results in the form of a @@ -305,15 +327,19 @@ cursor. Size`. Use a projection setting to limit the size of the result set. - Unlike other queries in MongoDB, :dbcommand:`text` takes all the - words in a query an selects matching documents from the index using - a logical ``OR`` operator. However, consider the following - behaviors of :dbcommand:`text` queries: - - - MongoDB adds the words in phrase (i.e. queries enclosed in - quotation marks,) to the search and then adds the praise itself - to the query joined with a logical ``AND``. - + The implicit connector between the terms of a multi-term search is a + disjunction (``OR``). Search for ``"first second"`` searches + for ``"first"`` or ``"second"``. The scoring system will prefer + documents that contain all terms. + + However, consider the following behaviors of :dbcommand:`text` + queries: + + - With phrases (i.e. terms enclosed in escaped quotes), the search + performs an ``AND`` with any other terms in the search string; + e.g. search for ``"\"twinkle twinkle\" little star"`` searches for + ``"twinkle twinkle"`` and (``"little"`` or ``"star"``). + - :dbcommand:`text` adds all negations to the query with the logical ``AND`` operator. @@ -345,27 +371,17 @@ cursor. db.collection.runCommand( "text", { search: "create search fields" } ) - This query returns documents that contain the either ``creat`` - **or** ``search`` **or** ``field`` in the ``content`` field. The - :dbcommand:`text` command has stemmed the word ``create`` to - ``creat`` and the word ``fields`` to ``field`` for the search. + This query returns documents that contain the either ``create`` + **or** ``search`` **or** ``field`` in the ``content`` field. #. Search for the exact phrase ``create search fields``: .. code-block:: javascript - db.collection.runCommand( "text", { search: "\"create search fields\""" } ) + db.collection.runCommand( "text", { search: "\"create search fields\"" } ) This query returns documents that contain the exact phrase - ``create search fields`` **and** (``creat`` **or** ``search`` - **or** ``field``), all case-insensitive, in the ``content`` - field. - - .. note:: - - Exact phrase matches use the index for the initial selection - phase, but the query must scan the entire result document set - to fulfill the query. + ``create search fields``. #. Search for documents that contain the words ``create`` or ``search``, but **not** ``fields``: @@ -385,9 +401,9 @@ cursor. - A ```` that only contains negative words returns no match. - - A hyphenated word, such as ``case-insensitive``, is not a negated - word. The :dbcommand:`text` command ignores the hyphen and - searches for either ``case`` or the stem ``insensit``. + - A hyphenated word, such as ``case-insensitive``, is not a + negation. The :dbcommand:`text` command treats the hyphen and + as a delimiter. #. Search for a single word ``search`` with an additional ``filter`` on the ``about`` field, but **limit** the results to 2 documents with the