Skip to content

Commit 48121d8

Browse files
author
Christoph Büscher
committed
Add ERR to ranking evaluation documentation
This change adds a section about the Expected Reciprocal Rank metric (ERR) to the Ranking Evaluation documentation.
1 parent d07b4ec commit 48121d8

File tree

1 file changed

+50
-0
lines changed

1 file changed

+50
-0
lines changed

docs/reference/search/rank-eval.asciidoc

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -259,6 +259,56 @@ in the query. Defaults to 10.
259259
|`normalize` | If set to `true`, this metric will calculate the https://en.wikipedia.org/wiki/Discounted_cumulative_gain#Normalized_DCG[Normalized DCG].
260260
|=======================================================================
261261

262+
[float]
263+
==== Expected Reciprocal Rank (ERR)
264+
265+
Expected Reciprocal Rank (ERR) is an extension of the classical reciprocal rank for the graded relevance case
266+
(Chapelle, Olivier, Donald Metzler, Ya Zhang, and Pierre Grinspan. 2009.
267+
http://olivier.chapelle.cc/pub/err.pdf[Expected reciprocal rank for graded relevance].)
268+
269+
It is based on the assumption of a cascade model of search, which models that a user scans through ranked search
270+
results in order and stops at the first document satisfies the information need of the user. For this reason, it
271+
is a good metric for question answering and navigation queries, but less for survey oriented information needs
272+
where the user is interested in finding several relevant documents in the top k results.
273+
274+
The metric tries to model the expectation of the reciprocal of the position of a result at which a user stops.
275+
This means, relevant document in top ranking positions will contribute much to the overall ERR score. The same
276+
document will contribute much less to the score on a lower rank, but even more so if there were some
277+
relevant documents preceding it. By this, ERR discounts documents which are shown below very relevant documents
278+
and introduces some kind of dependency in the ordering of relevant documents.
279+
280+
[source,js]
281+
--------------------------------
282+
GET /twitter/_rank_eval
283+
{
284+
"requests": [
285+
{
286+
"id": "JFK query",
287+
"request": { "query": { "match_all": {}}},
288+
"ratings": []
289+
}],
290+
"metric": {
291+
"expected_reciprocal_rank": {
292+
"maximum_relevance" : 3,
293+
"k" : 20
294+
}
295+
}
296+
}
297+
--------------------------------
298+
// CONSOLE
299+
// TEST[setup:twitter]
300+
301+
The `expected_reciprocal_rank` metric takes the following parameters:
302+
303+
[cols="<,<",options="header",]
304+
|=======================================================================
305+
|Parameter |Description
306+
| `maximum_relevance` | Mandatory parameter. The highest relevance grade used in the user supplied
307+
relevance judgments.
308+
|`k` | sets the maximum number of documents retrieved per query. This value will act in place of the usual `size` parameter
309+
in the query. Defaults to 10.
310+
|=======================================================================
311+
262312
[float]
263313
=== Response format
264314

0 commit comments

Comments
 (0)