Skip to content

Commit 1607e6e

Browse files
committed
First draft of IEP 1.
1 parent e7ef21d commit 1607e6e

File tree

1 file changed

+138
-0
lines changed

1 file changed

+138
-0
lines changed

docs/iris/src/IEP/IEP001.doc

Lines changed: 138 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
# IEP 1 - Enhanced indexing
2+
3+
## Background
4+
5+
Currently, to select a subset of a Cube based on coordinate values we use something like:
6+
[source,python]
7+
----
8+
cube.extract(iris.Constraint(realization=3,
9+
model_level_number=[1, 5],
10+
latitude=lambda cell: 40 <= cell <= 60))
11+
----
12+
On the plus side, this works irrespective of the dimension order of the data, but the drawbacks with this form of indexing include:
13+
14+
* It uses a completely different syntax to position-based indexing, e.g. `cube[4, 0:6]`.
15+
* It uses a completely different syntax to pandas and xarray value-based indexing, e.g. `df[4, 0:6]`.
16+
* It is long-winded.
17+
18+
Similarly, to select a subset of a Cube using positional indices but where the dimension is unknown has no standard syntax _at all_! Instead it requires code akin to:
19+
[source,python]
20+
----
21+
key = [slice(None)] * cube.ndim
22+
key[cube.coord_dims('model_level_number')[0]] = slice(3, 9, 2)
23+
cube[tuple(key)]
24+
----
25+
26+
The only form of indexing that is well supported is indexing by position where the dimension order is known:
27+
[source,python]
28+
----
29+
cube[4, 0:6, 30:]
30+
----
31+
32+
## Proposal
33+
34+
Provide indexing helpers on the Cube to extend support to all permutations of positional vs. named dimensions and positional vs. coordinate-value based selection.
35+
36+
### Extended pandas style
37+
38+
Use a single helper for index by position, and a single helper for index by value. Helper names taken from pandas, but their behaviour is extended by making them callable to support named dimensions.
39+
40+
|===
41+
2.2+| 2+h|Index by
42+
h|Position h|Value
43+
44+
.2+h|Dimension
45+
h|Position
46+
47+
a|[source,python]
48+
----
49+
cube[:, 2] # No change
50+
cube.iloc[:, 2]
51+
----
52+
53+
a|[source,python]
54+
----
55+
cube.loc[:, 1.5]
56+
----
57+
58+
h|Name
59+
60+
a|[source,python]
61+
----
62+
cube[dict(height=2)]
63+
cube.iloc[dict(height=2)]
64+
cube.iloc(height=2)
65+
----
66+
67+
a|[source,python]
68+
----
69+
cube.loc[dict(height=1.5)]
70+
cube.loc(height=1.5)
71+
----
72+
|===
73+
74+
### xarray style
75+
76+
xarray introduces a second set of helpers for accessing named dimensions that provide the callable syntax `(foo=...)`.
77+
78+
|===
79+
2.2+| 2+h|Index by
80+
h|Position h|Value
81+
82+
.2+h|Dimension
83+
h|Position
84+
85+
a|[source,python]
86+
----
87+
cube[:, 2] # No change
88+
----
89+
90+
a|[source,python]
91+
----
92+
cube.loc[:, 1.5]
93+
----
94+
95+
h|Name
96+
97+
a|[source,python]
98+
----
99+
cube[dict(height=2)]
100+
cube.isel(height=2)
101+
----
102+
103+
a|[source,python]
104+
----
105+
cube.loc[dict(height=1.5)]
106+
cube.sel(height=1.5)
107+
----
108+
|===
109+
110+
### TODO
111+
* Consistent terminology
112+
* `coord.name()` vs. `var_name` vs. "dimension name"?
113+
* Names that aren't valid Python identifiers
114+
* Inclusive vs. exclusive
115+
** Default: Inclusive? (as for pandas & xarray)
116+
** Use boolean otherwise.
117+
* Multi-dimensional coordinates
118+
* Non-orthogonal coordinates
119+
* Bounds
120+
* Boolean array indexing
121+
* Lambdas?
122+
* What to do about constrained loading?
123+
* Relationship to http://scitools.org.uk/iris/docs/v1.9.2/iris/iris/cube.html#iris.cube.Cube.intersection[iris.cube.Cube.intersection]?
124+
* Relationship to interpolation (especially nearest-neighbour)?
125+
** e.g. What to do about values that don't exist?
126+
*** pandas throws a KeyError
127+
*** xarray supports (several) nearest-neighbour schemes via http://xarray.pydata.org/en/stable/indexing.html#nearest-neighbor-lookups[`data.sel()`]
128+
*** Apparently http://holoviews.org/[holoviews] does nearest-neighbour interpolation.
129+
* Time handling
130+
** e.g. Rich Signell's http://nbviewer.jupyter.org/gist/rsignell-usgs/13d7ce9d95fddb4983d4cbf98be6c71d[xarray/iris comparison]
131+
132+
## References
133+
. Iris
134+
* http://scitools.org.uk/iris/docs/v1.9.2/iris/iris.html#iris.Constraint[iris.Constraint]
135+
* http://scitools.org.uk/iris/docs/v1.9.2/userguide/subsetting_a_cube.html[Subsetting a cube]
136+
. http://pandas.pydata.org/pandas-docs/stable/indexing.html[pandas indexing]
137+
. http://xarray.pydata.org/en/stable/indexing.html[xarray indexing]
138+
. http://legacy.python.org/dev/peps/pep-0472/[PEP 472 - Support for indexing with keyword arguments]

0 commit comments

Comments
 (0)