-
Notifications
You must be signed in to change notification settings - Fork 229
Description
Description of the desired feature
This is a parallel issue to #578 and #1442. While researching how to read GeoTIFFs into an xarray.DataArray
, specifically GeoTIFFs with RGB bands stored in a single band but with a color lookup table, I discovered that there wasn't really a straightforward solution 😅 So this issue is mainly to jot down some notes.
1) Reading GeoTiff into xarray.DataArray
Method A: rioxarray.open_rasterio
Pros: Reads in data with proper dtype (e.g. uint8) and proper spatial_ref
Cons: rioxarray
is a bit of a heavy dependency as it requires scipy
(edit: may be resolved with corteva/rioxarray#413, edit2: rioxarray=0.8.0
has made scipy
optional!)
import pygmt
import rioxarray
filename_or_obj = pygmt.which(fname="@earth_day_01d", download="a")
with rioxarray.open_rasterio(filename_or_obj) as dataarray:
print(dataarray)
gives
<xarray.DataArray (band: 1, y: 180, x: 360)>
array([[[251, 251, ..., 251, 251],
[251, 252, ..., 251, 252],
...,
[ 68, 16, ..., 68, 217],
[149, 149, ..., 149, 16]]], dtype=uint8)
Coordinates:
* band (band) int64 1
* x (x) float64 -179.5 -178.5 -177.5 -176.5 ... 177.5 178.5 179.5
* y (y) float64 89.5 88.5 87.5 86.5 ... -86.5 -87.5 -88.5 -89.5
spatial_ref int64 0
Attributes:
scale_factor: 1.0
add_offset: 0.0
Method B: xr.open_dataarray(..., engine="rasterio")
Pros: Requires only Easier to put into rasterio
(a Pythonic GDAL wrapper) to be installed, which keeps things lightweightpygmt.load_dataarray
as it only requires the 'engine' argument
Cons: Seems to read in data as float32 by default
Note: There was an xr.open_rasterio
function in xarray v0.19.0, but it is pending deprecation, see pydata/xarray#4697, and removal PR at pydata/xarray#5808.
with xr.open_dataarray(filename_or_obj) as dataarray:
print(dataarray)
produces
<xarray.DataArray 'band_data' (band: 1, y: 180, x: 360)>
array([[[251., 251., ..., 251., 251.],
[251., 252., ..., 251., 252.],
...,
[ 68., 16., ..., 68., 217.],
[149., 149., ..., 149., 16.]]], dtype=float32)
Coordinates:
* band (band) int64 1
* x (x) float64 -179.5 -178.5 -177.5 -176.5 ... 177.5 178.5 179.5
* y (y) float64 89.5 88.5 87.5 86.5 ... -86.5 -87.5 -88.5 -89.5
spatial_ref int64 ...
TLDR My preference is to use (A) rioxarray.open_rasterio
because it reads in data properly as uint8, with a fallback to (B) xr.open_dataarray(..., engine=rasterio)
if only rasterio
is installed.
2) Reading RGB colors from a GeoTIFF
Here's where things get a bit tricky. Neither rioxarray.open_rasterio
nor xarray.open_dataarray
reads in GeoTIFFs with color interpretation, i.e. RGB colors stored in a GeoTIFF will be read as a single band rather than 3 separate bands! This includes the @earth_day_01d
and @earth_night_01d
imagery. To be clear, non-color GeoTIFFs (e.g. DEMs, single band Landsat/Sentinel-2 images) are fine.
Good news is that the color interpretation mechanism is present in rasterio
see https://github.com/mapbox/rasterio/blob/1.2.8/docs/topics/color.rst and rasterio/rasterio#100. This is wrapping around the GDAL Color Table stuff (https://gdal.org/user/raster_data_model.html#color-table)
Bad news is that we'll need to have that mechanism in rioxarray
or xarray
in order to use it (unless we want to go with a pure rasterio
route). There is an issue open though at corteva/rioxarray#191, so it might be a matter of helping to open a Pull Request in those libraries to implement the feature.
3) Implementing GMT_IMAGE in PyGMT
So once we've solved (1) and (2) above, the next step would be, how to pass that 3-band RGB color xarray.DataArray
into GMT? Max mentioned at #578 (comment) that we might need to use the GMT_IMAGE
resource, but that hasn't been added into pygmt.clib as of PyGMT v0.4.1.
Probably need something like a virtualfile_from_image
function, similar to virtualfile_from_grid
(done in #159) at https://github.com/GenericMappingTools/pygmt/blame/v0.4.1/pygmt/clib/session.py#L1278
Are you willing to help implement and maintain this feature? Something to do for PyGMT v0.6.0 I think.
The steps I think would be to:
- Add
rioxarray
and/orrasterio
as an optional dependency for PyGMT (added in Add Figure.tilemap to plot XYZ tile maps #2394) - Contribute to
rioxarray
to get the color interpretation working (Read colormap and colorinterp from GeoTIFF as an attribute corteva/rioxarray#414) - Modify
pygmt.load_dataarray
function to handle reading in RGB GeoTIFFs - Implement
GMT_IMAGE
in PyGMT (Wrap GMT's standard data type GMT_IMAGE for images #3338, GMT_IMAGE: Implement the GMT_IMAGE.to_dataarray method for 3-band images #3128) - maybe some other stuff too
References:
- GDAL Raster Data Model https://gdal.org/user/raster_data_model.html#color-table
- OGC GeoTIFF Standard https://www.ogc.org/standards/geotiff
- TIFF standard, see Section 5: Palette-color Images https://www.adobe.io/content/dam/udp/en/open/standards/tiff/TIFF6.pdf