Skip to content

Conversation

@glevava
Copy link
Member

@glevava glevava commented Nov 7, 2025

Hi @lukaszlacinski I propose some minor changes in the convert2stac method according to the CMIP6 JSON schema.

"stac_extensions": [
#"https://stac-extensions.github.io/cmip6/v3.0.0/schema.json",
"https://esgf.github.io/stac-transaction-api/cmip6/v1.0.0/schema.json",
"https://stac-extensions.github.io/cmip6/v3.0.0/schema.json",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we tested/confirmed if this schema is working with our data?

Copy link
Member Author

@glevava glevava Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sturoscy-personal Not sure it was ever formally confirmed.
What I can say is that this is still the version currently applied in the latest CMIP6 schema we released a few days ago:
https://esgvoc.ipsl.fr/api/v1/apps/jsg/cmip6

This latest schema works correctly and be fully compatible with the test payloads generated using this PR.

I believe the updated schema should also be pushed to the STAC CMIP6 extension repository.
Would you like us to open a PR there as well?

Copy link
Collaborator

@lukaszlacinski lukaszlacinski Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the PR to regenerate 88k STAC Items and validated each of them with stac-validator using the new schema at https://esgvoc.ipsl.fr/api/v1/apps/jsg/cmip6. 71% of the Items validated successfully; 29% failed.
Here are the first 10 Items that failed validation::

  • esgfng-payloads/CMIP6.AerChemMIP.NOAA-GFDL.GFDL-ESM4.hist-piAer.r1i1p1f1.AERmon.wetnoy.gr1.v20180701_eagle.alcf.anl.gov.json
 "error_message": "'Wet Deposition Rate of NOy including Aerosol Nitrate' is not one of <snap>. Error is in properties -> cmip6:variable_long_name "
  • esgfng-payloads/CMIP6.C4MIP.MIROC.MIROC-ES2L.1pctCO2Ndep-bgc.r1i1p1f2.Amon.rsds.gn.v20191129_eagle.alcf.anl.gov.json
 "error_message": "'Surface Downwelling Shortwave Radiation' is not one of <snap>. Error is in properties -> cmip6:variable_long_name "
  • esgfng-payloads/CMIP6.CMIP.CAS.FGOALS-g3.1pctCO2.r2i1p1f1.day.psl.gn.v20191223_eagle.alcf.anl.gov.json
"error_message": "'air_pressure_at_mean_sea_level' is not one of <snap>. Error is in properties -> cmip6:variable_cf_standard_name "
  • esgfng-payloads/CMIP6.CFMIP.CNRM-CERFACS.CNRM-CM6-1.amip-4xCO2.r1i1p1f2.AERmon.rsutcsaf.gr.v20190820_eagle.alcf.anl.gov.json
"error_message": "'toa outgoing clear-sky shortwave radiation' is not one of <snap>. Error is in properties -> cmip6:variable_long_name "
  • esgfng-payloads/CMIP6.AerChemMIP.NIMS-KMA.UKESM1-0-LL.ssp370-lowNTCF.r1i1p1f2.Amon.rsds.gn.v20201020_eagle.alcf.anl.gov.json
"error_message": "'Surface Downwelling Shortwave Radiation' is not one of <snap>. Error is in properties -> cmip6:variable_long_name "
  • esgfng-payloads/CMIP6.C4MIP.NASA-GISS.GISS-E2-1-G.1pctCO2-rad.r101i1p1f1.Emon.cSoilTree.gn.v20190815_eagle.alcf.anl.gov.json
"error_message": "'soil_carbon_content' is not one of <snap>. Error is in properties -> cmip6:variable_cf_standard_name "
  • esgfng-payloads/CMIP6.AerChemMIP.NOAA-GFDL.GFDL-ESM4.ssp370pdSST.r1i1p1f1.Emon.vegHeight.gr1.v20180701_eagle.alcf.anl.gov.json
"error_message": "'canopy height' is not one of <snap>. Error is in properties -> cmip6:variable_long_name "
  • esgfng-payloads/CMIP6.C4MIP.NASA-GISS.GISS-E2-1-G-CC.ssp585-bgc.r1i1p1f1.Omon.fbddtalk.gn.v20190815_eagle.alcf.anl.gov.json
"error_message": "'ocnBgChem' is not one of ['seaIce', 'ocean', 'aerosol', 'land', 'landIce', 'atmos', 'ocnBgchem', 'atmosChem']. Error is in properties -> cmip6:realm -> 0 "
  • esgfng-payloads/CMIP6.AerChemMIP.NOAA-GFDL.GFDL-ESM4.piClim-2xdust.r1i1p1f1.AERmon.pan.gr1.v20180701_eagle.alcf.anl.gov.json
"error_message": "'PAN volume mixing ratio' is not one of <snap>. Error is in properties -> cmip6:variable_long_name "
  • esgfng-payloads/CMIP6.CFMIP.IPSL.IPSL-CM6A-LR.amip-p4K-lwoff.r1i1p1f1.Amon.pfull.gr.v20180928_eagle.alcf.anl.gov.json
"error_message": "'Pressure on Model Levels' is not one of <snap>. Error is in properties -> cmip6:variable_long_name "

It appears that 3 properties defined as enum in the new schema are missing many values. I checked the first 1,000 failing Items and the errors are distributed across the 3 properties as follows:

824 "Error is in properties -> cmip6:variable_long_name "
140 "Error is in properties -> cmip6:variable_cf_standard_name "
 36 "Error is in properties -> cmip6:realm -> 0 "

I guess the schema, https://esgvoc.ipsl.fr/api/v1/apps/jsg/cmip6, was generated based on a small subset of CMIP6 data.

@sturoscy-personal sturoscy-personal changed the title minor changes in convert2stac method Minor changes in convert2stac method Nov 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants