GribStream Blog
ERA5 reanalysis actuals are arriving in the GribStream API
GribStream is onboarding ECMWF ERA5 single-level reanalysis as hourly valid-time actuals on a 0.25 degree global grid, launching with 2020 onward while deeper history remains staged.
GribStream is onboarding ECMWF ERA5 as a valid-time actuals dataset in the public API. Users can query ERA5 with the same timeseries workflow used for forecast datasets, but interpret each row as a reanalysis field at its valid time rather than as a forecast lead time.
This matters because ERA5 is not just another model feed. It is ECMWF's fifth-generation global reanalysis, produced for the Copernicus Climate Change Service, and the official ERA5 record reaches back to January 1940. ERA5 combines historical observations with a fixed ECMWF model and data-assimilation system to build a gridded record of the atmosphere, land surface, and related climate variables.
Why ERA5 Matters
Operational forecast archives are essential, but they are not a stable historical reference by themselves. Forecast systems change, physics packages move, data assimilation changes, and delivery formats evolve. If you train a model or verify a product across many years, those changes can become part of the measured signal.
ERA5 reduces that problem by reanalyzing history with a consistent system. Observations are blended with model physics through data assimilation to produce complete hourly global fields. It is still a reanalysis, not a station network, so observing-system changes and local representativeness still matter. But for gridded actuals, forecast verification, climate-aware backtests, and machine-learning labels, ERA5 is one of the most valuable public references available.
How GribStream Exposes It
In GribStream, ERA5 uses valid-time actuals semantics:
forecasted_atequalsforecasted_timehorizon = 0member = 0- coordinates, variables, aliases, CSV, JSON, and NDJSON work like the rest of the API
That makes ERA5 straightforward to join with IFS Oper, IFS ENS, AIFS Oper, and other forecast datasets. A verification workflow can ask for the forecast and the later ERA5 actual using the same coordinate and parameter selectors.
At the API level, that means ERA5 does not need a separate integration path. Existing coordinate lists, variable selectors, aliases, CSV, JSON, and NDJSON workflows can be reused for validation, labels, backtests, and dashboards.
The practical reference is the public API and catalog. Use the catalog to see which ERA5 variables are currently available, and expect that list to grow as we validate more fields and customers ask for broader coverage.
Initial Core Variables
The first ERA5 release is intentionally focused. It targets a core set of high-value single-level fields on the CDS 0.25 degree global latitude/longitude grid:
- near-surface temperature and moisture:
2t,2d - wind:
10u,10v,100u,100v - pressure:
sp,msl - cloud and column water:
tcc,tcw,tcwv - precipitation and snow:
tp,sf,sd - radiation and surface state:
ssrd,strd,skt
That set is enough for common actuals workflows: temperature error, wind verification, solar and surface-energy features, snow and precipitation labels, pressure fields, cloud screening, and water-vapor context.
Backfill Plan
The official ERA5 archive is too large to treat as a single launch event. GribStream is launching ERA5 with coverage from January 1, 2020 onward. Older years can be added later if customer demand justifies the extra storage and backfill cost, instead of delaying access until the full 1940-present record is mirrored.
That staged rollout is deliberate. ERA5 is large, and the highest-value path is to make a reliable recent-history actuals archive available first, then extend deeper history and add variables as customer demand justifies them.
Recent ERA5 data also deserves a small caveat. Copernicus publishes recent fields as ERA5T about five days behind real time, then later replaces them with final ERA5, typically around two to three months after the month in question. For the newest dates, treat ERA5 as near-real-time reanalysis until the final product has replaced ERA5T.
Where It Fits
ERA5 is the GribStream choice when the question is "what happened on the grid?" rather than "what did a model think would happen?" It is especially useful for:
- forecast verification against ECMWF, NOAA, and AI model outputs
- ML labels for temperature, wind, precipitation, solar, snow, and cloud workflows
- bias correction and calibration pipelines
- weather-impact backtesting for energy, agriculture, logistics, insurance, and event operations
- joining actuals with forecast archives through the same API shape
The model page is live at ERA5, and the catalog shows the currently available archive window as backfilling progresses.
Related Reading
Sources
- ECMWF/Copernicus Knowledge Base, "ERA5: data documentation": https://confluence.ecmwf.int/spaces/CKB/pages/76414402/ERA5+data+documentation
- Copernicus Climate Data Store, "ERA5 hourly data on single levels from 1940 to present": https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels?tab=overview
- ECMWF, "ECMWF Reanalysis v5": https://www.ecmwf.int/en/forecasts/dataset/ecmwf-reanalysis-v5
