Provides access to, and download from, Copernicus Marine Data Store using R language. This package has been developed primarily around …
-
GLOBAL_MULTIYEAR_PHY_001_030 31 Dec 1992 to a month or more lag
-
GLOBAL_ANALYSISFORECAST_PHY_001_024 31 Oct 2020 to 9 days from present (forecast)
With no or minimal modification it can work with other products.
In 2023/2024 Copernicus migrated to a new service model; learn more here. This migration introduced the Copernicus Marine Toolbox which provides a Python API and a command line interface (CLI). This package leverages the latter.
The toolbox is under active development, so if you are having troubles (like we have sometimes!) try re-installing. We have some notes here.
Copernicus serves so many data resources; finding what you want can be a challenge. Check out the new Marine Data Store. And checkout the listing here.
Like data offerings from OBPG, Copernicus strives to provided consistent dataset identifiers that are easily decoded programmatically (and with practice by eye). In order to download programmatically you must have the datasetID in hand. Learn more about Copernicus nomenclature rules here. We leverage the data catalogs which link a product to one or more of its datasets. Did you catch that? Copernicus distributes products each of which comes as one or more datasets. There’s mor eon that later.
The Copernicus Marine
Toolbox
command-line application, copernicus-marine
provides two primary
methods for donwloading data: get
and subset
. get
is not well
documented, but subset does what it implies - subsetting resources by
variable, spatial bounding box, depth and time. This package only
supports subset
.
remotes::install_github("BigelowLab/copernicus")
You can preconfigure a credentials file (required) and a path definition file (optional) to streamline accessing and storing data.
You must have credentials to access Copernicus holdings - if you don’t have them now please request access here. Go straight to it here.
If you plan to use our directory-driven database storage system then you should set the root path for the data directory; keep in mind that you can always change or override it. We don’t actually run this in the README, but you can copy-and-paste to use in R. Again, replace the path with one suiting your own situation.
copernicus::set_root_path("/the/path/to/copernicus/data")
Once it is set, you shouldn’t have to set it again (unless you want to change the path).
If you are using R within a RStudio session, you may encounter issues
where system()
and system2()
can’t’ find the copernicusmarine
application. This is not the case when you run R outside of the RStudio
context. Technically, this is a environmental path issue, which you can
remedy by providing the full path specification for the app
argument
to the function build_cli_subset()
. By default,
app = 'copernicusmarine
, but you may need to include the full path
specification. We provide a mechanism for storing this path in a
configuration file once, and then it will work without issue in
subsequent sessions of R. Here’s how we set ours.
First determine the app path in the terminal session (outside of RStudio context).
$ which copernicusmarine
/opt/copernicus/bin/copernicusmarine
Then set the path.
copernicus::set_copernicus_app("/opt/copernicus/bin/copernicusmarine")
This is optional (but worth it if you operate within RStudio). You can
retrieve the application path with get_copernicus_app()
, which
defaults to copernicusmarine
if you didn’t set the path.
copernicus::get_copernicus_app()
## [1] "/opt/copernicus/bin/copernicusmarine"
You can download a product catalog for local storage.
suppressPackageStartupMessages({
library(copernicus)
library(stars)
})
ok = copernicus::fetch_product_catalog(product_id = "GLOBAL_ANALYSISFORECAST_BGC_001_028")
This downloads into a “catalogs” directory within your data directory Now read it in.
x = copernicus::read_product_catalog(product_id = "GLOBAL_ANALYSISFORECAST_BGC_001_028")
x
## # A tibble: 31 × 7
## product_id title dataset_id dataset_name short_name standard_name units
## <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 GLOBAL_ANALYSIS… Glob… cmems_mod… daily mean … nppv net_primary_… mg m…
## 2 GLOBAL_ANALYSIS… Glob… cmems_mod… daily mean … o2 mole_concent… mmol…
## 3 GLOBAL_ANALYSIS… Glob… cmems_mod… Monthly mea… nppv net_primary_… mg m…
## 4 GLOBAL_ANALYSIS… Glob… cmems_mod… Monthly mea… o2 mole_concent… mmol…
## 5 GLOBAL_ANALYSIS… Glob… cmems_mod… daily mean … dissic mole_concent… mol …
## 6 GLOBAL_ANALYSIS… Glob… cmems_mod… daily mean … ph sea_water_ph… 1
## 7 GLOBAL_ANALYSIS… Glob… cmems_mod… daily mean … talk sea_water_al… mol …
## 8 GLOBAL_ANALYSIS… Glob… cmems_mod… Monthly mea… dissic mole_concent… mol …
## 9 GLOBAL_ANALYSIS… Glob… cmems_mod… Monthly mea… ph sea_water_ph… 1
## 10 GLOBAL_ANALYSIS… Glob… cmems_mod… Monthly mea… talk sea_water_al… mol …
## # ℹ 21 more rows
By default this provides a flattened table of available datasets along with tables of variables for each (if any). Here’s the first dataset (a constituent of product suite)
dplyr::filter(x, dataset_id %in% "cmems_mod_glo_bgc-bio_anfc_0.25deg_P1D-m") |>
dplyr::glimpse()
## Rows: 2
## Columns: 7
## $ product_id <chr> "GLOBAL_ANALYSISFORECAST_BGC_001_028", "GLOBAL_ANALYSISF…
## $ title <chr> "Global Ocean Biogeochemistry Analysis and Forecast", "G…
## $ dataset_id <chr> "cmems_mod_glo_bgc-bio_anfc_0.25deg_P1D-m", "cmems_mod_g…
## $ dataset_name <chr> "daily mean fields from Global Ocean Biogeochemistry Ana…
## $ short_name <chr> "nppv", "o2"
## $ standard_name <chr> "net_primary_production_of_biomass_expressed_as_carbon_p…
## $ units <chr> "mg m-3 day-1", "mmol m-3"
CMEMS offers many products - which ones have we set up for this package? (Note, it’s subject to change).
GLOBAL_MULTIYEAR_PHY_001_030 31 Dec 1992 to a month or more lag
GLOBAL_ANALYSISFORECAST_PHY_001_024 31 Oct 2020 to 9 days from present (forecast)
GLOBAL_ANALYSISFORECAST_BGC_001_028 2021-10-01 to 9 days from present
To fetch data we’ll focus on ocean physics daily forecast which serves daily mean sea surface currents. We’ll define a date range and the bounding box that covers the Gulf of Maine (gom), and we’ll confine the request to just the surface data.
suppressPackageStartupMessages({
library(stars)
library(copernicus)
library(dplyr)
})
product_id = "GLOBAL_ANALYSISFORECAST_PHY_001_024"
dataset_id = "cmems_mod_glo_phy-cur_anfc_0.083deg_P1D-m"
vars = c("uo","vo")
bb = c(xmin = -72, xmax = -63, ymin = 39, ymax = 46)
path = copernicus_path(product_id, "gom") |>
make_path()
depth = c(0,1) # just the top 1 meter
time = c(0, 9) + Sys.Date() # today - and a little ahead window
ofile = copernicus_path("tmp",
paste0(product_id, "__", dataset_id, ".nc"))
ok = download_copernicus_cli_subset(dataset_id = dataset_id,
vars = vars,
depth = depth,
bb = bb,
time = time,
ofile = ofile)
x = stars::read_stars(ofile)
## uo, vo,
x
## stars object with 4 dimensions and 2 attributes
## attribute(s):
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## uo [m/s] -0.9230007 -0.08498658 -0.01408345 0.01123039 0.04875003 1.479645
## vo [m/s] -1.8433951 -0.06975559 -0.01406787 -0.02037171 0.03712043 1.345416
## NA's
## uo [m/s] 26800
## vo [m/s] 26800
## dimension(s):
## from to offset delta refsys x/y
## x 1 109 -72.04 0.08333 NA [x]
## y 1 85 46.04 -0.08333 NA [y]
## depth 1 1 0.494 [m] NA NA
## time 1 10 2025-05-12 UTC 1 days POSIXct
plot(x['uo'], axes = TRUE)