Mirror API
The Mirror API is responsible for mapping file requests to a
URL, local filename, file-like object, or something that can
construct a NetCDF Dataset
. These are usually constructred
using argopandas.url_mirror()
and/or
argopandas.file_mirror()
.
- class Mirror
The
Mirror
class is the abstract base class for other mirror types. You can define your own subclass and use it in the main API if you have a non-standard mapping of files and would like to use features of the package-level API.- Parameters
path – A path to a file on the GDAC (e.g., /dac/csio/1234/1234_meta.nc)
- filename(path) str
Get a filename for this path. The filename is not guaranteed to exist unless
prepare()
is called first.
- netcdf_dataset_src(path)
Return the best available input to
argopandas.netcdf.NetCDFWrapper
.
- open(path) BinaryIO
Get a file-like object for this
path
.
- prepare(path_iter)
Prepare the mirror for loading all the paths in
path_iter
(e.g., by downloading them).- Parameters
path_iter – An iterable of
path
s.
- url(path)
Return the URL to
path
without checking if it exists.
- class CachedUrlMirror(root, cache_dir=None)
This is the most common mirror, which uses a cache to avoid unnecessarily downloading the same file more than once. By default the cache will reset when the session is restarted; however, you can set a persistent cache using
cache_dir
.- __init__(root, cache_dir=None)
- Parameters
root – The URL of the base directory. This can be anything supported by
urllib.request.urlopen
.cache_dir – The path to the local persistent cache or
None
to use a temporary directory.
- class FileMirror(root)
The
FileMirror
maps a root directory on a filesystem. This is useful if you have a local copy of Argo downloaded viarsync
or via a stable DOI version of the GDAC. This can also be a partial copy if you have a few files you need to access frequently.- __init__(root)
- Parameters
root – The root directory containing the files.
- class UrlMirror(root)
The
UrlMirror
is a cache-less mirror that only uses URL connections. You probably want theCachedUrlMirror
unless you are doing real-time work that might be affected by an out-of-date cache. Note thatfilename()
is not supported by theUrlMirror
(useopen()
instead).- __init__(root)
- Parameters
root – The URL of the base directory. This can be anything supported by
urllib.request.urlopen
.