Mirror API
The Mirror API is responsible for mapping file requests to a
URL, local filename, file-like object, or something that can
construct a NetCDF Dataset. These are usually constructred
using argopandas.url_mirror() and/or
argopandas.file_mirror().
- class Mirror
The
Mirrorclass is the abstract base class for other mirror types. You can define your own subclass and use it in the main API if you have a non-standard mapping of files and would like to use features of the package-level API.- Parameters
path – A path to a file on the GDAC (e.g., /dac/csio/1234/1234_meta.nc)
- filename(path) str
Get a filename for this path. The filename is not guaranteed to exist unless
prepare()is called first.
- netcdf_dataset_src(path)
Return the best available input to
argopandas.netcdf.NetCDFWrapper.
- open(path) BinaryIO
Get a file-like object for this
path.
- prepare(path_iter)
Prepare the mirror for loading all the paths in
path_iter(e.g., by downloading them).- Parameters
path_iter – An iterable of
paths.
- url(path)
Return the URL to
pathwithout checking if it exists.
- class CachedUrlMirror(root, cache_dir=None)
This is the most common mirror, which uses a cache to avoid unnecessarily downloading the same file more than once. By default the cache will reset when the session is restarted; however, you can set a persistent cache using
cache_dir.- __init__(root, cache_dir=None)
- Parameters
root – The URL of the base directory. This can be anything supported by
urllib.request.urlopen.cache_dir – The path to the local persistent cache or
Noneto use a temporary directory.
- class FileMirror(root)
The
FileMirrormaps a root directory on a filesystem. This is useful if you have a local copy of Argo downloaded viarsyncor via a stable DOI version of the GDAC. This can also be a partial copy if you have a few files you need to access frequently.- __init__(root)
- Parameters
root – The root directory containing the files.
- class UrlMirror(root)
The
UrlMirroris a cache-less mirror that only uses URL connections. You probably want theCachedUrlMirrorunless you are doing real-time work that might be affected by an out-of-date cache. Note thatfilename()is not supported by theUrlMirror(useopen()instead).- __init__(root)
- Parameters
root – The URL of the base directory. This can be anything supported by
urllib.request.urlopen.