Skip to content

Fully implementing python version of USGS variometer data retrieval #4

@DCarSunDr

Description

@DCarSunDr

We've determined that python can provide a more time-efficient means of retrieving the 10 Hz variometer data (see #392), and so we are pressing forward with a full adaption of "gmag_usgs_download_variometer_files.php"; in this case, I've dubbed the python version "gmag_retrieve_usgs_variometer.py", for conciseness.

We will need a script in /src/thmsoc (do we want to add a subdirectory?) which handles the actual retrieval, and a src/thmsoc/cli script which handles argument parsing.

We should determine which functions (if any) should be split into separate scripts, to make it easier to adapt other networks/use cases. For instance, an availability checker could be generalized

If we'd like to fold the "call_gmag_usgs_download_variometer_files.ksh" script into "gmag_retrieve_usgs_variometer.py", we'd need to add support for logging, possibly db_log, and enable it to send the database queries file to the mysql server.

Currently, "gmag_retrieve_usgs_variometer.py" is able to retrieve a file given the station name, a date, and a sampling rate. It as a function to construct USGS query URLs (which we might want to make part of a separate thmsoc/thmsoc_gmag url handling script, since USGS observatory magnetometers use a similar thing, but a different path, which could be handled as an argument.). It can also split the retrieval into segments, by default using 24 hour segments for 1 Hz data and 4 hour segments for 10 Hz data.

Additionally, "gmag_retrieve_usgs_variometer.py" will need to be able to:

  • Accept a list under a "station names" argument
  • Loop over the passed station names, and for each station name, loop over the start and end dates / date + days to retrieve data files
  • Handle USGS and THEMIS station name alias. Since the toml file is used for paths, we might want a "gmag" config/definitions file which we could load under specific parameters
  • Check availability from IRIS/EARTHSCOPE (maybe should also be a separate script)
  • Load latest calibration history date (note: will likely need to handle xml format, but should be do-able under urllib request under their decode options)
  • Create database queries as strings, append them to .sql file
  • write to mirror_dir for 1 Hz data (need to update toml)
  • Write files (maybe start with a temporary directory), and if script is interrupted, delete file. Once script is complete, make sure files get placed into proper workdir

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions