DataScience Workbook / 06. High-Performance Computing (HPC) / 1. Introduction to HPC infrastructure / 1.2 SCINet Network / 1.2.3 Juno Storage
Juno Archive Storage
Juno is a Storage Device located in Beltsville (MD), being part of SCINet HPC System, funded by USDA-ARS. The SCINet initiative’s mission is to grow USDA’s research capacity by providing scientists with high-performance computing and professional training support.
Explore the resources to learn more:
- SCINet website: https://scinet.usda.gov ⤴
- USDA-ARS website: https://www.ars.usda.gov/ ⤴
- Introduction to SCINet HPC in this workbook: What is SCINet? ⤴
What is Juno used for?
In addition to its powerful computing capabilities, the SCINet HPC system also offers data storage solutions to efficiently manage and store data and results:
- Tier 1 Storage, short-term, not backed up storage on each computing cluster (Atlas ⤴, Ceres ⤴) for storing code, data, and intermediate results while performing a series of computational jobs
- Juno storage, with a large, multi-petabyte ARS long-term storage, periodically backed up to tape device.
Learn more about SCINet Data and Storage recommended procedures from the guide ⤴, provided by SCINet VRSC.
Benefits of using Juno
There are a few reasons why it is a good practice to move final results that are difficult to easily recreate to backed up Juno archive storage:
Archiving final results in a backed up storage system helps to protect against data loss due to hardware failure or other unforeseen events.
Archiving final results ensures that the data will be preserved for long-term use and will not be lost due to changes in technology or file formats.
Archiving final results allows for easier sharing and collaboration with other researchers, as the data will be stored in a centralized location that is easily accessible.
Archiving final results helps to ensure the reproducibility of research findings, as other researchers will be able to access the original data and results.
Juno access points
Juno transfer node: @nal-dtn.scinet.usda.gov
Juno end point via Globus: “NAL DTN 0” (recommended)
*SCINet account is required to get access
To obtain a SCINet account, a SCINet Account Request must be submitted. To learn more, visit the official Sign up for a SCINet account ⤴ guide or use links provided below:
Copy your data to Juno
Globus Online ⤴ is the recommended method for transferring data to and from the SCINet clusters. It provides faster data transfer speeds compared to
scp, has a graphical interface, and does not require a GA verification code for every file transfer.
• using Globus (preferred)
Follow the step-by-step guide: Globus Data Transfer ⤴ to learn how to transfer data to and from Juno storage.
Juno end point via Globus: “NAL DTN 0”
• using command line
For small datat transfers it is allowed to move data to Juno storage using command line approches, such as
- First, use terminal window on your local machine to log in to the transfer node on one of the SCINet clusters:
- Then, use
rsynccommand to synchronize (move new content or update changes) in your
rsync -avz --no-p --no-g ttt nal-dtn.scinet.usda.gov:/LTS/project/<project_name>/
Note, the organization of the file system is slightly different on computing clusters: /project/project_name and Juno storage: /LTS/project/project_name .
- 2. Remote Access to HPC Resources
- 3. Setting up Your Home Directory for Data Analysis
- 4. Software Available on HPC
- 5. Introduction to Job Scheduling
- 6. Introduction to GNU Parallel
- 7. Introduction to Containers