Employing Directed Internship and Apprenticeship for Fostering HPC Training and Education

Elizabeth Bautista and Nitin Sukhija

Volume 12, Issue 2 (February 2021), pp. 33–36

https://doi.org/10.22369/issn.2153-4136/12/2/8

PDF icon Download PDF

BibTeX
@article{jocse-12-2-8,
  author={Elizabeth Bautista and Nitin Sukhija},
  title={Employing Directed Internship and Apprenticeship for Fostering HPC Training and Education},
  journal={The Journal of Computational Science Education},
  year=2021,
  month=feb,
  volume=12,
  issue=2,
  pages={33--36},
  doi={https://doi.org/10.22369/issn.2153-4136/12/2/8}
}
Copied to clipboard!

Positions within High Performance Computing are difficult to fill, especially that of Site Reliability Engineer within an operational area. At the National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory (LBNL), the Operations team manage the HPC computational facility with a complex cooling ecosystem and also serve as the wide area network operations center. Therefore, this position requires skill sets in four specific areas: system administration, storage administration, facility management, and wide area networking. These skills are not taught in their entirety in any educational program; therefore, a new graduate will require extensive training before they can become proficient in all areas. The proximity to Silicon Valley adds another challenge in finding qualified candidates. NERSC has implemented a new approach patterned after the apprenticeship program in the trades. This program requires an intern or apprentice to fulfill milestones during their internship or apprenticeship timeframe, with constant evaluation, feedback, mentorship, and hands-on work that allow candidates to demonstrate their growing skill that will eventually lead to winning a career position.