High Performance Computing (HPC) System Engineer
Information Technology/Computing | livermore, CA | 03/17/2023
Job Code: SES.2 Science & Engineering MTS 2 / SES.3 Science & Engineering MTS 3
Position Type: Career Indefinite
Security Clearance: Anticipated DOE Q clearance (requires U.S. citizenship and a federal background investigation)
Drug Test: Required for external applicant(s) selected for this position (includes testing for use of marijuana)
Medical Exam: Not applicable
Join us and make YOUR mark on the World!
Are you interested in joining some of the brightest talent in the world to strengthen the United States’ security? Come join Lawrence Livermore National Laboratory (LLNL) where our employees apply their expertise to create solutions for BIG ideas that make our world a better place.
We are committed to a diverse and equitable workforce with an inclusive culture that values and celebrates the diversity of our people, talents, ideas, experiences, and perspectives. This is essential to innovation and creativity for continued success of the Laboratory’s mission.
$123,960 - $166,992 Annually for the SES.2 level
$148,650 - $200,328 Annually for the SES.3 level
Please note that the pay range information is a general guideline only. Many factors are taken into consideration when setting starting pay including education, experience, the external labor market, and internal equity.
We have an opening for a High Performance Computing (HPC) System Engineer to support one of the largest supercomputer centers in the world. The selected candidate will work in a challenging and team-oriented environment supporting Livermore Computing’s (LC) high performance computing clusters. You will apply fundamental knowledge of HPC systems and contribute to technical projects using creativity and imagination. The position requires the ability to serve periodically on a rotating off-hours on-call list. This position is in the Livermore Computing Division within the Computation Directorate.
This position will be filled at either the SES.2 or SES.3 level based on knowledge and related experience as assessed by the hiring team. Additional job responsibilities (outlined below) will be assigned if hired at the higher level.
In this role you will
- Provide system administration support for Linux-based HPC, Network Attached Storage (NAS) systems, Infrastructure and Parallel file systems servers and clusters.
- Participate in the design and implementation of multiple Linux-based HPC, Infrastructure and Parallel file system servers and clusters.
- Build, configure, and maintain multiple RAID controllers and disk enclosures systems.
- Deploy and maintain high-speed cluster fabrics for compute and storage networks.
- Monitor and conduct installations of software releases, patches of the operating system, and third-party utilities with emphasis on overall system security.
- Improve the quality of service for end users, working with system engineers, Hotline, and Operations staff.
- Troubleshoot and determine root cause of moderately complex system issues.
- Respond to system problems and user questions in person, via email, and via a trouble ticket system.
- Perform other duties as assigned.
Additional job responsibilities, at the SES.3 level
- Analyze and tune performance of complex computer, network, file system and disk sub-systems.
- Investigate, evaluate, test, and recommend technical solutions for future systems.
- Develop tools and procedures to monitor and automate system tasks on servers and clusters.
- Ability to secure and maintain a U.S. DOE Q-level security clearance which requires U.S. citizenship
- Bachelor’s degree in computer science or related field or the equivalent combination of education and related experience.
- Broad experience with Linux systems including installation, configuration, networking, backups, updates and patching, and system security.
- Broad experience with or knowledge of HPC environments and technologies such as high-speed cluster fabrics (Infiniband), job scheduling (Slurm), and parallel file systems (Lustre and GPFS).
- Comprehensive knowledge of scripting and programming languages, such as, Perl, Python, and bash/csh/ksh.
- Proficient with disk and storage systems, such as host-based RAID controllers, software RAID and vendor RAID systems.
- Comprehensive experience with version control and configuration management systems, such as, git, Ansible, and cfengine.
- Demonstrated ability to work with limited direction in a dynamic environment with competing priorities.
- Ability to work off-hours and on-call (intermittently either as needed or as part of a rotation).
- Proficient communication, interpersonal skills, and the ability to work and communicate with other technical staff and end-users.
Additional qualifications at the SES.3 level
- Significant experience with Linux system administration in support of several independent but inter-related systems and software packages, and knowledge of container technologies, Kubernetes, and other virtualization machine software environments.
- Advanced knowledge of and significant experience providing innovative solutions to broadly defined tasks and problems.
- Advanced communication, interpersonal skills, and the ability to effectively interact with system developers and vendors with minimal direction.
Qualifications We Desire
- Master’s degree in computer science or related field.
- Experience with local, parallel and distributed file systems, such as, XFS, ZFS, GPFS, Lustre, and with NAS platforms, such as, NetApp FAS systems running OnTap 9.x.
- Design and deployment experience with container technologies (singularity, docker, podman) and Kubernetes (OpenShift), and other virtualization environments, such as, KVM, and VMware ESXi 6.7/7.x.
Additional InformationAll your information will be kept confidential according to EEO guidelines.
This is a Career Indefinite position, open to Lab employees and external candidates.
Why Lawrence Livermore National Laboratory?
- Flexible Benefits Package
- Relocation Assistance
- Education Reimbursement Program
- Flexible schedules (*depending on project needs)
- Inclusion, Diversity, Equity and Accountability (IDEA) - visit https://www.llnl.gov/diversity
- Our core beliefs - visit https://www.llnl.gov/diversity/our-values
- Employee engagement - visit https://www.llnl.gov/diversity/employee-engagement
This position requires a Department of Energy (DOE) Q-level clearance. If you are selected, we will initiate a Federal background investigation to determine if you meet eligibility requirements for access to classified information or matter. Also, all L or Q cleared employees are subject to random drug testing. Q-level clearance requires U.S. citizenship.
Pre-Employment Drug Test
External applicant(s) selected for this position must pass a post-offer, pre-employment drug test. This includes testing for use of marijuana as Federal Law applies to us as a Federal Contractor.
Equal Employment Opportunity
We are an equal opportunity employer that is committed to providing all with a work environment free of discrimination and harassment. All qualified applicants will receive consideration for employment without regard to race, color, religion, marital status, national origin, ancestry, sex, sexual orientation, gender identity, disability, medical condition, pregnancy, protected veteran status, age, citizenship, or any other characteristic protected by applicable laws.
We invite you to review the Equal Employment Opportunity posters which include EEO is the Law and Pay Transparency Nondiscrimination Provision.
Our goal is to create an accessible and inclusive experience for all candidates applying and interviewing at the Laboratory. If you need a reasonable accommodation during the application or the recruiting process, please use our online form to submit a request.
California Privacy Notice
The California Consumer Privacy Act (CCPA) grants privacy rights to all California residents. The law also entitles job applicants, employees, and non-employee workers to be notified of what personal information LLNL collects and for what purpose. The Employee Privacy Notice can be accessed here.