GPU Cluster Software Engineer

Warren, MI, US

Apply

Back to Results

We are seeking a highly skilled GPU Cluster Software Engineer with strong expertise in VMware and CPU/GPU cluster technologies. This engineer will play a critical role in designing, implementing, and managing high-performance compute clusters that support advanced workloads including AI, ML, and HPC applications.

Key Responsibilities

  • Design, deploy, and manage enterprise-scale CPU/GPU clusters for high-performance workloads.
  • Configure, maintain, and optimize VMware virtualization platforms (vSphere, ESXi, vCenter, vSAN).
  • Integrate GPU virtualization technologies (e.g., NVIDIA GRID, vGPU) into VMware environments.
  • Perform performance tuning, capacity planning, and resource optimization for compute clusters.
  • Implement automation and orchestration tools to streamline cluster operations and provisioning.
  • Monitor, troubleshoot, and optimize cluster performance to ensure system reliability.
  • Collaborate with research and engineering teams to support compute-intensive applications (AI/ML/HPC).
  • Ensure system scalability, security, and efficiency across multi-user environments.

Required Skills & Qualifications

  • Hands-on expertise with VMware virtualization technologies (vSphere, ESXi, vCenter, vSAN).
  • Proven experience in building and managing CPU/GPU clusters in enterprise or research environments.
  • Strong knowledge of GPU virtualization (NVIDIA GRID, vGPU) and integration with VMware.
  • Proficiency in cluster monitoring, troubleshooting, and optimization.
  • Solid understanding of networking and storage concepts in clustered environments.
  • Experience supporting compute-intensive workloads such as AI, ML, or HPC.
  • Familiarity with automation/orchestration tools (e.g., Ansible, Terraform, Kubernetes, or similar).
  • Excellent problem-solving skills and ability to work in a fast-paced, collaborative environment.

Education

  • Master's or Ph.D. in Computer Science, Computer Engineering, or related field.

Apply

Back to Results