Lead Machine Learning Infrastructure Engineer - Infrastructure & Data
Location: Campbell
Posted on: June 23, 2025
|
|
Job Description:
Upwork ($UPWK) is the world’s work marketplace. We serve
everyone from one-person startups to large, Fortune 100 enterprises
with a powerful, trust-driven platform that enables companies and
talent to work together in new ways that unlock their potential.
Last year, more than $3.8 billion of work was done through Upwork
by skilled professionals who are gaining more control by finding
work they are passionate about and innovating their careers. The
Machine Learning Infrastructure & Data team is responsible for
architecting and building the foundational systems and tools that
enable efficient development, deployment, and management of machine
learning models at scale. As a Lead Machine Learning Infrastructure
Engineer, you will be pivotal in designing, developing, and
maintaining robust and scalable infrastructure components to
support Upwork’s machine learning initiatives. You will work
closely with cross-functional teams—including machine learning
researchers, data scientists, and software engineers—to build
state-of-the-art platforms and tools that accelerate the
development and deployment of machine learning models.
Responsibilities : • Design, implement, and optimize distributed
systems and infrastructure components to support large-scale
machine learning workflows, including data ingestion, feature
engineering, model training, and serving. • Develop and maintain
frameworks, libraries, and tools that streamline the end-to-end
machine learning lifecycle, from data preparation and
experimentation to model deployment and monitoring. • Architect and
implement highly available, fault-tolerant, and secure systems that
meet the performance and scalability requirements of production
machine learning workloads. • Collaborate with machine learning
researchers and data scientists to understand their requirements
and translate them into scalable and efficient software solutions.
• Stay current with advancements in machine learning
infrastructure, distributed computing, and cloud technologies,
integrating them into our platform to drive innovation. • Mentor
junior engineers, conduct code reviews, and uphold engineering best
practices to ensure the delivery of high-quality software
solutions. What it takes to catch our eye: • Strong technical
expertise in designing and building scalable ML infrastructure. •
Experience with distributed systems and cloud-based ML platforms. •
Proficiency in programming languages such as Python, Java, or
Scala. • Deep understanding of ML workflows, including data
pipelines, model training, and deployment. • Passion for innovation
and eagerness to implement the latest advancements in ML
infrastructure. • Strong problem-solving skills and ability to
optimize complex systems for performance and reliability. •
Collaborative mindset with excellent communication skills to work
across teams. • Ability to thrive in a fast-paced, dynamic
environment with evolving technical challenges.
Keywords: , Gilroy , Lead Machine Learning Infrastructure Engineer - Infrastructure & Data, IT / Software / Systems , Campbell, California