Breakthrough Energy Fellows Job Board

Breakthrough Energy Fellows Portfolio Career Opportunities

Search

jobs

My job alerts

Lead Engineer

Loamist

Sales & Business Development

Posted on Apr 22, 2026

Apply now

The role

The Lead Engineer is responsible for advancing how large language models are used across development, evaluation, and the product stack. This role owns hands-on engineering work across LLM-powered product capabilities, internal AI-native development workflows, agent systems, and the harness used to measure quality and reliability in production. The ideal candidate is a senior builder who can prototype quickly, ship reliably, and push the practical boundaries of LLMs in real software systems.

What success looks like

•Extend and improve the LLM harness used to test prompts, models, agents, and product behaviors.
•Ship production-ready LLM capabilities supporting both internal engineering workflows and customer-facing functionality.
•Build and refine agents, tool-use patterns, and orchestration logic that expand what the product stack can do reliably.
•Design and run rigorous evaluations for model quality, agent behavior, tool use, latency, and regression tracking.
•Support backend services and integrations connecting LLM systems to the broader product and operational stack.

Core responsibilities

LLM Product Engineering

•Build and refine LLM-powered product capabilities across internal tools and external product surfaces.
•Prototype, ship, and iterate on prompt pipelines, retrieval workflows, structured outputs, reasoning chains, and human-in-the-loop review flows.
•Work closely with product and design partners to translate ambiguous user needs into reliable model-powered experiences.
•Push model performance through experimentation with context construction, tool use, orchestration patterns, fallback logic, and guardrails.

Agentic Development & AI-Native Engineering

•Use Claude Code, Codex, and comparable AI-native tools to accelerate implementation, refactoring, debugging, testing, and code review.
•Design and build agents that retrieve context, plan, call tools, write or modify code, and operate safely within defined controls.
•Develop reusable engineering patterns for agent state, tool orchestration, permissions, error recovery, and human approval loops.
•Help shape internal engineering practices for how LLMs and agentic tools are used throughout the development lifecycle.

Evaluations, Harness & Reliability

•Own or co-own the LLM harness and evaluation framework used to benchmark prompts, models, agents, and product behaviors.
•Use LM Evaluation Harness or comparable tooling for benchmark suites, regression tests, release gates, and comparative baselines.
•Partner with product, QA, and operations stakeholders to define measurable quality thresholds for shipping AI features.
•Establish logging, tracing, feedback loops, and error-analysis workflows to improve model and agent performance over time.

Platform, Backend & Integrations

•Build and maintain backend services, APIs, and orchestration layers connecting LLM systems to the product stack.
•Support integrations with internal platforms, external systems, and workflow tools to operationalize model outputs.
•Contribute to environment setup, deployment workflows, release planning, production monitoring, and incident response.

Required qualifications

•5+ years in backend, platform, or product engineering in production environments.
•Strong development skills in Python and at least one of TypeScript/JavaScript, Java, or Node.js, with experience building APIs, services, and developer-facing tooling.
•Hands-on experience building and shipping LLM-powered applications, internal AI tools, or agentic workflows in production.
•Practical experience using Claude Code, Codex, or similar agentic coding tools as part of day-to-day engineering work.
•Experience building agents or multi-step LLM workflows using tools, retrieval, memory or state handling, and external actions.
•Experience designing and running LLM and agent evaluations — regression suites, benchmark harnesses, and release criteria.
•Ability to turn exploratory model experimentation into stable, maintainable, and observable production systems.
•Comfort operating in a lean, fast-moving environment with high ownership and strong cross-functional communication.

Nice to have

•Experience maintaining or extending an internal LLM harness, benchmark framework, or model evaluation platform.
•Experience with observability, prompt management, experiment tracking, and production monitoring for AI systems.
•Experience with document processing, retrieval, workflow automation, or reasoning over structured and semi-structured enterprise data.
•Familiarity with integrations across operational systems, APIs, middleware, or event-driven architectures.
•Experience supporting release readiness, production incident response, or reliability engineering for AI-enabled systems.

Interested in this role?

Compensation tailored based on employment model, experience, and demonstrated AI/LLM capability.

Apply now

See more open positions at Loamist

Powered by Getro.com

Privacy policy Cookie policy