AI engineer: you ship product features on top of foundation models. Your day is prompts, retrieval, evals, tool calls, agents, latency budgets, and cost dashboards. Most of your work is application-layer.
ML engineer: you build custom models using your company's proprietary data. Your day is feature engineering, training pipelines, drift monitoring, sub-100ms inference, and statistical guarantees. Most of your work is model-layer.
If you love product engineering and want to ship user-facing features fast, AI engineering. If you love data, math, and the kind of problems where the company's edge is the model itself, ML engineering.
Three years ago this question barely existed. Anyone who shipped anything machine-learning-adjacent had the same title: ML engineer. They trained the model, deployed the model, monitored the model, retrained the model. The role bundled everything that touched a model into one job.
Then foundation models swallowed the middle of the field. By late 2024, most product teams that wanted "AI features" stopped training models entirely. They called an LLM API, layered in retrieval, wired up tool calls, wrote evals, and shipped. That work didn't look like ML engineering anymore. It looked like product engineering with an LLM in the loop. The industry called it AI engineering, and the title stuck.
Meanwhile, the ML engineering jobs that survived narrowed and got harder. The problems where the company's edge is its proprietary data — fraud, ranking, demand forecasting, computer vision on custom datasets — got more specialized, not less. Generic "build me a sklearn model" work mostly got automated or absorbed. What's left is the hard stuff.
That split is now the cleanest way to think about the two roles. Below: what each one actually does day-to-day, how the skill stacks differ, the compensation reality, and how to decide which one to target.
What an AI engineer actually does
Open the calendar of an AI engineer at a typical AI-product company in 2026 and you'll see something like this. Morning: review the overnight eval run, flag a regression in the agent's tool-calling step, file a ticket for the prompt that drifted. Afternoon: pair with a product engineer on a new tool the agent needs, instrument the call, write the eval cases. Evening: investigate a cost spike from yesterday, find the runaway loop, ship the fix.
The work has a few defining characteristics:
- Application-layer. Almost none of the work is training models. The model is a vendor API or an open-weights checkpoint pulled from Hugging Face. The work is everything around it: input shaping, retrieval, tools, output validation, fallback paths.
- Iteration is the loop. The unit of work is "ship a prompt change, watch the eval, ship the next change." Most days look more like a frontend engineer iterating on a UI than an ML engineer iterating on a training run.
- Evals are the artifact. The strongest AI engineers spend more time on the eval suite than on the prompts. A good evals harness is the compound asset that lets a team ship faster every week. A bad one means every change is a coin flip.
- Cost and latency are first-class. Token spend is a real line item. So is p95 latency for an agent that calls four tools. AI engineering is the first ML-adjacent role where SRE-style instincts pay off more than statistical instincts.
The skill stack maps cleanly to a product engineer with a few LLM-specific additions. Strong async I/O, API design, retries, testing — the regular stuff. On top: structured output, streaming, embeddings, vector search, hybrid retrieval, reranking, eval design, tool calling, basic agent patterns.
What an ML engineer actually does
An ML engineer at a company where ML is the product looks very different. Morning: investigate why yesterday's drift detector fired on the ranking model, dig into the feature importance, find the data-pipeline bug that poisoned a feature for half the request volume. Afternoon: kick off a retraining run on the new label set, review a teammate's PR that proposes adding three new features — flag one of them as label leakage, suggest the right windowing instead. Evening: review the design doc for the new real-time feature store the team is building.
Characteristics that define the work:
- Proprietary data is the moat. The reason the role exists at all is that the company has data nobody else has. A fraud team's labels. A marketplace's click-through history. A search team's query logs. The model is custom because the data is custom.
- The pipeline is the product. Training, feature engineering, serving, monitoring, retraining — all of it has to run reliably forever. The model itself is often a smaller part of the work than the surrounding pipeline.
- Statistical fluency is required. Distribution shift, label leakage, selection bias, online vs offline metrics — this is the daily vocabulary. You can't fake your way through it.
- Performance constraints are tight. Sub-100ms p99 inference for a ranker. Real-time feature computation. These are problems with hard floors that foundation models can't meet at any reasonable cost.
The skill stack: solid software engineering, plus genuine ML fundamentals (loss functions, regularization, evaluation methodology, common failure modes), plus the production-ML stack (feature stores, model registries, batch and streaming pipelines, monitoring).
Side-by-side
| Dimension | AI Engineer | ML Engineer |
|---|---|---|
| Where the model comes from | Foundation model API or open-weights checkpoint | Trained in-house on proprietary data |
| Primary skill stack | Strong product engineering + LLM API fluency | Strong SWE + ML fundamentals + production ML |
| Daily work | Prompts, retrieval, evals, tools, agents | Pipelines, features, training, monitoring, drift |
| What can ruin your day | Cost spike, eval regression, agent loop bug | Drift fire, feature leak, retraining job timeout |
| What you optimize for | Quality, latency, cost | Accuracy, calibration, robustness |
| Where you live in the stack | Application layer | Model + data layer |
| Math you actually need | Intuition for embeddings & attention | Real statistics & modeling theory |
| How fast you ship | Daily-to-weekly cycles | Weekly-to-monthly cycles |
Where the roles overlap
The overlap zone has gotten interesting. A few areas where both roles increasingly meet:
Evals. ML engineers have always needed strong offline evaluation. AI engineers have had to invent the practice from scratch for LLM products. Both groups now share vocabulary around golden sets, regression suites, LLM-as-judge, and online-offline gap analysis. If you're strong at evals, both worlds will hire you.
Retrieval and ranking. Retrieval-augmented generation looks an awful lot like search. Search has always been the home turf of ML engineers. AI engineers are now learning that the hard problems in RAG are mostly the hard problems in search — hybrid retrieval, reranking, query understanding — with prompts wrapped around them.
Observability and cost. Production ML systems and production LLM systems both need deep monitoring. Both groups care about latency, throughput, and the cost of each unit of work. The skill of building those dashboards is the same skill.
If your goal is to keep optionality open between the two roles, lean into these overlap areas early. Evals especially — it's the highest-leverage skill in both worlds and the bar is still very low at most companies.
The compensation reality
This is where the two roles diverge less than people expect. Salary data through early 2026 shows a modest gap that varies significantly by level and company type. At the median, ML engineer base salaries in the US run roughly 10–15% higher than AI engineer base salaries, reflecting the historically specialized math, statistics, and distributed-systems requirements of the role. But the picture shifts materially at senior and staff levels: AI engineers at frontier AI labs frequently match or exceed ML engineer total comp, and 2026 offer data shows AI engineers commanding a notably higher salary ceiling.
But the gap is closing fast. Frontier AI labs are paying AI engineers at staff and principal levels at parity with senior ML talent. At product startups where AI features are the entire roadmap, AI engineer total comp has caught up entirely. The bigger differentiator now is whether the company sees you as a senior product engineer who happens to work on AI, or as a research-adjacent specialist — that framing shifts compensation more than the title does.
Two practical notes on comp:
- Equity at AI-native startups is often where the upside is. A senior AI engineer joining a Series A AI-product company in 2024-2025 is sitting on equity that has, in many cases, repriced 5-10x. That's not the base salary doing the work.
- ML engineer roles at scale companies are the steadier paycheck. Higher base, mature equity, less variance. If your priority is stability over upside, that's the trade.
Want to compare comp at specific companies?
Browse engineering roles across the AI labs and product teams hiring right now on JobsByCulture's ML & AI jobs page. We track openings at Anthropic, OpenAI, Databricks, Scale, Mistral, Cohere, and hundreds of product companies hiring AI engineers.
Which one should you target?
Here's the framework that maps most cleanly to how teams actually hire in 2026:
You want to ship user-facing features in days, not quarters. You're already a strong backend or full-stack engineer. You have product intuition. You want to be in the room when the team decides what to build.
You want to spend your time on training pipelines, feature engineering, and the deep statistical questions. You're comfortable with longer cycles. You care more about model accuracy than user-facing iteration speed.
Two situational tiebreakers:
- What's the company's actual moat? If the company's edge is proprietary data and custom models, ML engineering is where you'll have impact and where the comp will be best. If the company's edge is the product experience built on top of a frontier model, AI engineering is the leverage point.
- How fast do you want feedback? AI engineering operates on daily or weekly cycles — ship, eval, learn. ML engineering often runs on weekly or monthly cycles — train, evaluate, productionize. Some people thrive on the faster loop. Others find it shallow. Be honest about which one matches your energy.
How to position yourself in the next 90 days
If you're a software engineer trying to break into AI engineering, the ramp is shorter than people think. Three to six months of focused work is enough to be hirable. The pattern that works:
- Build a retrieval-augmented chatbot end-to-end. Real data, real embeddings, hybrid retrieval, reranking. Not a demo — something you'd let a stranger use.
- Add a tool-calling agent. Give it three or four real tools, write the failure cases, build the eval harness. Most of the learning happens in the failure cases.
- Instrument cost and latency. Build the dashboard. Optimize one expensive path. Document the trade-off you made and why.
- Write about it. A blog post with real numbers, real failure modes, and real trade-offs is worth more than three certifications.
If you're an ML engineer wondering whether to migrate, the honest answer is: not necessarily. The ML engineering jobs that survived the foundation-model wave are deeper and better-paid than they were three years ago. Generic ML work moved up the stack, but the hard ML work stayed where it was — and there are fewer people in the world who can do it well. If you're already strong there, lean in.
If you want optionality across both worlds, focus on evals, retrieval/ranking, and production observability — the three areas where the skills transfer cleanly in both directions.
The bottom line
AI engineering and ML engineering are now two different jobs. They sound similar, the postings overlap, and recruiters use the titles interchangeably — but inside teams, the work has split. AI engineers ship products. ML engineers build models. Both roles will be in demand for a long time. The right one for you depends on whether you want to spend your day in the application layer or the model layer, and whether you'd rather optimize for shipping speed or model quality.
Pick the role that matches the work you actually want to do, not the title that sounds more impressive on a profile.
Frequently Asked Questions
Browse AI & ML jobs hiring now
JobsByCulture tracks open AI and ML engineering roles at frontier labs, AI-native startups, and product companies hiring on top of foundation models. Filter by role, seniority, and culture — from remote-first to engineer-driven.
Browse AI & ML Jobs → Explore AI Skills Hub →