HomeJobsTogether AI › Research

Research Intern, Inference (Fall 2026)

Together AI San Francisco Full-time Research Posted Jun 12, 2026
Apply Now →

What it’s like to work at Together AI

AI Infrastructure · San Francisco

4.1
Employee Rating
3.8
Work-Life Balance
57
Open Roles
open-sourceeng-drivenlearningflatmany-hats

What employees love

  • Open-source AI infrastructure — real mission alignment, not just marketing
  • Small team with outsized research output; your work has direct impact

What could be better

  • Early-stage means wearing many hats constantly — not for specialists
  • Limited career ladder; growth paths are still being defined
View full Together AI culture profile →

About the Role

About The Role

The Inference Research team is dedicated to building the next generation of efficient, scalable, and reliable serving systems for large foundation models, directly contributing to the mission of advancing open and transparent AI. Our work operates at the critical intersection of cutting-edge model architectures, high-performance systems engineering, and deep hardware optimization. We focus on co-designing software, algorithms, and models to significantly lower the cost and latency of modern AI systems.

As a research intern, you will dive into the complexities of distributed inference, compiler-aware optimization, and novel inference-time computation strategies (such as speculative decoding and phase-aware execution). You will be tasked with co-designing and implementing cross-layer optimizations across models, systems, and hardware, with a focus on areas like KV cache design and large-scale serving architectures.

Projects aim to unlock unprecedented performance and scale for foundation models, enabling faster serving, larger model deployment (e.g., Mixture-of-Experts), and robust, reproducible evaluation under realistic serving workloads.

Responsibilities

Requirements

Preferred Qualifications

Internship Program Details

Our fall internship program spans over 12 to 16 weeks where you’ll have the opportunity to work with industry-leading engineers building a cloud from the ground up and possibly contribute to influential open source projects. Our internship dates are September 14th to December 18th.

About Together AI

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancements such as FlashAttention, Mamba, FlexGen, Petals, Mixture of Agents, and RedPajama.

Compensation

We offer competitive compensation, housing stipends, and other competitive benefits. The estimated US hourly rate for this role is $58-63/hr. Our hourly rates are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

Equal Opportunity

Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.

Please see our privacy policy at https://www.together.ai/privacy

Similar Roles

More at Together AI
AI Researcher, Core ML (Turbo)
San Francisco
Research Engineer, Core ML
San Francisco
Research Engineer, Frontier Speculative Decoding
San Francisco, New York City
Frontier Agents Intern (Fall 2026)
San Francisco
LLM Inference Frameworks and Optimization Engineer
San Francisco, Singapore, Amsterdam
Similar roles at other companies
PhD GenAI Research Scientist Intern
Databricks · San Francisco, California
Applied Research Intern
Weaviate · Europe
Data Center Electrical Modeling Intern - 2026
Lambda · San Francisco Office (Fremont St)
Machine Learning PhD Intern, Economics (Fall)
Instacart · United States - Remote
Biological Safety Research Scientist
Anthropic · San Francisco, CA

Frequently Asked Questions

What is the work-life balance like at Together AI?
Together AI has a work-life balance score of 3.8/5 based on employee reviews. This is about average for the AI/tech industry.
What is Together AI’s culture like?
Together AI is characterized by these culture values: open-source, eng-driven, learning, flat, many-hats. Based on employee reviews, the company has an overall rating of 4.1/5. Open-source AI infrastructure — real mission alignment, not just marketing
How many open roles does Together AI have?
Together AI currently has 57 open roles across departments including engineering, product, sales, and more. Roles are refreshed daily from their careers page.
Is this role remote-friendly?
This role is located in San Francisco. Check the job description above for specific location and remote work details.
Apply for this role at Together AI →