Engineering

AI Engineer (LLM & Applied ML)

Engineering

McKinney, TX

Hybrid

Full-time

As an AI Engineer (LLM & Applied ML) at Confer's McKinney, TX HQ, you design and ship the model-driven core of our agent products: RAG pipelines, fine-tuned open models (LoRA, PEFT), and agentic systems with MCP and multi-agent orchestration. You work in Python and PyTorch, evaluate rigorously, and help set technical direction.

Apply for this role →

About the role

Confer Solutions AI (Confer Inc., doing business as Confer Solutions AI) is an applied-AI company. We build intelligent agents — and the orchestration that makes teams of them work together — to automate real, end-to-end work. Founded by Yatin Karnik, our Founder and CEO and a former 18-year Senior Vice President at Wells Fargo, Confer pairs teams of AI agents with our own engineers to take products from idea to production. Across everything we build, the throughline is the same: intelligent agents that automate what humans shouldn't have to do.

This role is based at Confer's headquarters in McKinney, Texas, working hybrid alongside the founding team. You'll sit with the people setting product direction and see your work reach customers quickly. The pace is fast and the feedback loop is short: you build alongside both our engineers and our AI agents, and what you ship goes straight into real products.

As an AI Engineer focused on LLMs and applied ML, you'll design and ship the model-driven systems at the core of our agent products. You'll build retrieval-augmented generation (RAG) pipelines over vector databases, fine-tune and adapt open models — BERT, LLaMA, Mistral, and others — with SFT, LoRA, and PEFT, and build agentic systems using MCP, tool use, and multi-agent orchestration. You'll work in Python and PyTorch with Hugging Face Transformers, orchestrate with LangChain or LangGraph, route across models, evaluate rigorously with tools like Langfuse, and deploy with Docker. You'll also help set technical direction and mentor engineers earlier in their careers.

What you'll do

Design and build RAG pipelines over vector databases.
Fine-tune and adapt open models (BERT, LLaMA, Mistral) using SFT, LoRA, and PEFT.
Build agentic systems with MCP, tool use, and multi-agent orchestration.
Establish disciplined model evaluation (accuracy, faithfulness, latency) with tools like Langfuse.
Deploy and operate models in production with Docker; mentor engineers and help set technical direction.

What we're looking for

A Master's (MS) degree in computer science, machine learning, AI, or a related field.
3–6 years building production ML or AI systems.
Strong Python and PyTorch, with hands-on Hugging Face Transformers experience.
Practical experience preparing datasets and fine-tuning models (SFT, LoRA, PEFT).
Built RAG systems over vector databases using LangChain, LangGraph, or LlamaIndex.
Experience with agentic AI / MCP, multi-agent orchestration, and rigorous model evaluation (e.g., Langfuse).

Nice to have

Experience fine-tuning or serving open models like LLaMA or Mistral at scale.
Multi-model routing (LiteLLM / OpenRouter) and workflow orchestration (e.g., Temporal).
A track record of mentoring engineers and shaping technical direction.

Apply

Apply for AI Engineer (LLM & Applied ML)

Tell us a little about you and attach your resume. Fields marked with an asterisk are required.