Back to portfolio Smart Forum, Islamabad · 2025

Multi-Agent Wellness Assistant

A five-agent LLM system: orchestrator + four domain specialists.

AI Internship · Smart Forum, Islamabad · Jul – Aug 2025

The Problem

Wellness questions don't sort themselves. A user typing "I'm stressed and binge-ate, what should I do?" needs the diet agent and the mental-health agent - not a single generic chatbot, but also not a UI that asks them to pick a tab first. The product brief was a single assistant that feels general but actually delegates to specialists, with two hard constraints:

  • Personalized by default. Every response should be informed by what the user has logged - meals, workouts, sleep, mood, goals. A diet recommendation that ignores yesterday's logged meals is worse than no recommendation.
  • Operationally safe. The mental-health surface needs to recognize when to back off and escalate, and the orchestrator needs to firmly turn away anything outside the wellness domain instead of trying to answer politics or coding questions.

Our Approach

Five agents, gated supervisor pattern. The orchestrator is the only thing the user talks to; specialists are invoked under it.

  1. Orchestrator - routes, assembles per-user context, runs the deterministic OOS filter, composes the final reply.
  2. Vision agent - analyzes meal photos using Llama-3.2-11B-Vision-Instruct; returns structured items + estimated portions.
  3. Diet agent - nutrition recommendations grounded in the user's recent meal log + goals.
  4. Exercise planner - generates plans against fitness level, available equipment, and weekly history.
  5. Mental-health conversationalist - supportive conversation tuned for active listening; watches for escalation triggers.
Multi-agent wellness orchestration diagram: the user talks to a single-surface chat which routes to the Orchestrator, which fans out to Vision, Diet, Exercise, and Mental Health agents, and reads context from a 7-table SQLite store.

Figure 1 · Single-surface chat, orchestrated routing.

Per-turn activity diagram: a user message is checked by an in-scope filter (out-of-scope messages get a polite refusal); valid messages with an image first run through the Vision agent, then the Orchestrator plans a route to Diet, Exercise, and Mental Health specialists in parallel (each reading from SQLite), and finally the Orchestrator composes a unified reply.

Figure 2 · Per-turn activity flow.

The mid-project pivot

V1 had a multi-page UI: one tab per specialist. It looked sensible on paper. In practice, watching real users, two patterns broke it:

  1. Users opened the diet tab and asked an exercise question. The diet agent gave a confused answer. They switched tabs and re-asked.
  2. Cross-domain questions ("I'm stressed and binge-ate") had no good home. Users picked a tab semi-randomly and got a half-answer.

The fix was structural, not cosmetic: collapse to a single chat surface and put the routing decision inside the orchestrator instead of asking the user to make it. After the consolidation, the cross-domain case became the easy case - the orchestrator dispatches to two specialists in parallel, reads back, and composes a unified reply.

The lesson I took away: agent design is a UX problem before it is a modeling problem. The model wasn't broken; the surface was making users do work that the system should have done.

Takeaways

  • Agent design is a UX problem before it is a modeling problem. The biggest single quality jump came from collapsing a multi-page UI into a unified chat - no model change, no prompt change.
  • Gated supervisor > free agent swarm at this size. Five agents under a deterministic orchestrator is debuggable; five agents talking to each other is not.
  • Context assembly is the unsung hero. Most agent-quality issues I tracked weren't prompt issues, they were "the agent didn't know X about the user" issues - fixed by reading the right rows from SQLite, not by tuning temperature.
  • Cheap deterministic filters before expensive LLM calls. The keyword-based OOS gate cut routing-call volume meaningfully and made the system feel more decisive on edge cases.