Implementation Plan · Spike

Text reply goes live

Route an opted-in contact's inbound SMS through the new ai_agent runtime end-to-end — find-or-spawn one long-lived thread per contact, reconstruct the owner's session, and either send autonomously or draft-and-escalate, with the behavior chosen by capability, not prose.

Issue · #1454 (bundles #1424 + #1425) Epic · #1422 · builds on merged #1423 Scope · backend only — no UI this spike Milestone · AI Agents v3

↳ How this plan diverges from the issue body:

The binding is per-phone-number (phone_number.ai_agent_id), not per-contact (contact_text_agent) — so you can run multiple agents and assign each to specific numbers.
Suggest-vs-send is a dedicated ai_agent.send_mode column (the issue left the gate open between three options).
Suggest mode reuses the legacy 3-reply picker. A new tool writes drafts into text_agent_suggestion and the existing InlineSuggestionsPanel renders them on the message — pick & send wires up for free.
Assignment UI is dropped from this spike — assignment is set via DB/seed for now; the UI lands later in the agent view (assign numbers to an agent), not phone settings.

00What we're building

Today, every inbound SMS goes to the legacy stateless text_agent (generates suggestions, optional auto-reply, no memory). This spike adds a parallel agentic path: a number assigned an ai_agent routes its inbound to a stateful, per-contact agent thread that holds conversation memory and acts with real domain tools.

Today — legacy text_agent

Stateless: re-reads history every message, no thread.
Binding: contact_text_agent link or phone_number.text_agent_id.
Behavior: auto_reply boolean + confidence threshold.
Output: writes text_agent_suggestion rows; optionally sends best one.

This spike — ai_agent runtime

Stateful: one long-lived thread per (agent, contact), keyed by correlation_key="contact:{id}".
Binding: phone_number.ai_agent_id — set = opt-in.
Behavior: send_mode (autonomous | suggest) — a capability, not a number.
Output: real send_sms (autonomous) or escalate_to_user push with a draft (suggest).

Non-opted-in numbers are untouched

If phone_number.ai_agent_id is NULL, the inbound still flows to the legacy text_agent exactly as today. No backfill, no regression — the agentic path is purely additive and opt-in.

01Decisions locked

Four product/architecture forks, resolved with you before planning.

DEFAULT AGENT

Seed a new "Text Reply" system agent, cloned per member (like Follow-up Drafter). Not the assistant clone, not the drafter.

BINDING

New phone_number.ai_agent_id column. Resolve: phone.ai_agent_id ?? legacy text_agent. Multiple agents, assign each to numbers.

SEND GATE

New ai_agent.send_mode = autonomous | suggest. Suggest drops send_sms; both keep escalate_to_user.

Dropped this spike. Assignment via DB/seed. The picker lands later in the agent view, not phone settings.

02End-to-end flow

The reroute lives in handle_inbound_message (twilio_events_api.rs), at the point that today spawns the legacy text_agent.

Twilio inbound webhook ──▶ handle_inbound_message() │ ┌────────────────────────┴───────────────────────────┐ │ resolve phone (by `to`) · resolve/create contact │ │ create_message(Inbound) · opt-out keyword check │ └────────────────────────┬───────────────────────────┘ │ ┌────────────┴────────────┐ │ phone.ai_agent_id ? │ └──────┬───────────┬───────┘ Some │ │ None ▼ ▼ ╔══════════════════════════╗ ┌─────────────────────────────┐ ║ AGENTIC PATH (new) ║ │ LEGACY text_agent (today) │ ╟──────────────────────────╢ │ handle_inbound_message_.. │ ║ gate: opted-out? empty? ║ └─────────────────────────────┘ ║ └─ yes ▶ skip ║ ║ find_or_spawn_contact_ ║ ║ thread(agent, contact, ║ ║ InboundSms{ phone_id, ║ ║ message_id, text }) ║ ╚════════════╦═════════════╝ ▼ enqueue_and_maybe_spawn ─▶ worker turn (run_ai_agent_thread) │ │ │ build_session_for_member(owner) │ + domain tools (send_mode gate) │ + escalate_to_user (allowlist) ▼ ▼ ┌──────────────────┐ ┌───────────────────────────────┐ │ autonomous: │ │ suggest: │ │ send_sms(from = │ │ propose_sms_replies([3]) │ │ texted number) │ │ ▶ rows in text_agent_ │ │ │ │ suggestion ▶ existing │ │ │ │ picker on the message │ └──────────────────┘ └───────────────────────────────┘ (proactive "drafts waiting" push = follow-up issue)

✓

Reuses the merged #1423 machinery wholesale

The worker turn, build_session_for_member, the domain-tool bundle, and escalate_to_user all already exist and are exercised by the Loquent Assistant. This spike only adds the routing into that runtime plus the send-mode gate.

03Schema & migrations

Three migrations. Columns first, then the seed (which depends on the columns existing). Schema files regenerate via just generate — never hand-edited.

Migration	Change	Notes
`…_phone_number_add_ai_agent_id`	`phone_number.ai_agent_id UUID NULL` FK → `ai_agent(id)` `ON DELETE SET NULL`	Deleting the agent silently disables agentic routing (falls back to legacy) — desirable.
`…_ai_agent_add_send_mode`	`ai_agent.send_mode TEXT NOT NULL DEFAULT 'autonomous'` `CHECK (send_mode IN ('autonomous','suggest'))`	Default `'autonomous'` so existing assistant clones are unaffected (their path ignores the gate anyway — §6).
`…_seed_text_reply_agent`	Insert base "Text Reply" agent + backfill per-member clones	Mirrors `m20260610_190000_seed_followup_drafter_agent.rs` verbatim in structure.

Verified gap: the clone path must explicitly carry send_mode

The runtime clone builds an ActiveModel with ..Default::default() (→ NotSet → DB default 'autonomous'), and the seed backfill SQL has a fixed column list. Both must explicitly copy send_mode from the source, or every cloned Text Reply agent silently becomes autonomous and starts sending. This is the single highest-risk correctness bug in the plan.

04The Text Reply agent

A new platform system agent (org/user NULL), cloned per member so each member's threads run on their own owned agent — the owner's session is what grants the domain tools.

Configuration

Field	Value	Why
`TEXT_REPLY_AGENT_ID`	`a1a1…0005`	New well-known constant in `constants.rs` (mirrored hardcoded in the migration crate).
`attach_domain_tools`	`true`	Attaches the owner's session-gated CRM/messaging bundle (incl. `send_sms`, read tools).
`tools_allowlist`	`["escalate_to_user"]`	The meta-tool that pushes a draft/handoff to the human owner. Confirmed working.
`send_mode`	`'suggest'` (seeded default)	Safe default: drafts and escalates rather than auto-sending. Flip to `autonomous` per-agent.
`persona` / `goals`	SMS-reply tuned	"Reply from the same number; in suggest mode, hand the draft to your owner via `escalate_to_user`; send only when confident."

Provisioning

Append TEXT_REPLY_AGENT_ID to DEFAULT_USER_AGENT_SOURCES ([Uuid; 3] → [Uuid; 4]) so new members get a clone at signup, and the seed migration backfills clones for existing members.

Whose agent acts on a shared org number?

Phone numbers are org-scoped; agent clones are member-scoped. The assigned ai_agent_id row carries its own user_id/org_id — that member's session is reconstructed for tool permissions. So "which member acts" is answered explicitly by the assignment. If that owner later leaves the org, build_session_for_member returns Ok(None) → no domain tools → the turn degrades gracefully (it can still escalate).

05Find-or-spawn helper

The reusable seam (shared with #1429's Scheduled trigger): one long-lived thread per (agent, contact), race-safe under concurrent inbound.

// src/mods/ai_agent/services/find_or_spawn_contact_thread_service.rs (new)
pub async fn find_or_spawn_contact_thread(
    db: &DatabaseConnection,
    agent_id: Uuid,
    contact_id: Uuid,
    payload: AiThreadEventPayload,   // event-type parameterized — InboundSms OR Scheduled
) -> Result<Uuid, AppError> {
    let key = format!("contact:{contact_id}");

    // 1. Reuse a non-archived thread for (agent, key) if one exists.
    if let Some(id) = active_thread_for(db, agent_id, &key).await? {
        enqueue_and_maybe_spawn(id, None, payload).await?;
        return Ok(id);
    }

    // 2. Else insert one WITH the correlation_key. The partial unique index
    //    uq_ai_thread_agent_correlation_active guards against a concurrent
    //    insert; on 23505 we re-query the winner's thread.
    let id = match spawn_ai_agent_thread(db, agent_id, Some(key.clone())).await {
        Ok(id) => id,
        Err(AppError::Database(e)) if is_unique_violation(&e) =>
            active_thread_for(db, agent_id, &key).await?
                .ok_or_else(|| AppError::Internal("race lost but no thread".into()))?,
        Err(e) => return Err(e),
    };
    enqueue_and_maybe_spawn(id, None, payload).await?;
    Ok(id)
}

Required change	File
Extend `spawn_ai_agent_thread(db, agent_id)` → add `correlation_key: Option<String>` param (1 existing caller updated to pass `None`)	`spawn_ai_agent_thread_service.rs`
New `find_or_spawn_contact_thread` + small `active_thread_for` query helper	new service file
No change to `enqueue_and_maybe_spawn` — its claim gate is on `state`, which correctly wakes an idle thread and leaves a running one to drain	`enqueue_and_maybe_spawn_service.rs`

✓

Race guard verified against the live index

CREATE UNIQUE INDEX uq_ai_thread_agent_correlation_active ON ai_thread (agent_id, correlation_key) WHERE state <> 'archived' AND correlation_key IS NOT NULL — exactly the predicate we need. is_unique_violation() already detects Postgres 23505 on a sea_orm::DbErr.

06Suggest vs. send gate

The send_mode column swaps the agent's terminal move. Autonomous sends; suggest drafts pickable options. Both decided at the single non-assistant tool-build site.

// run_ai_agent_thread_service.rs — build_domain_tools_for_agent(&db, &agent, reply_ctx)
let mut tools = collect_agent_domain_rig_tools(&session, tier_flags);
match agent.send_mode.as_str() {
    "suggest" => {
        // Cannot send. Swaps send_sms for a draft-options tool bound to the inbound.
        tools.retain(|t| t.name() != "send_sms");
        if let Some(ctx) = &reply_ctx {
            tools.push(build_propose_sms_replies_tool(ctx.clone()));
        }
    }
    _ => /* autonomous: send_sms stays, curried to reply-from the texted number (§7) */ {}
}
tools

Mode	Terminal tool	Behavior
`autonomous`	`send_sms` (+ `escalate_to_user`)	Read bundle + sends from the texted number when confident; escalates when not.
`suggest`	`propose_sms_replies` (+ `escalate_to_user`)	Read bundle − `send_sms`. Drafts 2–3 options; physically cannot send. Escalate stays for genuine "can't draft / needs judgment" cases.

Suggest mode = pickable reply options (reuse the live picker)

The suggest agent's job is to draft, not send. Its terminal tool — propose_sms_replies(options: [2–3 strings]) — is built per-turn, curried with the inbound reply_ctx (contact_id, message_id) the way escalate_to_user is bound to its thread. It writes one row into the existing text_agent_suggestion table; from there the whole legacy surface wires up with zero new UI:

propose_sms_replies([d1, d2, d3]) └─ INSERT text_agent_suggestion { message_id, contact_id, ai_agent_id, suggestions:[…] } │ InlineSuggestionsPanel (existing) ── get_suggestions_api(message_id) ──▶ 3 cards on the message │ "Use this reply" ──▶ composer prefilled (ai_origin="suggestion") ──▶ send from the texted number

Reuse / change	Detail
reuse `text_agent_suggestion` table	One row per inbound, `suggestions` JSON blob of `{body, confidence}` — exactly what the panel reads.
migration nullable + provenance	Make `text_agent_id` nullable; add `ai_agent_id UUID NULL` so an ai_agent can own the row. (When the `text_agent` mod is retired later, the logic migrates back cleanly.)
reuse `InlineSuggestionsPanel` + `get_suggestions_api`	Renders the 3 cards on a tapped inbound message; the picker/compose/send path is untouched. Fetch is by `message_id`/org, not `text_agent_id`, so ai_agent rows show as-is.
reuse composer + `ai_origin`	"Use this reply" prefills with `ai_origin="suggestion"`; send goes out from the texted number via the §7 reply context.
new `propose_sms_replies` tool	The only net-new piece. Reply-context-bound; validates 2–3 options; inserts the row.

▲

Proactive "drafts are waiting" alert → follow-up issue

This spike makes the drafts appear in-conversation (the user sees them when they open the thread). The proactive push/in-app notification that pulls them there — "{Agent} drafted 3 replies for {contact}" deep-linking to /messaging/{contact_id} — ships as a separate follow-up issue. The OS tray can't carry tappable reply buttons without new platform work, so that path is notification → deep-link → in-app picker.

Why this site, and why the assistant is safe

is_assistant_flavor is true only for system_source_id == LOQUENT_ASSISTANT_AGENT_ID; that path uses assemble_assistant_turn and never calls build_domain_tools_for_agent. Text Reply has a different source id, so it always takes the attach_domain_tools branch — making this the exact and only chokepoint. build_domain_tools_for_agent already gets &agent; it gains one param, reply_ctx, threaded from the triggering InboundSms (the same value §7 needs).

07Reply from the right number settled · curry

The verification's most important finding: send_sms resolves the from number from the member, not the number the contact texted.

The bug if we do nothing

send_sms with no explicit phone_number_id falls back to resolve_sender_phone → the member's default/assigned phone. In an org with two numbers, a contact texting line B could get a reply from line A. send_sms does accept an explicit phone_number_id (and validates it) — but the inbound phone_number_id is currently never surfaced to the agent.

Settled now that suggest mode forces the same plumbing: a ReplyContext { contact_id, message_id, phone_number_id } is extracted from the freshest InboundSms pending event and threaded into build_domain_tools_for_agent. It drives both terminal tools:

autonomous — send_sms is curried so phone_number_id defaults to the texted line even if the model omits it (model can still override).
suggest — propose_sms_replies is bound to (contact_id, message_id); the human's pick sends from the same line via the composer.

✓

Q1 resolved — the curry comes for free

We have to thread ReplyContext in for the suggestions tool anyway, so currying send_sms's default-from is nearly zero extra cost — and makes a wrong-number reply impossible by default. This was the original open Q1.

08Webhook reroute & safety gates

One branch in handle_inbound_message, plus three guards the verification flagged.

// twilio_events_api.rs — replaces the current legacy text_agent spawn (~line 265)
match phone.ai_agent_id {
    Some(agent_id) => {
        // gate 1: don't auto-engage an opted-out contact
        // gate 2: skip media-only / empty-body inbound (nothing to reason about)
        if opted_out || body_for_agent.trim().is_empty() {
            tracing::info!("agentic path skipped (opt-out or empty body)");
        } else {
            tokio::spawn(find_or_spawn_contact_thread(
                agent_id, contact_id,
                AiThreadEventPayload::inbound_sms(message.id, contact_id, phone.id, body),
            ));
        }
        // gate 3: legacy text_agent is NOT spawned for opted-in numbers (no reply race)
    }
    None => { tokio::spawn(handle_inbound_message_for_text_agent(/* unchanged */)); }
}

Gate	Behavior	Source
Opt-out	If the contact opted out of SMS, skip the agentic spawn entirely (autonomous send would be blocked downstream anyway; this saves the turn).	risk: opt-out HIGH
Empty / MMS-only	If body is empty (media-only), skip spawning — the turn would have nothing to act on; the message + media still persist and notify.	risk: empty-body HIGH
Reply race	Opted-in numbers skip the legacy `text_agent` so a single inbound can't yield two replies.	risk: path-collision HIGH

▲

Plan automation still fires alongside — see Q2

wake_plans_for_contact and create_plans_from_sms run independently of this branch. A plan could also send a reply to the same inbound. The legacy text_agent race is closed by gate 3; the plan overlap is a separate decision (Q2) — plans are opt-in business automation, so my default is to leave them and document the overlap.

09`InboundSms` payload

A new enum variant. Verified blast radius: exactly two exhaustive match arms; every other read site deserializes and degrades gracefully.

// ai_thread_event_payload_type.rs
pub enum AiThreadEventPayload {
    UserMessage(AiThreadUserMessage),
    InboundSms(AiThreadInboundSms),          // NEW
}
pub struct AiThreadInboundSms {
    pub message_id: Uuid,
    pub contact_id: Uuid,
    pub phone_number_id: Uuid,   // the texted number → reply from here (§7)
    pub text: String,
}
// as_text() → &m.text (joins conversation history)
// page_context() → None (SMS has no app page)

The drained event's as_text() becomes the turn prompt via payload_text(); page_context()=None is safe (freshest_page_context uses find_map). The text may be wrapped with a one-line context preamble naming the inbound number so the model knows which line to reply from.

10Verification findings

A 15-agent workflow read the actual code and adversarially checked every load-bearing claim. Distilled — "confirmed" = design holds; "gap" = the plan now closes something it originally missed.

verification agents

HIGH gaps now closed

open questions left (Q2, Q3)

design-breaking refutations

Finding	Verdict	Resolution in this plan
`send_sms` resolves from by member, not texted number	HIGH	§7 + Q1 — bind reply to inbound `phone_number_id`
Clone drops `send_mode` (`..Default::default()` → autonomous)	HIGH	§3 — explicit `Set(send_mode)` in both clone paths
Reply-path collision (text_agent + plans + agent)	HIGH	§8 gate 3 (text_agent) + Q2 (plans)
Opt-out not enforced at spawn; empty/MMS-only turns	HIGH	§8 gates 1 & 2
`escalate_to_user` = allowlist meta-tool, pushes draft to owner	CONFIRMED	suggest-mode delivery works as designed
Partial unique index + `is_unique_violation` race guard	CONFIRMED	§5 find-or-spawn
Gate site (`build_domain_tools_for_agent`); assistant untouched	CONFIRMED	§6 one-line filter
Payload variant blast radius = 2 match arms	CONFIRMED	§9
Legacy 3-reply picker (`InlineSuggestionsPanel`) is live + reusable	CONFIRMED	§6 — suggest mode feeds it via `text_agent_suggestion`
Long-lived thread history growth (no TTL reaper)	MEDIUM	§13 — accept + defer; existing 100-event window caps it
Mid-turn inbound batches into next turn (pre-existing runtime behavior)	MEDIUM	§13 — not introduced here; no event lost

11Open questions for you

Two decisions left (Q2, Q3). Q1 settled itself once suggest mode required the same reply-context plumbing.

Q1 · RESOLVED

How hard do we bind the reply to the texted number?

Settled — curry the reply context. Suggest mode forces us to thread ReplyContext into the turn anyway, so send_sms's default-from is curried to the texted number (model can override). A wrong-number reply is impossible by default. See §7.

When a number is agent-handled, do plan SMS automations still run?

Leave plan automation as-is (only suppress legacy text_agent). Plans are explicit, opt-in business workflows; document that an inbound may trigger both. Minimal, non-breaking.
Also suppress create_plans_from_sms / plan replies for agent-handled numbers, so the agent is the single responder. Cleaner UX, but changes plan behavior and needs product sign-off.

My take: option 1 for the spike — keep blast radius tight; revisit plan↔agent orchestration as its own issue.

Seeded send_mode default for Text Reply: suggest or autonomous?

suggest — the agent drafts and escalates for human approval. Safest "goes live" posture; flip individual agents to autonomous when trusted.
autonomous — sends immediately when confident. Higher wow-factor, higher risk on day one.

My take: seed suggest; we'll demonstrate both modes in testing by flipping the column.

12Build order

Bottom-up: schema → payload → helper → gate → reroute → seed → tests. Each step compiles before the next.

1Migrations: columnssmall

phone_number.ai_agent_id + ai_agent.send_mode + text_agent_suggestion (make text_agent_id nullable, add ai_agent_id). Register in lib.rs, you run just generate.
2InboundSms payloadsmall

Variant + struct + inbound_sms() ctor + 2 match arms + round-trip tests.
3spawn + find-or-spawnmedium

Extend spawn_ai_agent_thread with correlation_key; new find_or_spawn_contact_thread + race-path test.
4Reply context + send_mode gatemedium

Extract ReplyContext from the freshest InboundSms; thread it into build_domain_tools_for_agent. Curry send_sms default-from; on suggest, drop send_sms. Unit test: suggest drops send_sms, autonomous keeps + curries it.
5propose_sms_replies toolmedium

New reply-context-bound tool: validate 2–3 options, insert a text_agent_suggestion row (ai_agent_id set). Added to suggest-mode turns. The existing panel/fetch/pick/send renders it — no UI work.
6Webhook reroute + gatesmedium

Branch on phone.ai_agent_id with opt-out / empty-body / text_agent-skip gates.
7Seed Text Reply + clone fixmedium

Constant, seed migration (mirrors drafter), DEFAULT_USER_AGENT_SOURCES, and the explicit send_mode copy in the runtime clone.
8Verify + reviewsmall

cargo check both targets, /review-code, manual inbound test (suggest + autonomous).

13Out of scope (noted in PR)

follow-up issue Proactive "drafts are waiting" alert — push/in-app notification "{Agent} drafted N replies for {contact}" deep-linking to /messaging/{contact_id}, reusing the inbound-SMS notify path. This spike surfaces drafts in-conversation; the alert that pulls the user there ships next.
defer Assignment UI — lands later in the agent view (assign numbers to this agent) + a set/list API with auth.
defer Idle-TTL thread archival — a reaper cron. Threads stay long-lived; the existing 100-event history window bounds per-turn cost. Revisit with summarization.
defer Per-contact behavior override — repurposing contact_text_agent stays a future seam (per-thread tool gating).
defer send_mode toggle UI — set via DB for the spike; a per-agent control ships with the assignment UI.
defer Media-aware turns — MMS-only inbound is skipped for now (gate 2) rather than fed to the agent.
note Mid-turn batching — rapid inbounds collapse into the next turn. Pre-existing runtime behavior, no event loss; not re-architected here.

14Testing

Unit

InboundSms serde round-trip; as_text() returns the body; page_context()=None.
find_or_spawn: first call inserts; second reuses (no duplicate); simulated unique-violation re-queries the same thread.
send_mode gate: suggest bundle excludes send_sms and includes propose_sms_replies; autonomous includes send_sms (curried) and not the propose tool; both include escalate_to_user.
propose_sms_replies inserts a text_agent_suggestion row the existing get_suggestions_api returns.
Clone preserves send_mode='suggest' (guards the §3 correctness bug).

Manual (acceptance criteria)

Set phone.ai_agent_id; inbound with no thread spawns one keyed contact:{id}; follow-up reuses it.
autonomous agent replies via send_sms from the texted number; suggest agent drafts 3 options that appear on the message (tap → send), and does not send on its own.
Non-opted-in number still uses legacy text_agent — no regression.
Opted-out / media-only inbound does not spawn a turn.

Text reply goes live

00What we're building

Today — legacy text_agent

This spike — ai_agent runtime

01Decisions locked

02End-to-end flow

03Schema & migrations

04The Text Reply agent

Configuration

Provisioning

05Find-or-spawn helper

06Suggest vs. send gate

Suggest mode = pickable reply options (reuse the live picker)

07Reply from the right number settled · curry

08Webhook reroute & safety gates

09InboundSms payload

10Verification findings

11Open questions for you

12Build order

13Out of scope (noted in PR)

14Testing

Unit

Manual (acceptance criteria)

09`InboundSms` payload