Make the Voice Prompt Smaller Before You Make It Stricter

When a voice assistant starts doing the wrong thing, the first instinct is usually to make the prompt stricter.

Add another rule.

Repeat the warning.

Tell the model not to guess.

Tell it to be careful.

That can feel like progress, but it often just makes the prompt heavier. The behavior may still fail, because the failure was never one general problem.

That was the useful lesson from the Northbridge Messenger assistant work.

The assistant was not failing in one neat way. It had different risks in different parts of the call.

The caller-number logic had one kind of risk.

The closing flow had another.

The phone-number readback had another.

The knowledge lookup rules had another.

If those are treated as one prompt problem, the instructions start to blur together. The prompt becomes a pile of good-sounding cautions, but the model still has too much room to improvise.

The better move was to turn the prompt into gates.

A gate is a small decision point with a clear source of truth, a clear fallback, and a clear check before the conversation moves on.

The first gate was caller state.

The assistant should not infer whether a callback number exists. It should check the runtime value first.

If the caller number is present and usable, the assistant can confirm it.

If the number is withheld or unresolved, the assistant must treat it as withheld and ask for a callback number directly.

That rule is not glamorous, but it matters. It stops the model from pretending a missing variable is a conversational detail it can smooth over.

Runtime state wins. Not assumption.

The second gate was closing behavior.

A vague instruction like “end the call politely” leaves too much open.

In a voice agent, the closing path is not just copy. It is part of the control system.

The prompt needed approved closing utterances, a shared terminal phrase, and a fallback story for the direct end-call action.

The normal path should be phrase-triggered. The hard end-call action should be fallback-only.

That separation gives the assistant one job at a time.

Say the approved close.

Then stop.

The third gate was number readback.

Phone numbers are not ordinary chat text. They are captured data, and the readback is the validation step.

The assistant needed to repeat numbers as plain spoken digits only.

No markup-shaped speech.

No extra punctuation.

No compressed phrasing that sounds natural but hides mistakes.

Just the captured digit sequence, spoken one digit at a time, then confirmed.

This is where small formatting choices become operational risk. If the assistant sounds casual, uncertain, or decorative while reading back a number, the caller may miss the error.

Capture. Repeat. Confirm. Move on.

The fourth gate was knowledge lookup.

This is where many assistant prompts quietly become unsafe.

If the answer is in the inline company summary or fixed prompt fields, the assistant can use that.

If it is not there, it should use the knowledge tool.

If the tool does not return a clear answer, the assistant should say it does not have that information.

That is the boundary.

The assistant does not get to expand the business from general world knowledge. It does not get to invent services, pricing, process details, device support, or operational promises because they sound plausible.

It knows the inline summary, the fixed fields, and the tool result.

No more than that.

The reason the phased approach worked is simple: each gate had a different failure mode.

Caller-number handling is about runtime truth.

Closing behavior is about termination control.

Number readback is about speech formatting and confirmation.

Knowledge lookup is about boundaries.

Trying to solve all of that with one stronger paragraph would have made the prompt longer, but not clearer.

Solving it gate by gate made each rule easier to test. It also made the assistant easier to reason about when something went wrong.

That is the lesson I would carry into other voice-agent work.

Do not harden the whole prompt as one blob.

Find the gates.

Give each gate one source of truth.

Give each gate one fallback.

Give each gate one check before the conversation moves on.

Reliability in voice does not come from a clever answer. It comes from behavior the caller can predict, confirm, and trust.

-----------
If you find this content useful, please share it with this link: [https://patrickmichael.co.za/subscribe](https://patrickmichael.co.za/subscribe)

Classification

All