The Four Pricing Models
Conversational AI platform licensing comes in four shapes, and the first job of any buyer is to know which one a quote uses. Per-request/usage pricing bills each text or voice call to the platform — the model used by Amazon Lex and Google Dialogflow CX. Per-session pricing charges each time the AI engages a customer, regardless of outcome. Per-resolution pricing charges only when an issue is resolved end to end without a human. Contract-based enterprise pricing bundles volume, channels and modules into a bespoke annual quote — the Kore.ai and Cognigy model, and the fully opaque approach taken by newer entrants such as Ada, Sierra and Decagon.
The unit matters more than the rate. A low per-session price can cost more than a higher per-resolution price if your conversations routinely take several turns — and vice versa. This is the same "what am I actually paying for" discipline that runs through the AI contract negotiation deep dive and the metric work in AI vendor benchmarking.
Vendor Cost Benchmarks
The published numbers span an order of magnitude. Usage-based platforms are cheap per unit but accumulate with telephony and speech services; contract-based platforms front-load cost into an annual commitment.
| Platform | Model | Indicative price |
|---|---|---|
| Amazon Lex | Per request | $0.00075 text · $0.004 voice |
| Google Dialogflow CX | Per request / minute | $0.007 text · $0.07–$0.20 voice/min |
| Fin / Intercom | Per resolution | $0.99 per resolution |
| Zendesk | Per resolution | $1.50 per resolution |
| Freshworks Freddy | Per session | ~$0.10 per session ($100/1,000) |
| Kore.ai | Enterprise contract | $1,200–$2,000/mo → $50K–$200K+/yr |
| Cognigy | Enterprise contract | $2,000–$3,000/mo → $100K+/yr |
For usage-based voice deployments, all-in annual costs commonly land in the $10,000–$100,000+ range once speech, telephony and backend usage are counted — well above the per-request rate. Treat the rate card as the floor, not the estimate.
The Per-Resolution Trap
Per-resolution pricing is marketed as fairness — you pay for value delivered — and at first glance it is attractive. But it contains a perverse incentive: the better your bot performs, the more you pay. As you improve containment and resolve more issues automatically, your per-resolution bill rises in lockstep, and at scale $0.99 or $1.50 per resolution compounds quickly. The vendor captures the upside of every improvement you fund.
Per-resolution pricing punishes AI improvement. Before you accept it, model the bill at your target containment rate — not today's — and negotiate volume tiers so success does not become your largest cost line.
The mirror-image risk on per-session pricing is the multi-turn issue: you pay for every interaction even when three sessions resolve one problem. Either way, the definition of a billable "resolution" or "session" must be written into the contract — vendors and buyers count them differently, and the gap is real money.
The Hidden Stack
The headline price is rarely the bill. Add-on stacking is the main inflator: AI message overages at roughly $0.50–$2.00 per extra message, channel add-ons such as WhatsApp at $59+/month, multilingual premiums, and human-agent seats at $29–$169 per agent per month for the 35–50% of conversations the AI cannot resolve. Migration runs $3,500–$17,000, and custom-build maintenance 15–20% of the original build cost per year.
The cumulative effect is severe: a platform quoting $1,500/month often runs $4,000–$8,000 fully loaded by month three, which is why 56% of companies miss their AI cost forecasts by 11–25%, and 24% by more than 50%. The "scaling paradox" makes it worse — on variable pricing, a seasonal volume spike can double the bill with no warning. These dynamics echo the usage-cost traps we map in negotiating AI compute costs and the agent-licensing maths in AI agent platform contracts.
Voice vs Text: The Cost Multiplier
The channel mix is a cost decision that buyers routinely underestimate. On the same platform, voice is dramatically more expensive than text: Amazon Lex charges $0.004 per voice request against $0.00075 per text request — more than a 5× multiplier — and Google Dialogflow CX bills voice at $0.07–$0.20 per minute once speech recognition and synthesis are added on top of the per-request rate. A deployment modelled on text economics will overshoot badly the moment voice volume scales.
Voice also drags in dependencies the rate card omits: telephony and carrier charges, speech-to-text and text-to-speech services, and higher latency sensitivity that can force a more expensive tier. For a contact centre weighing channels, the right move is to model voice and text as separate cost lines with separate volume assumptions, then negotiate speech services into the platform commitment rather than buying them piecemeal. Treating "a conversation" as one undifferentiated unit — when a voice conversation can cost several times a text one — is how a forecast quietly doubles, compounding the add-on stacking covered above.
How to Negotiate the Contract
Five moves protect the budget. First, pin the billable-unit definition in writing — what counts as a resolution, a session, a request — and reconcile it to your own analytics. Second, cap variable charges: fix the overage-message rate, bundle the channels you need, and put a ceiling on the scaling paradox so a volume spike cannot double the bill. Third, use the annual term as leverage: annual contracts are the norm, so trade a volume and term commitment for a discount rather than accepting list rates.
Fourth, validate the quoted resolution rate in a pilot before committing — a containment claim is a sales figure until proven on your traffic. Fifth, secure data-export and model-portability rights so you are not trapped if quality slips, the same protection we insist on in multi-model AI strategy. For the full evaluation framework, work through the AI Procurement Checklist and the AI Contract Red Flags brief, benchmark the platform clouds via the AWS and Google Cloud hubs, and request a confidential briefing before you sign.