What CALM Replaces
Classic Rasa used stories (example conversation paths) and rules (strict if-then conditions) to define dialogue management. Training required hundreds of story examples to handle variations. Edge cases required more stories. The system was powerful but labour-intensive to maintain.
CALM (Conversational AI with Language Models) replaces this with:
- Flows -- structured definitions of what the bot should accomplish, written in YAML
- An LLM that interprets user messages and decides which flow step to execute next
- Slot filling backed by the LLM -- no more extensive NLU training for entity extraction
The result: less training data required, more graceful handling of unexpected inputs, and better performance on out-of-distribution user messages.
Flows: The Core CALM Concept
A flow defines a conversation goal and the steps to achieve it. The LLM decides which step is appropriate given the user's current message and conversation context.
flows:
book_appointment:
description: Book a medical appointment for the user.
steps:
- id: ask_appointment_type
collect: appointment_type
ask_before_filling: true
description: Ask the user what type of appointment they need
next:
- if: slots.appointment_type == 'urgent'
then: check_urgent_availability
- else: ask_preferred_date
- id: ask_preferred_date
collect: preferred_date
ask_before_filling: true
description: Ask the user for their preferred appointment date
next: confirm_booking
- id: check_urgent_availability
action: action_check_urgent_slots
next: confirm_booking
- id: confirm_booking
action: action_create_appointment
next: END
The LLM reads the flow descriptions and slot definitions to understand the intent. It does not need story examples -- it reasons from the YAML definition.
Slot Collection With CALM
CALM's LLM automatically extracts slots from user messages during a flow. You define what to collect; the LLM handles the extraction.
slots:
appointment_type:
type: categorical
values:
- general
- specialist
- urgent
mappings:
- type: from_llm
preferred_date:
type: text
mappings:
- type: from_llm
Use type: from_llm for most slot extraction in CALM. The LLM handles variation ('I need to see someone urgently' -> appointment_type = 'urgent') without requiring NLU training examples for each variation.Custom Actions Still Use Python
Business logic (database lookups, API calls, custom validations) still lives in a Python action server. The interface is the same as classic Rasa.
from rasa_sdk import Action, Tracker
from rasa_sdk.executor import CollectingDispatcher
from typing import Dict, Text, Any, List
class ActionCreateAppointment(Action):
def name(self) -> Text:
return 'action_create_appointment'
def run(
self,
dispatcher: CollectingDispatcher,
tracker: Tracker,
domain: Dict[Text, Any],
) -> List[Dict[Text, Any]]:
appointment_type = tracker.get_slot('appointment_type')
preferred_date = tracker.get_slot('preferred_date')
# Your booking logic here
booking_id = create_booking_in_crm(appointment_type, preferred_date)
dispatcher.utter_message(
text=f'Your {appointment_type} appointment has been booked. '
f'Reference: {booking_id}'
)
return []
Migrating From Classic Rasa to CALM
There is no automated migration path. CALM is a different architecture. The migration approach:
- Identify your highest-traffic stories and convert them to CALM flows (one flow per user goal)
- Replace intent-based NLU with flow descriptions the LLM can reason from
- Keep your action server code -- it is unchanged
- Convert slot mappings to from_llm type where applicable
- Run CALM alongside classic Rasa in parallel during transition (enterprise feature)
- Retrain and test iteratively -- CALM requires testing for LLM reasoning quality, not story coverage
CALM is only available in Rasa Pro, not Rasa Open Source. Before planning a migration, evaluate whether your use case justifies Rasa Pro's enterprise pricing vs switching to a cloud chatbot builder that has already integrated LLMs natively.CALM Limitations to Know
- LLM dependency: CALM requires an LLM call for each dialogue step. This adds latency (~200-500ms per turn) and cost compared to classic Rasa's deterministic rules.
- Determinism: LLM-backed dialogue is less deterministic than rules. Two identical inputs may take slightly different paths. This matters for compliance-sensitive scenarios.
- Debugging: classic Rasa showed exactly which story matched. CALM's LLM reasoning is less transparent -- you see the outcome, not the full reasoning chain.