Rasa CALM: The New LLM-Backed Architecture That Changes How You Build Chatbots

What CALM Replaces

Classic Rasa used stories (example conversation paths) and rules (strict if-then conditions) to define dialogue management. Training required hundreds of story examples to handle variations. Edge cases required more stories. The system was powerful but labour-intensive to maintain.

CALM (Conversational AI with Language Models) replaces this with:

Flows -- structured definitions of what the bot should accomplish, written in YAML
An LLM that interprets user messages and decides which flow step to execute next
Slot filling backed by the LLM -- no more extensive NLU training for entity extraction

The result: less training data required, more graceful handling of unexpected inputs, and better performance on out-of-distribution user messages.

Flows: The Core CALM Concept

A flow defines a conversation goal and the steps to achieve it. The LLM decides which step is appropriate given the user's current message and conversation context.

flows:
  book_appointment:
    description: Book a medical appointment for the user.
    steps:
      - id: ask_appointment_type
        collect: appointment_type
        ask_before_filling: true
        description: Ask the user what type of appointment they need
        next:
          - if: slots.appointment_type == 'urgent'
            then: check_urgent_availability
          - else: ask_preferred_date
 
      - id: ask_preferred_date
        collect: preferred_date
        ask_before_filling: true
        description: Ask the user for their preferred appointment date
        next: confirm_booking
 
      - id: check_urgent_availability
        action: action_check_urgent_slots
        next: confirm_booking
 
      - id: confirm_booking
        action: action_create_appointment
        next: END

The LLM reads the flow descriptions and slot definitions to understand the intent. It does not need story examples -- it reasons from the YAML definition.

Slot Collection With CALM

CALM's LLM automatically extracts slots from user messages during a flow. You define what to collect; the LLM handles the extraction.

slots:
  appointment_type:
    type: categorical
    values:
      - general
      - specialist
      - urgent
    mappings:
      - type: from_llm
 
  preferred_date:
    type: text
    mappings:
      - type: from_llm

Use type: from_llm for most slot extraction in CALM. The LLM handles variation ('I need to see someone urgently' -> appointment_type = 'urgent') without requiring NLU training examples for each variation.

Custom Actions Still Use Python

Business logic (database lookups, API calls, custom validations) still lives in a Python action server. The interface is the same as classic Rasa.

from rasa_sdk import Action, Tracker
from rasa_sdk.executor import CollectingDispatcher
from typing import Dict, Text, Any, List
 
class ActionCreateAppointment(Action):
    def name(self) -> Text:
        return 'action_create_appointment'
 
    def run(
        self,
        dispatcher: CollectingDispatcher,
        tracker: Tracker,
        domain: Dict[Text, Any],
    ) -> List[Dict[Text, Any]]:
        appointment_type = tracker.get_slot('appointment_type')
        preferred_date = tracker.get_slot('preferred_date')
 
        # Your booking logic here
        booking_id = create_booking_in_crm(appointment_type, preferred_date)
 
        dispatcher.utter_message(
            text=f'Your {appointment_type} appointment has been booked. '
                 f'Reference: {booking_id}'
        )
        return []

Migrating From Classic Rasa to CALM

There is no automated migration path. CALM is a different architecture. The migration approach:

Identify your highest-traffic stories and convert them to CALM flows (one flow per user goal)
Replace intent-based NLU with flow descriptions the LLM can reason from
Keep your action server code -- it is unchanged
Convert slot mappings to from_llm type where applicable
Run CALM alongside classic Rasa in parallel during transition (enterprise feature)
Retrain and test iteratively -- CALM requires testing for LLM reasoning quality, not story coverage

CALM is only available in Rasa Pro, not Rasa Open Source. Before planning a migration, evaluate whether your use case justifies Rasa Pro's enterprise pricing vs switching to a cloud chatbot builder that has already integrated LLMs natively.

CALM Limitations to Know

LLM dependency: CALM requires an LLM call for each dialogue step. This adds latency (~200-500ms per turn) and cost compared to classic Rasa's deterministic rules.
Determinism: LLM-backed dialogue is less deterministic than rules. Two identical inputs may take slightly different paths. This matters for compliance-sensitive scenarios.
Debugging: classic Rasa showed exactly which story matched. CALM's LLM reasoning is less transparent -- you see the outcome, not the full reasoning chain.