Built on Modulate’s Ensemble Listening Model, Velma Enterprise API is the missing layer in real-world voice conversations. Experience it at Customer Contact Week Las Vegas.
Boston, MA – June 3, 2026 – Modulate, the frontier conversational voice intelligence company, today released its flagship Velma model through its developer API. Previously locked to enterprises, now any developer can access and deploy the leading voice-native conversation intelligence model, which natively understands audio and provides live insights into emotion, intent, behavioral risk, and conversational context. Modulate will be showcasing its powerful voice intelligence API at Customer Contact Week (CCW), taking place June 22-25, 2026, in Las Vegas.
Velma Enterprise API is designed to help organizations move from post-call analysis to continuous, real-time conversational understanding and intervention. This expansion marks the next phase of Modulate’s Velma platform, moving beyond transcription and point solutions toward a broader enterprise intelligence layer for live voice conversations.
As enterprises race to deploy voice AI across customer experience, fraud prevention, trust and safety, contact centers, and AI agent workflows, most systems still rely on speech-to-text as the foundation for understanding. That approach reduces conversations to flattened transcripts, stripping out critical signals such as urgency, hesitation, confusion, silence, emotional state, deception, and conversational context, making the limitations of transcript-first systems more apparent. Organizations need infrastructure that can continuously monitor live interactions for fraud, escalation risk, compliance failures, customer vulnerability, and AI agent behavior while there is still time to intervene.
Velma is designed to close that gap. Powered by Modulate’s Ensemble Listening Model (ELM) architecture, the API provides enterprises with a real-time listening layer that identifies and interprets the signals that determine what is actually happening in a conversation beyond just the words spoken.
“As enterprises deploy more AI across customer interactions, they’re realizing that transcription alone is an incomplete foundation for understanding conversations,” said Mike Pappas, CEO and co-founder of Modulate. “The excitement we’re seeing from operators, compliance teams, and customer experience leaders comes from finally having infrastructure that can interpret conversational and emotional context in real time, beyond the transcript.”
Unlike general-purpose voice systems that rely on a single large model or a transcript-first workflow, Velma uses an ensemble of specialized models that work together to analyze conversational audio across multiple dimensions. By analyzing raw conversational audio directly rather than relying solely on transcripts, Velma can detect emotional signals, conversational dynamics, behavioral patterns, and non-verbal cues that traditional speech-to-text pipelines often miss. This approach enables enterprises to extract structured insights from voice in real time while maintaining the transparency, efficiency, and scalability needed for production environments.
Velma Enterprise API can support use cases including:
- Fraud and risk detection: Identifying signs of synthetic audio, urgency, manipulation, policy avoidance, or other risk signals during live interactions.
- Customer experience and contact center intelligence: Helping teams understand caller emotion, frustration, confusion, escalation risk, and service needs in real time.
- AI agent oversight: Detecting when AI agents may be making inaccurate claims, violating policies, or failing to respond appropriately to customer needs.
- Trust and safety: Recognizing harmful, abusive, or policy-violating behavior in live voice environments.
- Operational intelligence: Turning conversational audio into structured, explainable signals that can inform review, escalation, training, and decision-making workflows.
- Compliance and vulnerable customer protection: Helping organizations identify signs of distress, confusion, disclosure failures, or regulatory risk during live interactions.
Velma updates introduce expanded real-time conversational understanding capabilities designed for organizations that need to move beyond post-call review toward continuous monitoring, explainable decision support, and live operational awareness across voice channels.
Pappas added, “Fraud, customer dissatisfaction, policy violations, and AI failures don’t politely happen only in the first 30 seconds of a call. Enterprises need systems that can listen continuously, explain what they are hearing, and help humans act quickly.”
Modulate developed its voice intelligence technology in some of the most demanding real-world audio environments: large-scale online video games, where conversations are live, noisy, and emotionally charged. That foundation has shaped the company’s approach to enterprise AI, where voice systems must be accurate, cost-effective, explainable, and resilient enough to operate at scale. That experience also shaped Modulate’s approach to voice intelligence infrastructure – continuously listening, understanding conversational behavior in context, and producing structured outputs that enterprises can trust and operationalize in real time.
With the Velma Enterprise API, Modulate is bringing that real-world voice intelligence infrastructure to enterprise teams building the next generation of AI-powered customer experience, fraud prevention, safety, and automation systems.
Velma Enterprise API is available now. To learn more, visit modulate.ai.
Experience Velma at Customer Contact Week Las Vegas
Modulate will showcase Velma with live demonstrations at Customer Contact Week (CCW) Las Vegas, June 22-25, 2026. Attendees can visit Modulate at Booth #1738 or schedule a briefing with the team to learn how enterprises are deploying voice-native conversational intelligence across customer experience, fraud prevention, compliance, and AI oversight workflows. To book a demo at CCW Las Vegas, visit: https://www.modulate.ai/lp/events/ccw-las-vegas-2026.
To schedule a press or analyst briefing, contact Kristin Canders at [email protected].
About Modulate
Modulate is a voice intelligence company building AI models and APIs designed to understand real-world conversational audio at scale. Its technology combines speech recognition, acoustic analysis, and conversational context to deliver reliable, explainable, and cost-effective voice intelligence for developers and enterprises.
For more information or to get started, visit modulate.ai.
Media Contact
Kristin Canders
Grithaus Agency