Powering Southeast Asia’s banks with AI-driven communications 

Share this:
Image generated by Deeptech Times using Google Gemini

As Southeast Asia’s financial institutions ramp up digital transformation, giants like Sacombank, Vietcombank and Home Credit Vietnam are leaning on AI to reshape customer engagement. Increasingly, customers interacting with these banks will find their queries fielded by AI voice agents rather than traditional staff.

The underlying technology is a collaboration between real-time communications specialist Agora and Vietnamese tech heavyweight FPT Corp. The joint solution fuses Agora’s ultra-low-latency RTC and conversational AI platform with FPT’s enterprise AI ecosystem and local deployment experience. The result: streamlined service workflows, greater operational efficiency, and modernisation of client interactions, all while keeping pace with evolving regulatory demands.

With mobile-first strategies sweeping the region, consumer expectations have shifted towards seamless and personalised banking. At the same time, regulators are tightening requirements around data privacy, operational resilience and cyber protection. This confluence is turning conversational AI from a futuristic add-on into core infrastructure for banks.

Deployment is swift, according to company representatives; what starts as proof-of-concept often moves to production in about three months. Sacombank, for instance, reported that its next-generation AI contact centre, powered by voice agents, boosted call-handling capacity by over 58 per cent, now managing up to 41,000 calls daily. The bank credits the change for higher net promoter scores and a smoother customer journey.

Meanwhile, Vietcombank relies on AI agents across messaging channels to answer everyday inquiries, covering everything from credit cards to loans and foreign exchange, freeing up employees to tackle more complex work.

Home Credit Vietnam, cited in an NVIDIA case study, utilises FPT AI voice agents for contact centre automation on a massive scale, handling millions of monthly customer interactions. The system maintains consistent quality, with an average satisfaction score of 4.5 out of 5 since launch.

Agora co-founder Tony Wang highlights that their tech supports Vietnamese and 47 other languages, reflecting the region’s linguistic diversity. Security concerns loom large given the patchwork of regulations covering electronics, cybersecurity and consumer data in Southeast Asia. Both companies stress their platform’s low-latency, enterprise-grade architecture is built to handle high volumes securely and compliantly.

Wang notes that audio access for the AI agent is determined by each bank’s configuration. Audio streams are transcribed, processed via LLM responses, and rendered as speech, or managed by end-to-end multimodal models like OpenAI Realtime or Gemini Live.

Data privacy remains nuanced, shaped by client integration choices such as data shared with LLMs, log retention and redaction policies. Agora does not provide a universal list of accessed data fields for all deployments. 

Tony Wang, co-founder, Agora
IMAGE: Agora

Agora’s approach to data security echoes practices seen across Silicon Valley, with Wang outlining a two-pronged strategy: audio and video streams are encrypted using industry-standard protocols, while all data travelling across the platform enjoys the same robust encryption as leading secure websites. Access to sensitive information is tightly managed through ephemeral tokens, ensuring that user permissions are short-lived and do not persist beyond their intended session. 

The mechanics behind Agora’s live session channels are equally streamlined. When a user speaks, their voice is routed to an AI agent that seamlessly joins the channel, listening and responding in real time. The system’s process unfolds in three fast-paced steps: converting speech to text, running the text through an AI model, and transforming the generated response back to speech, all within as little as 650 milliseconds. For banks seeking granular control, features such as interruption handling and detailed timing analytics come built-in. 

Wang notes that connecting a multimodal model such as OpenAI Realtime API or Gemini Live can fundamentally alter this workflow. These advanced models process voice inputs natively, eliminating the need for the traditional three-step conversion pipeline and further accelerating response times. 

FPT Smart Cloud CRO Mark Hall Andrew adds that the company adheres to strict personal data protection policies, referencing both GDPR and Vietnam’s PDP Decree, underscoring transparency, security and data minimisation at every turn.

Mark Hall Andrew, chief revenue officer, FPT Smart Cloud
IMAGE: FPT Corporation

The Agora–FPT partnership enables both firms to meet rising demand for AI engagement platforms in regional financial services.

Leave a Reply

Your email address will not be published. Required fields are marked *

Search this website