AWS Machine Learning Blog

Use generative AI to increase agent productivity through automated call summarization

Your contact center serves as the vital link between your business and your customers. Every call to your contact center is an opportunity to learn more about your customers’ needs and how well you are meeting those needs.

Most contact centers require their agents to summarize their conversation after every call. Call summarization is a valuable tool that helps contact centers understand and gain insights from customer calls. Additionally, accurate call summaries enhance the customer journey by eliminating the need for customers to repeat information when transferred to another agent.

In this post, we explain how to use the power of generative AI to reduce the effort and improve the accuracy of creating call summaries and call dispositions. We also show how to get started quickly using the latest version of our open source solution, Live Call Analytics with Agent Assist.

Challenges with call summaries

As contact centers collect more speech data, the need for efficient call summarization has grown significantly. However, most summaries are empty or inaccurate because manually creating them is time-consuming, impacting agents’ key metrics like average handle time (AHT). Agents report that summarizing can take up to a third of the total call, so they skip it or fill in incomplete information. This hurts the customer experience—long holds frustrate customers while the agent types, and incomplete summaries mean asking customers to repeat information when transferred between agents.

The good news is that automating and solving the summarization challenge is now possible through generative AI.

Generative AI is helping summarize customer calls accurately and efficiently

Generative AI is powered by very large machine learning (ML) models referred to as foundation models (FMs) that are pre-trained on vast amounts of data at scale. A subset of these FMs focused on natural language understanding are called large language models (LLMs) and are able to generate human-like, contextually relevant summaries. The best LLMs can process even complex, non-linear sentence structures with ease and determine various aspects, including topic, intent, next steps, outcomes, and more. Using LLMs to automate call summarization allows for customer conversations to be summarized accurately and in a fraction of the time needed for manual summarization. This in turn enables contact centers to deliver superior customer experience while reducing the documentation burden on their agents.

Below are two videos. The first video shows an example of the Live Call Analytics with Agent Assist summarizing a call after the call ends and generating a follow-up email.  The second video shows how a manager can join an escalating call and generate an in-progress call summary to gain context faster.

Solution overview

The following diagram illustrates the solution workflow.

The first step to generating abstractive call summaries is transcribing the customer call. Having accurate, ready-to-use transcripts is crucial to generate accurate and effective call summaries. Amazon Transcribe can help you create transcripts with high accuracy for your contact center calls. Amazon Transcribe is a feature-rich speech-to-text API with state-of-the-art speech recognition models that are fully managed and continuously trained. Customers such as New York Times, Slack, Zillow, Wix, and thousands of others use Amazon Transcribe to generate highly accurate transcripts to improve their business outcomes. A key differentiator for Amazon Transcribe is its ability to protect customer data by redacting sensitive information from the audio and text. Although protecting customer privacy and safety is important in general to contact centers, it’s even more important to mask sensitive information such as bank account information and Social Security numbers before generating automated call summaries, so they don’t get injected into the summaries.

For customers who are already using Amazon Connect, our omnichannel cloud contact center, Contact Lens for Amazon Connect provides real-time transcription and analytics features natively. However, if you want to use generative AI with your existing contact center, we have developed solutions that do most of the heavy lifting associated with transcribing conversations in real time or post-call from your existing contact center, and generating automated call summaries using generative AI. Additionally, the solution detailed in this section allows you to integrate with your Customer Relationship Management (CRM) system to automatically update your CRM of choice with generated call summaries. In this example, we use our Live Call Analytics with Agent Assist (LCA) solution to generate real-time call transcriptions and call summaries with LLMs hosted on Amazon Bedrock. You can also write an AWS Lambda function and provide LCA the function’s Amazon Resource Name (ARN) in the AWS CloudFormation parameters, and use the LLM of your choice.

The following simplified LCA architecture illustrates call summarization with Amazon Bedrock.

LCA is provided as a CloudFormation template that deploys the preceding architecture and allows you to transcribe calls in real time. The workflow steps are as follows:

  1. Call audio can be streamed via SIPREC from your telephony system to Amazon Chime SDK Voice Connector, which buffers the audio in Amazon Kinesis Video Streams. LCA also supports other audio ingestion mechanisms, such Genesys Cloud Audiohook.
  2. Amazon Chime SDK Call Analytics then streams the audio from Kinesis Video Streams to Amazon Transcribe, and writes the JSON output to Amazon Kinesis Data Streams.
  3. A Lambda function processes the transcription segments and persists them to an Amazon DynamoDB table.
  4. After the call ends, Amazon Chime SDK Voice Connector publishes an Amazon EventBridge notification that triggers a Lambda function that reads the persisted transcript from DynamoDB, generates an LLM prompt (more on this in the following section), and runs an LLM inference with Amazon Bedrock. The generated summary is persisted to DynamoDB and can be used by the agent in the LCA user interface. You can optionally provide a Lambda function ARN that will be run after the summary is generated to integrate with third-party CRM systems.

LCA also allows the option to call the summarization Lambda function during the call, because at any time the transcript can be fetched and a prompt created, even if the call is in progress. This can be useful for times when a call is transferred to another agent or escalated to a supervisor. Rather than putting the customer on hold and explaining the call, the new agent can quickly read an auto-generated summary, and it can include what the current issue is and what the previous agent tried to do to resolve it.

Example call summarization prompt

You can run LLM inferences with prompt engineering to generate and improve your call summaries. You can modify the prompt templates to see what works best for the LLM you select. The following is an example of the default prompt for summarizing a transcript with LCA. We replace the {transcript} placeholder with the actual transcript of the call.

Human: Answer the questions below, defined in <question></question> based on the transcript defined in <transcript></transcript>. If you cannot answer the question, reply with 'n/a'. Use gender neutral pronouns. When you reply, only respond with the answer.

<question>
What is a summary of the transcript?
</question>

<transcript>
{transcript}
</transcript>

Assistant:

LCA runs the prompt and stores the generated summary. Besides summarization, you can direct the LLM to generate almost any text that is important for agent productivity. For example, you can choose from a set of topics that were covered during the call (agent disposition), generate a list of required follow-up tasks, or even write an email to the caller thanking them for the call.

The following screenshot is an example of agent follow-up email generation in the LCA user interface.

With a well-engineered prompt, some LLMs have the ability to generate all of this information in a single inference as well, reducing inference cost and processing time. The agent can then use the generated response within a few seconds of ending the call for their after-contact work. You can also integrate the generated response automatically into your CRM system.

The following screenshot shows an example summary in the LCA user interface.

It’s also possible to generate a summary while the call is still ongoing (see the following screenshot), which can be especially helpful for long customer calls.

Prior to generative AI, agents would be required to pay attention while also taking notes and performing other tasks as required. By automatically transcribing the call and using LLMs to automatically create summaries, we can lower the mental burden on the agent, so they can focus on delivering a superior customer experience. This also leads to more accurate after-call work, because the transcription is an accurate representation of what occurred during the call—not just what the agent took notes on or remembered.

Summary

The sample LCA application is provided as open source—use it as a starting point for your own solution, and help us make it better by contributing back fixes and features via GitHub pull requests. For information about deploying LCA, refer to Live call analytics and agent assist for your contact center with Amazon language AI services. Browse to the LCA GitHub repository to explore the code, sign up to be notified of new releases, and check out the README for the latest documentation updates. For customers who are already on Amazon Connect, you can learn more about generative AI with Amazon Connect by referring to How contact center leaders can prepare for generative AI.


About the authors

Christopher Lott is a Senior Solutions Architect in the AWS AI Language Services team. He has 20 years of enterprise software development experience. Chris lives in Sacramento, California and enjoys gardening, aerospace, and traveling the world.

Smriti Ranjan is a Principal Product Manager in the AWS AI/ML team focusing on language and search services. Prior to joining AWS, she worked at Amazon Devices and other technology startups leading product and growth functions. Smriti lives in Boston, MA and enjoys hiking, attending concerts and traveling the world.