26th October 2023
Gennadiy Bezko at Miarec explores call scoring in the contact centre, comparing manual, keyword-based, and generative AI-based options, and explaining the benefits and use cases of each.
“Did the agent say their name?”, “Did they get consent for recording?”, “Did the agent resolve the matter on the first call?” and many other questions are standard call scoring questions that contact centre supervisors try to answer every day.
With hundreds or even thousands of calls to get through, making call scoring for quality assurance as efficient as possible is key.
With advances in generative AI and Voice Analytics, automated call scoring is quickly becoming the preferred method for many contact centre managers. But there still is a place and time for manual call scoring.
In fact, the best contact centre managers use all three methods in conjunction, with each playing to their strengths.
In this blog post, we’ll explore three main call scoring methods for quality assurance (manual, keyword-based, and Generative AI-based automatic call scoring), explore their pros and cons, and provide best practices when using any of the approaches so that you can make an informed decision on which one works best for your organization’s needs.
Topics Covered In This Article:
Manual call scoring is the most basic form of quality assurance and refers to the process of evaluating customer service calls by listening to call recordings, assessing the performance of the agent, and assigning scores based on predetermined criteria.
It involves human judgment and expertise and does not require any Voice Analytics or AI-based tools.
Manual call scoring requires trained personnel familiar with the criteria used for evaluation, such as customer satisfaction, politeness, problem resolution accuracy, etc.
There are two distinct benefits of doing manual call scoring compared to more automated methods:
First, direct human judgment provides higher levels of accuracy for the specific calls reviewed.
Secondly, there is no reliance on technology, which means there’s no risk of technical errors in the assessment.
But manual call scoring also has its disadvantages. It is time-consuming, as each recording must be carefully listened to to assess.
Additionally, there is potential bias when assessing calls due to personal preferences or other factors influencing an evaluator’s opinion about a particular interaction.
Lastly, scaling up this process can be difficult if a company has many customers or large volumes of calls every day.
While it could be done on a clipboard or in an Excel sheet, in its simplest form, agent evaluation functionality allows contact centre managers to evaluate call recordings using customizable forms.
These forms are based on predetermined criteria and integrated into the call recording detail screen.
The supervisor is supported by various software features, e.g., they can speed up the call while listening.
After the supervisor evaluates the call, it will automatically calculate the score and create an evaluation report for this call, including a score expressed in percentages and color-coded reasoning for the scores.
This allows you to instantly understand which sections had problems and which were done well. In addition, you can track and report on your agent’s and team’s performance over time.
In summary, manual call scoring for quality assurance is very time-consuming and prone to human error, making it necessary to explore automated solutions.
However, manual call scoring won’t disappear. The idea of Auto QA is to score or analyze calls en masse, freeing up humans to focus on more detailed reviews on specific calls.
Keyword-based call scoring for quality assurance is a process that uses artificial intelligence (AI) algorithms to analyze calls for specific keywords or phrases based on preset syntax expressions.
Keyword-based call scoring offers many benefits over manual evaluations as it can evaluate 100% of your calls effectively for simple use cases where specific keywords are indicative of the call’s content. This is especially useful to check if compliance statements have been read or call scripts are adhered to.
For example, to answer the question “Did the agent notify the caller that the call is being recorded?”, the system would search the following key phrases in a call transcript: “This call may be monitored”, “This call is being recorded”, “You are on a recorded line”, etc. Such a list of all variants of key phrases can be long.
The more advanced systems support advanced query expressions, like “call NEAR (recorded OR monitored)”, which can match many variants of the key phrase at once.
Despite its many advantages over traditional methods, there are some potential challenges associated with automated call scoring systems that should be taken into consideration before implementing one in your organization.
Most of these shortcomings can be resolved by using of Generative AI-based auto evaluation, discussed below.
Despite the noted drawbacks, the keyword-based evaluation performs effectively in scenarios where agents adhere to a script, and calls exhibit a high degree of uniformity.
One example of keyword-based call scoring is Auto Score Card, an AI-driven automatic call scoring feature powered by a Voice Analytics module within its Conversational Intelligence platform.
It is very easy and quick to set up although it does require some work and diligence to set up the keywords and phrases using our well-documented expression syntax.
Based on the predefined criteria, it will look for keywords and phrases associated with the call scoring criteria.
For example, one criterion is thanking the caller for calling today. If the agent used the prescribed script and got a good score for that segment.
Each criterion can be weighted, allowing you to emphasize more important aspects over less important ones.
Auto Score Card is highly effective when used with sentiment scores, for example where the agent had to place the customer on hold. The positive language used is reflected in a good agent sentiment score.
In summary, keyword-based call scoring may still be useful in some scenarios, but it has limitations due to the unstructured nature of conversations.
The third method of Auto QA employs Generative AI, particularly large language models, to analyze telephone conversations. Large language models demonstrate superior comprehension skills, enabling the system to assess the call in its entirety.
This recent advancement in technology marks a significant step in automated quality assurance and analysis of customer interactions.
A key difference between the Generative AI-based and keyword-based auto evaluation is the following:
Instead of scanning for pre-defined keywords or key phrases within a call transcript, the Generative AI model is provided with a transcript alongside a list of scoring questions phrased naturally.
For example, a question might be, “Based on the provided call transcript, did the agent ask for the name of the caller at the beginning of the conversation?”
The AI is capable of answering this question based on the essence of the conversation, rather than relying on the presence of specific words or phrases.
In some cases, the AI can even respond with “Not applicable, the caller mentioned their name at the outset of the conversation, saying ‘Hi, this is David Schmidt.'”
Automatic call scoring offers many benefits:
There are a few challenges to consider with Auto QA. For example, like keyword-based QA, it requires an investment in Voice Analytics software as well as highly accurate Speech-to-Text transcription.
Imagine being able to write an easy prompt using natural language to get an assessment of whether the issue has been resolved on the first call or if the agent was professional and courteous. That’s where Auto QA comes in.
Auto QA, which takes advantage of Generative AI, gives companies the opportunity to capture the maximum benefit from their customer interactions.
With the addition of Sentiment Analysis and Topical Analysis, Auto QA enhances the quantity and quality of insights that can be gained from conversations, providing a more thorough view of customer service agent performance and customer encounters.
In summary, generative AI is the future of QA. It comes ready to use and is equipped with basic questionnaires/scorecards to get you started. Because it is based on generative AI, it is much easier to customize.
While the historical progression of the three methods represents drastic functional advancements for each step, these tools are most powerful if they are used in a complementary way.
While AI, especially with advancements in Generative AI and natural language processing, showcases promising strides in accurately evaluating agent performance, the human touch embodies an irreplaceable nuance and understanding of contextual subtleties.
AI can evaluate 100% of interactions in the contact centre, which can significantly optimize the evaluation process.
However, the empathetic understanding, experiential judgment, and adaptive feedback that human evaluators offer are crucial for the holistic development of agents.
Moreover, humans can discern the emotional tone and underlying customer sentiments in a way that AI might not fully grasp.
The integration of AI could augment the evaluation process, making it more data-driven and timely, yet the comprehensive insight provided by human evaluators remains pivotal.
AI tools, especially generative AI, can pre-score calls, allowing human evaluators to focus on calls that need the most attention.
Also, we put feedback loops in place, allowing users to provide feedback on the AI’s assessments, which can be used to improve the system’s accuracy.
In conclusion, when it comes to manual vs. keyword-based vs. automatic call scoring for quality assurance, there are pros and cons to each approach.
Manual call scoring is time-consuming but sometimes provides more accurate results as the agent’s performance can be evaluated in detail.
While keyword-based call scoring offers a more efficient way of assessing agents and is able to provide detailed feedback on their performance, it requires experience working with syntax expressions and is limited to certain scenarios.
Generative AI-based call scoring is the most advanced and efficient option, requiring minimal effort from managers and agents. This allows contact centres to evaluate agent performance at scale.
We offer a range of QA tools, from manual to AI-driven, each with its strengths and use cases. However, rather than choosing one approach, the vision is to have these tools work in harmony, leveraging the speed and breadth of AI with the depth and nuance of human judgment.
Quality assurance in a call centre is measured by analyzing customer interactions to identify areas of improvement.
This can be done through automated quality management systems, which record calls and use voice analytics to score the conversation using pre-defined criteria.
These systems also provide feedback on agent performance, allowing managers to monitor and improve the overall quality of service provided by their contact centre.
Additionally, compliance officers can use these recordings for audit purposes, and customer service teams can review them to ensure that agents are following company policies and procedures.
Creating a robust call scorecard for agent evaluation in a contact centre is essential for maintaining a high level of customer satisfaction and agent performance.
Here are several key elements that contribute to an effective call scorecard:
The exact percentage of calls that should be evaluated depends on the size and scope of your contact centre, as well as the nature of customer interactions.
Generally speaking, a good starting point is to have at least 5 calls reviewed for quality assurance purposes for each agent weekly.
This can help ensure that customer service standards are being met and that any issues are identified quickly so they can be addressed promptly.
However, evaluating 100% of your calls gives you a complete and, therefore, much more accurate picture.
Additionally, having automated Voice Analytics tools in place can help identify trends or conversation patterns that may require further investigation or additional training for agents.