AI in Consulting Practices #4: Measuring Equals Learning

Announcement: Specialized AI fund CuriosityVC becomes a strategic investor at Onesurance

Read the article

Announcement: Specialized AI fund CuriosityVC becomes a strategic investor at Onesurance

Read the article

Announcement: Specialized AI fund CuriosityVC becomes a strategic investor at Onesurance

Read the article

Back to articles

Oct 1, 2024

AI in Consulting Practices #4: Measuring Equals Learning

Dennie van den Biggelaar, Onesurance, in Know Your Stuff!, VVP 4-2024

In this fourth part of the series AI in the Advisory Practice, we focus on a crucial aspect: how do you know and measure if your AI system is really doing what it should be doing? In the first part (VVP 1, 2024), AI strategist Dennie van den Biggelaar showed how to get started with Machine Learning (a specific part of AI), in the second part (VVP 2), how to operationalize AI in your business processes, and the third part (VVP 3) highlighted Integrating AI software into existing IT landscapes.

Download the PDF article

Measuring the effectiveness of an AI application starts with defining clear 'business KPIs.' These KPIs are crucial because they guide which aspects of your business operations you want to improve and how you can make these improvements measurable. For an insurance company, these goals might include increasing revenue, improving retention, increasing policy density, or increasing STP acceptance. Establishing these KPIs provides a framework for both the development and evaluation of the AI application.

Human and machine

In practice, AI applications often work alongside human experts. Therefore, it's important to measure the performance of both the AI and the human separately and together. This provides insight into the effectiveness of the collaboration and helps you determine where improvements are possible.

Example: Active customer management: imagine you have an AI algorithm that identifies customers with a high likelihood of defection. If the sales team or advisor does not adequately follow up on these signals, the intended reduction in defection may not occur. By measuring performance per employee, you can discover if certain employees achieve better results than others. These insights can then be shared to strengthen the team as a whole.

Technical performance

To assess the technical performance of a predictive algorithm, various indicators are used: accuracy (indicating how often the algorithm makes the correct prediction), precision (this metric specifically looks at the reliability of positive predictions), sensitivity (this measures how well an AI model can detect all relevant outcomes), Area Under Curve (provides an overview of the model's prediction quality over different thresholds), and Log Loss (this measures how close the predicted probabilities are to the actual outcomes).

In addition to these indicators, speed, efficiency, and scalability are important. Speed, or latency, determines how quickly the AI application responds to a request. Efficiency is measured by the application's memory usage, and scalability is assessed based on the number of predictions made within a certain time frame (throughput). These factors provide an assessment of the scalability of an algorithm.

Robust and ethical

An AI application must not only perform well technically but also be robust and ethically sound. This includes the model's ability to remain effective even as input data or the environment changes (model drift and shift). Additionally, the model must be sensitive to changes in the data it has been trained on (data drift and shift). Ethical considerations, such as preventing discrimination based on gender, ethnicity, or age, are also crucial to ensure the AI operates fairly and responsibly.

'Measuring the effectiveness of an AI application is a complex but necessary process.'

Uptime and reliability

As with any cloud-based application, the uptime of an AI application is crucial, especially in production environments. A common standard in a Service Level Agreement (SLA) is an uptime of 99.9 percent. This means that out of every 1,000 interactions with the application, only one may fail. To ensure this reliability, a backup application is often deployed that can take over in case of failures.

From prototype to production

Setting up an AI application is a step-by-step process. In the prototype phase, the focus is primarily on testing the predictability of the algorithm and minimizing any discrimination. If the AI application passes these tests, the next step is to assess whether the application actually improves the desired business KPIs. The scalability of the model is also taken into consideration at this stage.

Once the AI is in production, the focus shifts to ensuring uptime and monitoring the robustness of the AI over time. By systematically measuring and evaluating, you can continuously improve and ensure that your AI application does what it is supposed to do, now and in the future.

Measuring impact

One of the most effective methods to measure whether an AI application delivers the desired results is through A/B testing. In this method, the target audience is randomly divided into two groups: one group (Group A) uses the new AI application, while the other group (Group B) uses the traditional method or a previous version of the system without AI. By comparing the performance of both groups, you can determine how effective the AI is in improving the business KPIs.

The success of an AI application greatly depends on how insights from A/B tests are integrated into business operations. For instance, if an A/B test shows that a particular AI tool leads to higher policy density, this could be a reason to roll out the tool more broadly within the organization.

Effective

Measuring the effectiveness of an AI application is a complex but necessary process. It begins with defining clear business KPIs and evaluating both the technical performance and the collaboration between human and machine. Robustness, ethical considerations, and uptime are just as important as the predictability of the algorithm. By leveraging A/B testing, you can reliably determine whether the AI application genuinely contributes to achieving your business goals. It is essential that it not only functions well technically but also effectively contributes to improving your business outcomes.

The original article was published in VVP, read here the article online.

Frustratie of innovatie?

May 23, 2025

Stop optimizing. Start growing.

Apr 9, 2025

Are we already trusting AI?

Feb 11, 2025

Solutions

Brokers & Intermediaries

Carriers

M&A Consultancy

Solutions

Proactive Customer Management with AI

Company

About us

News

Plan a meeting

Legal

Terms and Conditions

Select Language

English

Solutions

Brokers & Intermediaries

Carriers

M&A Consultancy

Solutions

Proactive Customer Management with AI

Company

About us

News

Plan a meeting

Legal

Terms and Conditions

Select Language

English

Solutions

Brokers & Intermediaries

Carriers

M&A Consultancy

Solutions

Proactive Customer Management with AI

Company

About us

News

Plan a meeting

Legal

Terms and Conditions

Select Language

English

AI in Consulting Practices #4: Measuring Equals Learning

Human and machine

Technical performance

Robust and ethical

'Measuring the effectiveness of an AI application is a complex but necessary process.'

Uptime and reliability

From prototype to production

Measuring impact

Effective

Read more

Frustratie of innovatie?

Stop optimizing. Start growing.

Are we already trusting AI?