Skip to content

Africa: Navigating AI hallucinations and biases towards a safer fintech future

15 November 2023
– 5 Minute Read

DOWNLOAD ARTICLE

Fintech players are expected to move quickly to harness the power of generative artificial intelligence (GAI) to improve their service offerings and remain competitive. But is the fintech sector fully aware of the risks that need to be managed at this turning point?

The potential benefits of GAI for fintech firms are compelling. By relying on GAI to automate decision-making, these firms could offer their clients access to real-time support from user-friendly chatbots and personalised investment advice informed by dynamic market analyses. GAI could also sharpen their fraud and money laundering detection and prevention capabilities and enhance the speed and accuracy of their risk assessment and price determination processes.

However, GAI systems present serious risks and understanding the architecture of GAI systems is key to finding and addressing these risks in practice.

GAI systems are built on large language models (LLMs) – algorithms that are trained on large and complex datasets and use deep learning techniques to generate new content. LLMs are continuously learning from dynamic training datasets made up of predetermined categories of data that are updated on an ongoing basis with users’ prompts and outputs generated by LLMs from those prompts. In simple terms, LLMs function as automated digital libraries of growing volumes of data that can generate text, images and other media at the click of a button with minimal inputs.

What risks do LLMs present to fintech firms?

While LLMs may help fintech firms to work smarter with their data, LLMs are only as good as their training datasets and typically lack the ability to reliably verify and ensure the accuracy of their outputs. This presents serious risks to fintech firms and their clients, particularly where the design of LLMs and the contents of their training datasets reflect biases that contribute to inaccurate and discriminatory decisions being generated. These risks are exacerbated by the notorious ability of LLMs to present fictitious ‘hallucinations’ as fact in a compelling way.

Other risks include: (i) data protection risks (where training data or prompts hold personal information of clients or employees); (ii) intellectual property (IP) infringement risks (where the outputs generated incorporate or otherwise infringe a third party’s IP rights); and (iii) regulatory risks (where the use of GAI systems infringes applicable financial services or insurance legislation).

How can biased datasets used by fintech firms result in unfair discrimination?

Biases in LLMs can subsist in both the structure of their decision-making processes and the data on which the processes rely. These biases can be baked into an LLM’s decision-making and outputs.

For example, a GAI system may be used to generate creditworthiness reports using processes and datasets that favour certain demographics of clients over others by under- or overrepresenting various input factors. If, for example, communities in certain geographical areas are inaccurately represented in these datasets, members of those communities may be issued with biased creditworthiness reports that inaccurately find they have low creditworthiness ratings.

Similarly, biases of a GAI system’s developers that are incorporated into the structure of its decision-making process (e.g. where undue weight is given to metrics that correlate strongly with the age, race, sex, gender, education or ethnicity of individuals), may lead to outputs that are not only biased but also false.

If clients are denied access to loan or credit facilities on the basis of inaccurate and biased outputs, this may lead to unfair discrimination and economic exclusion – an outcome that would be adverse to the well-being of those clients and to the reputation of those firms.

What are AI hallucinations and how do they occur?

An AI hallucination occurs where an LLM generates outputs that appear factual but are actually false. This is typically triggered by two related factors.

Firstly, an LLM is usually designed to predict the next sequence in a series based on an analysis of its training data without verifying the accuracy of that data or the outputs it generates. Secondly, those outputs will be false where they are generated based on inaccurate, incomplete or biased training data.

The risk of hallucination increases where the input prompts are vague and misleading and where false outputs are fed back into the training data without correction and then replicated in further outputs generated. Things can therefore go from bad to worse when an LLM hallucinates.

How can these risks be mitigated?

It is not easy to retrospectively trace and address the exact roots of a problem given that the architecture of GAI systems is typically opaque. It is therefore important that GAI systems are built, by default and by design, in a way that is responsive to these risks.

Legal recourse for LLM users tends to be limited, as developers provide LLMs on an ‘as is’ basis without warranties and subject to substantial limitations on their liability. Responsibility for use of the outputs generally falls squarely on the user. Wherever possible, it is best to ensure risks and liabilities are appropriately distributed by way of contract between LLM developers, fintech firms and clients.

In addition, fintech firms relying on LLMs must proactively implement practical mitigation measures to address data and decision-making risks before relying on GAI systems to service clients.  Ultimately, fintech firms need to appreciate that they are responsible for ensuring that their use of GAI systems is intelligent and lawful.