Artificial Intelligence – A Deep Dive Into Analysing AI Bias

7/8/24

Dr. Joshua Arnold Principal Data Scientist, Economics & Data Analytics

Want to know more?

As Artificial Intelligence (AI) technologies become integrated into business processes, organisations face potential litigation risks stemming from the effects of AI bias. Litigation risks include those that involve discrimination, product liability, corporate disclosure, and intellectual property. Individuals and organisations can begin to manage this risk by understanding the origins of AI bias and how to mitigate it. This article identifies different forms of AI biases and discusses how allegations related to AI based systems can be analysed from a technical perspective.

Types and Sources of AI Bias

AI bias occurs when a system produces results that are systemically prejudiced. Diverse backgrounds and perspectives are crucial to managing bias effectively as bias can emerge from both technical sources as well as from insufficient AI governance within the organisation.

From a technical perspective, the underlying data set that is used to ‘train’ the AI system is often a key focus of inquiry. Training is the process of teaching AI to make predictions or decisions based on a set of data. Many AI technologies are powerful pattern recognition systems that rely on the patterns learnt from the training data to generate output. Any pattern, including inherent biases in the original data set, could be preserved in the AI system, leading to biased outputs. A common source of bias in the training data is the underlying prejudices that exists in the data generation process itself (e.g., society).

Direct vs Indirect Bias

Using protected attributes (e.g., person’s gender, race, or age) explicitly in the prediction or decision-making process is called direct bias. In contrast, indirect bias occurs when protected attributes are omitted from a decision-making process but can be inferred from other correlated attributes (e.g., inferring a job candidate’s gender based on being the Captain of a men’s soccer team).

Bias that results from making predictions or decisions based on protected attributes (e.g., person’s gender, race, or age) is called direct bias. For example, a loan approval system trained using data from lenders who preferentially lent to young white males will likely reproduce this preference in its approval decision process. Bias in the data set can also emerge from how the training data was sampled. For example, a facial recognition system trained on primarily one ethnic group is not likely to perform as well on individuals from other ethnic groups.

Indirect bias consists of often subtle, hidden forms of bias that manifest in AI systems. Even if protected attributes are removed from the training data, biases may still arise due to correlations between protected attributes and other non-protected variables. For example, an AI-driven job applicant screening platform that was not explicitly designed to incorporate geographical location may learn to subtly favour individuals living in specific regions. This indirect bias may stem from correlated characteristics, such as living closer to prominent universities or technology hubs. Therefore, even though location was not directly used in the decision-making process, it indirectly influenced the system’s recommendations, leading to a skewed preference for candidates from certain locations.

From a governance perspective, how an AI system is applied in practice relative to its original intended purpose can also contribute to AI bias. Examples of this include the misapplication of an AI system by applying it in a context that it was not designed for (e.g., a credit scoring AI trained on data from one country and then applied to another). This misapplication is sometimes referred to as the ‘Portability Trap’ referring to how portable (or not) the software implementation of the AI is from one domain to another.

Governance of AI systems refers to the framework for managing an organisation’s use of AI and includes policies, procedures, systems, and structures. Increasingly organisations are viewing AI governance as an ongoing process as AI bias and related challenges are not static. For example, a system that adapts and learns over time may become biased without adequate oversight due to the raw data that it uses to train the algorithm. Alternatively, a system that is initially unbiased, but is prevented from future “learning”, may become biased due to changes in the underlying societal context in which it is used (e.g., if a previously unrepresented demographic begins using the AI or if the characteristics of a given demographic changes). Thus, the risk of bias in an AI system exists irrespective of whether the system evolves or is static in time.

Finally, it is worth noting for completeness that the term ‘bias’ has a secondary use within the AI development community to describe the tendency of an AI system to consistently learn incorrect aspects of the data. Bias in this context refers to the assumptions of the model used and whether they are appropriate for the task (e.g., using linear models for non-linear data).

Other Sources of Bias

The previous examples represent a small sample of possible sources of AI bias. Examples of other sources of AI bias include:

i. Optimisation bias. Where a system emphasises a particular goal (e.g., platform engagement) leading to siloed and extremist communities.
ii. Algorithmic bias. Where an AI system is designed or programmed in a way that inherently favours one group over another, regardless of the data it is trained on.
iii. Measurement bias. Where an inappropriate metric is used to evaluate the performance of an AI system or the AI system is given an inappropriate measure to rank people (e.g., hours spent in the office instead of results produced). These examples highlight the diversity of bias sources ranging from deeply technical issues to measurement and management issues.

Analysing and Managing AI Bias

Given that AI bias can originate from a variety of sources, it can be beneficial to view each AI system holistically and within its particular context. This may require adopting both technical and organisational perspectives. From a technical perspective this might include the analysis of input data, algorithm choice, output analysis, and algorithm performance review. From an organisational perspective this might include the analysis of real-world impacts, employee education, oversight mechanisms, and system transparency.

To understand the bias in an AI system, several levels of analysis can be performed which provide different levels of assurance, these include:

i. Input-based tests that analyse the raw data used to train the system. This may help identify potential pitfalls before the AI is even created. However, unbiased data doesn’t guarantee that the final AI is unbiased.
ii. Output-based tests on the trained AI. These tests examine the actual results from the AI system.
iii. Active interrogation of the AI system through meticulously designed scenarios aimed at ‘stress testing’ the AI. Observing a model’s response to challenging inputs can facilitate a thorough examination, going beyond just scrutinising the outcomes that is the focus of output-based tests.

Input-Based Tests

At the core of an AI system is the input data that is used to train the AI. Any biases present in the raw data can be exacerbated once the AI is trained. For this reason, auditing the raw data can be a crucial step in understanding AI bias. Both statistical methods and exploratory data analysis can be used to understand the distribution of demographics that comprise the trained data set. Uneven distributions of demographics, that is under or overrepresented groups, can lead to bias in systems trained on this data. Sampling techniques such as stratified sampling can help alleviate issues of uneven or non-representative data.

Choosing the correct statistical test to uncover nonrepresentative groups depends on the underlying data available, the specific question of interest, and the appropriateness of the test’s assumptions and limitations. Finally, analysing the raw data can identify potential causes of bias but is insufficient for analysing the AI model or governance.

Output-Based Tests

Output-based tests represent a set of methodologies for detecting and measuring biases in AI systems once they are trained. These tests focus on examining the results or outputs generated by an AI system rather than probing its internal mechanisms or input data. A benefit of these tests lies in their ability to reveal the behaviour of the AI post-training, highlighting any bias in its real-world application. If the tester is external to an organisation, and does not have access to the raw data or the trained model for active interrogation, these may be the only practicable tests for gauging AI bias.

Active Interrogation

Active interrogation methods represent a suite of techniques for gaining a deep understanding of bias within AI systems. Rather than solely analysing the raw data or existing outputs from the AI system, active interrogation techniques typically involve curating new inputs to pressure test specific aspects of the AI system (i.e., to see if it generates biased results). Many of these techniques could be described as a counterfactual analysis which involves changing the value of certain input attributes and observing how the model’s output changes. Changes to protected attributes that cause large changes to the model’s output can be an indication of AI bias.

Active interrogation techniques are a powerful suite of tools for gaining insight into bias within AI systems, however, as they often rely on advanced AI or analytic approaches themselves, a depth of technical knowledge is currently necessary to apply and interpret them. This has reduced their adoption within the legal profession to date.

Quantitative Tests for Raw Data Analysis

Different tests may be required depending on the type of bias. The Difference in Positive Proportions of Labels is a test used to detect potential demographic-based biases by examining the ratio of positive outcomes within a demographic, not just their presence in the data set. It emphasises the importance of having a balanced distribution of positive and negative outcomes across different demographics to avoid bias in model training. This method helps detect issues such as having a demographic represented in the raw data, but only associated with negative outcomes, a pattern the AI would likely reproduce.

Statistical methods can also be applied. For example, variants of Chi-squared tests can be used to find if a significant association between two variables exists. That is, these tests can assess if an AI system’s outputs are independent of key attributes like ethnicity or gender. An association between one of these attributes and a positive (or negative) outcome may indicate AI bias. Often multiple groups are considered in an analysis, requiring more advanced ANOVA (Analysis of Variance) methods. ANOVA methods help detect statistically significant differences between measured demographics. These and other correlation analysis techniques can be useful for detecting potential causes of direct and indirect bias before an AI is trained.

Fairness Measures for Output-Based Analysis

Tests for direct bias often involve analysing the outputs of an AI system to identify biases that are directly embedded within its decision-making or predictions. Direct biases are often the simplest to detect and commonly arise from imbalanced or prejudiced training data. A range of established fairness metrics exist for measuring bias. For example, the demographic parity test (also referred to as ‘statistical parity’) is one such test that could be used to assess an AI for making bank loan decisions. Such a test could check whether the loan acceptance rate (in technical terms, the positive rate) is statistically identical across key demographics such as gender or race. However, care is required when using these measures to assess the AI, as an equal acceptance rate for loans across demographics can lead to some groups being held to a higher standard. That is, members of one group capable of repaying the loan will not be granted a loan when comparable members of a different group would be.

An alternative fairness metric for directly assessing bias, called equal opportunity, instead focuses on ensuring that equal rates of applicants who can repay the loan are accepted (in technical terms, the true positive rate). That is, demographic parity focuses on ensuring equal rates of loans are granted for each demographic, while equal opportunity focuses on ensuring that equal rates of loans are granted among qualifying individuals from each demographic. Importantly, adjusting an AI system to be fair for one of these tests may directly conflict with the systems fairness in another test. Finding a middle ground where both tests are fair is not always possible and the choice of a fairness metric may be influenced by the context in which the AI system is used. These represent only two of many such fairness tests, each emphasising different, sometimes conflicting, aspects of fairness.

Disparate Impact for AI

Disparate impact refers to the disproportionate or adverse effects of a facially neutral policy on protected groups, even if no discrimination was intended. Disparate impact tests are used in U.S. anti-discrimination law and are becoming more widely adopted in the context of AI bias analysis. The most common test for disparate impact compares the rate of positive or favourable outcomes between groups. A group experiencing less than 80% of the rate of positive outcomes compared to the most favoured group is often taken as an indication of adverse effects. However, this 80% test (often called the four fifths rule) has faced criticism for its applicability to AI analysis around its simplicity, arbitrary 80% threshold, and lack of consideration of relative group sizes.

The lack of consideration of relative group sizes leaves this disparate impact test susceptible to Simpson’s paradox. Simpson’s paradox is a statistical phenomenon where a trend appears in different groups of data but disappears or reverses when these groups are combined. This paradox was highlighted in the 1973 UC Berkeley Admissions Study where preference for male or female applicants reversed depending on how the data was aggregated.

Active Interrogation Techniques

AI interpretability techniques such as Partial Dependence Plots and Shapley Additive Explanations can be employed to help analyse the interaction of each attribute and determine their influence on final outcomes.

Partial Dependence Plots provide visual representations of the relationship between input attributes (e.g., being male, or white) and model outcomes (e.g., approved / denied loan) while holding all other factors constant. This allows for the identification of relationships between protected attributes that should not exist. However, this method does not capture the interactions between attributes (e.g., being just a male or just white vs being white and male).

Shapley Additive Explanations are a tool that can capture these complex interactions. This method breaks down the result of the AI system and assigns each input attribute a portion of impact for arriving at that outcome while accounting for attribute interactions. As well as identifying which protected attributes have contributed to the decision-making or predictions, this technique also quantifies the degree of contribution allowing more nuanced examination of the level of bias within an AI.

Adversarial testing is a method of using AI to detect bias in AI. It works by using a second AI, called the adversarial model, to design scenarios that will ‘trick’ the system under examination into making mistakes. This can reveal vulnerabilities in the original AI’s decision-making process that may lead to bias. A variant of adversarial testing involves assigning the second AI the task of trying to predict protected attributes based on the results of the system under examination. If the second model can correctly predict protected attributes based on the results of the original model it implies the original system is indirectly using those attributes, suggesting bias.

AI Governance

AI governance is a system for managing bias in the AI tools developed or used by organisations. To help mitigate risk, leadership teams might consider being proactive in exploring the implementation of governance that could be appropriate to the organisation’s responsibilities and risk appetite. These may include policies and procedures around managing bias in raw data, trained models, and AI use cases. This will likely be a continuous process throughout the lifecycle of an AI system to effectively mitigate bias risk. While specific AI regulation is at a formative stage, future litigation may involve analysing the governance structures that lead to bias in AI models including organisational structure, controls, systems, decision making bodies, and policies.

Conclusion

A growing reliance on AI systems within an organisation may necessitate the development of a comprehensive governance framework to address potential AI bias risks. Bias can emerge from societal prejudices present in the training data, misapplication of AI technologies, or algorithmic choice. Fortunately, detecting and mitigating AI bias can be achieved using the correct analytical tools. These range from statistical tests to active interrogation techniques. Continuous effective AI governance is another important component for mitigating AI related risks. While AI bias presents a considerable challenge, these analytical tools and robust governance practices provide a multifaceted framework to address these risks.

Disclaimer: The content of this article is general in nature and is presented for informative purposes. It is not intended to constitute tax, financial or legal advice, whether general or personal nor is it intended to imply any recommendation or opinion about a financial or legal product. It does not take into consideration your personal situation and may not be relevant to circumstances. Before taking any action, consider your own particular circumstances and seek professional advice. This content is protected by copyright laws and various other intellectual property laws. It is not to be modified, reproduced or republished without prior written consent.

Featured Insight

New South Wales Junior Doctors Underpayment Class Action

Featured Insight

Rockhampton Plaza Hotel sells for $5.75 million in competitive court-appointed auction led by Vincents, Lloyds Auctioneers and Ray White

Featured Insight

New South Wales Junior Doctors Underpayment Class Action

Featured Insight

One-in-five investors snatch up interstate properties

Featured Insight

Low deposit first home buyers now have $82,000 in equity

Featured Insight

Market on Edge: Iran Tensions Drive Volatility

Featured Insight

Rockhampton Plaza Hotel sells for $5.75 million in competitive court-appointed auction led by Vincents, Lloyds Auctioneers and Ray White

Featured Insight

2026-27 Federal Budget: Unpacking the Key Announcements

Featured Insight

2025/26 Year-end Tax Planning

Featured Insight

How to Prepare for Payday Super 2026: Webinar Series

Featured Insight

Preparing for a financial statement audit

Featured Insight

New South Wales Junior Doctors Underpayment Class Action

Featured Insight

Navigating Mandatory Climate-Related Financial Disclosures in Australia

Featured Insight

One-in-five investors snatch up interstate properties

Featured Insight

Rockhampton Plaza Hotel sells for $5.75 million in competitive court-appointed auction led by Vincents, Lloyds Auctioneers and Ray White

Featured Insight

2026-27 Federal Budget: Unpacking the Key Announcements

Featured Insight

The Power of Experience: Workers 70 and Older Redefining the Workforce

Featured Insight

2025/26 Year-end Tax Planning

Featured Insight

How to Prepare for Payday Super 2026: Webinar Series

Featured Insight

Preparing for a financial statement audit

Featured Insight

New South Wales Junior Doctors Underpayment Class Action

Featured Insight

Navigating Mandatory Climate-Related Financial Disclosures in Australia

Featured Insight

One-in-five investors snatch up interstate properties

Featured Insight

Rockhampton Plaza Hotel sells for $5.75 million in competitive court-appointed auction led by Vincents, Lloyds Auctioneers and Ray White

Featured Insight

The Power of Experience: Workers 70 and Older Redefining the Workforce

Featured Insight

2025/26 Year-end Tax Planning

Featured Insight

Preparing for a financial statement audit

Featured Insight

New South Wales Junior Doctors Underpayment Class Action

Featured Insight

Navigating Mandatory Climate-Related Financial Disclosures in Australia

Featured Insight

2025/26 Year-end Tax Planning

Featured Insight

How to Prepare for Payday Super 2026: Webinar Series

Featured Insight

Preparing for a financial statement audit

Featured Insight

New South Wales Junior Doctors Underpayment Class Action

Featured Insight

Navigating Mandatory Climate-Related Financial Disclosures in Australia

Artificial Intelligence – A Deep Dive Into Analysing AI Bias

Category

Want to know more?

Types and Sources of AI Bias

Direct vs Indirect Bias

Other Sources of Bias

Analysing and Managing AI Bias

Input-Based Tests

Output-Based Tests

Active Interrogation

Quantitative Tests for Raw Data Analysis

Fairness Measures for Output-Based Analysis