2.5 Bias
Bias-In, Bias-Out

Many AI models are trained on data where social biases are present. These biases are then encoded into the patterns, relationships, rules, and decision-making processes of the AI and have a direct impact on the output.
Biased data can be easy to spot, such as in this AI generated image which shows a predominantly white class of 2023 at Western, but it can also be more invisible. AI-generated text will reflect dominant ideologies, discourses, language, values, and knowledge structures of the datasets they were trained on. For example, Large Language Models may be more likely to reproduce certain dominant forms of English, underrepresenting regional, cultural, racial, or class differences (D’Agostino, 2023 ).
The ethical issue is twofold: first, the information generated by Generative AI is more likely to reflect dominant social identities, meaning that students who use AI will not be exposed to certain worldviews or perspectives, and some students may not feel that their experiences and identities are reflected in the output. Second, the use of Generative AI to produce knowledge will continue to reinforce the dominance of these ideologies, values, and knowledge structures, contributing to further inequities in representation.
As an instructor, it’s important to be aware of this limitation of AI tools. If you ask your students to use these tools, it’s also important to teach them critical AI literacies to similarly be able to identify and reflect on these issues of representation, bias and equity.
Some Generative AI companies have taken steps to correct for biases in the training data by establishing content policies or other guardrails to prevent generating biased or discriminatory output. However, these guardrails are inconsistent and can be subject to the ethical standards of each Generative AI company.

Large Language Models (LLMs) are computational models that are trained on huge datasets of text to recognize common patterns and relationships in natural language. They can be used for generating texts that mimic human language.
A subset of Deep Learning that can use learned rules or patterns to generate new content.