Part 5: GenAI Risks & Mitigations | Module 3

Introduction

Generative AI offers powerful capabilities but introduces unique risks that differ from traditional AI systems. Understanding these risks - and practical mitigation strategies - is essential for responsible deployment and effective governance.

Hallucinations

💭

Hallucinations

LLMs confidently generate false information - fabricated facts, non-existent citations, invented statistics. This isn't a bug to be fixed; it's inherent to how these models work.

Why it happens: LLMs predict plausible-sounding text, not necessarily true text. They have no mechanism to verify factual accuracy or "know" that they don't know something.

Mitigations

Implement human review for high-stakes outputs
Use Retrieval-Augmented Generation (RAG) to ground responses in verified sources
Ask models to cite sources and verify those citations
Prompt for uncertainty: "If you're not certain, say so"
Design systems expecting errors, not preventing them entirely

Bias and Fairness

⚖

Bias and Fairness Issues

LLMs are trained on internet text that contains societal biases. These biases manifest in outputs - stereotyped characterizations, unequal treatment, and discriminatory content.

Forms of bias: Gender stereotypes in professional contexts, racial biases in content generation, cultural biases favoring Western perspectives, socioeconomic assumptions.

Mitigations

Test outputs across diverse demographic scenarios
Include explicit fairness guidelines in prompts
Use diverse teams to review and test applications
Monitor production outputs for bias patterns
Provide feedback mechanisms for users to report issues

Copyright and Intellectual Property

©

Copyright Concerns

LLMs are trained on copyrighted content. They can sometimes reproduce substantial portions of training data verbatim. The legal landscape for AI-generated content is unsettled and evolving rapidly.

Key questions: Who owns AI-generated content? Is training on copyrighted work "fair use"? Can generated content infringe copyright?

Mitigations

Use models with clear licensing and indemnification
Implement plagiarism detection on generated content
Document that content is AI-generated
Avoid prompts that request reproduction of specific works
Monitor legal developments and update policies accordingly

Legal Uncertainty

Multiple lawsuits are challenging AI training practices. Courts have not yet established clear precedent. Organizations should consult legal counsel and consider risk tolerance when deploying GenAI for content creation.

Data Privacy and Confidentiality

🔒

Privacy Risks

Data sent to LLM APIs may be stored, logged, or used for training. Employees may inadvertently share confidential information with external services.

Mitigations

Review vendor data policies carefully
Use enterprise agreements with data protection commitments
Consider on-premises or private cloud deployment
Implement data loss prevention (DLP) controls
Train employees on what should not be shared
Anonymize or redact sensitive data before processing

Prompt Injection and Security

💻

Prompt Injection Attacks

Malicious inputs can manipulate LLM behavior, potentially bypassing safety measures, extracting system prompts, or causing unintended actions. This is a fundamental vulnerability in LLM systems.

Types: Direct injection (user input), indirect injection (through processed content like emails or documents), jailbreaking (bypassing safety guidelines).

Mitigations

Input validation and sanitization
Separate user content from system instructions
Limit model capabilities and permissions
Monitor for suspicious patterns
Don't expose raw model outputs to sensitive systems
Implement rate limiting and abuse detection

No Perfect Defense

Prompt injection is an unsolved problem. Current mitigations reduce but don't eliminate risk. Assume that determined attackers may be able to manipulate model behavior. Design systems with this assumption - limit what the model can do, not just what it should do.

Misinformation and Harmful Content

⚠

Harmful Content Generation

LLMs can generate convincing misinformation, malicious code, phishing emails, and other harmful content. Safety guardrails help but can be bypassed.

Mitigations

Choose models with robust safety training
Implement content moderation on outputs
Use case-specific restrictions and guardrails
Monitor for misuse patterns
Have clear policies and consequences for misuse

Reliability and Consistency

🔄

Unpredictable Behavior

LLMs can produce different outputs for the same input. Behavior can change subtly with model updates. This unpredictability complicates testing and quality assurance.

Mitigations

Use temperature=0 for more deterministic outputs
Implement output validation and retry logic
Build automated test suites for critical behaviors
Monitor for regressions when models are updated
Consider model versioning and controlled rollouts

Risk Assessment Framework

When evaluating GenAI use cases, consider:

Impact of errors: What's the worst case if the model makes a mistake?
Human oversight: Is there a human review before consequential actions?
Reversibility: Can harm be undone if discovered later?
Verification ability: Can outputs be checked for accuracy?
Sensitivity: Does the application involve personal data or vulnerable populations?
Regulatory context: Are there legal requirements that apply?

Risk-Proportionate Controls

Match controls to risk level. A chatbot answering general FAQs needs different governance than an AI system processing loan applications. Don't let perfect be the enemy of good, but don't deploy high-risk applications without appropriate safeguards.

Key Takeaways

Hallucinations are inherent to LLMs - design systems expecting errors
Bias from training data manifests in outputs - test across diverse scenarios
Copyright and IP questions remain legally unsettled - monitor developments
Data privacy requires careful vendor evaluation and data handling policies
Prompt injection is an unsolved security challenge - limit model capabilities
Match governance controls to the risk level of each application
Human oversight remains essential for high-stakes decisions