Part 5 of 6

GenAI Risks & Mitigations

⏱ 50-60 min read☆ Risk

Introduction

Generative AI offers powerful capabilities but introduces unique risks that differ from traditional AI systems. Understanding these risks - and practical mitigation strategies - is essential for responsible deployment and effective governance.

Hallucinations

💭

Hallucinations

LLMs confidently generate false information - fabricated facts, non-existent citations, invented statistics. This isn't a bug to be fixed; it's inherent to how these models work.

Why it happens: LLMs predict plausible-sounding text, not necessarily true text. They have no mechanism to verify factual accuracy or "know" that they don't know something.

Mitigations
  • Implement human review for high-stakes outputs
  • Use Retrieval-Augmented Generation (RAG) to ground responses in verified sources
  • Ask models to cite sources and verify those citations
  • Prompt for uncertainty: "If you're not certain, say so"
  • Design systems expecting errors, not preventing them entirely

Bias and Fairness

Bias and Fairness Issues

LLMs are trained on internet text that contains societal biases. These biases manifest in outputs - stereotyped characterizations, unequal treatment, and discriminatory content.

Forms of bias: Gender stereotypes in professional contexts, racial biases in content generation, cultural biases favoring Western perspectives, socioeconomic assumptions.

Mitigations
  • Test outputs across diverse demographic scenarios
  • Include explicit fairness guidelines in prompts
  • Use diverse teams to review and test applications
  • Monitor production outputs for bias patterns
  • Provide feedback mechanisms for users to report issues

Copyright and Intellectual Property

©

Copyright Concerns

LLMs are trained on copyrighted content. They can sometimes reproduce substantial portions of training data verbatim. The legal landscape for AI-generated content is unsettled and evolving rapidly.

Key questions: Who owns AI-generated content? Is training on copyrighted work "fair use"? Can generated content infringe copyright?

Mitigations
  • Use models with clear licensing and indemnification
  • Implement plagiarism detection on generated content
  • Document that content is AI-generated
  • Avoid prompts that request reproduction of specific works
  • Monitor legal developments and update policies accordingly

Legal Uncertainty

Multiple lawsuits are challenging AI training practices. Courts have not yet established clear precedent. Organizations should consult legal counsel and consider risk tolerance when deploying GenAI for content creation.

Data Privacy and Confidentiality

🔒

Privacy Risks

Data sent to LLM APIs may be stored, logged, or used for training. Employees may inadvertently share confidential information with external services.

Mitigations
  • Review vendor data policies carefully
  • Use enterprise agreements with data protection commitments
  • Consider on-premises or private cloud deployment
  • Implement data loss prevention (DLP) controls
  • Train employees on what should not be shared
  • Anonymize or redact sensitive data before processing

Prompt Injection and Security

💻

Prompt Injection Attacks

Malicious inputs can manipulate LLM behavior, potentially bypassing safety measures, extracting system prompts, or causing unintended actions. This is a fundamental vulnerability in LLM systems.

Types: Direct injection (user input), indirect injection (through processed content like emails or documents), jailbreaking (bypassing safety guidelines).

Mitigations
  • Input validation and sanitization
  • Separate user content from system instructions
  • Limit model capabilities and permissions
  • Monitor for suspicious patterns
  • Don't expose raw model outputs to sensitive systems
  • Implement rate limiting and abuse detection

No Perfect Defense

Prompt injection is an unsolved problem. Current mitigations reduce but don't eliminate risk. Assume that determined attackers may be able to manipulate model behavior. Design systems with this assumption - limit what the model can do, not just what it should do.

Misinformation and Harmful Content

Harmful Content Generation

LLMs can generate convincing misinformation, malicious code, phishing emails, and other harmful content. Safety guardrails help but can be bypassed.

Mitigations
  • Choose models with robust safety training
  • Implement content moderation on outputs
  • Use case-specific restrictions and guardrails
  • Monitor for misuse patterns
  • Have clear policies and consequences for misuse

Reliability and Consistency

🔄

Unpredictable Behavior

LLMs can produce different outputs for the same input. Behavior can change subtly with model updates. This unpredictability complicates testing and quality assurance.

Mitigations
  • Use temperature=0 for more deterministic outputs
  • Implement output validation and retry logic
  • Build automated test suites for critical behaviors
  • Monitor for regressions when models are updated
  • Consider model versioning and controlled rollouts

Risk Assessment Framework

When evaluating GenAI use cases, consider:

  • Impact of errors: What's the worst case if the model makes a mistake?
  • Human oversight: Is there a human review before consequential actions?
  • Reversibility: Can harm be undone if discovered later?
  • Verification ability: Can outputs be checked for accuracy?
  • Sensitivity: Does the application involve personal data or vulnerable populations?
  • Regulatory context: Are there legal requirements that apply?

Risk-Proportionate Controls

Match controls to risk level. A chatbot answering general FAQs needs different governance than an AI system processing loan applications. Don't let perfect be the enemy of good, but don't deploy high-risk applications without appropriate safeguards.

Key Takeaways

  • Hallucinations are inherent to LLMs - design systems expecting errors
  • Bias from training data manifests in outputs - test across diverse scenarios
  • Copyright and IP questions remain legally unsettled - monitor developments
  • Data privacy requires careful vendor evaluation and data handling policies
  • Prompt injection is an unsolved security challenge - limit model capabilities
  • Match governance controls to the risk level of each application
  • Human oversight remains essential for high-stakes decisions