Ensuring the Reliability, Robustness, and Ethical Compliance of LLMs


Date
Aug 22, 2024 3:30 PM — 4:00 PM
Event
Location
HEC Montréal
501 De la Gauchetière Street West, Montréal, QC

Large Language Models (LLMs) have revolutionized natural language processing by enabling human-like text generation and conversation. However, their non-deterministic nature, coupled with the complexities of modern applications, raises significant concerns around reliability, robustness, and ethical compliance. This presentation post delves into the challenges faced when integrating LLMs into real-world systems, particularly around testing and validation. We explore the limitations of traditional unit testing in handling LLM-generated outputs, offering new approaches such as LLM judges, black-box confidence estimation, and post-run output validation. Furthermore, we introduce tools like DeepEval and Guardrails AI, which help testing and guarding LLMs’ generations while ensuring alignment with business, legal, and ethical standards. The discussion highlights the importance of balancing creative, generative power with rigorous validation processes to maintain trustworthiness in LLM applications. Through this comprehensive examination, we aim to provide practical solutions for improving the reliability, robustness and ethical governance of LLM systems in production environments.

Houssem Ben Braiek
Houssem Ben Braiek
Ph.D., M.Sc., Eng.

I am ML Tech Lead with a background in software engineering, holding M.Sc. and Ph.D. degrees from Polytechnique Montreal with distinction. My role involves supervising and guiding the development of machine learning solutions for intelligent automation systems. As an active SEMLA member, I contribute to research projects in trustworthy AI, teach advanced technical courses in SE4ML and MLOps, and organize workshops.