Benchmarking Large Language Models and Privacy Protection
Organization
University of Ottawa
Published
2025
Project Leader(s)
Rafal Kulik
Summary
In the current digital age, the accelerated growth of data generated by individuals has fueled advances in artificial intelligence (AI), particularly the development and use of large language models (LLMs). These models, which mimic human-like understanding and generation of language, are being integrated into various societal tools. The evolution of LLMs deployment requires an appropriate strategy for ensuring user privacy. Addressing privacy challenges requires going beyond traditional privacy means and carefully balancing the models’ performance needs and the constraint of privacy of sensitive data. The challenge lies in the dual need to harness the potential of LLMs to drive innovation and improve services, while ensuring the confidentiality and privacy of the data used to train these models.
The project addressed several aspects related to data privacy in complex LLMs.
First, the project explains basic techniques and applications of LLMs as well as privacy mechanisms and their relevance to LLMs. Then, the anonymization and data privacy techniques were applied and their effect on the model performance was studied. In particular, the project analyses thoroughly current data privacy challenges related to ChatGPT.
Additionally to the technical part, the project provides a thorough analysis of existing legal and policy frameworks governing LLMs, identifying gaps and forecasting future legal needs.
The final report consists of non-technical description, policy and legal analysis as well as codes and numerical analysis. The final report is accompanied by the project webpage with educational materials.
Project deliverables are available in the following language(s):
English
OPC Funded Project
This project received funding support through the Office of the Privacy Commissioner of Canada’s Contributions Program. The opinions expressed in the summary and report(s) are those of the authors and do not necessarily reflect those of the Office of the Privacy Commissioner of Canada. Summaries have been provided by the project authors. Please note that the projects appear in their language of origin.
Contact Information
Rafal Kulik
Department of Mathematics and Statistics
University of Ottawa
150 Louis Pasteur, STEM building
Ottawa, Ontario K1N 6N5
Email: rkulik@uottawa.ca
- Date modified: