Factors related to GDPR compliance promises in privacy policies: a machine learning and NLP approach

Abstract

This paper employs machine learning (ML) and natural language processing (NLP) techniques to examine the relationship between organizational factors, such as company size and headquarters location, of data processing entities and their GDPR compliance promises as disclosed in privacy policies. Our methodology comprises three main stages, each representing a key contribution. Firstly, we developed five NLP-based classification models with precision scores of at least 0.908 to assess different GDPR compliance promises in privacy policies. Secondly, we have collected a data set of 8,614 organizations in the EU containing organizational information and the GDPR compliance promises derived from the organization’s privacy policy. Lastly, we have analyzed the organizational factors correlating to these GDPR compliance promises. The findings reveal, among other things, that small or medium-sized enterprises negatively correlate with the disclosure of two GDPR privacy policy core requirements. Moreover, as a headquarters location, Denmark performs best regarding positively correlating with disclosing GDPR privacy policy core requirements, whereas Spain, Italy, and Slovenia negatively correlate with multiple requirements. This study contributes to the novel field of GDPR compliance, offering valuable insights for policymakers and practitioners to enhance data protection practices and mitigate non-compliance risks.

Keywords - general data protection regulation; data protection; privacy policy; natural language processing; machine learning.

Authors

Abdel-Jaouad Aberkane, Seppe vanden Broucke, Geert Poels

Full Journal Article

DOI: https://doi.org/10.12821/ijispm130202

Scripts

Stage 1: Training Classification Models
Stage 2: Scraping & Classification of Scraped Privacy Policies
Stage 3: Analysis
- Output: Statsmodels output

Data

Data set containing organizational information and GDPR classification (based on privacy policy) of 8 614 organizations in the EU.