Jin Ashida

In recent years, machine learning technology has been increasingly implemented in society across diverse fields such as image recognition and natural language processing. However, a significant challenge remains the risk of privacy-related data leakage from trained models. Membership Inference Attacks (MIA) against machine learning models, formalized by Shokri et al. in 2017, involve attackers using machine learning responses to infer whether a given data point was part of the training data. In 2022, Carlini et al. proposed the Likelihood Ratio Attack (LiRA) using multiple shadow models. LiRA demonstrated extremely high attack success rates at low false positive rates, introducing a new evaluation metric for attack performance in the distinct low false positive rate domain. They also explicitly showed that DP-SGD, which uses differential privacy in the learning process, is secure against MIA. Recently, machine learning models protecting data with LDP have been studied, but their safety against MIA remains unclear. Specifically, the data contains labels and features, and how each should be protected, along with the definition of attack models against them, has not been examined. This study formalizes anonymization techniques for LDP-based machine learning models as fully-private data, label-only-private data, and data-only-private data. Furthermore, it formalizes three levels of MIA attack models against these: the privacy black model, quasi-privacy black model, and privacy white model. We then evaluated the attack resistance of the label-only-private and data-only-private frameworks against the quasi-privacy black model using LiRA.