Tatsuhiro Yamatsuki

Currently, it is believed that the collection and use of personal data can solve various problems in society. However, privacy protection is essential for the collection of personal data. Typical privacy protection measures include k-anonymization and l-diversity. However, these cannot mathematically evaluate the privacy level against arbitrary attackers. A mathematically guaranteed privacy measure is diﬀerential privacy. Diﬀerential privacy mathematically evaluates the level of protection of an individual's guess of data contained in a database due to noise. Furthermore, local diﬀerential privacy has attracted attention as a measure that mathematically guarantees privacy protection not only for data users but also for data collectors. Local diﬀerential privacy is a method to protect data by adding random noise when data providers send data to data collectors. Local diﬀerential privacy has been attempted to be applied to machine learning. However, local diﬀerential privacy is diﬃcult to control the usefulness and privacy protection for multidimensional data. Therefore, it is necessary to suppress noise by combining dimensionality reduction and data discretization in order to apply it to machine learning. In this study, we applied local diﬀerential privacy to machine learning by discretizing the data according to the distribution of the data to enable learning with higher accuracy. In addition, by adjusting the intensity of noise according to the attributes of the machine learning dataset, it is possible to perform learning even under strict privacy conditions.