Currently, it is believed that the collection and use of personal data can solve various problems in society. However, privacy protection is essential for the collection of personal data. Typical privacy protection measures include k-anonymization and l-diversity. However, these cannot mathematically evaluate the privacy level against arbitrary attackers. A mathematically guaranteed privacy measure is differential privacy. Differential privacy mathematically evaluates the level of protection of an individual's guess of data contained in a database due to noise. Furthermore, local differential privacy has attracted attention as a measure that mathematically guarantees privacy protection not only for data users but also for data collectors. Local differential privacy is a method to protect data by adding random noise when data providers send data to data collectors. Local differential privacy has been attempted to be applied to machine learning. However, local differential privacy is difficult to control the usefulness and privacy protection for multidimensional data. Therefore, it is necessary to suppress noise by combining dimensionality reduction and data discretization in order to apply it to machine learning. In this study, we applied local differential privacy to machine learning by discretizing the data according to the distribution of the data to enable learning with higher accuracy. In addition, by adjusting the intensity of noise according to the attributes of the machine learning dataset, it is possible to perform learning even under strict privacy conditions.

Top