In recent years, a large amount of personal data has been collected as big data, and the results of its analysis are used for various purposes. One way to utilize the large amount of data that exists is to find the frequency corresponding to the most frequently occurring items in the data set. However, since such data often contain personal privacy, privacy protection is necessary for data collection and utilization. Differential privacy is one such measure. It is a technique to protect personal privacy by adding noise to the collected and analyzed data. In general, differential privacy mechanisms involve a trade-off between privacy protection and data availability. A differential privacy mechanism has also been proposed that corresponds to the frequency estimation technique described earlier, but this also has a problem regarding its usefulness when high privacy protection is required. On the other hand, the DIP method is a differential privacy mechanism that can be applied to various types of analysis by adding noise to the data itself rather than to the output statistics, and by maintaining the distribution of the data before anonymization even after anonymization. The DIP method is a differential privacy mechanism that can be applied to various types of analysis. In this study, we focus on the DIP method and propose a method that protects privacy and does not degrade the usefulness of frequency estimation.

Top