Haruto Kubo

In recent years, machine learning has been widely used in various fields; however, protecting personal information and privacy contained in training data has become an important issue. In particular, when the attacker is also a model user, membership inference attacks (MIAs), which aim to infer whether a specific data sample was included in the training dataset of a target model, pose a serious risk of information leakage. Most conventional studies on MIAs assume a cloud-based centralized inference environment in which the target model is executed entirely on a server.
On the other hand, Split Computing (SC), which has recently attracted attention, divides a model into a front-end and a back-end, executing the front-end on the client side and the back-end on the server side, thereby reducing server load and power consumption. In an SC environment, assuming a black-box attack where the model user acts as the attacker, the attacker can utilize not only the model outputs and probability distributions that are obtainable in a cloud-based centralized inference environment, but also information from the front-end model, including gradients, which were previously inaccessible. However, black-box MIAs targeting SC environments have not yet been sufficiently studied.
In this study, we extend black-box MIAs (Shokri et al., 2017) and white-box MIAs (Nasr et al., 2020) to attacks that utilize front-end model information obtainable in SC environments, and experimentally evaluate their effectiveness. Furthermore, we propose a novel black-box MIA optimized for SC environments.
Unlike conventional shadow models that reconstruct model behavior using labels, the proposed method reconstructs the behavior of the back-end model using output probability distributions, enabling attack accuracy comparable to that of white-box attacks by leveraging the increased amount of information available from the front-end model.
Experimental results on CIFAR-10 show that, for SimpleCNN, applying Nasr et al.'s method in the SC environment achieved an attack accuracy of 63-73%, whereas the proposed method reduced the accuracy degradation from the cloud-based Nasr method by up to 60% and maintained an attack accuracy of 65-76%. Similarly, for AlexNet, the proposed method achieved an attack accuracy of 64-68%. These results confirm that high-accuracy black-box MIAs are feasible even in SC environments.