Abstract:
As Internet, Internet of Things (IoT), and Artificial Intelligence (AI) technologies rapidly evolve, data has become a critical driving force behind economic and technological advancement. Companies can leverage data analysis to gain comprehensive insights into customer behavior, market trends, and operational performance, thereby making informed decisions and enhancing overall performance. However, a single organization’s data may not be sufficient for comprehensive data analysis, posing a significant challenge. For instance, developing an accurate marketing model to target users may necessitate data from multiple sources, such as telecom operators, social networking sites, and e-commerce platforms. This data scarcity necessitates data-sharing mechanisms, which are often fraught with concerns surround data privacy,ethics,and legality. In this regard, Federated Learning (FL)—a novel machine learning paradigm—has garnered increasing attention. FL participants can train local models, safeguard data privacy, and exchange only model parameters with servers or other peers, fully capitalizing on the value of data. This “data-available-but-not-visible” approach is gaining popularity in data-intensive fields.
Many FL tasks cannot be accomplished in a single instance and require sustained collaboration among multiple parties. For example, in the joint development of an FL model across multiple medical institutions to detect and manage chronic diseases, continuous accumulation of clinical data, learning from case changes, and model robustness and predictability improvements are necessary to reflect the latest medical knowledge and practices. Current literature on FL cooperative behavior and incentive mechanisms,however,primarily focuses on cross-device federated learning and considers only one-off cooperation. This modeling is inadequate for characterizing practical cross-silo long-term FL patterns. On the one hand, cross-silo FL participants, who also accumulate a certain amount of data,have more complex and diverse strategic options compared to those in cross-device FL.Participants can choose to participate in public training or solely improve their model utility through local training.On the other hand,when cooperation transitions from a one-off to a long-term scenario,time inconsistency issues may lead to free-riding behaviors,incentivizing participants to delay data contributions while enjoying the benefits of others’ contributions.To address these limitations,this study concentrates on the long-term cross-silo FL process,establishing a dynamic game model to characterize federated clients’ interactive strategies and proposing a reinforcement learning-based incentive mechanism to encourage rational participant contribution,aiming to boost the FL system’s overall revenue.
This paper first establishes a dynamic game model to characterize federated clients’ long-term interactive strategies.We devise a cooperation contract in which the central server only transmits the aggregated parameters to current training period contributors.With the long-term cross-silo FL cooperation process divided into several model training periods,clients have two strategic choices in each period:to participate in public federated training or to retain data for local training only.At the end of each period,clients receive feedback parameters from the central server and gain corresponding benefits based on their local models’ accuracy.In this framework,clients face a trade-off between participation costs and potential early contribution benefits.Given the information accumulation in the model with the client’s input,clients also confront a cross-period decision-making problem regarding resource allocation throughout the entire long-term FL cooperation process.Based on these background assumptions,this paper establishes a game tree to consider the game solution,where clients’ decisions in each training period are based on full knowledge of past cooperation and rational expectations of future actions.Through backward induction,we solve for the client’s equilibrium strategy,which exhibits intermittent contribution gaps,clearly deviating from the socially optimal cooperative pattern.
Building on the above game analysis,this paper subsequently designs a dynamic incentive scheme based on reinforcement learning,setting incentives for different training periods based on clients’ cooperation progress.Firstly,we regard the FL organization as a central planner responsible for issuing incentives before each training period to encourage federated client input.The Deep Reinforcement Learning (DRL) agent assists the central planner in making incentive decisions,with federated clients serving as the environment with which the agent interacts.On the one hand,we meticulously design the state,action,and reward of the DRL method to fully encompass the information of the federated learning cooperation process.On the other hand,we introduce enhancements to the traditional Deep Q-Network (DQN) method,such as Double Deep Q-Network (DDQN),prioritized replay,and noisy network,to augment the method’s performance.Through extensive experiments,we verify the scheme’s effectiveness in improving the system’s total revenue and controlling incentive costs.Reasonable incentive cost penalties can guide the DRL agent towards the most cost-effective incentive scheme,accurately incentivizing low-willingness cooperation periods of clients,and the system revenue under the same budget significantly surpasses that of fixed incentives.
This paper not only theoretically uncovers the dynamic patterns in long-term cross-silo federated learning cooperation but also proposes innovative incentive mechanisms to enhance cooperation efficiency,offering fresh insights and methodologies for effectively facilitating data sharing and cooperation in the contemporary information era.