Fed-Platform
A light-weighted federated learning platform based on browser
FedPlatform is a web-based federated learning platform for cross-silo privacy-preserving machine learning. Current federated learning algorithms required intense computational skills for implementation and deploying the correct dependencies is often-time a nightmare. Thus, I proposed a web-based federated machine-learning platform which is based on a communication tool socket and WebAssembly pyodide. By providing a well-defined python environment with packages in the front-end, FedPlatform can free users/trainers from configuring the deployment.
Outline
Fire up the central aggregator
First we fire up the server with node.js, and then we go to port 2333 of Localhost.
Create two trainers Alice and Bob
Now, Alice and Bob are interested in doing some training in FedPlatform. So, they open the browser and enter the Linear Regression project.
Platform interface
This is what Alice sees in the FedPlatform interface. She can even talk to Bob in the chat!
Federated Learning
Initiating a federated training task
To initiate a federated learning task with data, user/trainer should upload or drag the data file to the left menu, then hit the Start Training button. Notice that, data is only uploaded to the local browser and it never leaves your device! (refreshing the page will lose your uploaded data).
After Alice Uploaded data and hit the Start Training button, the system will pop up a window for confirmation. Here, I used Breast Cancer Wisconsin (Diagnostic) Data Set from the UCI Machine Learning Repository. And I horizontally split the dataset into two parts for Alice and Bob.
Selecting the label for training
It is essential to agree on the same target label with your trainees in advance. Now, assumes Alice and Bob decided to use the label Diagnosis_M as the training target.
Training result
After FedPlatform received the same training labels from all users/trainers, the central aggregator will start checking the alignment of variables and nans in the dataset. For now, FedPlatform will just simply remove the entries with missing values. Here shows the results from the training.
And of course the result is lossless compared with a centralized linear regression.
Roadmap
- Develop a dashboard web page
- Include more statistical model
- Migrate to tensorflow.js for deep learning model
- Develop a Knowledge Distillation model