Distributed deep learning and inference without sharing raw data

MIT Alliance for Distributed and Private Machine Learning

Abstract: Friction in data sharing is a large challenge for large scale machine learning. Recently techniques such as Federated Learning, Differential Privacy and Split Learning aim to address siloed and unstructured data, privacy and regulation of data sharing and incentive models for data transparent ecosystems. Split learning is a new technique developed at the MIT Media Lab’s Camera Culture group that allows for participating entities to train machine learning models without sharing any raw data.

New Program: MIT Alliance for Distributed and Private Machine Learning

The program will explore the main challenges in data friction that make capture, analysis and deployment of AI technologies. The challenges include siloed and unstructured data, privacy and regulation of data sharing and incentive models for data transparent ecosystems. The research program with study automated machine learning (AutoML), privacy preserving machine learning (PrivateML) and intrinsic as well as extrinsic data valuation (Data Markets). By working with a stakeholder and innovator network, we aim to create a standard for data transparent ecosystems that can simultaneously address the privacy and utility of data. Our broad focus would be on key technologies such as Differential privacy, Federated Learning and Split Learning.

The program members will meet 4 times a year, publish case studies of AI on siloed data, will develop a curated github archive and engage in privacy aware data sharing protocol discussion towards a data exchange standard. We expect this integrated program to lead to many publications, training of talent, new technologies and standards.

MIT Media Lab consortium members can join the alliance as a special interest group SIG member. Non-member companies, startups and non-profits can join via undirected research gift. To know more about the program, please contact at vepakom(at)mit.edu

New: COVID19 SafePaths Project at (CovidSafePaths.org)

A global community-led movement Safe Paths develops free, open-source, privacy-by-design tools for residents, public health officials, and larger communities to flatten the curve of COVID-19, reduce fear, and prevent a surveillance-state response to the pandemic.

CovidSafePaths References

1.) Apps Gone Rogue: Maintaining Personal Privacy in an Epidemic, 2020. (PDF)

2.) COVID-19 Contact-Tracing Mobile Apps: Evaluation and Assessment for Decision Makers, 2020. (PDF)

3.) Bluetooth based Proximity, Multi-hop Analysis and Bi-directional Trust: Epidemics and More, 2020. (PDF)

Split Learning and Inference:

Split learning removes barriers for collaboration in a whole range of sectors including healthcare, finance, security, logistics, governance, operations and manufacturing.

For example, a split learning configuration as shown below allows for resource-constrained local hospitals with smaller individual datasets to collaborate and build a machine learning model that offers superior healthcare diagnostics, without sharing any raw data across each other as necessitated by trust, regulation and privacy.

Landscape of related work: As shown below, split learning ideally fills the gap for being able to perform advanced AI tasks like training machine learning models in distributed settings with a substantial level of data protection.

Videos of Privacy aware AI, Split Learning at World Economic Forum and Niti Aayog

Health Grid: Blockchain-based Data Marketplace | Ramesh Raskar | WEF 2019 RAMESH RASKAR INTERVIEW WITH BLOXLIVE AT THE WEF AI for All | Speedtalk | Ramesh Raskar Ramesh Raskar: UNC-Chapel Hill Convocation Speaker | 2019
Ramesh Raskar: TEDxBeaconStreet | 2019 Ramesh Raskar: UNC-Chapel Hill Convocation Speaker | 2019

Key technical idea: In the simplest of configurations of split learning, each client (for example, radiology center) trains a partial deep network up to a specific layer known as the cut layer. The outputs at the cut layer are sent to another entity (server/another client) which completes the rest of the training without looking at raw data from any client that holds the raw data. This completes a round of forward propagation without sharing raw data. The gradients are now back propagated again from its last layer until the cut layer in a similar fashion. The gradients at the cut layer (and only these gradients) are sent back to radiology client centers. The rest of back propagation is now completed at the radiology client centers. This process is continued until the distributed split learning network is trained without looking at each others raw data.

SplitNN Architectures, Leakage Prevention and Diverse Applications


Potential Partner/Want to connect with us?

Please fill this simple form to reach out

Frequently asked questions

  1. How does split learning work and what is new in our approach?
    Split learning attains high resource efficiency for distributed deep learning in comparison to existing methods by splitting the models architecture across distributed entities. It only communicates activations and gradients just from the split layer unlike other popular methods that share weights/gradients from all the layers. Split learning requires no raw data sharing; either of labels or features.

  2. How is raw data protected and who can get positively impacted?
    Split learning requires absolutely no raw data sharing. Sectors like healthcare, finance, security, surveillance and others where data sharing is prohibited will benefit from our approach for training distributed deep learning models. Another modality of split learning called NoPeek SplitNN also drastically reduces leakage due to any communicated activations by reducing their distance correlation with raw data while maintaining model performance via categorical cross-entropy.

  3. How long will it take to transition from laboratory setting to actual deployments between cooperating entities?
    The approach is easily deployable for inter and intra entity or organizational collaboration and is highly versatile in terms of possible network topologies. Due to its high resource efficiency in terms of computations, memory, communication bandwidth it is also naturally suitable for distributed learning where the clients are pervasive and ubiquitous edge devices like mobile phones or IOT devices as well as across larger devices and organizations.


Splintering Papers:

1.)Splintering with distributions: A stochastic decoy scheme for private computation, Praneeth Vepakomma, Julia Balla, Ramesh Raskar, (PDF) (2020)

Split Learning Papers:

1.) Distributed learning of deep neural network over multiple agents, Otkrist Gupta and Ramesh Raskar, In: Journal of Network and Computer Applications 116, (PDF) (2018)

2.) NoPeek: Information leakage reduction to share activations in distributed deep learning, Praneeth Vepakomma, Otkrist Gupta, Abhimanyu Dubey, Ramesh Raskar, (PDF) (2020)

3.) DISCO: Dynamic and Invariant Sensitive Channel Obfuscation, Abhishek Singh, Ayush Chopra, Vivek Sharma, Ethan Z. Garza, Emily Zhang, Praneeth Vepakomma, Ramesh Raskar (PDF)(2020)

4.) FedML: A Research Library and Benchmark for Federated Machine Learning.(PDF) (2020)

5.) Split learning for health: Distributed deep learning without sharing raw patient data, Praneeth Vepakomma, Otkrist Gupta, Tristan Swedish, Ramesh Raskar, Accepted to ICLR 2019 Workshop on AI for social good.(PDF) (2018)

6.) Detailed comparison of communication efficiency of split learning and federated learning, Abhishek Singh, Praneeth Vepakomma, Otkrist Gupta, Ramesh Raskar, (PDF) (2019)

7.) ExpertMatcher: Automating ML Model Selection for Users in Resource Constrained Countries, Vivek Sharma, Praneeth Vepakomma, Tristan Swedish, Ken Chang, Jayashree Kalpathy-Cramer, and Ramesh Raskar (PDF) (2019)

8.) Split Learning for collaborative deep learning in healthcare, Maarten G.Poirot, Praneeth Vepakomma, Ken Chang, Jayashree Kalpathy-Cramer, Rajiv Gupta, Ramesh Raskar (2019)

Survey Papers:

1.) Advances and open problems in federated learning (with, 58 authors from 25 institutions!) (PDF) (2019)

2.) No Peek: A Survey of private distributed deep learning, Praneeth Vepakomma, Tristan Swedish, Ramesh Raskar, Otkrist Gupta, Abhimanyu Dubey, (PDF) (2018)

3.) A Review of Homomorphic Encryption Libraries for Secure Computation, Sai Sri Sathya, Praneeth Vepakomma, Ramesh Raskar, Ranjan Ramachandra, Santanu Bhattacharya, (PDF) (2018)

Differentially Private Data Structures:

1.) DAMS: Meta-estimation of private sketch data structures for differentially private COVID-19 contact tracing, Praneeth Vepakomma, Subha Nawer Pushpita, Ramesh Raskar, PPML-NeurIPS 2020, (PDF) (2020)

AutoML Papers:

1.) Accelerating neural architecture search using performance prediction, Bowen Baker, Otkrist Gupta, Ramesh Raskar, Nikhil Naik, In: conference paper at ICLR, (PDF) (2018)

2.) Designing neural network architecture using reinforcement learning, Bowen Baker, Otkrist Gupta, Nikhil Naik & Ramesh Raskar, In: conference paper at ICLR, (PDF) (2017)

We are giving a half-day tutorial at CVPR 2019: On Distributed Private Machine Learning for Computer Vision: Federated Learning, Split Learning and Beyond by Brendan McMahan (Google, USA), Jakub Konečný (Google, USA), Otkrist Gupta (LendBuzz), Ramesh Raskar (MIT Media Lab, Cambridge, Massachusetts, USA), Hassan Takabi (University of North Texas, Texas, USA) and Praneeth Vepakomma (MIT Media Lab, Cambridge, Massachusetts, USA).

Recent talk on Split Learning at Datacouncil.ai SF 2019 (Slides)

Split learning’s computational and communication efficiency on clients:

Client-side communication costs are significantly reduced as the data to be transmitted is restricted to initial layers of the split learning network (splitNN) prior to the split. The client-side computation costs of learning the weights of the network are also significantly reduced for the same reason. In terms of model performance, the accuracies of Split NN remained competitive to other distributed deep learning methods like federated learning and large batch synchronous SGD with a drastically smaller client side computational burden when training on a larger number of clients as shown below in terms of teraflops of computation and gigabytes of communication when split learning is used to train Resnet and VGG architectures over 100 and 500 clients with CIFAR 10 and CIFAR 100 datasets.

Versatile plug-and-play configurations of split learning

Versatile configurations of split learning configurations cater to various practical settings of i) multiple entities holding different modalities of patient data, ii) centralized and local health entities collaborating on multiple tasks, iii) learning without sharing labels, iv) multi-task split learning, v) multi-hop split learning and other hybrid possibilities to name a few as shown below and further detailed in our paper here (PDF)

News stories

  1. (A new AI method can train on medical records without revealing patient data)

  2. (A little-known AI method can train on your health data without threatening your privacy)

  3. (The Algorithm Newsletter: The privacy-preserving AI technique that will transform healthcare)

  4. (Les Echos: Medical secrecy, artificial intelligence and RGPD: irreconcilable? Not so sure…)

Ramesh Raskar, Associate Professor, MIT Media Lab; Principal Investigator
Praneeth Vepakomma, MIT
Abhishek Singh, MIT
Otkrist Gupta, MIT Affiliate
Vitor Pamplona, MIT Affiliate
Kevin Pho, MIT

OpenMined Collaborators:
Andrew Trask, Adam J. Hall, Théo Ryffel

MIT Alliance for Distributed and Private Machine Learning

Potential Partner/Want to connect with us?

Please fill this simple form to reach out