Federated Multilingual Models for medical Transcript Analysis
Hi, I am Tau from Microsoft L5DC and this is federated multilingual models for medical transcript analysis. This work is used in product that we are developing called text analytics for health. Text analytics for health is a platform for understanding clinical texts. Among its features is name identity recognition. We recognize over 30 types of entities that are all relevant to the medical domain such as treatment name, the agnoses, medication name, etc. And on top of this, we also identify various relations such as examination and examination results. We can attach time to the entity that the time refers to, etc. We also identify different type of assertions such as negation. So for example, if the agnoses is negated, then instead of presenting to the client that the entity appear, we also tag it as negative for better understanding of the clinical intention of text. And finally, we also do entity linking. We link all the entities found to UMLS. So the clients can enjoy any structure information from those anthologies. So how do we do it? So behind the scenes, there’s a large language model implemented by Torch and it reads a lot of clinical text in different languages. But as everyone knows, clinical texts are private and protected and cannot be accessed by anyone. So what we do is federated learning pipeline. First, each client provision is on a subscription with its own compute and its own data. And only the client compute can access his own data. And then we distribute a language model to all the clients, clients where they train on their own data, update the central model. And once the central model is updated and aggregates all the different weights for more clients, he can run evaluation and redistribute the model for another round of training. And this repeats till the, till the model is converged. And if you want to access some resources that we used for this work, you can first access the Azure ML pipeline for federated learning, where you can run, create federated learning pipelines on your own. And then you can implement the training step and the evaluation and you can implement your own federated learning pipeline. And then you can use flute, which is a platform for federated learning simulation. If you want to run simulation of your of your experiment and check its validity, you can do this over flute. And save all the overhead of slow communication between, between different subscription and so on and just run a simulation of a single cluster. And finally, you can access text analytics for health or more information about our product. Thank you.