Curriculum Vitae

General Information

Full Name Wentao Li
Languages English, Chinese
Programming Python (Pytorch, Tensorflow), R, JavaScript, Plink, Docker, MongoDB
Skills Machine Learning, Deep Learning, Genomic Studies, Medical Imaging Studies, Federated Learning, Privacy-preserving AI


  • 2021 - present
    PhD candidate in Biomedical Informatics
    University of Texas Health Science Center at Houston (UTHealth), US
    • Dean’s Excellent Award 2021.
    • Jingchun Sun Memorial Scholarship 2023.
  • 2018 - 2020
    Master of Science in Statistics
    University of California, San Diego, US
  • 2014 - 2018
    Bachelor of Science in Mathematics
    Shanghai Maritime University, China
    • Dean’s List of SMU, 2016.
    • First Class Scholarship of SMU, 2017.


  • 2023 - present
    Research Assistant
    The University of Texas MD Anderson Cancer Center, US
    • Multi-modality modeling for cancer research
    • Pan-cancer study
  • 2021 - 2023
    Research Assistant
    The University of Texas Health Science Center at Houston, US
    • Developed and published a genetic algorithm for federated learning in the privacy-preserving genome-wide association studies (GWAS) using GLMMs.
    • Conducted federated Genomic Data Analysis evaluation experiments with OpenSNP dataset.
    • Developed a privacy federated learning genetic algorithm based on R package ‘Generalized linear Mixed Model Association Tests (GMMAT).
  • 2020
    Research intern
    The University of Texas Health Science Center at Houston, US
    • Developed and published a privacy federated learning method to approximate the intractable marginal log-likelihood function in the Generalized Linear Mixed Models (GLMMs) for cohort study.
    • Conducted experiments in adding differential privacy to federated GLMMs.
    • Hosted federated training among Houston, San Diego, and Munich with previous published work VERIcal Grid logistic regression with Confidence Interval (VERTIGO-CI).
  • 2019 - 2020
    Research Assistant
    University of California San Diego, US
    • Developed two prediction models in R and Python (based on Logistic Regression) that can handle horizontally and vertically partitioned data, Grid Binary LOgistic REgression (GLORE) and VERTIcal Grid logistic regression (VERTIGO).
    • Proved and developed an algorithm that can transmit Confidence Intervals based on VERTIGO and published the method as VERTIGO-CI.
    • Set up Dockers for the prediction models (VERTIGO with Confidence Intervals & GLORE) and then tested the capability of privacy-preserving prediction with data from Oklahoma, Texas, and San Diego.
    • Cleaned correlated genetic data with Quality control (QC) procedure in Plink.

Seminars & Speaches

  • 2021
    AMIA 2021 Virtual Informatics Summit
    • Presentation on published conference paper ‘VERTIcal Grid lOgistic regression with Confidence Interval’.

Research Interests

  • Privacy-preserving machine learning
    • Federated Learning
    • Differential Privacy
    • Secure Multi-party computation
    • Homomorphic Encryption
  • Genome-Wide Association Studies (GWAS)
    • Generalized linear Mixed Model Association Tests
    • Kinship relationship estimation
  • Medical imaging research
    • Multimodality imaging modeling (CT/PET/MRI)

Open Source Projects

  • 2022 - present
    Federated Learning Platform (FedPlatform) development
    • Developed a lightweight cross-silo federated learning platform based on the browser.
    • Embed a Python distribution on the browser to accomplish federated learning tasks. This lightweight system can free federated trainers from installing any dependencies.
    • Accomplished multi-party data collaboration simulation test on linear regression with federated learning.
    • Ongoing project aims to bridge isolated data islands and provide an experience-friendly platform for non-professional users to collaborate on federated learning tasks.
  • 2022 - present
    FedML MLOpsCloud-Web development
    • Developed a web-based cross-silo federated learning feature in FedML.
    • Designed and deployed a generalised framework in web-based federated learning, which aligns model structures during communication between browsers (Tensorflow.js) and the server (Pytorch).

Other Interests

  • Hobbies: Hikings, BBQ, etc.