for people ages 40-85 (full criteria)
at San Diego, California and other locations
study started
completion around



The study will collect a cross-sectional dataset of 4000 people across the US from diverse racial/ethnic groups who are either 1) healthy, or 2) belong in one of the three stages of diabetes severity (pre-diabetes/diet controlled, oral medication and/or non-insulin-injectable medication controlled, or insulin dependent), forming a total of four groups of patients. Clinical data (social determinants of health surveys, continuous glucose monitoring data, biomarkers, genetic data, retinal imaging, cognitive testing, etc.) will be collected. The purpose of this project is data generation to allow future creation of artificial intelligence/machine learning (AI/ML) algorithms aimed at defining disease trajectories and underlying genetic links in different racial/ethnic cohorts. A smaller subgroup of participants will be invited to come for a follow-up visit in year 4 of the project (longitudinal arm of the study). Data will be placed in an open-source repository and samples will be sent to the study sample repository and used for future research.


The Artificial Intelligence Ready and Equitable Atlas for Diabetes Insights (AI-READI) project seeks to create a flagship ethically-sourced dataset to enable future generations of artificial intelligence/machine learning (AI/ML) research to provide critical insights into type 2 diabetes mellitus (T2DM), including salutogenic pathways to return to health. The ability to understand and affect the course of complex, multi-organ diseases such as T2DM has been limited by a lack of well-designed, high quality, large, and inclusive multimodal datasets. The AI-READI team of investigators will aim to collect a cross-sectional dataset of 4,000 people and longitudinal data from 10% of the study cohort across the US. The study cohort will be balanced for self-reported race/ethnicity, gender, and diabetes disease stage. Data collection will be specifically designed to permit downstream pseudo-time manifold analysis, an approach used to predict disease trajectories by collecting and learning from complex, multimodal data from participants with differing disease severity (normal to insulin-dependent T2DM). The long-term objective for this project is to develop a foundational dataset in T2DM, agnostic to existing classification criteria or biases, which can be used to reconstruct a temporal atlas of T2DM development and reversal towards health (i.e., salutogenesis). Six cross-disciplinary project modules involving teams located across eight institutions will work together to develop this flagship dataset. Data will be optimized for downstream AI/ML research and made publicly available. This project will also create a roadmap for ethical and equitable research that focuses on the diversity of the research participants and the workforce involved at all stages of the research process (study design and data collection, curation, analysis, and sharing and collaboration).


Type 2 Diabetes, Machine Learning, Retinal Imaging, Data Sharing, Equitable Data Collection, Diabetes Mellitus, Type 2 Diabetes Mellitus, Pre-diabetes/Diet Controlled, Insulin Dependent


You can join if…

Open to people ages 40-85

  • Adults (≥ 40 years old)
  • Patients with and without type 2 diabetes
  • Able to provide consent
  • Must be able to read and speak English

You CAN'T join if...

  • Adults older than 85 years of age
  • Pregnancy
  • Gestational diabetes
  • Type 1 diabetes


  • UC San Diego
    San Diego California 92093 United States
  • University of Washington
    Seattle Washington 98109 United States
  • University of Alabama, Birmingham
    Birmingham Alabama 35233 United States


accepting new patients by invitation only
Start Date
Completion Date
University of Washington
Study Type
Expecting 4000 study participants
Last Updated