Access Data

Qualified researchers may obtain access to the full breadth of individual-level PPMI data, including clinical, imaging, ‘omics, genetic, sensor, and biomarker data. Apply for data access or login below.

The PPMI data structure has been optimized as a result of the study expansion. Two changes occurred in September 2021 that are crucial to all investigators downloading PPMI data:

  • Some variable definitions/names have been adjusted, which will impact code written based on the previous structure. The revised data dictionary reflects all changes in data structure.
  • PPMI participants have now been assigned to cohorts and subgroups in an analytic dataset, based on a central review of the most recent longitudinal data. The analytic data set cohort assignments should be used rather than the enrollment cohort assignments for all data analysis. An Excel file providing definitions for analytic datasets and accompanying guidance document will now be available with data downloads

Please reach out with questions or feedback.

Registered users

Investigators who have been granted access to PPMI data can click the button below to access the login page.

Go to login page

New users

Investigators seeking access to PPMI data must do the following:

Applications for data access are reviewed by the Data and Publications Committee within one week of receipt.

Apply for data access

Investigators using PPMI data will be asked to provide annual updates on the analyses they have performed. This information will be displayed publicly on the PPMI Web site on an Ongoing Analyses page. Investigators will also be asked to provide new data generated using PPMI data back to the Data and Publications Committee so that it can be integrated into the database for use by future investigators.

Although every attempt has been made to ensure that the contents of the database are correct, data are provided by multiple parties and occasional errors may occur. All reasonable measures will be taken to ensure that errors are corrected promptly and completely; however, no explicit guarantee is provided regarding the accuracy of any data contained within this database. All data are made available as they are collected and are updated daily, therefore are not curated. Investigators who suspect errors in the database should contact us.

Whole exome and genome sequencing were performed on whole-blood extracted DNA samples (details may be found in the Exome Sequencing and Genome Sequencing Methods documents available in the Genetic Data Download section of the PPMI database). The RNA Sequencing project generated data from raw sequencing reads of PPMI samples, and the Foundational Data Initiative for Parkinson's Disease (FOUNDIN-PD) project generated data from DNA, RNA, and proteins, using multiple assays, for 95 inducible pluripotent stem cells (iPSC). Genomic VCF (gVCF) data files are currently available for download by authorized PPMI investigators from within the PPMI database.

PPMI also makes available the raw data files. Whole Exome Sequencing BAM files will require approximately 15 TB of storage space and FASTQ files 130 GB of storage space. Whole Genome Sequencing FASTQ files will require approximately 70 TB of storage space. Other genetic data are available including FOUNDIN-PD genetic data, at approximately 25 TB of storage space, and RNA sequencing data, at approximately 154 TB of storage space.

Investigators interested in obtaining data in BAM and/or FASTQ format may submit a request for these data to be provided via IBM Aspera Connect file transfer software. Investigators requesting FASTQ and/or BAM files must have an active PPMI database account and submit a Genetic Data Request Form to ppmi@loni.usc.edu, available in the Genetic Data Download section of the PPMI database. Data requests will be filled in the order received. We will contact you once your request is approved to facilitate a data transfer via IBM Aspera Connect

PPMI data that has the potential to influence other assessments (such as those completed by clinical staff with prodromal cohorts) may be sequestered. For this reason, alpha-synuclein seeding amplification assay results from PPMI prodromal participants are sequestered. Researchers interested in accessing this data may request such from study leadership. Please reference the Data Access Guidelines for more information. Note: Sequestered data requests should sent in pdf format.

Usage Statistics

Number of Downloads

21,589,003

Clinical Downloads: 984,095

Image Downloads: 20,604,908

This number represents unique downloads. A single download is equal to downloading a single .csv on IDA not 1 download per the entire dataset.

Downloads by Country

United States of America: 4,765,351

Downloads by Sector

University Research 17,973,878
Pharmaceutical and Biotech 1,319,940
Government 956,290
Pharmaceutical 364,533
Scanner Mfg 8,446
Other 965,916

Resources

This interactive tool provides an aggregate-level snapshot of PPMI cohort demographics and data collection time points.

Our data dictionary describes participant data and includes variable and schema definitions. Values are found in the codebook.

Have a question on PPMI design, data collection methods or access details?

The study’s data and biosample collection, storage and analysis methods have become field-wide standards.

Some quick example text to build on the card title and make up the bulk of the card's content.

PPMI follows thousands of individuals with varied connections to Parkinson's disease.