OpDEA: Optimizing Proteomics Data Differential Expression Analysis via High-Performing Rules and Ensemble Inference


Abstract

Identification of differentially expressed proteins in a proteomics workflow typically encompasses five key steps: raw data quantification, expression matrix construction, matrix normalization, missing value imputation (MVI), and differential expression analysis. The plethora of options in each step makes it challenging to identify optimal workflows that maximize the identification of differentially expressed proteins. To identify optimal workflows and their common properties, we conduct an extensive study involving 34,576 combinatoric experiments on 24 gold standard spike-in datasets. Applying frequent pattern mining techniques to top-ranked workflows, we uncover high-performing rules that demonstrate optimality has conserved properties. Via machine learning, we confirm optimal workflows are indeed predictable, with average cross-validation F1 scores and Matthew's correlation coefficients surpassing 0.84. We introduce an ensemble inference to integrate results from individual top-performing workflows for expanding differential proteome coverage and resolve inconsistencies. Ensemble inference provides gains in pAUC (up to 4.61%) and G-mean (up to 11.14%) and facilitates effective aggregation of information across varied quantification approaches such as topN, directLFQ, MaxLFQ intensities, and spectral counts. However, further development and evaluation are needed to establish acceptable frameworks for conducting ensemble inference on multiple proteomics workflows.

Workflow of a DEA process for label-free proteomics data and available tools for each step in the workflow

M-VIDIA: A Multi-View representation to Increase modality Depth using Integrative AI


Abstract

Artificial intelligence (AI) models are powerful tools for addressing data challenges such as complexity, sparsity, and noise. Multi-view learning (MVL), which leverages multiple data representations (“views”), holds great potential by enhancing signal quality and improving task performance. However, its application in biomedical research remains largely underexplored. To address this gap, we introduce Multi-View representation to Increase Modality Depth using Integrative AI (M-VIDIA), a novel MVL framework. M-VIDIA integrates diverse views to boost performance across various tasks, including differential expression analysis, patient diagnosis, and cell clustering. Our findings demonstrate that M-VIDIA consistently outperforms other approaches, significantly improving sensitivity and signal quality. This represents a major advancement in data-driven AI, highlighting the crucial role of high-quality data representation in producing reproducible and interpretable results in biomedical research. M-VIDIA is available for use at http://www.ai4pro.tech:3838/.

Overall pipeline of M-VIDIA and its performance on DEA

Acknowledgement and Citation

Acknowledgement

This research/project is supported by the National Research Foundation, Singapore under its Industry Alignment Fund – Prepositioning (IAF-PP) Funding Initiative. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not reflect the views of National Research Foundation, Singapore.

This work was partly supported by the National Innovation Fellow Program of the MOST of China (J.L., Grant No. E327130001).

WWBG also acknowledges support from an MOE Tier 1 award (RS08/21).


Publication

Please cite the following papers:

Hui Peng, He Wang, Weijia Kong, Jinyan Li*, Wilson Wen Bin Goh*. (2024). Optimizing differential expression analysis for proteomics data via high-performing rules and ensemble inference. Nat Commun 15, 3922 (2024). https://doi.org/10.1038/s41467-024-47899-w

Hui Peng, Wilson Wen Bin Goh*. (2024). M-VIDIA: A Multi-View representation to Increase modality Depth using Integrative AI.

FragPipe workflow benchmarking results with label-free DDA data


workflow benchmarking
performance distributions

Maxquant workflow benchmarking resultswith label-free DDA data


workflow benchmarking
performance distributions

DIA-NN workflow benchmarking results with label-free DIA


workflow benchmarking
performance distributions

Spectronaut workflow benchmarking results with label-free DIA


workflow benchmarking
performance distributions

FragPipe workflow benchmarking results with TMT data


workflow benchmarking
performance distributions

Maxquant workflow benchmarking results with TMT data


workflow benchmarking
performance distributions

Recommend optimal Workflow for DDA label-free data

We highly recommend you to use our offline tool or R package instead for testing DEA workflows !!! (see following link:)

offline OpDEA toolkit: OpDEA toolkit

R package: R package


Do you have any preferred selections?

click the botton again to hide !!!

ensemble inference may help improve the true positive rates, Do you help try it?

click the botton again to hide !!!

We suggest to apply the FragPipe-specific top-ranked workflow:
DEqMS|FragPipe|dlfq|missForest|blank:        expression matrix:dlfq        normalization:None        MVI:missForest        DEA tool:DEqMS
or your can apply one of the workflows including the following choices:
expression matrix: [directLFQ] intensity          normalization: None          MVI: [SeqKNN]          DEA tool: [limma, ROTS]
You can view the details of your selected workflow in benchmarking-DDA_LFQ-FragPipe page

thresholds:

Volcano plot of the differential expression analysis (DEA)

DEA with the suggested workflow, please wait for the results...


DEA results.zip
Per your info, our suggestions are as follows:

You can view the details of this workflow in benchmarking-DDA_LFQ-FragPipe page



thresholds:

Volcano plot of the differential expression analysis (DEA)

DEA with the suggested workflow, please wait for the results...


DEA results.zip
We suggest to use the following workflows to conduct ensemble inference:


thresholds:

Volcano plot of the differential expression analysis (DEA)

DEA with the suggested ensemble inference workflow, please wait for the results...


DEA results.zip

please check in which step:

expression matrix types

Normalization methods

MVI algorithms

DEA tools

Do you have any preferred selections?

click the botton again to hide !!!

ensemble inference may help improve the true positive rates, Do you help try it?

click the botton again to hide !!!

We suggest to apply the maxquant-specific top-ranked workflow:
DEqMS|Maxquant|dlfq|Impseq|blank:        expression matrix:dlfq        normalization:None        MVI:Impseq        DEA tool:DEqMS
or your can apply one of the workflows including the following choices:
expression matrix: directLFQ intensity          normalization: [None]          MVI: [Impseq]          DEA tool: limma
You can view the details of your selected workflow in benchmarking-DDA_LFQ-Maxquant page

thresholds:

Volcano plot of the differential expression analysis (DEA)

DEA with the suggested workflow, please wait for the results...


DEA results.zip
Per your info, our suggestions are as follows:

You can view the details of this workflow in benchmarking-DDA_LFQ-Maxquant page


thresholds:

Volcano plot of the differential expression analysis (DEA)

DEA with the suggested workflow, please wait for the results...


DEA results.zip
We suggest to use the following workflows to conduct ensemble inference:


thresholds:

Volcano plot of the differential expression analysis (DEA)

DEA with the suggested ensemble inference workflow, please wait for the results...


DEA results.zip

please check in which step:

expression matrix types

Normalization methods

MVI algorithms

DEA tools

Recommend optimal Workflow for DIA label-free data

We highly recommend you to use our offline tool or R package instead for testing DEA workflows !!! (see following link:)

offline OpDEA toolkit: OpDEA toolkit

R package: R package


Do you have any preferred selections?

click the botton again to hide !!!

ensemble inference may help improve the true positive rates, Do you help try it?

click the botton again to hide!!!

We suggest to apply the DIA-NN-specific top-ranked workflow:
limma|DIANN|dlfq|MinDet|blank:        expression matrix:dlfq        normalization:None        MVI:MinDet        DEA tool:limma
or your can apply one of the workflows including the following choices:
expression matrix: "directLFQ intensity"          normalization: [None]          MVI: [MinDect]          DEA tool: [limma]
You can view the details of your selected workflow in benchmarking-DIA_LFQ-DIANN page

thresholds:

Volcano plot of the differential expression analysis (DEA)

DEA with the suggested workflow, please wait for the results...


DEA results.zip
Per your info, our suggestions are as follows:

You can view the details of this workflow in benchmarking-DIA_LFQ-DIANN page


thresholds:

Volcano plot of the differential expression analysis (DEA)

DEA with the suggested workflow, please wait for the results...


DEA results.zip
We suggest to use the following workflows to conduct ensemble inference:


thresholds:

Volcano plot of the differential expression analysis (DEA)

DEA with the suggested ensemble inference workflow, please wait for the results...


DEA results.zip

please check in which step:

expression matrix types

Normalization methods

MVI algorithms

DEA tools

Do you have any preferred selections?

click the botton again to hide!!!

ensemble inference may help improve the true positive rates, Do you help try it?

click the botton again to hide!!!

We suggest to apply the Spectronaut-specific top-ranked workflow:
limma|DIANN|dlfq|MinDet|blank:        expression matrix:dlfq        normalization:None        MVI:MinDet        DEA tool:limma
or your can apply one of the workflows including the following choices:
expression matrix: "directLFQ intensity"          normalization: [None]          MVI: [Impseq]          DEA tool: [ROTS, limma]
You can view the details of your selected workflow in benchmarking-DIA_LFQ-Spectronaut page

thresholds:

Volcano plot of the differential expression analysis (DEA)

DEA with the suggested workflow, please wait for the results...


DEA results.zip
Per your info, our suggestions are as follows:

You can view the details of this workflow in benchmarking-DIA_LFQ-Spectronaut page


thresholds:

Volcano plot of the differential expression analysis (DEA)

DEA with the suggested workflow, please wait for the results...


DEA results.zip
We suggest to use the following workflows to conduct ensemble inference:


thresholds:

Volcano plot of the differential expression analysis (DEA)

DEA with the suggested ensemble inference workflow, please wait for the results...


DEA results.zip

please check in which step:

expression matrix types

Normalization methods

MVI algorithms

DEA tools

Recommend optimal Workflow for TMT data

We highly recommend you to use our offline tool or R package instead for testing DEA workflows !!! (see following link:)

offline OpDEA toolkit: OpDEA toolkit

R package: R package


Do you have any preferred selections?

click the botton again to hide !!!

We suggest to apply the TMT-FragPipe-specific top-ranked workflow:
limma|FragPipe|abd|SeqKNN|blank:        expression matrix:abd        normalization:None        MVI:SeqKNN        DEA tool:limma
or your can apply one of the workflows including the following choices:
expression matrix: TMT-Integrator abundance          normalization: None          MVI: [SeqKNN]          DEA tool: limma
You can view the details of your selected workflow in benchmarking-TMT-FragPipe page

You can view the details of this workflow in benchmarking-DIA_LFQ-Spectronaut page


thresholds:

Volcano plot of the differential expression analysis (DEA)

DEA with the suggested workflow, please wait for the results...


DEA results.zip
Per your info, our suggestions are as follows:
You can view the details of your selected workflow in benchmarking-TMT-FragPipe page

You can view the details of this workflow in benchmarking-DIA_LFQ-Spectronaut page


thresholds:

Volcano plot of the differential expression analysis (DEA)

DEA with the suggested workflow, please wait for the results...


DEA results.zip

please check in which step:

expression matrix types

Normalization methods

MVI algorithms

DEA tools

Do you have any preferred selections?

click the botton again to hide !!!

We suggest to apply the TMT-Maxquant-specific top-ranked workflow:
proDA|Maxquant|intensity|bpca|blank:        expression matrix:intensity        normalization:None        MVI:bpca        DEA tool:proDA
or your can apply one of the workflows including the following choices:
expression matrix: Reporter intensity          normalization: None          MVI: [bpca]          DEA tool: [limma, ROTS]
You can view the details of your selected workflow in benchmarking-TMT-FragPipe page

You can view the details of this workflow in benchmarking-DIA_LFQ-Spectronaut page


thresholds:

Volcano plot of the differential expression analysis (DEA)

DEA with the suggested workflow, please wait for the results...


DEA results.zip
Per your info, our suggestions are as follows:

You can view the details of this workflow in benchmarking-TMT-FragPipe page


You can view the details of this workflow in benchmarking-DIA_LFQ-Spectronaut page


thresholds:

Volcano plot of the differential expression analysis (DEA)

DEA with the suggested workflow, please wait for the results...


DEA results.zip

please check in which step:

Normalization methods

MVI algorithms

DEA tools

Extracting views for M-VIDIA


This page provides the function of extracting view data from raw quantification outputs

!!! We strongly recommend using our offline tool (zendo) to extract views as the process is time consuming especially for DIA data:

example raw quantification data:

Download from zenodo via the following link
Zenodo: example_raw_quantification.zip;

FragPipe-DDA quantification

please provide following file paths.


please choose views you want to extract


Maxquant-DDA quantification

please provide following file paths.


please choose views you want to extract


DIA-NN-DIA quantification

please provide following file paths.


please choose views you want to extract


Spectronaut-DIA quantification

please provide following file paths.

For Spectronaut, please Use this export schema iq.rs to make a long report, as we use 'iq' R package to extract the views. See https://github.com/tvpham/iq.


please choose views you want to extract


M-VIDIA for proteomics data analysis


Input files

!!!please prepare required files per following instructions.

Please provide view files

File preparation instructions:

Step 1. Extraction of views from raw quantification outputs.

We provide a module helping extract views from your proteomics data and use them as inputs to MVIDIA. see View Extraction page.

example view data:

Download from zenodo via the following link
Zenodo: views.zip;

Step 2. Select view files e.g., e.g., dlfq.tsv, maxlfq.tsv, etc. from Step 1.

Step 3. Preparing a design.tsv file with at least columns of sample_name, condition, and replicate name, the sample name should be the name as the view file column names for intensities

Step 4. Compressing the view files and design file to a views.zip file.

Parameters


Volcano plot of the differential expression analysis (DEA)

DEA with MVIDIA, please wait for the results...

DEA_results.zip

M-VIDIA for Disease diagnosis


Input files

!!!please prepare required files per following instructions.

Please provide view files

File preparation instructions:

Step 1. Extraction of views from raw quantification outputs.

We provide a module helping extract views from your proteomics data and use them as inputs to MVIDIA. see View Extraction page.

example view data:

Download from zenodo via the following link
Zenodo: train.zip;
Download from zenodo via the following link
Zenodo: test.zip;

Step 2. Select a group of training view files e.g., e.g., dlfq.tsv, maxlfq.tsv, etc. from training views extracted from Step 1.

Step 3. Preparing a design.tsv file for training data, with at least columns of sample_name, sample_ID, and class, the sample_name should be the name as the view file column names for intensities

Step 4. Compressing the training view files and design file to a train_views.zip file.

Step 5. Select the same group of test view files e.g., e.g., dlfq.tsv, maxlfq.tsv, etc. from testing views extracted from Step 1.

Step 6. Preparing a design.tsv file for testing data, with at least columns of sample_name and sample_ID, the sample_name should be the name as the view file column names for intensities

Step 7. Compressing the testing view files and design file to a test_views.zip file.

Parameters



Classification of patients with MVIDIA, please wait for the results...

PD_results.zip

M-VIDIA for cell clustering


Input files

!!!please prepare required files per following instructions.

Please provide view files

File preparation instructions:

Step 1. Extraction of views from raw quantification outputs.

We provide a module helping extract views from your proteomics data and use them as inputs to MVIDIA. see View Extraction page.

example view data:

Download from zenodo via the following link
Zenodo: sc.zip;

Step 2. Select view files e.g., e.g., dlfq.tsv, maxlfq.tsv, etc. from Step 1. (also can provide a design file with true labels.)

Step 3. Compressing the view files to a single .zip file.

Parameters



Single cell clustering with MVIDIA, please wait for the results...

Clustering_results.zip

Avaliable Resources


Raw data links for DEA workflow benchmarking

label-free DDA data

HYE5600735_LFQ: PXD028735

HYE6600735_LFQ: PXD028735

HYEqe735_LFQ: PXD028735

HYEtims735_LFQ: PXD028735

HYtims134_LFQ: PXD036134

HEtims425_LFQ: PXD021425;

YUltq006_LFQ: PDC000006

YUltq099_LFQ: PXD002099

YUltq819_LFQ: PXD001819

HEqe408_LFQ: PXD018408

HYqfl683_LFQ: PXD007683

HYEtims777_LFQ: PXD014777;


label-free DIA data

HYEtims735_DIA: PXD028735

MYtims709_DIA: PXD034709

HEqe408_DIA: PXD018408

HEof_n600_DIA: PXD026600

HEof_w600_DIA: PXD026600

HYtims134_DIA: PXD036134

HEqe777_DIA: PXD019777

HEqe777_DIA:


TMT data

HEqe277_TMT10: PXD013277

HYqfl683_TMT11: PXD007683

HYms2faims815_TMT16: PXD020815

HYsps2815_TMT16: PXD020815

HYms2815_TMT16: PXD020815

Quantification data for DEA workflow benchmarking

Please download from zenodo via the following link

Zenodo: Raw quantification results from FragPipe, Maxquant, DIA-NN and Spectronaut

Or download from google drive via the following link

google Drive: Raw quantification results of DDA data; DIA data; TMT data

DEA workflow benchmarking result data

expression matrices are available at:

Download from zenodo via the following link
Zenodo: DDA_expression_matrices;
Download from zenodo via the following link
Zenodo: DIA_expression_matrices;
Download from zenodo via the following link
Zenodo: TMT_expression_matrices;

Performance metrices are available at:

Download from zenodo via the following link
Zenodo: DDA_performance_metrics;
Download from zenodo via the following link
Zenodo: DIA_performance_metrics;
Download from zenodo via the following link
Zenodo: TMT_performance_metrics;

Workflow ranks are available at:

Download from zenodo via the following link
Zenodo: DDA_workflow_ranks;
Download from zenodo via the following link
Zenodo: DIA_workflow_ranks;
Download from zenodo via the following link
Zenodo: TMT_workflow_ranks;

view data for M-VIDIA case studies

view data for Thyroid Nodule Classification (TNC) dataset are available at:

Download from zenodo via the following link
Zenodo: discovery set;
Download from zenodo via the following link
Zenodo: retrospective test set;

Single Mouse Oocytes (SMO) data are available at:

Download from zenodo via the following link
Zenodo: proteomics;
Download from zenodo via the following link
Zenodo: transcriptomics;

Pick-up Single-cell Proteomic Analysis (PiSPA) dataset are available at:

Download from zenodo via the following link
Zenodo: single cell DDA;
Download from zenodo via the following link
Zenodo: single cell DIA;

Introduction of the webserver


Introdcution of the webserver:

This toolkit includes 6 main function panels including Introduction, Benchmarking, Suggestion & DEA, MVIDIA, Data and Help. In the [Introduction] panel, we first present the abstract of this work, then we show an overview of the proteomics data differential expression analysis workflow and the options available in each workflow steps. We later present the abstract of our newly design multi-view learning framework for proteomics data analysis together with the pipeline of the MVIDIA. At last, the Acknowledgement and Citation information are shown.


In the [Benchmarking] panel,

users can view our benchmarking results including checking the rank position and performance metric values of a workflow tested with our benchmaking datasets.


In the [Suggestion & DEA] panel,

we provide the tools for suggesting optimal workflows and conduct differential expression analysis with the suggested workflow directly.


In the [MVIDIA] panel,

we provide three functional modules for completing proteomics data analysis tasks such as DEA, patient diagnosis and cell clustering.


In the [Data] panel,

the user can get the links where raw proteomics data are available. The raw quantification results of the raw data, the extracted expression matrices and our benchmarking results can be downloaded.

The data used in our case studies in our MVIDIA paper are provided.

We also provide the link for downloading our offline toolkit with the same function as the webserver.


In the [Help] panel,

We introduce the webserver and show what the users can do with our toolkit


View benchmarking results


View workflow benchmarking results:
Step 1: Choosing an interested setting, e.g., label-free DDA data quantified with FragPipe, click the item of DDA_LFQ-FragPipe.

There are 7852, 7852, 6284, 6284, 4720 and 1568 workflows under settings of LFQ_DDA-FragPipe, LFQ_DDA-Maxquant, LFQ_DIA-DIANN, LFQ_DIA-Spectronaut, TMT-FragPipe and TMT-Maxquant.


Step 2: Filtering workflows, e.g., check the options in the 4 checkboxes that you are interested in, or by using a key word.

Drag the scroll bar below the table can check the ranking with a specific metric. The ranking of workflows is based-on the average ranking based on the five performance metrics (the column of avg_rank_mean).


Step 3: Click one row of the table to check the performance distributions testing on different datasets.

12 DDA datasets, 7 DIA datasets and 5 TMT datasets were used to evaluate the performances of workflows, the boxplot shows the performance distributions of the 5 metrics.

Only one row is permitted to be select each time, click the same row twice can cancel the selection.


Step 4: View the details of performance distrubutions of the five metrics in the right bottom figure.

Optimal workflow recommendation & DEA


Workflow suggestion and testing:
Step 1: Choose data type that the user has.

Lable-free DDA (DDA), label free DIA (DIA) and TMT data are supported. Choose the one correponding to the data you have.


Step 2: Choose a quantification platform.

For DDA data, FragPipe and Maxquant are supported. For DIA data, DIA-NN and Spectronaut are supported. For TMT data, again, FragPipe and Maxquant are supported. Just choose the one that you used to quantify your proteomics data.


Step 3: Choose to suggest single workflow or try the ensemble inference.

If single workflow is preferred. The user should choose whether the have some preferred options in the workflow step. If yes, then check Yes, and select the options in below checkboxes, otherwise select No.

For DDA and DIA data, our server also support to suggest ensemble inference where multiple workflow are integrated by a p-value integration method. The ens_multi-quant approach is used defaulty.

The imputation method missForest and MLE are quite time-consuming, if they are suggested, we will replace them with MinProb.


Step 4: Click the suggest worklfow button or Try ensemble inference button.

Our server will suggest the top 1st workflow after filtering the workflows with user specified option preference. The potential alternative options will also be shown according to our option comparsion results


Step 5: Upload raw quantification result data, specify thresholds and submit the DEA task.

The user should upload their raw quantification result data for DEA, e.g., the combined_protein.tsv file from FragPipe. The designed file showing the sample condition and group information must be uploaded at the same time. Read the file requirement carefully before uploading your files. The log2FC threshold and adj.pvalue (p-value is ajusted with BH method) threshold should be specified. At last, click DEA button to submit the task.


Step 6: Download DEA results.

After submitting the DEA task, the user should wait for a while till the task being completed. A volcano plot will be generated and a link for the user downloading their DEA results can be found below the volcano plot.


MVIDIA DEA


Applying MVIDIA to DEA tasks:
Step 1: Choose the task "DEA" under MVIDIA in the left menu.
Step 2 Preparing your input data.

We provide a module helping extract views from your proteomics data and use them as inputs to MVIDIA. see "View Extraction" page.


Step 3: Upload view data.

After extracting the view data, Compressing the view files (e.g., dlfq.tsv, maxlfq.tsv, etc.) and the design file (design.tsv) to a zip file named "views.zip". Then, upload the zip file to the server.


Step 4: Setting the parameters.

A.) Please specify the parameters including:

B.) Quantification platform, e.g., FragPipe for DDA data, DIA-NN for DIA data, etc.

C.) proteomics data acqusition mode, DDA or DIA

D.) views used, 3 views are suggested, but please select at least 2 views, e.g., dlfq and maxlfq, etc.

E.) The thesholds for defining differentially expressed proteins (DEPs), including the logFC and the adj.pvalue.

F.) Specifying whehter the data should be normalized and the missing values should be imputed. Avaialbe options please see the pages in Benchmarking.

G.) Specify the group label that you are willing to compare, e.g., A, B.

H.) Choose the multi-view DEA method, MVIDIA, Concat or MLE.


Step 5: After specifying the parameters, please click the MVIDIA-DEA button to submit the task.
Step 6: Check DEA results.

After submitting the DEA task, the user should wait for a while till the task being completed. A volcano plot and the a venn plot showing the DEP overlaps obtained by single view methods and the multi-view method will be generated. The result file can be downloaded through the blue link.


MVIDIA patient diagnosis


Applying MVIDIA to patient diagnosis (classification) tasks:
Step 1: Choose the task "Patient Diagnosis" under MVIDIA in the left menu.
Step 2 Preparing your input data.

We provide a module helping extract views from your proteomics data and use them as inputs to MVIDIA. see View Extraction page.


Step 3: Upload view data.

After extracting the view data for both training data and test data, Compressing the view files (e.g., dlfq.tsv, maxlfq.tsv, etc.) and the design file (design.tsv) to two zip files named "train.zip" and "test.zip". Then, upload the zip files to the server.


Step 4: Setting the parameters.

Please specify the parameters including:

A.) Quantification platform, e.g., FragPipe for DDA data, DIA-NN for DIA data, etc.

B.) proteomics data acqusition mode, DDA or DIA

C.) views used, 3 views are suggested, but please select at least 2 views, e.g., dlfq and maxlfq, etc.

D.) The signatures used for patient diagnosis, it is a set of proteins, can be DEPs detected by our MVIDIA-DEA.

!!! please make sure the proteins are in quotes and seperated by ",", e.g., "sp|A0A075B6S2|KVD29_HUMAN,sp|A0A075B6S5|KV127_HUMAN"

E.) Choose the multi-view DEA method, MVIDIA, Concat or MLE.

F.) Specify the label of the positive class


Step 5: After specifying the parameters, please click the "Patient Diagnosis" button to submit the task.
Step 6: Check diagnosis results.

After submitting the patient diagnosis task, the user should wait for a while till the task being completed. A notification will appear showing the running time after the task being completed. The result file can be downloaded through the blue link.


MVIDIA cell clustering


Applying MVIDIA to cell clustering tasks:
Step 1: Choose the task "Cell Clustering" under MVIDIA in the left menu.
Step 2 Preparing your input data.

We provide a module helping extract views from your proteomics data and use them as inputs to MVIDIA. see View Extraction page.


Step 3: Upload view data.

After extracting the view data, Compressing the view files (e.g., dlfq.tsv, maxlfq.tsv, etc.) and the design file (design.tsv) to a zip file named "sc.zip". Then, upload the zip file to the server.


Step 4: Setting the parameters.

Please specify the parameters including:

A.) Quantification platform, e.g., FragPipe for DDA data, DIA-NN for DIA data, etc.

B.) proteomics data acqusition mode, DDA or DIA

C.) views used, 3 views are suggested, but please select at least 2 views, e.g., dlfq and maxlfq, etc.

D.) Cell filtering threshold, per the missing rate.

E.) Specifying the number of clusters.

F.) Specify the hyperparameters for GCCA used for cell embedding.


Step 5: After specifying the parameters, please click the "Cell-Clustering" button to submit the task.
Step 6: Check clustering results.

After submitting the cell clustering task, the user should wait for a while till the task being completed. A notification will appear showing the running time after the task being completed. The result file can be downloaded through the blue link.


Contact Info


Contact:

OpDEA

Any problem about OpDEA please contact Hui Peng:

Email: hui.peng@ntu.edu.sg

or cdph2009@163.com


Or, can email to the corresponding authors:

Jinyan Li: jinyan.li@siat.ac.cn;

Wilson Wen Bin Goh: wilsongoh@ntu.edu.sg


M-VIDIA

Any problem about MVIDIA please contact Hui Peng:

Email: hui.peng@ntu.edu.sg

or cdph2009@163.com


Or, can email to the corresponding author:

Wilson Wen Bin Goh: wilsongoh@ntu.edu.sg