EviCor

Molecular data — global interaction network — drug response

This website uses cookies. To learn more about our cookie policy, click here

A quick demo for 2D plot from protein expression vs. enrichment scores. Note that this is a real-time execution. Hence for robustness it is recommended to let it running without extra interference.	Correlation between cell line response to alisertib and a gene expression.	Correlation between a gene expression and senstivity to tamoxifen was first discovered in cell lines. Then it was validated in breast cancer TCGA cohort: survival of tamoxifen-treated patients depended on expression of the same gene.	Demo - using expression data for several genes and M0 macrophage count to predict overall survival time in cutaneous and malignant skin melanoma (SKCM) TCGA cohort.

Correlates of drug response
Data exploration
Multivariate models

Intro
FAQ
Demo
Plots
Models
Compatibility
REST API
Citation
Version

EviCor supports simple user requests regarding correlation of molecular profiles in large scale datasets. This may be helpful in a range of contexts, from evaluating agreement between omics platforms to identifying potential markers of drug response.

The data from major public resources, The Cancer Genome Atlas and The Cancer Cell Line Encyclopedia can be either explored as plots of gene, protein, and drug sensitivity profiles or help in identifying correlates of response to anti-cancer drugs. Furthermore, such correlaions can be traced to clinical treatment profiles in TCGA. An extra feature of EviCor is network context. This is instrumental in both biomarker discovery and evaluation of candidate molecules in the network context.

Refer to this page to get answers to your questions and instructions for some typical tasks.

Open as new pageicon

Demo "2D"
Demo "gene X drug"
Demo "survival"

EviCor website offers a variety of interactive plots. User can use plots for explorative data analysis. Plots can be shared vi automatically generated urls.

Available plot types

Plot type is always decided on the data types of chosen variables. The following table describes available plot types and their dimensionality.

Plot type	Dimensionality	Description
Bar	1D	Basic bar plot for categorial data
Pie	1D	Alternative representation for categorial data
Histogram	1D/2D	1D: basic histogram, 2D: stacked histogram (by categories)
Venn	2D	Venn diagrams for categorial data, shows intersection of 2 categories
Box	2D/3D	Interactive box plots with whiskers and outliers. In case of 3D - groupped box plot, one of the groups is represented via color
Kaplan-Meier plot	2D/3D	Kaplan-Meier survival plots, numeric and categorial data can be used to define groups
Scatter	2D/3D	2D plot for numerical data. In case of 3D plot, 3rd dimension is represented via data points color and/or shape, data for the 3rd dimension can be either numeric or categorial.

Meta-codes

Data for plotting can be selected either by codes (for TCGA) or tissue of origin (CCLE) or meta-codes. Meta-codes are synonyms for a collection of codes, e.g. "metastatic" metacode corresponds to all TCGA codes meant to represent samples from metastases.

Available data transformations

EviCor allows user to scale data when working with numerical data types. The following transformations are available:

Scale	Description
Original	Keep values as they are stored in EviCor DB
Square root (sqrt)	Apply square root tranformation to data
Logarithmic (log)	Apply log transformation to data
Beta	Convert values into beta-scale (methylation data only)
M (M-value)	Convert values into M-values (methylation data only)

Plot legends

In many cases plot legends contain additional information. Some of this information is descriptive and pre-defined by user (chosen data source etc), some is calculated by R scripts, such as different types of correlaions. Correlations, when available, are calculated for data on X and Y axis.

REST API

EviCor web platform allows user to create plots using REST API. Refer to the respective tab for more details.

EviCor web platform can create predictive models from pre-loaded data.

Available models

At the moment EviCor supports only glmnet models for regression and binomial classification. GLMnet is an implementation of elastic net algorithm. You can find more information on GLmnet, including original publications, on the official CRAN page of the respective package.

Predictive variables

EviCor allows user to combine variables from different data types for better perofrmance. For example, you can try to predict gene expression of TP53 based on gene expression of certain set of genes and copy number of another set of genes (these sets can intersect). Additionally, you can have predictive variables of the same data type (e.g. gene expression) for the same genes, but measured using different platforms (e.g. Agilent and Illumina HiSeq v.2). You can use all ids as predictive variables for a chosen platform, for this type [all] in the respective IDs input.

Transfering significant correlates

Once you have retrieved significant correaltions via "Correlates of drug response" tab, you may use some (or all of them) as predictive variables. EviCor allows you to transfer them to the "Multivariate models" tab using built-in buffer. You can either add IDs one by one by pressing "Add to clipboard" (icon with two documents one above another) button or add all unique IDs by pressing "Copy all IDs for transfering" in the table header. This buffer stores only unique records. Next, you can pick the predictive variables input on the "Mutivariate models" tab, then either pick required ids in the clipboard window and pick option to ransfer chosen ids or copy all ids.

Meta-variables

Just like meta-codes EviCor allows usage of meta-variables. Meta-variable is a mnemonic code which "hides" a lot of ids behind. As for version 1.3.1, only one metacode is supported - [all]. Use this variable if you want, for example, to test all genes from a certain platform as predictive variables.

Validation parameters

EviCor offers user to withhold some part of data to use for model validation. To use this option, set tick in "Model validation" checkbox, then you can adjust number of folds in k-fold validation and portion of samples withheld for validation.

Exploring model creatiuon results

After model successfully created, user will see a multi-tab dialog. This dialog shows training procedure (refer to glmnet documentation for explanation), correlation of predicted vs actual values on training and testing (if validation was selected) data sets.

Performance metrics

EviCor offers a number of performance metrics, which are calculated on training and testing (if validation option is selected) data sets. EviCor offers the following performance metrics:

Metric	Explanation
AIC	Akaike information criterion
BIC	Bayesian information criterion
RSS	Residual sum of squares
Accuracy	Proportion of correctly classified entities at 0.5 threshold
p	Statistical significance (for survival prediction only)

Using generated models in your pipeline

The generated model can be downloaded by pressing the respective button and loaded into user's R environment (model name: model). Refer to glmnet documentation for model usage tips.

Known compatibility issues

This site best works with Google Chrome (v. 71 or later) and Mozilla Firefox (v. 60 or later). It is also compatible with Edge.
Some functions do not work with IE 11.
The site also works with Apple Safari (v. 12.1, macOS Mojave v10.14.6), other versions of Safari browser can have compatibility issues.

EviCor offers REST API for all functional tabs. Some functions, such as batch jobs, are available only via REST API. All scripts are accessible through https://www.evicor.org/cgi/script_name.

Retrieving correlations

Script name: cor_datatables_json.cgi
Returns: correlations in JSON format
Parameters:

source: mandatory, correlation source (TCGA/CCLE);
datatype: mandatory, data type (e.g. GE, COPY...). Can be set to all;
cohort: mandatory, data cohort, can be set to 'all' for TCGA, must be set to all for CCLE;
platform: mandatory, correlaion platform (e.g. Agilent), can be set to all;
screen: mandatory, correlation screen (e.g. GDSC1), can be set to all for CCLE, must be set to all for TCGA;
id: mandatory, gene/protein/pathway/drug name;
fdr: mandatory, FDR, use dot as delimeter;
mindrug: mandatory, minimal number of samples treated with drug;
data_columns: mandatory, comma-separated list of columns to retrieve from SQL:

for CCLE: gene,feature,ancova_q_1x,ancova_p_2x_cov1,ancova_p_2x_feature,ancova_q_2x_feature
for TCGA: gene,feature,followup,followup_part,q_drug,q_expr,q_interaction,n_patients,n_treated

filter_columns: mandatory, columns to use for filtering (e.g. column with FDR values);
concat_operator: mandatory if more than one value specified in filter_columns, comma-separated list of logical operators to use for filtering conditions generation. For n columns in filter_columns this list should either contain n-1 operators (applied in order) or one operator (applied to all);
limit_by: mandatory, name of the column by which results are sorted (in ascending order), number of returned results is limited to 1000.

Example: https://www.evicor.org/cgi/cor_datatables_json.cgi?source=TCGA&datatype=GE&cohort=BRCA&platform=IlluminaHiSeq_RNASeqV2&screen=all&id=tamoxifen&fdr=0.005&mindrug=10&data_columns=gene,feature,followup,followup_part,q_drug,q_expr,q_interaction,n_patients,n_treated&filter_columns=q_expr,q_interaction&concat_operator=OR&limit_by=q_interaction
Usage tips: Option concat_operator may be confusing. The following short example is designed to clarify it. Let's say columns col1, col2 and col3 are chosen for filtering, FDR threshold is 0.05. If concat_operator is set to OR,AND the resulting condition will be: (col1<0.05) OR (col2<0.05) AND (col3<0.05). Setting this parameter to AND will result in the following filtering condition: (col1<0.05) AND (col2<0.05) AND (col3<0.05).

Creating plots

Script name: rplot.cgi
Returns: plot file name and meta-info in JSON format
Parameters:

type: mandatory, plot type (e.g. scatter);
source: mandatory, data source (TCGA or CCLE);
cohort: mandatory, used cohort;
datatypes: mandatory, list of datatypes spearated by commas;
platforms: mandatory, list of platforms separated by commas;
ids: mandatory, list of genes/antibodies/pathways/drugs names, must match length of datatypes/platforms, if platform has no ids - id should be skipped (e.g. tp53,,mdm2);
codes: mandatory, TCGA codes (01, 06...) or metacodes (all, normal...) for TCGA, all or tissue name for CCLE;
scales: mandatory for the most types of plots, axis scales (original...).
surv_period: optional, used for KM plots only; part of followup interval which should be used, 1 = 100% = full interval, 0.25 = 25% etc.; default is 1;

Example: https://www.evicor.org/cgi/rplot.cgi?type=bar&source=TCGA&cohort=BRCA&datatypes=CLIN&platforms=vital_status&ids=%2C&codes=&scales=
Usage tips:This script returns plot filename and plot metadata. Plot can either be an html page (plotly plots, for the majority of plot types) or png image (for Venn diagrams). If only the plot name was returned - this is a sign of an error. REST API does not provide error explanations.

Creating models

Script name: model_predict.cgi
Returns: model name (without any file extenstions)
Parameters:

method: mandatory, method for creating models, at the moment only glmnet is supported;
source: mandatory, data source for model creation (TCGA/CCLE);
cohort: mandatory, cohort for model creation;
multiopt: mandatory, TCGA codes or metacodes (for TCGA) or CCLE tissues;
xdatatypes: mandatory, comma-separated list of data types for independent variables;
xplatforms: mandatory, comma-separated list of platforms for independent variables;
xids: mandatory, lists of ids (if applicable) for independent variables, empty space if ids do not exist for the chosen platforms (refer to the examples for format); can be set to [all];
rdatatype: mandatory, datatype for independent (response) variable;
rplatform: mandatory, platform for independent (response) variable;
rid: mandatory, response variable id (if applicable, otherwise empty);
family: mandatory for glmnet, glmnet family (gaussian/cox...), refer to glmnet documentation for further explanation;
measure: mandatory for glmnet with cross-validation, which loss should be used to pick the best model, refer to glmnet documentation;
alpha: mandatory for glmnet, mixing parameter describing balance between lasso and ridge regression (1 - lasso, 0 - ridge);
nlambda: mandatory for glmnet, number of lambda values, refer to glmnet documentation;
minlambda: mandatory for glmnet, min lambda value, refer to glmnet documentation;
validation: mandatory, if cross-validation should be used (binary flag);
validation_fraction: mandatory for glmnet with validation flag set to TRUE, numeric, defines share of records to be used for independent validation, use dot as a decimal separator;
nfolds: mandatory for glmnet with validation flag set to TRUE, number of folds for N-folds cross-validation (see cv.glmnet in glmnet documentation);
standardize: mandatory for glmnet, if variables should be standardized (binary flag);
stat_file: mandatory, filename to which performance metrics should be saved (use auto to automatically assign the name - see the table below);
extended_output: mandatory, if model creation parameters should be saved to the specified stat_file;
header: mandatory, if specified stat_file should have header; recommended to be set to TRUE. You can collect information on many models in one file, in this case specify this parameter as TRUE only for the first query and make sure that all models belong to the same family and validation option is always the same - so columns are the same (also, extended_output should be set the same for all the models).

Example: https://www.evicor.org/cgi/model_predict.cgi?method=glmnet&source=TCGA&cohort=BRCA&rdatatype=GE&rplatform=illuminahiseq_rnaseq&rid=TP53&xdatatypes=GE&xplatforms=illuminahiseq_rnaseq&xids=%5BA1CF%7CA2BP1%7CA4GALT%7CA4GNT%7CAAA1%7CAADAC%7CAADACL2%7CAADACL3%7CAADACL4%7CAADAT%7CAAK1%7CAANAT%7CAARS2%7CAASS%7CABCA4%7CABI2%7CABP1%7CABR%7CACAA1%7CACAN%7CACTA1%7CACTN1%7CADA%7CADAM7%7CADAP1%7CADAT2%7CADAT3%7CAFM%7CAGBL4%7CAHDC1%7CFOXA2%7CFOXB2%7CBCL2%7CTGFBI%7CTGFBRAP1%7CPTGFR%7CARID1A%7CARID1B%7CARID2%7CARID3B%7CARID4A%7CARID4B%7CCACNA1B%7CCACNA1C%7CCACNA2D1%7CCACNA2D2%7CCACNG2%7CCACNA2D4%7CCACNG3%7CCACNG4%7CCACNG7%7CCACNG8%7CVEGFC%7CHRAS%7CKRAS%7CNRAS%7CPDGFA%7CPDGFC%7CPDGFD%7CMDM1%7CMDM2%7CRBL1%7CRBL2%7CRBP1%7CCTRB1%7CPRB1%5D&multiopt=cancer&family=gaussian&measure=deviance&standardize=false&alpha=0.5&nlambda=10&minlambda=0.01&validation=true&nfolds=10&validation_fraction=0.1&stat_file=auto&extended_output=false&header=true
Usage tips: Using just model name, you can access graphical interpretation of the created model, RData model file, coefficients (in JSON format) and performance metrics (in JSON and csv format). If you have received value 'model578045416464' from the script, you can access the following files in https://www.evicor.org/pics/plots/ :

File name	Meaning
model578045416464.RData	glmnet model for R environment
coeff.model578045416464.json	Model coefficients (with weights) in JSON format
model578045416464_model.png	Graphical abstract of the created model
model578045416464_training.png	Graphical abstract on training step
model578045416464_validation.png	Graphical abstract on validation step (if set)
model578045416464.csv	Performance metrics and some additional information in csv format with header
perf.model578045416464.json	Performance metrics in JSON format

If the error for this model was thrown - it is stored in file model578045416464_error.json.

Batch models creation

Script name: models_from_correlations_batch.cgi
Returns: error or nothing (done returned only after all models are created, so don't use synchronous calls with it, link with the results will be sent to the specified email)
Parameters:

source: mandatory, data source (TCGA/CCLE) for both correlation retrieval and model creation;
datatype: mandatory, same as datatype for cor_datatables_json.cgi; however, several comma-separated values are allowed, number should be the same as number of predictor types;
cohort: mandatory, same as cohort for cor_datatables_json.cgi; may be a comma-separated list, length matches with datatype parameter list;
screen: mandatory, same as screen for cor_datatables_json.cgi; similar to datatype and cohort may be a list;
id: mandatory, same as id for cor_datatables_json.cgi; similar to previous parameters;
fdr: mandatory, same as fdr for cor_datatables_json.cgi; similar to previous parameters;
mindrug: mandatory, same as mindrug for cor_datatables_json.cgi; unlike previous parameters, this one is always a single value;
columns: mandatory, same as columns for cor_datatables_json.cgi;
method: mandatory, same as method for model_predict.cgi;
model_cohort: mandatory, same as cohort for model_predict.cgi;
multiopt: mandatory, same as multiopt for model_predict.cgi;
xdatatypes: mandatory, same as xdatatypes for model_predict.cgi;
xplatforms: mandatory, same as xplatforms for model_predict.cgi;
additional_xids: optional, specifies which ids should be passed to glmnet even if they do not suffice filters; recommended, if no additional xids - leave blank;
rdatatype: mandatory, same as rdatatype for model_predict.cgi;
rplatform: mandatory, same as rplatform for model_predict.cgi;
rid: mandatory, same as rid for model_predict.cgi;
family: mandatory for glmnet, same as family for model_predict.cgi;
measure: mandatory for glmnet, same as measure for model_predict.cgi;
alpha: mandatory for glmnet, same as alpha for model_predict.cgi;
nlambda: mandatory for glmnet, same as nlambda for model_predict.cgi;
minlambda: mandatory for glmnet, same as minlambda for model_predict.cgi;
validation: mandatory, same as validation for model_predict.cgi;
validation_fraction: mandatory for glmnet with validation flag set to TRUE, same as validation_fraction for model_predict.cgi;
nfolds: mandatory for glmnet with validation flag set to TRUE, same as nfolds for model_predict.cgi;
standardize: mandatory, same as standardize for model_predict.cgi;
iter: mandatory, desired number of models;
stat_file: mandatory, name of the file without extenstion to save your data in (try to give a unique name to your stat_file);
extended_output: mandatory, binary flag, if set to TRUE - some additional info will be written into stat_file (such as source, rdatatype etc.);
mail: mandatory, provide the correct email, links with the results will be sent to it when your batch job is done;

Example: https://www.evicor.org/cgi/models_from_correlations_batch.cgi?source=TCGA&datatype=GE&cohort=LUAD&platform=all&screen=all&id=gemcitabine&fdr=0.05&mindrug=10&columns=gene,feature,followup_part,interaction,drug,expr,n_patients,n_treated,followup,q&iter=3&model_cohort=LUAD;method=glmnet&rdatatype=CLIN&rplatform=os&xdatatypes=GE&xplatforms=illuminahiseq_rnaseqv2&additional_xids=&multiopt=01&family=cox&measure=deviance&standardize=false&alpha=1&nlambda=10&minlambda=0.01&validation=true&nfolds=10&validation_fraction=0.1&stat_file=/your_file/&mail=/your_email/
Usage tips: Always try to specify a unique name for your output file! Files are created in the append mode, so if you (or someone) has used the specified file name before - you will get "mixed" results! When the job is done, the reults will be available in file https://www.evicor.org/pics/plots/your_file.csv.

Using JS API

All the aforementioned functions are available through API specified in drugs.js module. Some parameters (like columns) are specified in druggable_config.js.

Datatypes, platforms and screens

Below you can find valid names of datatypes, platforms, variables to use with REST API.

Datatypes
Platforms
Independent variables
Dependent variables

Network-based biomarker discovery and validation:

Marcela Franco, Ashwini Jeggari, Sylvain Peuget, Franziska Böttger, Galina Selivanova, Andrey Alexeyenko Prediction of response to anti-cancer drugs becomes robust via network integration of molecular data Sci Rep 9, 2379 (2019) doi: 10.1186/1471-2105-13-226.

EviNet web resource:

Ashwini Jeggari, Zhanna Alekseenko, José Dias, Johan Ericson, and Andrey Alexeyenko EviNet: a web platform for network enrichment analysis with flexible definition of gene sets Nucleic Acids Res 2018 Jul 2;46(W1):W163-W170. doi: 10.1002/1878-0261.12350.

Version history:
1.0.0 (25th of March, 2020) - stable release: exploring correlations, creating and sharing plots, creating and downloading predictive models.
1.1.0 (8th of April, 2021) - stable release: improved functionality, model comparison, REST API.
1.2.0 (21st of June, 2021) - experimental version: improved functionality, more responsive design.
1.3.0 (10th of September, 2021) - stable version: improved REST API, improved performance, new data, style changes, new tools for working with plots and models.
1.3.1 (9th of November, 2021) - stable version: improved error handling, visual adjustments, updated documentation, bugfixes.
1.3.2 (22nd of December, 2021) - stable version: improved KM plots, updated legend format for all plots, model viewer window now uses plotly, visual improvements, bugfixes
1.3.3 (20th of January, 2022) - stable version: 3D boxplots, visual improvements, bugfixes, extended help

Source	Cohort	Code

Data type	Platform	ID			Scale

Source

Cohort

Samples

Model hyperparameters

Standardize

Alpha

No. of lambda steps

Lambda min ratio

Family

Model validation

No. of folds for cross-validation
Final test set	%

Dependent variable
Data type	Platform	ID

Independent variables

Help Show available data Demos Previous plots Show IDs clipboard Reload completely EviCor

Previous plots