EviCor

Molecular data — global interaction network — drug response

×

EviCor supports simple user requests regarding correlation of molecular profiles in large scale datasets. This may be helpful in a range of contexts, from evaluating agreement between omics platforms to identifying potential markers of drug response.

The data from major public resources, The Cancer Genome Atlas and The Cancer Cell Line Encyclopedia can be either explored as plots of gene, protein, and drug sensitivity profiles or help in identifying correlates of response to anti-cancer drugs. Furthermore, such correlaions can be traced to clinical treatment profiles in TCGA. An extra feature of EviCor is network context. This is instrumental in both biomarker discovery and evaluation of candidate molecules in the network context.

Refer to this page to get answers to your questions and instructions for some typical tasks.


Open as new pageicon
Demo "2D"
Demo "gene X drug"
Demo "survival"

EviCor website offers a variety of interactive plots. User can use plots for explorative data analysis. Plots can be shared vi automatically generated urls.


Available plot types

Plot type is always decided on the data types of chosen variables. The following table describes available plot types and their dimensionality.
Plot type Dimensionality Description
Bar 1D Basic bar plot for categorial data
Pie 1D Alternative representation for categorial data
Histogram 1D/2D 1D: basic histogram, 2D: stacked histogram (by categories)
Venn 2D Venn diagrams for categorial data, shows intersection of 2 categories
Box 2D Interactive box plots with whiskers and outliers
Kaplan-Meier plot 2D/3D Kaplan-Meier survival plots, numeric and categorial data can be used to define groups
Scatter 2D/3D 2D plot for numerical data. In case of 3D plot, 3rd dimension is represented via data points color and/or shape, data for the 3rd dimension can be either numeric or categorial.

Meta-codes

Data for plotting can be selected either by codes (for TCGA) or tissue of origin (CCLE) or meta-codes. Meta-codes are synonyms for a collection of codes, e.g. "metastatic" metacode corresponds to all TCGA codes meant to represent samples from metastases.

Available data transformations

EviCor allows user to scale data when working with numerical data types. The following transformations are available:
Scale Description
Original Keep numbers as they are stored in EviCor DB
Square root (sqrt) Apply square root tranformation to data
Logarithmic (log) Apply log transformation to data
Beta Convert values into beta-scale (methylation data only)
M (M-value) Convert values into M-values (methylation data only)

Plot legends

In many cases plot legends contain additional information. Some of this information is descriptive and pre-defined by user (chosen data source etc), some is calculated by R scripts, such as different types of correlaions. Correlations, when available, are calculated for data on X and Y axis.

REST API

EviCor web platform allows user to create plots using REST API. Refer to the respective tab for more details.
EviCor web platform can create predictive models from pre-loaded data.

Available models

At the moment EviCor supports only glmnet models for regression and binomial classification. GLMnet is an implementation of elastic net algorithm. You can find more information on GLmnet, including original publications, on the official CRAN page of the respective package.

Predictive variables

EviCor allows user to combine variables from different data types for better perofrmance. For example, you can try to predict gene expression of TP53 based on gene expression of certain set of genes and copy number of another set of genes (these sets can intersect). Additionally, you can have predictive variables of the same data type (e.g. gene expression) for the same genes, but measured using different platforms (e.g. Agilent and Illumina HiSeq v.2).

Transfering significant correlates

Once you have retrieved significant correaltions via "Correlates of drug response" tab, you may use some (or all of them) as predictive variables. EviCor allows you to transfer them to the "Multivariate models" tab using built-in buffer. You can either add IDs one by one by pressing "Add to clipboard" (icon with two documents one above another) button or add all unique IDs by pressing "Copy all IDs for transfering" in the table header. This buffer stores only unique records. Next, you can pick the predictive variables input on the "Mutivariate models" tab, then either pick required ids in the clipboard window and pick option to ransfer chosen ids or copy all ids.

Meta-variables

Just like meta-codes EviCor allows usage of meta-variables. Meta-variable is a mnemonic code which "hides" a lot of ids behind. As for version 1.3.1, only one metacode is supported - all. Use this variable if you want, for example, to test all genes from a certain platform as predictive variables.

Validation parameters

EviCor offers user to withhold some part of data to use for model validation. To use this option, set tick in "Model validation" checkbox, then you can adjust number of folds in k-fold validation and portion of samples withheld for validation.

Exploring model creatiuon results

After model successfully created, user will see a multi-tab dialog. This dialog shows training procedure (refer to glmnet documentation for explanation), correlation of predicted vs actual values on training and testing (if validation was selected) data sets.

Performance metrics

EviCor offers a number of performance metrics, which are calculated on training and testing (if validation option is selected) data sets. EviCor offers the following performance metrics:
Metric Explanation
AIC Akaike information criterion
BIC Bayesian information criterion
RSS Residual sum of squares
Accuracy Proportion of correctly classified entities at 0.5 threshold
p Statistical significance (for survival prediction only)

Using generated models in your pipeline

The generated model can be downloaded by pressing the respective button and loaded into user's R environment (model name: model). Refer to glmnet documentation for model usage tips.
Known compatibility issues:
This site best works with Google Chrome (v. 71 or later) and Mozilla Firefox (v. 60 or later). It is also compatible with Edge.
Some functions do not work with IE 11.
The site also works with Apple Safari (v. 12.1, macOS Mojave v10.14.6), other versions of Safari browser can have compatibility issues.

EviCor offers REST API for all functional tabs. Some functions, such as batch jobs, are available only via REST API. All scripts are accessible through https://www.evicor.org/cgi/script_name.


Retrieving correlations

Script name: cor_datatables_json.cgi
Returns: correlations in JSON format
Parameters:
source: mandatory, correlation source (TCGA/CCLE);
datatype: mandatory, data type (e.g. GE, COPY...). Can be set to 'all';
cohort: mandatory, data cohort, can be set to 'all' for TCGA, must be set to 'all' for CCLE;
platform: mandatory, correlaion platform (e.g. Agilent), can be set to 'all';
screen: mandatory, correlation screen (e.g. GDSC1), can be set to 'all' for CCLE, must be set to 'all' for TCGA;
id: mandatory, gene/protein/pathway/drug name;
fdr: mandatory, FDR, use dot as delimeter;
mindrug: mandatory, minimal number of samples treated with drug;
data_columns: mandatory, comma-separated list of columns to retrieve from SQL:
for CCLE: gene,feature,ancova_p_1x,ancova_p_2x_cov1,ancova_p_2x_feature,ancova_q_2x_feature,ancova_q_2x_feature
for TCGA: gene,feature,followup_part,interaction,drug,expr,n_patients,n_treated,followup,q
filter_columns: mandatory, columns to use for filtering (e.g. column with FDR values);
concat_operator: mandatory if more than one value specified in filter_columns, comma-separated list of logical operators to use for filtering conditions generation. For n columns in filter_columns this list should either contain n-1 operators (applied in order) or one operator (applied to all).
Example:
Usage tips: Option concat_operator may be confusing. The following short example is designed to clarify it. Let's say columns col1, col2 and col3 are chosen for filtering, FDR threshold is 0.05. If concat_operator is set to OR,AND the resulting condition will be: (col1<0.05) OR (col2<0.05) AND (col3<0.05). Setting this parameter to AND will result in the following filtering condition: (col1<0.05) OR (col2<0.05) AND (col3<0.05).


Creating plots

Script name: rplot.cgi
Returns: plot file name
Parameters:
type: mandatory, plot type (e.g. scatter);
source: mandatory, data source (TCGA or CCLE);
cohort: mandatory, used cohort;
datatypes: mandatory, list of datatypes spearated by commas;
platforms: mandatory, list of platforms separated by commas;
ids: mandatory, list of genes/antibodies/pathways/drugs names, must match length of datatypes/platforms, if platform has no id - id should be skipped (e.g. 'tp53,,mdm2');
codes: mandatory, TCGA codes (01, 06...) or metacodes (all, healthy...) for TCGA, 'all' or tissue name for CCLE;
scales: mandatory for the most types of plots, axis scales (original...).
Example: https://www.evicor.org/cgi/rplot.cgi?type=bar&source=TCGA&cohort=BRCA&datatypes=CLIN&platforms=vital_status&ids=%2C&codes=&scales=
Usage tips:This script returns plot filename and plot metadata. Plot can either be an html page (plotly plots, for the majority of plot types) or png image (for Venn diagrams). If only the plot name was returned - this is a sign of mistake. REST API does not provide mistake explanations.


Creating models

Script name: model_predict.cgi
Returns: model name (without any file extenstions)
Parameters:
method: mandatory, method for creating models, at the moment only 'glmnet' is supported;
source: mandatory, data source for model creation (TCGA/CCLE);
cohort: mandatory, cohort for model creation;
multiopt: mandatory, TCGA codes or metacodes (for TCGA) or CCLE tissues;
xdatatypes: mandatory, comma-separated list of data types for independent variables;
xplatforms: mandatory, comma-separated list of platforms for independent variables;
xids: mandatory, lists of ids (if applicable) for independent variables, empty space if ids do not exist for the chosen platforms (refer to the examples for format);
rdatatype: mandatory, datatype for independent (response) variable;
rplatform: mandatory, platform for independent (response) variable;
rid: mandatory, response variable id (if applicable, otherwise empty);
family: mandatory for glmnet, glmnet family (gaussian/cox...), refer to glmnet documentation for further explanation;
measure: mandatory for glmnet with cross-validation, which loss should be used to pick the best model, refer to glmnet documentation;
alpha: mandatory for glmnet, mixing parameter describing balance between lasso and ridge regression (1 - lasso, 0 - ridge);
nlambda: mandatory for glmnet, number of lambda values, refer to glmnet documentation;
minlambda: mandatory for glmnet, min lambda value, refer to glmnet documentation;
validation: mandatory, if cross-validation should be used (binary flag);
validation_fraction: mandatory for glmnet with validation flag set to TRUE, numeric, defines share of records to be used for independent validation, use dot as decimal separator;
nfolds: mandatory for glmnet with validation flag set to TRUE, number of folds for N-folds cross-validation (see cv.glmnet in glmnet documentation);
standardize: mandatory for glmnet, if variables should be standardized (binary flag);
stat_file: mandatory, filename to which performance metrics should be saved (use auto to automatically assign the name - see the table below);
extended_output: mandatory, if model creation parameters should be saved to the specified stat_file;
header: mandatory, if specified stat_file should have header; recommended to be set to true. You can collect information on many models in one file, in this case specify this parameter as true only for the first query and make sure that all models belong to the same family and validation option is always the same - so columns are the same (also, extended_output should be set the same for all the models).
Example: https://www.evicor.org/cgi/model_predict.cgi?method=glmnet&source=TCGA&cohort=BRCA&rdatatype=GE&rplatform=illuminahiseq_rnaseq&rid=TP53&xdatatypes=GE&xplatforms=illuminahiseq_rnaseq&xids=%5BA1CF%7CA2BP1%7CA4GALT%7CA4GNT%7CAAA1%7CAADAC%7CAADACL2%7CAADACL3%7CAADACL4%7CAADAT%7CAAK1%7CAANAT%7CAARS2%7CAASS%7CABCA4%7CABI2%7CABP1%7CABR%7CACAA1%7CACAN%7CACTA1%7CACTN1%7CADA%7CADAM7%7CADAP1%7CADAT2%7CADAT3%7CAFM%7CAGBL4%7CAHDC1%7CFOXA2%7CFOXB2%7CBCL2%7CTGFBI%7CTGFBRAP1%7CPTGFR%7CARID1A%7CARID1B%7CARID2%7CARID3B%7CARID4A%7CARID4B%7CCACNA1B%7CCACNA1C%7CCACNA2D1%7CCACNA2D2%7CCACNG2%7CCACNA2D4%7CCACNG3%7CCACNG4%7CCACNG7%7CCACNG8%7CVEGFC%7CHRAS%7CKRAS%7CNRAS%7CPDGFA%7CPDGFC%7CPDGFD%7CMDM1%7CMDM2%7CRBL1%7CRBL2%7CRBP1%7CCTRB1%7CPRB1%5D&multiopt=cancer&family=gaussian&measure=deviance&standardize=false&alpha=0.5&nlambda=10&minlambda=0.01&validation=true&nfolds=10&validation_fraction=0.1&stat_file=auto&extended_output=false&header=true
Usage tips: Using just model name, you can access graphical interpretation of the created model, RData model file, coefficients (in JSON format) and performance metrics (in JSON and csv format). If you have received value 'model578045416464' from the script, you can access the following files in https://www.evicor.org/pics/plots/ :

File name Meaning
model578045416464.RData glmnet model for R environment
coeff.model578045416464.json Model coefficients (with weights) in JSON format
model578045416464_model.png Graphical abstract of the created model
model578045416464_training.png Graphical abstract on training step
model578045416464_validation.png Graphical abstract on validation step (if set)
model578045416464.csv Performance metrics and some additional information in csv format with header
perf.model578045416464.json Performance metrics in JSON format

If the error for this model was thrown - it is stored in file model578045416464_error.json.


Batch models creation

Script name: models_from_correlations_batch.cgi
Returns: error or nothing ('done' returned only after all models are created, so don't use synchronous calls with it, link with the results will be sent to the specified email)
Parameters:
source: mandatory, data source (TCGA/CCLE) for both correlation retrieval and model creation;
datatype: mandatory, same as datatype for cor_datatables_json.cgi; however, several comma-separated values are allowed, number should be the same as number of predictor types;
cohort: mandatory, same as cohort for cor_datatables_json.cgi; may be a comma-separated list, length matches with datatype parameter list;
screen: mandatory, same as screen for cor_datatables_json.cgi; similar to datatype and cohort may be a list;
id: mandatory, same as id for cor_datatables_json.cgi; similar to previous parameters;
fdr: mandatory, same as fdr for cor_datatables_json.cgi; similar to previous parameters;
mindrug: mandatory, same as mindrug for cor_datatables_json.cgi; unlike previous parameters, this one is always a single value;
columns: mandatory, same as columns for cor_datatables_json.cgi;
method: mandatory, same as method for model_predict.cgi;
model_cohort: mandatory, same as cohort for model_predict.cgi;
multiopt: mandatory, same as multiopt for model_predict.cgi;
xdatatypes: mandatory, same as xdatatypes for model_predict.cgi;
xplatforms: mandatory, same as xplatforms for model_predict.cgi;
additional_xids: optional, specifies which ids should be passed to glmnet even if they do not suffice filters; recommended, if no additional xids - leave blank;
rdatatype: mandatory, same as rdatatype for model_predict.cgi;
rplatform: mandatory, same as rplatform for model_predict.cgi;
rid: mandatory, same as rid for model_predict.cgi;
family: mandatory for glmnet, same as family for model_predict.cgi;
measure: mandatory for glmnet, same as measure for model_predict.cgi;
alpha: mandatory for glmnet, same as alpha for model_predict.cgi;
nlambda: mandatory for glmnet, same as nlambda for model_predict.cgi;
minlambda: mandatory for glmnet, same as minlambda for model_predict.cgi;
validation: mandatory, same as validation for model_predict.cgi;
validation_fraction: mandatory for glmnet with validation flag set to TRUE, same as validation_fraction for model_predict.cgi;
nfolds: mandatory for glmnet with validation flag set to TRUE, same as nfolds for model_predict.cgi;
standardize: mandatory, same as standardize for model_predict.cgi;
iter: mandatory, desired number of models;
stat_file: mandatory, name of the file without extenstion to save your data in (try to give a unique name to your stat_file);
extended_output: mandatory, binary flag, if set to TRUE - some additional info will be written into stat_file (such as source, rdatatype etc.);
mail: mandatory, provide the correct email, links with the results will be sent to it when your batch job is done;
Example: https://www.evicor.org/cgi/models_from_correlations_batch.cgi?source=TCGA&datatype=GE&cohort=LUAD&platform=all&screen=all&id=gemcitabine&fdr=0.05&mindrug=10&columns=gene,feature,followup_part,interaction,drug,expr,n_patients,n_treated,followup,q&iter=3&model_cohort=LUAD;method=glmnet&rdatatype=CLIN&rplatform=os&xdatatypes=GE&xplatforms=illuminahiseq_rnaseqv2&additional_xids=&multiopt=01&family=cox&measure=deviance&standardize=false&alpha=1&nlambda=10&minlambda=0.01&validation=true&nfolds=10&validation_fraction=0.1&stat_file=/your_file/&mail=/your_email/
Usage tips: Always try to specify a unique name for your output file! Files are created in the append mode, so if you (or someone) has used the specified file name before - you will get "mixed" results! When the job is done, the reults will be available in file https://www.evicor.org/pics/plots/your_file.csv


Using JS API

All the aforementioned functions are available through API specified in drugs.js module. Some parameters (like columns) are specified in druggable_config.js.


Datatypes, platforms and screens

Below you can find valid names of datatypes, platforms, variables to use with REST API.

Datatypes
Platforms
Independent variables
Dependent variables
Network-based biomarker discovery and validation:

Marcela Franco, Ashwini Jeggari, Sylvain Peuget, Franziska Böttger, Galina Selivanova, Andrey Alexeyenko Prediction of response to anti-cancer drugs becomes robust via network integration of molecular data Sci Rep 9, 2379 (2019) doi: 10.1186/1471-2105-13-226.

EviNet web resource:

Ashwini Jeggari, Zhanna Alekseenko, José Dias, Johan Ericson, and Andrey Alexeyenko EviNet: a web platform for network enrichment analysis with flexible definition of gene sets Nucleic Acids Res 2018 Jul 2;46(W1):W163-W170. doi: 10.1002/1878-0261.12350.

Version history:
1.0.0 (25th of March, 2020) - stable release: exploring correlations, creating and sharing plots, creating and downloading predictive models.
1.1.0 (8th of April, 2021) - stable release: improved functionality, model comparison, REST API.
1.2.0 (21st of June, 2021) - experimental version: improved functionality, more responsive design.
1.3.0 (10th of September, 2021) - stable version: improved REST API, improved performance, new data, style changes, new tools for working with plots and models.
1.3.1 (9th of November, 2021) - stable version: improved error handling, visual adjustments, updated documentation, bugfixes.

Source Cohort Code

Data type Platform ID Scale
Source
Cohort
Samples
Model hyperparameters
Standardize
Alpha
No. of lambda steps
Lambda min ratio
Family
Model validation
No. of folds for cross-validation
Final test set %
Dependent variable
Data type
Platform
ID
Independent variables