Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
c6d3e75
Add inferelator R implementation for comparison testing
AaronWatters Jul 20, 2016
8715af1
disable inferelator.R reports
AaronWatters Jul 20, 2016
1e75c12
Merge branch 'master' into add-Inferelator
AaronWatters Jul 20, 2016
4a22648
Refactor call to subprocess for reuse.
AaronWatters Jul 20, 2016
f950fe5
simple driver for running inferelator.R with no validation yet
AaronWatters Jul 20, 2016
d6203d2
test folder exists before rmtree
AaronWatters Jul 20, 2016
12ef12a
add failure hook for travis debugging
AaronWatters Jul 20, 2016
d7d8a65
missing ggplot2 package
AaronWatters Jul 20, 2016
9adbed2
more checking and cleanup
AaronWatters Jul 20, 2016
47e63d2
remove ggplot2 dependancy for travis runs
AaronWatters Jul 21, 2016
23a36f9
check that output files get created in inferelator test
AaronWatters Jul 21, 2016
6e2d634
remove bogus files
AaronWatters Jul 26, 2016
514fe77
add tsv dump for design and response
AaronWatters Jul 26, 2016
4c64294
add dump of priors
AaronWatters Jul 26, 2016
f4b5400
dump bootstrap matrix
AaronWatters Jul 26, 2016
10361ae
add simplified job that uses tfa
AaronWatters Jul 26, 2016
2785e95
add end-to-end run with tfa
AaronWatters Jul 26, 2016
bee94f8
dump activities matrix
AaronWatters Jul 26, 2016
74828cf
try to fix travis problem
AaronWatters Jul 26, 2016
96967db
skip test that requires corpcore R package in TRAVIS for now.
AaronWatters Jul 26, 2016
a1ac0f8
py3 fix
AaronWatters Jul 26, 2016
2036fad
comment dead code and dump X Y
AaronWatters Jul 26, 2016
dbc2ba0
dump mutual information ms_bg
AaronWatters Jul 26, 2016
0e7d1a4
dump clr matrix
AaronWatters Jul 26, 2016
e7d77e7
save correct matrix
AaronWatters Jul 28, 2016
090ed5e
comment out non-working package
AaronWatters Jul 28, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,15 @@ install:
# install R components
- sudo R -f inferelator_ng/R_code/packages.R
script:
- export TRAVIS_TESTING=noninteractive
- python -c "import os; print(repr(os.name))"
- coverage run --source=inferelator_ng setup.py test
#- nosetests
after_success:
- codecov
after_failure:
- pwd
- which Rscript
- find .
- cat ./inferelator_ng/tests/artifacts/run_mi.R
- R -f ./inferelator_ng/tests/artifacts/run_mi.R
132 changes: 132 additions & 0 deletions Inferelator/README
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
NOTE:

This directory contains a subset of https://github.com/ChristophH/Inferelator
for testing and comparison purposes.

Call the inferelator script from the base directory (the one containing this
README) with a job config file as argument.

Example call: Rscript inferelator.R jobs/dream4_cfg.R



--------------------------------------------------------------------------------
Default parameters and a brief explanation of each one
--------------------------------------------------------------------------------

PARS$input.dir <- 'input/dream4' # path to the input files

PARS$exp.mat.file <- 'expression.tsv' # required; see definition below
PARS$tf.names.file <- 'tf_names.tsv' # required; see definition below
PARS$meta.data.file <- 'meta_data.tsv' # assume all steady state if NULL
PARS$priors.file <- 'gold_standard.tsv' # no priors if NULL
PARS$gold.standard.file <- 'gold_standard.tsv' # no evaluation if NULL
PARS$leave.out.file <- NULL # file with list of conditions that will be ignored
PARS$randomize.expression <- FALSE # whether to scramble input expression

PARS$job.seed <- 42 # random seed; can be NULL
PARS$save.to.dir <- file.path(PARS$input.dir, date.time.str) # output directory
PARS$num.boots <- 20 # number of bootstraps; no bootstrapping with a value of 1
PARS$max.preds <- 10 # max number of predictors based on CLR to pass to model
# selection method
PARS$mi.bins <- 10 # number of bins to use for mutual information calculation
PARS$cores <- 8 # number of cpu cores

PARS$delT.max <- 110 # max number of time units allowed between time series
# conditions
PARS$delT.min <- 0 # min number of time units allowed between time series
# conditions
PARS$tau <- 45 # constant related to half life of mRNA (see Core model)

PARS$perc.tp <- 0 # percent of true priors that will be used; can be vector
PARS$perm.tp <- 1 # number of permutations of true priors
PARS$perc.fp <- 0 # percent of false priors (100 = as many false priors as
# there are true priors); can be vector
PARS$perm.fp <- 1 # number of permutations of false priors
PARS$pr.sel.mode <- 'random' # prior selection mode: 'random' or 'tf'
# if 'random', the true priors are randomly chosen
# from all priors edges, if 'tf',
# PARS$perc.tp is interpreted as the percent of
# TFs to use for true priors and all interactions
# for the chosen TFs will be used

PARS$eval.on.subset <- FALSE # whether to evaluate only on the part of the
# network that has connections in the gold
# standard; if TRUE false priors will only be
# drawn from that part of the network

PARS$method <- 'BBSR' # which method to use; either 'MEN' or 'BBSR'
PARS$prior.weight <- 1 # the weight for the priors; has to be larger than 1
# for priors to have an effect

PARS$use.tfa <- FALSE # whether to estimate transcription factor activities and
# use those in the regression models
# if TRUE, interactions in priors file shoud be signed,
# i.e. -1 for repression and +1 for activation
PARS$prior.ss <- FALSE # whether to also sub-sample from the prior matrix during
# each bootstrap; if TRUE, priors are sampled randomly with
# replacement; if FALSE, all priors are used as is

PARS$output.summary <- TRUE # write a summary tsv and RData file of network

PARS$output.report <- TRUE # create html network report

PARS$output.tf.plots <- TRUE # create png files with plots of TFs and targets

--------------------------------------------------------------------------------
Required Input Files
--------------------------------------------------------------------------------

expression.tsv
--------------
expression values; must include row (genes) and column (conditions) names

tf_names.tsv
------------
one TF name on each line; must be subset of the row names of the expression data



--------------------------------------------------------------------------------
Optional Input Files
--------------------------------------------------------------------------------

meta_data.tsv
-------------
the meta data describing the conditions; must include column names;
has five columns:
isTs: TRUE if the condition is part of a time-series, FALSE else
is1stLast: "e" if not part of a time-series; "f" if first; "m" middle; "l" last
prevCol: name of the preceding condition in time-series; NA if "e" or "f"
del.t: time in minutes since prevCol; NA if "e" or "f"
condName: name of the condition

priors.tsv
----------
matrix of 0 and 1 indicating whether we have prior knowledge in
the interaction of one TF and a gene; one row for each gene, one column for
each TF; must include row (genes) and column (TF) names

gold_standard.tsv
-----------------
needed for validation; matrix of 0 and 1 indicating whether there is an
interaction between one TF and a gene; one row for each gene, one column for
each TF; must include row (genes) and column (TF) names



--------------------------------------------------------------------------------
Output Files
--------------------------------------------------------------------------------

One or more betas_frac_tp_X_perm_X--frac_fp_X_perm_X_X.RData files. One file
per true and false prior and prior weight combination. Each RData file contains
two lists of length PARS$num.boots where every entry is a matrix of betas and
confidence scores (rescaled betas) respectively.

One or more combinedconf_frac_tp_X_perm_X--frac_fp_X_perm_X_X.RData files with
one matrix each. The matrix is the rank-combined version of the confidence
scores of all bootstraps.

A params_and_input.RData file with data objects holding the user set parameters,
and input and input-derived objects.
Loading