-
Notifications
You must be signed in to change notification settings - Fork 2
Usage
Mathieu Fourment edited this page Apr 22, 2015
·
1 revision
- Sequence names in the alignment and tree files must match.
- Sequence names cannot contain these characters
,[]():; - Sequences contain the time of sampling at the end of their names and the date should start with an underscore (e.g. seq1_2010.5). Sequences are assumed to be isochronous if dates are not present.
- No duplicate sequence names.
- The tag for calibration points for an internal node inside a tree should be: [&cal_height={10,15}].
- Calibration points are only for isochronous sequences (for now).
- NEXUS files:
- Names can contain spaces but the whole name has to be surrounded by quotes (e.g. "seq 1_2001.3")
- Keep the file simple.
>seq1_2001.3
AAAAAAAA
>seq2_2011.3
AAAAAAAT
>seq3_2013.72
AAAAAAAG
>seq4_2008.2
AAAAAAAC
or
#NEXUS
begin data;
seq1_2001.3 AATCTCGA
seq2_2011.3 AAACTCGA
seq3_2013.72 AATTTCGA
seq4_2008.2 AATTTCGG
end;
(((seq1_2001.3, seq2_2011.3), seq3_2013.72), seq4_2008.2);
or
#NEXUS
begin trees;
(((seq1_2001.3, seq2_2011.3), seq3_2013.72), seq4_2008.2);
end;
physher expects a configuration file as argument.
Each line of the configuration file is of the form: key=value
Example:
input.sequences = fluA.fa
input.tree = fluA.tree
output.stem = fluA
substmodel.type = HKY
clock = strict
Alternatively, physher can be run from the command line:
./physher -m HKY -i fluA.fa -t fluA.tree -C strict -o fluA
| Option | Key | Type | Value | Default | Note |
|---|---|---|---|---|---|
| -i | input.sequences | string | path to sequence file | Mandatory | sequence file in FASTA or NEXUS format |
| -t | input.tree | string | path to rooted tree file | Mandatory | tree file in newick or NEXUS format |
| -o | output.stem | string | outptut file name | input.sequences |
| Option | Key | Type | Value | Default | Note |
|---|---|---|---|---|---|
| -m | substmodel.type | string | GTR, HKY, K80, JC69, or GY94 | Mandatory | 00000, 00001,...,01234 can also be used |
| -r | substmodel.kappa | real | value > 0 | empirical | Only for HKY, K80, and GY94 |
| -r | substmodel.rates | array of real | values > 0 | empirical | Nucleotide only. For GTR it could be: 0.1,0.1,0.2,0.6,0.9 |
| substmodel.rates.fixed | boolean | true or false | false | Fixed to values specified in substmodel.rates | |
| -f | substmodel.freqs | array of real | 0 < values < 1 | empirical | Nucleotide model only. e.g. 0.1,0.2,0.3,0.4 |
| substmodel.freqs.fixed | boolean | true or false | false | Fixed to values specified in substmodel.freqs | |
| substmodel.codon.omega | real | value > 0 | 1 | GY94 only | |
| substmodel.heterogeneity | string | no, gamma, inv, or gammainv | no | ||
| -c | substmodel.heterogeneity.gamma.cat | integer | value > 0 | 4 | Number of rate categories |
| -a | substmodel.heterogeneity.gamma.alpha | real | value > 0 | 0.3 | Parameter of gamma distribution |
| substmodel.heterogeneity.gamma.alpha.fixed | boolean | true or false | false | Fixed to value specified in substmodel.heterogeneity.gamma.alpha | |
| substmodel.heterogeneity.pinv | real | 0 < value < 1 | 0.5 | proportion of invariant sites |
| Option | Key | Type | Value | Default | Note |
|---|---|---|---|---|---|
| -C | clock | string | strict, local, or discrete | If not set, no clock is assumed | |
| -S | clock.algorithm | string | ga or greedy | ga (local and discrete clocks) greedy (local clock only) | |
| --forward | clock.forward | bool | true or false | true | Time is forward if a large number = present and smaller number = past |
| --rate | clock.rate | integer | value > 0 | It is not fixed, it is just a guess for a better start |
| Option | Key | Type | Value | Default | Note |
|---|---|---|---|---|---|
| --ga-pop | ga.popsize | integer | value > 0 | 30 | |
| --ga-gen | ga.ngen | integer | value > 0 | 200 | |
| --ga-no-improv | ga.maxnoimprovement | integer | value > 0 | 50 |
| Option | Key | Type | Value | Default | Note |
|---|---|---|---|---|---|
| -b | bootstrap | integer | value >= 0 | 0 | |
| --b-threads | bootstrap.threads | integer | value > 0 | 10 |
| Option | Key | Type | Value | Default | Note |
|---|---|---|---|---|---|
| -R | random.seed | integer | value >= 0 | random number | |
| -T | nthreads | integer | value > 0 | 10 | Number of threads used by GA and greedy algorithm |
| --gc | sequences.geneticcode | integer | 0 <= value <= 14 | 0 | Only for codon models. See genetic code [Inputfile#Genetic_codes section] |
| --ic | ic | string | AIC, AICc, or BIC | AICc | For greedy and genetic algorithms |
| Value | Type |
|---|---|
| 0 | Universal |
| 1 | Vertebrate Mitochondrial |
| 2 | Yeast |
| 3 | Mold Protozoan Mitochondrial |
| 4 | Mycoplasma |
| 5 | nvertebrate Mitochondrial |
| 6 | Ciliate |
| 7 | Echinoderm Mitochondrial |
| 8 | Euplotid Nuclear |
| 9 | Bacterial |
| 10 | Alternative Yeast |
| 11 | Ascidian Mitochondrial |
| 12 | Flatworm Mitochondrial |
| 13 | Blepharisma Nuclear |
| 14 | No stops |