This is a subproject of EVA.
This work is supported by the National Science Foundation under Grant No. 2034247.
- Spark 3.0.0, Hadoop 3.2.0, Scala 2.12.8 (for Adam-Cannoli)
- Spark 2.4.7, Hadoop 2.7.6, Scala 2.11.12 (for GATK4)
Hadoop 3+ must use etc/hadoop/workers to list the data nodes; check using hdfs dfsadmin -report
The instructions are here.
This is a Scala project. You can use sbt to compile and package the project. The JAR file should be copied manually to lib/ before executing AVAH.
If you wish to change the scalaVersion in build.sbt, run reload before rebuilding the JAR.
To check YARN jobs:
yarn application -list
To kill YARN jobs:
yarn application -kill <application_ID>
To see YARN queues:
mapred queue -list
To change YARN's scheduler configuration via command line
yarn schedulerconf
Examples:
yarn schedulerconf -global yarn.scheduler.maximum-allocation-mb=16384
yarn schedulerconf -global yarn.scheduler.maximum-allocation-vcores=32
yarn schedulerconf -global yarn.scheduler.maximum-allocation-mb=16384,yarn.scheduler.maximum-allocation-vcores=32
yarn queue -status default
yarn logs -applicationId <application_ID>
yarn top
yarn node -all -list
yarn node -showDetails -list
dstat --cpu --mem --load --top-cpu --top-mem -dn --output report.csv 2 10
or
dstat --cpu --mem --load --top-cpu --top-mem -dn --noupdate --output report.csv 2 10