From 95286c31be780365f1da4b62a2cf6583b8b05abb Mon Sep 17 00:00:00 2001 From: RosCraddock <109593931+RosCraddock@users.noreply.github.com> Date: Tue, 10 Sep 2024 10:04:20 +0100 Subject: [PATCH 01/19] Genome_Vignette --- vignettes/articles/Genome.Rmd | 191 ++++++++++++++++++++++++++++++++-- 1 file changed, 183 insertions(+), 8 deletions(-) diff --git a/vignettes/articles/Genome.Rmd b/vignettes/articles/Genome.Rmd index fe3fdb9d..22296159 100644 --- a/vignettes/articles/Genome.Rmd +++ b/vignettes/articles/Genome.Rmd @@ -1,6 +1,6 @@ --- -title: "Genome" -author: "Vignette Author" +title: "The Genome" +author: "Rosalind Craddock" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > @@ -14,11 +14,186 @@ knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) + +library(AlphaSimR) +``` + +**AlphaSimR** is built upon the basics of genetics for the stochastic simulation of crop and animal breeding programmes. This vignette will provide an overview of the genome, it's inheritance, and the tools used to read it with supporting examples and code. + +# What is a genome? +The genome is the complete set of genetic information within an organism. It is made up of deoxyribonucleic acid (DNA), consisting of sequences of four types of nucleotide bases: adenine (A), cytosine (C), guanine (G), and thymine (T). Within one DNA molecule, there are two strands joined by base-pairs: A binds with T and C binds with G. This forms a double helix structure which can be visualised as a spiral ladder. These DNA molecules are condensed around supporting protein to form a thread-like structure known as chromosomes. Due to the different demographic history across species, how the DNA is package can vary substantial including the number of chromosome and the number of chromosome copies (ploidy). For example, humans have 23 pairs of chromosomes (22 autosomnal, one non-autosomnal) for ~three billion base-pairs. Whereas cattle have 30 pairs of chromosomes for fewer base-pairs at ~2.7 billion. As these species are both diploid, in each chromosome pair, one has been inherited from the father (paternal) and one has been inherited from the mother (maternal). + +## Genes, alleles, loci, genotypes, and haplotypes. +Although similar, there are essential differences in the terminology used. This next section will provide a short glossary for genes, allele, loci, genotypes, and haplotypes. + +Genes are segments of DNA that are translated to proteins. Typically they will contain a few hundred to more than two billion bases-pairs with a codon (triplet base-pair) coding for either an amino acid or to start/stop. + +An allele in a gene is a gene variant. More generally, an allele represents a possible sequence of bases at a particular site on a genome (known as a locus). When biallelic (two variants per gene) the allele is either ancestral (denoted 0) or a mutant (denoted 1). Per the name, new alleles arise from mutation. + +A genotype is the combination of alleles across chromosome copies that an organism has at a specified locus or across loci. For instance, there are three possible genotypes for a diploid at a biallelic locus with ancestral allele A and mutant allele a: ancestral homozygous (aa) with allele dosage 0, heterozygous (aA or Aa) with allele dosage 1, and mutant homozygous (AA) with allele dosage 2. Genotypes can differ between individuals due to random segregation in inheritance. + +A haplotype (haploid genotype) describes the physical grouping of alleles along the same chromosome. This means that these group of alleles tend to be inherited together. New haplotypes are created through recombination. + +What does this look like in **AlphaSimR**? + +Let's start with creating the founding genome using `runMacs`. As mentioned, cattle have 30 pairs of chromosomes, but for the purpose of this simulation we are only interested in three. When we inspect the founderGenome we see that it has a ploidy of two with 12 loci across three chromosomes (with four segregating sites on each). + +```{r} +founderGenome = runMacs(nInd = 10, + nChr = 3, + segSites = 4, + species = "CATTLE") +# Set simulation parameters +SP = SimParam$new(founderGenome) +SP$setTrackRec(TRUE) +# Inspect the founderGenome +founderGenome + +``` +Then we can observe the genotypes with `pullSegSiteGeno`. Each row represents one of the ten individuals and each column represents a defined loci labelled as the Chromosome_Locus. All will have a value of zero to two equivalent to the allele dosage. Commonly, this would be homozygous ancestral (0), heterozygous (1), and homozygous mutant (2). Since this is a diploid, we are observing these genotypes across the pair of chromosomes. Where the ancestral allele is not known, the allele dosage will be randomly assigned. + +```{r} +pullSegSiteGeno(founderGenome) +``` + +The haplotypes can be observed with `pullSegSiteHaplo`. Each row represents the sequence of alleles on each chromosome per individual. Since this is diploid, we observe two for each individual: one maternal (id_1) and the other paternal (id_2). All will have a value of 0 or 1 for either ancestral or mutant allele respectingly. As before, each column represents a defined loci. + +```{r} +pullSegSiteHaplo(founderGenome) +``` + + +# How are genomes inherited? +Genetic code is inherited from an individuals mother and father. This involves cells known as gametes: eggs (from mother) and sperm (from father). These cells have half the number of chromosomes to other cells in the species. For example, in diploid species like humans gametes are haploid cells. These have been developed through interphase (DNA replication) and meiosis (cell division). The mechanism of interphase and meiosis are outside the scope of this vignette, however is important due to their contribution to genetic variation. + +## Genetic Variation: the DNA lottery +Broadly speaking, there are three contributions to genetic variation: mutation, crossing over (or recombination), and independent assortment. + +Mutation occurs during DNA replication in interphase. Although the rate is very small (~2.5x10^-8^ mutation per nucleotide), when considering the whole genome mutation is not negligible. + +Both crossing over and independent assortment occurs in the first part of meiosis known as the reductional division. Crossing over refers to the swapping of genetic material at random between chromatids resulting in variation. Independent Assortment refers to the random orientation of homologous chromosome pairs resulting in gametes with many different assortments of chromosomes. (recall there are 23 pairs in a human). + +Within genetics, albeit quantitative or population, we are interested in how individual's genomes differ to the next. Thus, it is this variation we wish to capture to understand how genetics influence desired (or undesired) traits in crop and animal breeding. + +In **AlphaSimR**: + +1) Random mutations can be added to individuals in populations with `mutate`. For the purpose of demonstration, the mutation is at a much higher rate than would be expected. + +```{r} +basePop = newPop(founderGenome) + +mutatedBasePop = mutate(basePop, mutRate = 0.1, simParam = SP) + +pullSegSiteGeno(basePop) + +pullSegSiteGeno(mutatedBasePop) +``` + +2) Independent assortment and recombination can be observed across generations. + +The first step is to make the cross in the base population to produce the second generation. For this, `randCross` can be used to create one cross, producing a pair of full siblings. + +```{r} +secondGenPop = randCross(pop = basePop, nCrosses = 1, nProgeny = 2, simParam = SP) + +str(secondGenPop) + +# Collect the iid for mother and father +mother = as.integer(secondGenPop@mother[1]) +father = as.integer(secondGenPop@father[1]) +``` + +Then, the haplotypes can be extracted for the mother, father, and two progeny to observe the inheritance. Recall that id_1 is the maternal haplotype (inherited from mother) and id_2 is the paternal haplotype (inherited from father). For clarity, no mutation is included in this example. All variation observed is due to independent assortment and/or recombination. + +```{r} +pullSegSiteHaplo(basePop[basePop@iid == mother,]) +``` + +```{r} +pullSegSiteHaplo(basePop[basePop@iid == father,]) +``` + +```{r} +pullSegSiteHaplo(secondGenPop) +``` +In the above, independent assortment will always be observed. This is clear since despite being full siblings they inherit a different arrangement of haplotypes from the mother and the father. We may observe changes to the haplotypes via recombination between the four sites across a chromosome (either chromosome one, two, and/or three). However, this depends on the rate of recombination. Nevertheless, even if present it can be hard or even impossible to observe through the haplotypes alone, even in this small sample size. Instead we can use the `pullIbdHaplo` to observe this. For further details on recombination, please see the vignette on Recombination. + +```{r} +pullIbdHaplo(basePop[basePop@iid == mother,]) +pullIbdHaplo(basePop[basePop@iid == father,]) +pullIbdHaplo(secondGenPop) +``` +# How are genomes read? +For a genome to be read, first it needs to be sequenced. The techniques for this include Sanger sequencing or next-generation sequencing. These are then used to identify single nucleotide polymorphisms (SNPs). SNPs highlight variation at specific loci; equivalent to markers. Other terminology include traits and QTL: Quantitative Trait Locus. Traits are characteristics of interest such as milk yield or growth rate that have a genetic component. QTL are a region of DNA that's associated with a specific trait that varies in the population. This could be identifiable by a marker (eg. SNP) or could only be correlated. + +In **AlphaSimR**: + +Let's start a new population. + +```{r} +founderGenome = runMacs(nInd = 10, + nChr = 3, + segSites = 4, + species = "CATTLE") +# Set simulation parameters +SP = SimParam$new(founderGenome) +``` + +Then, add an additive trait with three QTLs per chromosome using `SP$addTraitA` and one SNP per chromosome with `SP$addSnpChip`. + +```{r} +SP$addTraitA(nQtlPerChr = 3) +SP$addSnpChip(nSnpPerChr = 1) +``` + +After, create a base population. + +```{r} +basePop = newPop(founderGenome, simParam = SP) + +# Inspect basePop +basePop ``` -Introduction - Pull from MOOC - Explain the genome and its representation - Haplotypes / genotypes - QTL/SNP - Genetic map vs physical map + +The SNPs can be read with `pullSnpGeno` or, if the location in the genome is known, `pullMarkerGeno` can be used. + +```{r} +# From SNP +SNP = pullSnpGeno(pop = basePop, snpChip =1) +print(SNP) + +# From marker +location = colnames(SNP) +pullMarkerGeno(basePop, markers = location) +``` + + +The defined trait (Trait 1) can be inspected via the genetic values calculated for each individual. + +```{r} +basePop@gv +``` +The genetic map for the QTLs can be retrieved via `getQtlMap`. Here the site of each QTL is identified per chromosome. The pos notes the position of the QTL on each chromosome in Morgans. + +```{r} +getQtlMap(trait = 1, simParam = SP) +``` + + +## What is a genetic map? + +A genetic map (also known as linkage map) provides the genetic distance in Morgans between genetic markers, QTLs, and/or segregating sites on a chromosome. This differs from a physical map which gives the physical distance in the DNA sequence by the number of nucleotides. Thus, is based on the actual structure of the genetic material. Although **AlphaSimR** can import a physical map for all loci through the `lociMap-class`, it can only output genetic maps of defined segregating sites, including QTLs, SNPs, and all defined loci. For this, the observed recombination frequencies between loci are used to show how genetic information is shuffled in chromosomes. Thus, highlighting how markers are inherited. + +For a genetic map of all defined loci (or segregating sites), `getGenMap` is used. `getSnpMap` can be used for the genetic map of the SNP(s) on each chromosome. As hinted previously, `getQtlMap` can be used for the genetic map of the QTL(s). + +```{r} +getGenMap(SP) +getSnpMap(snpChip = 1, simParam = SP) +``` + + +# How is this used in plant and animal breeding? +By understanding the genome, breeders can produce crop and livestock with desirable traits for use in specific production systems. For example, a breeder may desire to have polled cattle to increase animal welfare and human safety. When considering the breeding design for a species, breed, and or system, there are many factors to consider with potential to introduce unintended consequences. These include balancing the trade off between short-term genetic gain and long-term loss in genetic variation as well as antagonistic relationships between desired traits commonly observed between production and functional traits in animals. **AlphaSimR** provides a fast and inexpensive way to explore and test a wide range of breeding designs for long term genetic improvement across plants and animals through forward-in-time stochastic simulations. + +For a more detailed overview of AlphaSimR in plant and animal breeding with exercises, please see the edx course: Breeding Programme Modelling with AlphaSimR. Available at: https://learning.edx.org/course/course-v1:EdinburghX+ISB001+1T2021/home \ No newline at end of file From 6ebba5a22cca1af0d556471149f37c6563a988f3 Mon Sep 17 00:00:00 2001 From: Gregor Gorjanc Date: Sat, 7 Dec 2024 11:28:09 +0100 Subject: [PATCH 02/19] Added asCategorical() --- NEWS.md | 4 ++ R/phenotypes.R | 106 +++++++++++++++++++++++++++++++ tests/testthat/test_phenotypes.R | 22 +++++++ 3 files changed, 132 insertions(+) create mode 100644 tests/testthat/test_phenotypes.R diff --git a/NEWS.md b/NEWS.md index b63cae23..8b2e5b9d 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,3 +1,7 @@ +# AlphaSimR 1.6.2 + +*added `asCategorical` to convert a normal (Gaussian) trait to an ordered categorical (threshold) trait + # AlphaSimR 1.6.1 *fixed bug in `mergePops` and `[` (subset) methods - they were failing for populations that had a misc slot with a matrix - we now check if a misc slot element is a matrix and rbind them for `mergePops` and subset rows for `[` (assuming the first dimension represents individuals) diff --git a/R/phenotypes.R b/R/phenotypes.R index 19f73e09..93410f06 100644 --- a/R/phenotypes.R +++ b/R/phenotypes.R @@ -261,4 +261,110 @@ setPheno = function(pop, h2=NULL, H2=NULL, varE=NULL, corE=NULL, return(pop) } +#' @title Convert a normal (Gaussian) trait to an ordered categorical (threshold) +#' trait +#' @param x matrix, values for one or more traits (if not a matrix, +#' we cast to a matrix) +#' @param threshold numeric or list, when numeric, provide a vector of threshold +#' values to convert continuous values into categories for a single trait +#' (the thresholds specify left-closed and right-opened intervals [t1, t2), +#' which can be changed with \code{include.lowest} and \code{right}; +#' ensure you add \code{-Inf} and \code{Inf} or min and max to cover the whole +#' range of values; otherwise you will get \code{NA} values); +#' when list, provide a list of numeric thresholds with \code{NULL} to skip +#' conversion for a specific trait (see examples) +#' @param include.lowest logical, see \code{\link{cut}} +#' @param right logical, see \code{\link{cut}} +#' @details If input trait is normal (Gaussian) then this function generates a +#' categorical trait according to the ordered probit model. +#' @return matrix of values with some traits recorded as ordered categories +#' in the form of 1:nC with nC being the number of categories. +#' @examples +#' founderPop = quickHaplo(nInd=10, nChr=1, segSites=10) +#' SP = SimParam$new(founderPop) +#' \dontshow{SP$nThreads = 1L} +#' SP$addTraitA(nQtlPerChr = 10, mean = c(0, 0), var = c(1, 2), +#' corA = matrix(data = c(1.0, 0.6, +#' 0.6, 1.0), ncol = 2)) +#' pop = newPop(founderPop) +#' pop = setPheno(pop, varE = c(1, 1)) +#' pheno(pop) +#' #Convert a single input trait +#' asCategorical(x = pheno(pop)[, 2]) +#' #Demonstrate threshold argument +#' asCategorical(x = pheno(pop)[, 2], threshold = c(-1, 0, 1)) +#' asCategorical(x = pheno(pop)[, 2], threshold = c(-Inf, -1, 0, 1, Inf)) +#' #Convert multiple input traits +#' try(asCategorical(x = pheno(pop))) +#' asCategorical(x = pheno(pop), +#' threshold = list(NULL, +#' c(-Inf, 0, Inf))) +#' asCategorical(x = pheno(pop), +#' threshold = list(c(-Inf, -2, -1, 0, 1, 2, Inf), +#' c(-Inf, 0, Inf))) +#' @export +asCategorical = function(x, threshold = c(-Inf, 0, Inf), + include.lowest = TRUE, right = FALSE) { + if (!is.matrix(x)) { + x = as.matrix(x) + } + if (is.numeric(threshold)) { + if (ncol(x) > 1) { + stop("When x contains more than one column, you must supply a list of thresholds! See examples.") + } + threshold = list(threshold) + } + for (trt in 1:ncol(x)) { + if (!is.null(threshold[[trt]])) { + x[, trt] = as.numeric(cut(x = x[, trt], breaks = threshold[[trt]], + include.lowest = include.lowest, right = right)) + } + } + return(x) +} +#' @title Convert a normal (Gaussian) trait to a count (Poisson) trait +#' @param x matrix, values for one or more traits (if not a matrix, +#' we cast to a matrix) +#' @param TODO numeric or list, when numeric, provide a vector of TODO +#' @return matrix of values with some traits recoded as counts +#' @details If input trait is normal (Gaussian) then this function generates a +#' count trait according to the Poisson generalised linear model. +#' @examples +#' founderPop = quickHaplo(nInd=20, nChr=1, segSites=10) +#' SP = SimParam$new(founderPop) +#' \dontshow{SP$nThreads = 1L} +#' SP$addTraitA(nQtlPerChr = 10, mean = c(0, 0), var = c(1, 2), +#' corA = matrix(data = c(1.0, 0.6, +#' 0.6, 1.0), ncol = 2)) +#' pop = newPop(founderPop) +#' pop = setPheno(pop, varE = c(1, 1)) +#' pheno(pop) +#' #Convert a single input trait +#' asCount(x = pheno(pop)[, 2]) +#' asCount(x = pheno(pop)[, 2], TODO = c(-1, 0, 1)) +#' asCount(x = pheno(pop)[, 2], TODO = c(-Inf, -1, 0, 1, Inf)) +#' #Convert multiple input traits +#' try(asCount(x = pheno(pop))) +#' asCount(x = pheno(pop), +#' TODO = list(NULL, +#' ???)) +#' TODO export +# asCount = function(x, TODO = 10) { +# if (!is.matrix(x)) { +# x = as.matrix(x) +# } +# if (is.numeric(TODO)) { +# if (ncol(x) > 1) { +# stop("When x contains more than one column, you must supply a list of TODO! See examples.") +# } +# TODO = list(TODO) +# } +# for (trt in 1:ncol(x)) { +# if (!is.null(TODO[[trt]])) { +# TODO: need to think what to do with an intercept and lambda +# x[, trt] = round(exp(x[, trt])) +# } +# } +# return(x) +# } diff --git a/tests/testthat/test_phenotypes.R b/tests/testthat/test_phenotypes.R new file mode 100644 index 00000000..38122112 --- /dev/null +++ b/tests/testthat/test_phenotypes.R @@ -0,0 +1,22 @@ +context("phenotypes") + +test_that("asCategorical_converts_correctly",{ + cont = matrix(data = 0, nrow = 7, ncol = 3) + cont[, 1] = c(-3, -2, -1, 0, 1, 2, 3) + cont[, 2] = c(-3, -2, -1, 0, 1, 2, 3) + cont[, 3] = c(-3, -2, -1, 0, 1, 2, 3) + expect_equal(asCategorical(x = cont[, 2]), + matrix(c(1, 1, 1, 2, 2, 2, 2))) + expect_equal(asCategorical(x = cont[, 2], threshold = c(-1, 0, 1)), + matrix(c(NA, NA, 1, 2, 2, NA, NA))) + expect_equal(asCategorical(x = cont[, 2], threshold = c(-Inf, -1, 0, 1, Inf)), + matrix(c(1, 1, 2, 3, 4, 4, 4))) + expect_error(asCategorical(x = cont)) + cont2 = asCategorical(x = cont, + threshold = list(NULL, + c(-Inf, 0, Inf), + NULL)) + cont2Exp = cont + cont2Exp[, 2] = c(1, 1, 1, 2, 2, 2, 2) + expect_equal(cont2, cont2Exp) +}) From 16d01206691e091d722a8a656019e9d41e97b53a Mon Sep 17 00:00:00 2001 From: Gregor Gorjanc Date: Thu, 12 Dec 2024 19:51:55 +0000 Subject: [PATCH 03/19] asCategorical() added option to provide p instead of thresholds --- R/phenotypes.R | 119 +++++++++++++++++++++++++------ tests/testthat/test_phenotypes.R | 28 +++++++- 2 files changed, 124 insertions(+), 23 deletions(-) diff --git a/R/phenotypes.R b/R/phenotypes.R index 93410f06..28c3fac5 100644 --- a/R/phenotypes.R +++ b/R/phenotypes.R @@ -265,14 +265,26 @@ setPheno = function(pop, h2=NULL, H2=NULL, varE=NULL, corE=NULL, #' trait #' @param x matrix, values for one or more traits (if not a matrix, #' we cast to a matrix) -#' @param threshold numeric or list, when numeric, provide a vector of threshold -#' values to convert continuous values into categories for a single trait +#' @param p NULL, numeric or list, when \code{NULL} the \code{threshold} argument +#' takes precedence; when numeric, provide a vector of probabilities of +#' categories to convert continuous values into categories for a single trait +#' (if probabilities do not sum to 1, another category is added and a warning +#' is raised); when list, provide a list of numeric probabilities - list node +#' with \code{NULL} will skip conversion for a specific trait (see examples); +#' internally \code{p} is converted to \code{threshold} hence input +#' \code{threshold} is overwritten +#' @param mean numeric, assumed mean(s) of the normal (Gaussian) trait(s); +#' used only when \code{p} is given +#' @param var numeric, assumed variance(s) of the normal (Gaussian) trait(s); +#' used only when \code{p} is given +#' @param threshold NULL, numeric or list, when numeric, provide a vector of +#' threshold values to convert continuous values into categories for a single trait #' (the thresholds specify left-closed and right-opened intervals [t1, t2), #' which can be changed with \code{include.lowest} and \code{right}; #' ensure you add \code{-Inf} and \code{Inf} or min and max to cover the whole #' range of values; otherwise you will get \code{NA} values); -#' when list, provide a list of numeric thresholds with \code{NULL} to skip -#' conversion for a specific trait (see examples) +#' when list, provide a list of numeric thresholds - list node with \code{NULL} +#' will skip conversion for a specific trait (see examples) #' @param include.lowest logical, see \code{\link{cut}} #' @param right logical, see \code{\link{cut}} #' @details If input trait is normal (Gaussian) then this function generates a @@ -283,38 +295,104 @@ setPheno = function(pop, h2=NULL, H2=NULL, varE=NULL, corE=NULL, #' founderPop = quickHaplo(nInd=10, nChr=1, segSites=10) #' SP = SimParam$new(founderPop) #' \dontshow{SP$nThreads = 1L} -#' SP$addTraitA(nQtlPerChr = 10, mean = c(0, 0), var = c(1, 2), +#' trtMean = c(0, 0) +#' trtVarG = c(1, 2) +#' SP$addTraitA(nQtlPerChr = 10, mean = trtMean, var = trtVarG, #' corA = matrix(data = c(1.0, 0.6, #' 0.6, 1.0), ncol = 2)) #' pop = newPop(founderPop) -#' pop = setPheno(pop, varE = c(1, 1)) +#' trtVarE = c(1, 1) +#' trtVarP = trtVarG + trtVarE +#' pop = setPheno(pop, varE = trtVarE) #' pheno(pop) +#' #' #Convert a single input trait -#' asCategorical(x = pheno(pop)[, 2]) -#' #Demonstrate threshold argument -#' asCategorical(x = pheno(pop)[, 2], threshold = c(-1, 0, 1)) -#' asCategorical(x = pheno(pop)[, 2], threshold = c(-Inf, -1, 0, 1, Inf)) -#' #Convert multiple input traits +#' asCategorical(x = pheno(pop)[, 1]) +#' +#' #Demonstrate threshold argument (in units of pheno SD) +#' asCategorical(x = pheno(pop)[, 1], threshold = c(-1, 0, 1) * sqrt(trtVarP[1])) +#' asCategorical(x = pheno(pop)[, 1], threshold = c(-Inf, -1, 0, 1, Inf) * sqrt(trtVarP[1])) +#' asCategorical(x = pheno(pop)[, 1], threshold = c(-Inf, 0, Inf)) +#' +#' #Demonstrate p argument +#' asCategorical(x = pheno(pop)[, 1], p = 0.5, var = trtVarP[1]) +#' asCategorical(x = pheno(pop)[, 1], p = c(0.5, 0.5), var = trtVarP[1]) +#' asCategorical(x = pheno(pop)[, 1], p = c(0.25, 0.5, 0.25), var = trtVarP[1]) +#' +#' #Convert multiple input traits (via threshold or p argument) #' try(asCategorical(x = pheno(pop))) #' asCategorical(x = pheno(pop), -#' threshold = list(NULL, -#' c(-Inf, 0, Inf))) +#' threshold = list(c(-Inf, 0, Inf), +#' NULL)) +#' try(asCategorical(x = pheno(pop), p = c(0.5, 0.5))) +#' asCategorical(x = pheno(pop), +#' p = list(c(0.5, 0.5), +#' NULL), +#' mean = trtMean, var = trtVarP) +#' +#' asCategorical(x = pheno(pop), +#' threshold = list(c(-Inf, 0, Inf), +#' c(-Inf, -2, -1, 0, 1, 2, Inf) * sqrt(trtVarP[2]))) +#' q = c(-2, -1, 0, 1, 2) +#' p = pnorm(q) +#' p = c(p[1], p[2]-p[1], p[3]-p[2], p[4]-p[3], p[5]-p[4], 1-p[5]) #' asCategorical(x = pheno(pop), -#' threshold = list(c(-Inf, -2, -1, 0, 1, 2, Inf), -#' c(-Inf, 0, Inf))) +#' p = list(c(0.5, 0.5), +#' p), +#' mean = trtMean, var = trtVarP) #' @export -asCategorical = function(x, threshold = c(-Inf, 0, Inf), +asCategorical = function(x, p = NULL, mean = 0, var = 1, + threshold = c(-Inf, 0, Inf), include.lowest = TRUE, right = FALSE) { if (!is.matrix(x)) { x = as.matrix(x) } + nTraits = ncol(x) + if (!is.null(p)) { + if (is.numeric(p)) { + if (nTraits > 1) { + stop("When x contains more than one column, you must supply a list of probabilities! See examples.") + } + p = list(p) + } + if (length(p) != nTraits) { + stop("You must supply probabilities for all traits in x !") + } + if (length(mean) != nTraits) { + stop("You must supply means for all traits in x !") + } + if (length(var) != nTraits) { + stop("You must supply variances for all traits in x !") + } + threshold = p + for (trt in 1:nTraits) { + if (!is.null(p[[trt]])) { + pSum = sum(p[[trt]]) + if (!(pSum == 1)) { # TODO: how can we do floating point aware comparison? + warning("Probabilities do not sum to 1 - creating one more category!") + p[[trt]] = c(p[[trt]], 1 - pSum) + } + tmp = qnorm(p = cumsum(p[[trt]]), mean = mean[trt], sd = sqrt(var[trt])) + if (!(-Inf %in% tmp)) { + tmp = c(-Inf, tmp) + } + if (!(Inf %in% tmp)) { + tmp = c(tmp, Inf) + } + threshold[[trt]] = tmp + } + } + } if (is.numeric(threshold)) { - if (ncol(x) > 1) { + if (nTraits > 1) { stop("When x contains more than one column, you must supply a list of thresholds! See examples.") } threshold = list(threshold) } - for (trt in 1:ncol(x)) { + if (length(threshold) != nTraits) { + stop("You must supply thresholds for all traits in x !") + } + for (trt in 1:nTraits) { if (!is.null(threshold[[trt]])) { x[, trt] = as.numeric(cut(x = x[, trt], breaks = threshold[[trt]], include.lowest = include.lowest, right = right)) @@ -354,13 +432,14 @@ asCategorical = function(x, threshold = c(-Inf, 0, Inf), # if (!is.matrix(x)) { # x = as.matrix(x) # } +# nTraits = ncol(x) # if (is.numeric(TODO)) { -# if (ncol(x) > 1) { +# if (nTraits > 1) { # stop("When x contains more than one column, you must supply a list of TODO! See examples.") # } # TODO = list(TODO) # } -# for (trt in 1:ncol(x)) { +# for (trt in 1:nTraits) { # if (!is.null(TODO[[trt]])) { # TODO: need to think what to do with an intercept and lambda # x[, trt] = round(exp(x[, trt])) diff --git a/tests/testthat/test_phenotypes.R b/tests/testthat/test_phenotypes.R index 38122112..bf9b4bae 100644 --- a/tests/testthat/test_phenotypes.R +++ b/tests/testthat/test_phenotypes.R @@ -5,12 +5,25 @@ test_that("asCategorical_converts_correctly",{ cont[, 1] = c(-3, -2, -1, 0, 1, 2, 3) cont[, 2] = c(-3, -2, -1, 0, 1, 2, 3) cont[, 3] = c(-3, -2, -1, 0, 1, 2, 3) - expect_equal(asCategorical(x = cont[, 2]), + + expect_equal(asCategorical(x = cont[, 1]), matrix(c(1, 1, 1, 2, 2, 2, 2))) - expect_equal(asCategorical(x = cont[, 2], threshold = c(-1, 0, 1)), + expect_equal(asCategorical(x = cont[, 1], threshold = c(-1, 0, 1)), matrix(c(NA, NA, 1, 2, 2, NA, NA))) - expect_equal(asCategorical(x = cont[, 2], threshold = c(-Inf, -1, 0, 1, Inf)), + expect_equal(asCategorical(x = cont[, 1], threshold = c(-Inf, -1, 0, 1, Inf)), + matrix(c(1, 1, 2, 3, 4, 4, 4))) + + expect_warning(asCategorical(x = cont[, 1], p = 0.5)) + expect_equal(suppressWarnings(asCategorical(x = cont[, 1], p = 0.5)), + asCategorical(x = cont[, 1], p = c(0.5, 0.5))) + + trtMean = apply(X = cont, MARGIN = 2, FUN = mean) + trtVar = apply(X = cont, MARGIN = 2, FUN = var) + expect_equal(asCategorical(x = cont[, 1], p = c(0.5, 0.5), var = trtVar[1]), + matrix(c(1, 1, 1, 2, 2, 2, 2))) + expect_equal(asCategorical(x = cont[, 1], p = c(2/7, 1/7, 1/7, 3/7), var = trtVar[1]), matrix(c(1, 1, 2, 3, 4, 4, 4))) + expect_error(asCategorical(x = cont)) cont2 = asCategorical(x = cont, threshold = list(NULL, @@ -19,4 +32,13 @@ test_that("asCategorical_converts_correctly",{ cont2Exp = cont cont2Exp[, 2] = c(1, 1, 1, 2, 2, 2, 2) expect_equal(cont2, cont2Exp) + + expect_error(asCategorical(x = cont, p = c(0.5, 0.5))) + pList = list(NULL, c(0.5, 0.5), NULL) + expect_error(asCategorical(x = cont, p = pList)) + expect_error(asCategorical(x = cont, p = pList, mean = trtMean)) + cont2 = asCategorical(x = cont, p = pList, mean = trtMean, var=trtVar) + cont2Exp = cont + cont2Exp[, 2] = c(1, 1, 1, 2, 2, 2, 2) + expect_equal(cont2, cont2Exp) }) From 2928e9e05c19975d9947d8537b362fb0e406488b Mon Sep 17 00:00:00 2001 From: Gregor Gorjanc Date: Sun, 22 Dec 2024 08:28:56 +0000 Subject: [PATCH 04/19] File rename --- tests/testthat/{test_phenotypes.R => test-phenotypes.R} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename tests/testthat/{test_phenotypes.R => test-phenotypes.R} (100%) diff --git a/tests/testthat/test_phenotypes.R b/tests/testthat/test-phenotypes.R similarity index 100% rename from tests/testthat/test_phenotypes.R rename to tests/testthat/test-phenotypes.R From 555e3b6419968a43a221ed82fac7fb778a43d323 Mon Sep 17 00:00:00 2001 From: Chris Gaynor Date: Tue, 24 Dec 2024 21:00:19 -0600 Subject: [PATCH 05/19] Update NEWS.md Setting version to devel flag --- NEWS.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/NEWS.md b/NEWS.md index 8b2e5b9d..f6e0fc9f 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,4 +1,4 @@ -# AlphaSimR 1.6.2 +# AlphaSimR 1.6.1.9990 *added `asCategorical` to convert a normal (Gaussian) trait to an ordered categorical (threshold) trait From 59e2619fa7ae5a511cb87da052c22e39c9a88a4e Mon Sep 17 00:00:00 2001 From: Chris Gaynor Date: Thu, 2 Jan 2025 12:18:23 -0600 Subject: [PATCH 06/19] Provided internal documentation for all R functions --- DESCRIPTION | 4 +- NAMESPACE | 2 +- R/Class-Pop.R | 4 +- R/Class-SimParam.R | 46 ++++++++ R/GS.R | 12 ++ R/misc.R | 34 ++---- R/phenotypes.R | 20 +++- R/pullGeno.R | 23 +++- R/selection.R | 50 +++++--- man/addError.Rd | 19 +++ man/asCategorical.Rd | 111 ++++++++++++++++++ man/calcPheno.Rd | 25 ++++ man/checkSexes.Rd | 21 ++++ man/convertTraitsToNames.Rd | 22 ++++ man/dot-newPop.Rd | 3 +- man/getCandidates.Rd | 17 +++ man/getFam.Rd | 17 +++ man/getLociNames.Rd | 19 +++ man/getResponse.Rd | 24 ++++ man/rnormWithSeed.Rd | 20 ++++ man/sampAddEff.Rd | 27 +++++ man/sampDomEff.Rd | 30 +++++ man/sampEpiEff.Rd | 29 +++++ man/selectLoci.Rd | 23 ++++ man/transMat.Rd | 19 +-- .../articles/SimpleGeneticSimulations.Rmd | 31 +++++ vignettes/intro.R | 22 ++-- 27 files changed, 597 insertions(+), 77 deletions(-) create mode 100644 man/addError.Rd create mode 100644 man/asCategorical.Rd create mode 100644 man/calcPheno.Rd create mode 100644 man/checkSexes.Rd create mode 100644 man/convertTraitsToNames.Rd create mode 100644 man/getCandidates.Rd create mode 100644 man/getFam.Rd create mode 100644 man/getLociNames.Rd create mode 100644 man/getResponse.Rd create mode 100644 man/rnormWithSeed.Rd create mode 100644 man/sampAddEff.Rd create mode 100644 man/sampDomEff.Rd create mode 100644 man/sampEpiEff.Rd create mode 100644 man/selectLoci.Rd create mode 100644 vignettes/articles/SimpleGeneticSimulations.Rmd diff --git a/DESCRIPTION b/DESCRIPTION index 4c35cd3b..a8fa50f6 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,8 +1,8 @@ Package: AlphaSimR Type: Package Title: Breeding Program Simulations -Version: 1.6.1 -Date: 2024-11-01 +Version: 1.6.1.9000 +Date: 2025-01-02 Authors@R: c(person("Chris", "Gaynor", email = "gaynor.robert@hotmail.com", role = c("aut", "cre"), comment = c(ORCID = "0000-0003-0558-6656")), person("Gregor", "Gorjanc", role = "ctb", diff --git a/NAMESPACE b/NAMESPACE index 144729ec..7886b032 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -12,6 +12,7 @@ export(RRBLUP_SCA2) export(SimParam) export(aa) export(addSegSite) +export(asCategorical) export(attrition) export(bv) export(cChr) @@ -102,7 +103,6 @@ export(solveRRBLUP_EM) export(solveRRBLUP_EM2) export(solveRRBLUP_EM3) export(solveUVM) -export(transMat) export(usefulness) export(varA) export(varAA) diff --git a/R/Class-Pop.R b/R/Class-Pop.R index fecc8858..afbd34e3 100644 --- a/R/Class-Pop.R +++ b/R/Class-Pop.R @@ -615,7 +615,7 @@ newPop = function(rawPop,simParam=NULL,...){ return(.newPop(rawPop=rawPop,simParam=simParam,...)) } -#' @title Create new population (internal) +#' @title Create new population #' #' @description #' Creates a new \code{\link{Pop-class}} from an object of @@ -636,6 +636,8 @@ newPop = function(rawPop,simParam=NULL,...){ #' function in simParam #' #' @return Returns an object of \code{\link{Pop-class}} +#' +#' @keywords internal .newPop = function(rawPop, id=NULL, mother=NULL, father=NULL, iMother=NULL, iFather=NULL, isDH=NULL, femaleParentPop=NULL, maleParentPop=NULL, diff --git a/R/Class-SimParam.R b/R/Class-SimParam.R index 8fd0031b..c9d0d19a 100644 --- a/R/Class-SimParam.R +++ b/R/Class-SimParam.R @@ -2526,6 +2526,21 @@ SimParam = R6Class( ) #### External helpers ---- + +#' @title Sample additive effects +#' +#' @description Samples deviates from a normal distribution or gamma distribution +#' with a random sign +#' +#' @param qtlLoci total number of loci +#' @param nTraits number of traits +#' @param corr correlation between traits +#' @param gamma indicator of trait should use a gamma distribution +#' @param shape gamma distribution shape parameter +#' +#' @returns a matrix with dimensions qtlLoci by nTraits +#' +#' @keywords internal sampAddEff = function(qtlLoci,nTraits,corr,gamma,shape){ addEff = matrix(rnorm(qtlLoci@nLoci*nTraits), ncol=nTraits)%*%transMat(corr) @@ -2538,6 +2553,22 @@ sampAddEff = function(qtlLoci,nTraits,corr,gamma,shape){ return(addEff) } +#' @title Sample dominance effects +#' +#' @description Samples dominance deviation effects from a normal distribution +#' and uses previously sampled additive effects to form dominance +#' effects +#' +#' @param qtlLoci total number of loci +#' @param nTraits number of traits +#' @param addEff previously sampled additive effects +#' @param corDD correlation between dominance degrees +#' @param meanDD mean value of dominance degrees +#' @param varDD variance of dominance degrees +#' +#' @returns a matrix with dimensions qtlLoci by nTraits +#' +#' @keywords internal sampDomEff = function(qtlLoci,nTraits,addEff,corDD, meanDD,varDD){ domEff = matrix(rnorm(qtlLoci@nLoci*nTraits), @@ -2548,6 +2579,21 @@ sampDomEff = function(qtlLoci,nTraits,addEff,corDD, return(domEff) } +#' @title Sample epistatic effects +#' +#' @description Samples epistatic effects from a normal distribution or gamma distribution +#' with a variance relative to the variance of the additive effects +#' +#' @param qtlLoci total number of loci +#' @param nTraits number of traits +#' @param corr correlation between epistatic effects +#' @param gamma indicator of trait should use a gamma distribution +#' @param shape gamma distribution shape parameter +#' @param relVar desired variance for epistatic effects +#' +#' @returns a matrix with dimensions qtlLoci by nTraits +#' +#' @keywords internal sampEpiEff = function(qtlLoci,nTraits,corr,gamma,shape,relVar){ epiEff = matrix(rnorm(qtlLoci@nLoci*nTraits/2), ncol=nTraits)%*%transMat(corr) diff --git a/R/GS.R b/R/GS.R index 0f65ac2c..f99cca77 100644 --- a/R/GS.R +++ b/R/GS.R @@ -1,3 +1,15 @@ +#' @title Convert traits to a vector of names +#' +#' @description This function processes the traits arguments from GS models. It usually +#' receives a number indicating the trait, but may also be the trait name +#' itself or a custom function. +#' +#' @param traits the traits argument from a GS model +#' @param simParam the simulation parameters object +#' +#' @returns a vector of names for traits +#' +#' @keywords internal convertTraitsToNames = function(traits, simParam){ if(is.character(traits)){ # Suspect trait is a name diff --git a/R/misc.R b/R/misc.R index 6f6d0ef0..c88b37e2 100644 --- a/R/misc.R +++ b/R/misc.R @@ -442,24 +442,8 @@ usefulness = function(pop,trait=1,use="gv",p=0.1, #' input matrix wasn't too far removed from a valid #' correlation matrix. #' -#' @examples -#' # Create an 2x2 correlation matrix -#' R = 0.5*diag(2) + 0.5 -#' -#' # Sample 1000 uncorrelated deviates from a -#' # bivariate standard normal distribution -#' X = matrix(rnorm(2*1000), ncol=2) -#' -#' # Compute the transformation matrix -#' T = transMat(R) -#' -#' # Apply the transformation to the deviates -#' Y = X%*%T -#' -#' # Measure the sample correlation -#' cor(Y) #' -#' @export +#' @keywords internal transMat = function(R){ # Check if matrix is symmetric # Stop if it is not @@ -637,10 +621,18 @@ attrition = function(pop, p){ return(pop[take]) } -# Sample deviates from a standard normal distribution -# n is the number of deviates -# u is a deviate from a uniform distribution [0,1] -# Seed is generated from u + +#' Sample normal deviates using a seed +#' +#' This function is designed for future expansion. The intent is to use it +#' to develop a GxE model based on compound symmetry. In this model, the +#' p-value used in the current GxE model will serve a a seed for sampling +#' environmental effects. +#' +#' @param n number of deviates to sample +#' @param u seed +#' +#' @keywords internal rnormWithSeed = function(n, u){ glbEnv = globalenv() origSeed = glbEnv$.Random.seed diff --git a/R/phenotypes.R b/R/phenotypes.R index 28c3fac5..72d3b64f 100644 --- a/R/phenotypes.R +++ b/R/phenotypes.R @@ -1,4 +1,10 @@ -#Adds random error to a matrix of genetic values +#' Add residual error to genetic values +#' +#' @param gv matrix of genetic values +#' @param varE residual variances, vector or matrix +#' @param reps number of reps for phenotype +#' +#' @keywords internal addError = function(gv, varE, reps){ nTraits = ncol(gv) nInd = nrow(gv) @@ -24,7 +30,17 @@ addError = function(gv, varE, reps){ return(pheno) } -#See setPheno documentation + +#' Calculate phenotypes +#' +#' @param pop an object of class Pop +#' @param varE a vector or matrix of residual variances +#' @param reps number of reps for phenotype +#' @param p p-value for environment +#' @param traits number of traits +#' @param simParam simulation parameters object +#' +#' @keywords internal calcPheno = function(pop, varE, reps, p, traits, simParam){ nTraits = length(traits) diff --git a/R/pullGeno.R b/R/pullGeno.R index df27c650..f7adec02 100644 --- a/R/pullGeno.R +++ b/R/pullGeno.R @@ -1,3 +1,15 @@ +#' Find loci on specific chromosomes +#' +#' This function alters the lociPerChr and lociLoc vectors to reflect +#' only loci on a specific chromosomes. +#' +#' @param chr chromosomes to select +#' @param inLociPerChr original lociPerChr vector +#' @param inLociLoc original lociLoc vector +#' +#' @return a list with lociPerChr and lociLoc +#' +#' @keywords internal selectLoci = function(chr, inLociPerChr, inLociLoc){ if(is.null(chr)){ return(list(lociPerChr=inLociPerChr, @@ -25,10 +37,13 @@ selectLoci = function(chr, inLociPerChr, inLociLoc){ lociLoc=outLociLoc)) } -# Retrieves Marker names from genMap -# lociPerChr, number of loci per chromosome -# lociLoc, position of loci on chromosome -# genMap, genetic map with names +#' Retrieves marker names from genMap +#' +#' @param lociPerChr number of loci per chromosome +#' @param lociLoc position of loci on chromosome +#' @param genMap internal AlphaSimR genetic map with names +#' +#' @keywords internal getLociNames = function(lociPerChr, lociLoc, genMap){ lociNames = character(length(lociLoc)) start = end = 0L diff --git a/R/selection.R b/R/selection.R index 9d70d09c..0cb94ee1 100644 --- a/R/selection.R +++ b/R/selection.R @@ -1,14 +1,13 @@ -# Returns a vector response from a population -# pop is an object of class Pop or HybridPop -# trait is a vector of traits or a function -# (must work like trait(pop@use, ...) or trait(use(pop, ...), ...)). -# See examples in \code{selectInd}. -# use is a character ("rand", "gv", "ebv", "pheno", or "bv"; note that -# "bv" doesn't work on class HybridPop) -# or a function (must work like use(pop, trait, ...)). -# See examples in \code{selectInd}. -# simParam is an object of class SimParam, it is only called when use="bv" -# ... are additional arguments passed to trait when trait is a function +#' Returns a vector response from a population +#' +#' @param pop an object of class Pop or HybirdPop +#' @param trait a vector or custom function +#' @param use a character ("rand", "gv", "ebv", "pheno", or "bv"; +#' note that "bv" doesn't work on class HybridPop) +#' @param simParam simulation parameters are only used when use="bv" +#' @param ... are additional arguments passed to trait when trait is a function +#' +#' @keywords internal getResponse = function(pop,trait,use,simParam=NULL,...){ if(is(trait,"function")){ if(is.character(use)){ @@ -73,9 +72,14 @@ getResponse = function(pop,trait,use,simParam=NULL,...){ return(response) } -# Converts candidates to a vector of positive numbers -# for the individuals that are candidates. This function -# handle indexing by id and negative value indexing +#' Identify candidate individuals +#' +#' This function handles indexing by id and negative value indexing +#' +#' @param pop a population object +#' @param candidates a vector of candidates (numeric or character) +#' +#' @keywords internal getCandidates = function(pop, candidates){ if(is.character(candidates)){ candidates = match(candidates, pop@id) @@ -94,7 +98,16 @@ getCandidates = function(pop, candidates){ return(candidates) } -# Returns a vector of individuals in a population with the required sex +#' Find individuals of desired sex +#' +#' This function ignores sex if sexes is "no" in SimParam. +#' +#' @param pop a population +#' @param sex the desired sex (M or F) +#' @param simParam simulation parameters object +#' @param ... captures use of old gender argument +#' +#' @keywords internal checkSexes = function(pop,sex,simParam,...){ sex = toupper(sex) eligible = 1:pop@nInd @@ -116,7 +129,12 @@ checkSexes = function(pop,sex,simParam,...){ } } -# Returns a vector of families +#' Determine families +#' +#' @param pop a population object +#' @param famType type of family (fullsib "B", maternal "M", or paternal "F") +#' +#' @keywords internal getFam = function(pop,famType){ famType = toupper(famType) if(famType=="B"){ diff --git a/man/addError.Rd b/man/addError.Rd new file mode 100644 index 00000000..79213747 --- /dev/null +++ b/man/addError.Rd @@ -0,0 +1,19 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/phenotypes.R +\name{addError} +\alias{addError} +\title{Add residual error to genetic values} +\usage{ +addError(gv, varE, reps) +} +\arguments{ +\item{gv}{matrix of genetic values} + +\item{varE}{residual variances, vector or matrix} + +\item{reps}{number of reps for phenotype} +} +\description{ +Add residual error to genetic values +} +\keyword{internal} diff --git a/man/asCategorical.Rd b/man/asCategorical.Rd new file mode 100644 index 00000000..1060b9d2 --- /dev/null +++ b/man/asCategorical.Rd @@ -0,0 +1,111 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/phenotypes.R +\name{asCategorical} +\alias{asCategorical} +\title{Convert a normal (Gaussian) trait to an ordered categorical (threshold) + trait} +\usage{ +asCategorical( + x, + p = NULL, + mean = 0, + var = 1, + threshold = c(-Inf, 0, Inf), + include.lowest = TRUE, + right = FALSE +) +} +\arguments{ +\item{x}{matrix, values for one or more traits (if not a matrix, +we cast to a matrix)} + +\item{p}{NULL, numeric or list, when \code{NULL} the \code{threshold} argument +takes precedence; when numeric, provide a vector of probabilities of +categories to convert continuous values into categories for a single trait +(if probabilities do not sum to 1, another category is added and a warning +is raised); when list, provide a list of numeric probabilities - list node +with \code{NULL} will skip conversion for a specific trait (see examples); +internally \code{p} is converted to \code{threshold} hence input +\code{threshold} is overwritten} + +\item{mean}{numeric, assumed mean(s) of the normal (Gaussian) trait(s); +used only when \code{p} is given} + +\item{var}{numeric, assumed variance(s) of the normal (Gaussian) trait(s); +used only when \code{p} is given} + +\item{threshold}{NULL, numeric or list, when numeric, provide a vector of +threshold values to convert continuous values into categories for a single trait +(the thresholds specify left-closed and right-opened intervals [t1, t2), +which can be changed with \code{include.lowest} and \code{right}; +ensure you add \code{-Inf} and \code{Inf} or min and max to cover the whole +range of values; otherwise you will get \code{NA} values); +when list, provide a list of numeric thresholds - list node with \code{NULL} +will skip conversion for a specific trait (see examples)} + +\item{include.lowest}{logical, see \code{\link{cut}}} + +\item{right}{logical, see \code{\link{cut}}} +} +\value{ +matrix of values with some traits recorded as ordered categories + in the form of 1:nC with nC being the number of categories. +} +\description{ +Convert a normal (Gaussian) trait to an ordered categorical (threshold) + trait +} +\details{ +If input trait is normal (Gaussian) then this function generates a + categorical trait according to the ordered probit model. +} +\examples{ +founderPop = quickHaplo(nInd=10, nChr=1, segSites=10) +SP = SimParam$new(founderPop) +\dontshow{SP$nThreads = 1L} +trtMean = c(0, 0) +trtVarG = c(1, 2) +SP$addTraitA(nQtlPerChr = 10, mean = trtMean, var = trtVarG, + corA = matrix(data = c(1.0, 0.6, + 0.6, 1.0), ncol = 2)) +pop = newPop(founderPop) +trtVarE = c(1, 1) +trtVarP = trtVarG + trtVarE +pop = setPheno(pop, varE = trtVarE) +pheno(pop) + +#Convert a single input trait +asCategorical(x = pheno(pop)[, 1]) + +#Demonstrate threshold argument (in units of pheno SD) +asCategorical(x = pheno(pop)[, 1], threshold = c(-1, 0, 1) * sqrt(trtVarP[1])) +asCategorical(x = pheno(pop)[, 1], threshold = c(-Inf, -1, 0, 1, Inf) * sqrt(trtVarP[1])) +asCategorical(x = pheno(pop)[, 1], threshold = c(-Inf, 0, Inf)) + +#Demonstrate p argument +asCategorical(x = pheno(pop)[, 1], p = 0.5, var = trtVarP[1]) +asCategorical(x = pheno(pop)[, 1], p = c(0.5, 0.5), var = trtVarP[1]) +asCategorical(x = pheno(pop)[, 1], p = c(0.25, 0.5, 0.25), var = trtVarP[1]) + +#Convert multiple input traits (via threshold or p argument) +try(asCategorical(x = pheno(pop))) +asCategorical(x = pheno(pop), + threshold = list(c(-Inf, 0, Inf), + NULL)) +try(asCategorical(x = pheno(pop), p = c(0.5, 0.5))) +asCategorical(x = pheno(pop), + p = list(c(0.5, 0.5), + NULL), + mean = trtMean, var = trtVarP) + +asCategorical(x = pheno(pop), + threshold = list(c(-Inf, 0, Inf), + c(-Inf, -2, -1, 0, 1, 2, Inf) * sqrt(trtVarP[2]))) +q = c(-2, -1, 0, 1, 2) +p = pnorm(q) +p = c(p[1], p[2]-p[1], p[3]-p[2], p[4]-p[3], p[5]-p[4], 1-p[5]) +asCategorical(x = pheno(pop), + p = list(c(0.5, 0.5), + p), + mean = trtMean, var = trtVarP) +} diff --git a/man/calcPheno.Rd b/man/calcPheno.Rd new file mode 100644 index 00000000..a5db2224 --- /dev/null +++ b/man/calcPheno.Rd @@ -0,0 +1,25 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/phenotypes.R +\name{calcPheno} +\alias{calcPheno} +\title{Calculate phenotypes} +\usage{ +calcPheno(pop, varE, reps, p, traits, simParam) +} +\arguments{ +\item{pop}{an object of class Pop} + +\item{varE}{a vector or matrix of residual variances} + +\item{reps}{number of reps for phenotype} + +\item{p}{p-value for environment} + +\item{traits}{number of traits} + +\item{simParam}{simulation parameters object} +} +\description{ +Calculate phenotypes +} +\keyword{internal} diff --git a/man/checkSexes.Rd b/man/checkSexes.Rd new file mode 100644 index 00000000..17b61d2f --- /dev/null +++ b/man/checkSexes.Rd @@ -0,0 +1,21 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/selection.R +\name{checkSexes} +\alias{checkSexes} +\title{Find individuals of desired sex} +\usage{ +checkSexes(pop, sex, simParam, ...) +} +\arguments{ +\item{pop}{a population} + +\item{sex}{the desired sex (M or F)} + +\item{simParam}{simulation parameters object} + +\item{...}{captures use of old gender argument} +} +\description{ +This function ignores sex if sexes is "no" in SimParam. +} +\keyword{internal} diff --git a/man/convertTraitsToNames.Rd b/man/convertTraitsToNames.Rd new file mode 100644 index 00000000..766286ae --- /dev/null +++ b/man/convertTraitsToNames.Rd @@ -0,0 +1,22 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/GS.R +\name{convertTraitsToNames} +\alias{convertTraitsToNames} +\title{Convert traits to a vector of names} +\usage{ +convertTraitsToNames(traits, simParam) +} +\arguments{ +\item{traits}{the traits argument from a GS model} + +\item{simParam}{the simulation parameters object} +} +\value{ +a vector of names for traits +} +\description{ +This function processes the traits arguments from GS models. It usually +receives a number indicating the trait, but may also be the trait name +itself or a custom function. +} +\keyword{internal} diff --git a/man/dot-newPop.Rd b/man/dot-newPop.Rd index 3b9ce34a..223a9f1a 100644 --- a/man/dot-newPop.Rd +++ b/man/dot-newPop.Rd @@ -2,7 +2,7 @@ % Please edit documentation in R/Class-Pop.R \name{.newPop} \alias{.newPop} -\title{Create new population (internal)} +\title{Create new population} \usage{ .newPop( rawPop, @@ -52,3 +52,4 @@ Returns an object of \code{\link{Pop-class}} Creates a new \code{\link{Pop-class}} from an object of of the Pop superclass. } +\keyword{internal} diff --git a/man/getCandidates.Rd b/man/getCandidates.Rd new file mode 100644 index 00000000..587a14c2 --- /dev/null +++ b/man/getCandidates.Rd @@ -0,0 +1,17 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/selection.R +\name{getCandidates} +\alias{getCandidates} +\title{Identify candidate individuals} +\usage{ +getCandidates(pop, candidates) +} +\arguments{ +\item{pop}{a population object} + +\item{candidates}{a vector of candidates (numeric or character)} +} +\description{ +This function handles indexing by id and negative value indexing +} +\keyword{internal} diff --git a/man/getFam.Rd b/man/getFam.Rd new file mode 100644 index 00000000..adf31119 --- /dev/null +++ b/man/getFam.Rd @@ -0,0 +1,17 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/selection.R +\name{getFam} +\alias{getFam} +\title{Determine families} +\usage{ +getFam(pop, famType) +} +\arguments{ +\item{pop}{a population object} + +\item{famType}{type of family (fullsib "B", maternal "M", or paternal "F")} +} +\description{ +Determine families +} +\keyword{internal} diff --git a/man/getLociNames.Rd b/man/getLociNames.Rd new file mode 100644 index 00000000..a4fa6ef3 --- /dev/null +++ b/man/getLociNames.Rd @@ -0,0 +1,19 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/pullGeno.R +\name{getLociNames} +\alias{getLociNames} +\title{Retrieves marker names from genMap} +\usage{ +getLociNames(lociPerChr, lociLoc, genMap) +} +\arguments{ +\item{lociPerChr}{number of loci per chromosome} + +\item{lociLoc}{position of loci on chromosome} + +\item{genMap}{internal AlphaSimR genetic map with names} +} +\description{ +Retrieves marker names from genMap +} +\keyword{internal} diff --git a/man/getResponse.Rd b/man/getResponse.Rd new file mode 100644 index 00000000..3adc1dca --- /dev/null +++ b/man/getResponse.Rd @@ -0,0 +1,24 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/selection.R +\name{getResponse} +\alias{getResponse} +\title{Returns a vector response from a population} +\usage{ +getResponse(pop, trait, use, simParam = NULL, ...) +} +\arguments{ +\item{pop}{an object of class Pop or HybirdPop} + +\item{trait}{a vector or custom function} + +\item{use}{a character ("rand", "gv", "ebv", "pheno", or "bv"; +note that "bv" doesn't work on class HybridPop)} + +\item{simParam}{simulation parameters are only used when use="bv"} + +\item{...}{are additional arguments passed to trait when trait is a function} +} +\description{ +Returns a vector response from a population +} +\keyword{internal} diff --git a/man/rnormWithSeed.Rd b/man/rnormWithSeed.Rd new file mode 100644 index 00000000..59b27da7 --- /dev/null +++ b/man/rnormWithSeed.Rd @@ -0,0 +1,20 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/misc.R +\name{rnormWithSeed} +\alias{rnormWithSeed} +\title{Sample normal deviates using a seed} +\usage{ +rnormWithSeed(n, u) +} +\arguments{ +\item{n}{number of deviates to sample} + +\item{u}{seed} +} +\description{ +This function is designed for future expansion. The intent is to use it +to develop a GxE model based on compound symmetry. In this model, the +p-value used in the current GxE model will serve a a seed for sampling +environmental effects. +} +\keyword{internal} diff --git a/man/sampAddEff.Rd b/man/sampAddEff.Rd new file mode 100644 index 00000000..490356da --- /dev/null +++ b/man/sampAddEff.Rd @@ -0,0 +1,27 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/Class-SimParam.R +\name{sampAddEff} +\alias{sampAddEff} +\title{Sample additive effects} +\usage{ +sampAddEff(qtlLoci, nTraits, corr, gamma, shape) +} +\arguments{ +\item{qtlLoci}{total number of loci} + +\item{nTraits}{number of traits} + +\item{corr}{correlation between traits} + +\item{gamma}{indicator of trait should use a gamma distribution} + +\item{shape}{gamma distribution shape parameter} +} +\value{ +a matrix with dimensions qtlLoci by nTraits +} +\description{ +Samples deviates from a normal distribution or gamma distribution +with a random sign +} +\keyword{internal} diff --git a/man/sampDomEff.Rd b/man/sampDomEff.Rd new file mode 100644 index 00000000..f5e69a89 --- /dev/null +++ b/man/sampDomEff.Rd @@ -0,0 +1,30 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/Class-SimParam.R +\name{sampDomEff} +\alias{sampDomEff} +\title{Sample dominance effects} +\usage{ +sampDomEff(qtlLoci, nTraits, addEff, corDD, meanDD, varDD) +} +\arguments{ +\item{qtlLoci}{total number of loci} + +\item{nTraits}{number of traits} + +\item{addEff}{previously sampled additive effects} + +\item{corDD}{correlation between dominance degrees} + +\item{meanDD}{mean value of dominance degrees} + +\item{varDD}{variance of dominance degrees} +} +\value{ +a matrix with dimensions qtlLoci by nTraits +} +\description{ +Samples dominance deviation effects from a normal distribution +and uses previously sampled additive effects to form dominance +effects +} +\keyword{internal} diff --git a/man/sampEpiEff.Rd b/man/sampEpiEff.Rd new file mode 100644 index 00000000..e20cb3da --- /dev/null +++ b/man/sampEpiEff.Rd @@ -0,0 +1,29 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/Class-SimParam.R +\name{sampEpiEff} +\alias{sampEpiEff} +\title{Sample epistatic effects} +\usage{ +sampEpiEff(qtlLoci, nTraits, corr, gamma, shape, relVar) +} +\arguments{ +\item{qtlLoci}{total number of loci} + +\item{nTraits}{number of traits} + +\item{corr}{correlation between epistatic effects} + +\item{gamma}{indicator of trait should use a gamma distribution} + +\item{shape}{gamma distribution shape parameter} + +\item{relVar}{desired variance for epistatic effects} +} +\value{ +a matrix with dimensions qtlLoci by nTraits +} +\description{ +Samples epistatic effects from a normal distribution or gamma distribution +with a variance relative to the variance of the additive effects +} +\keyword{internal} diff --git a/man/selectLoci.Rd b/man/selectLoci.Rd new file mode 100644 index 00000000..5104c07d --- /dev/null +++ b/man/selectLoci.Rd @@ -0,0 +1,23 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/pullGeno.R +\name{selectLoci} +\alias{selectLoci} +\title{Find loci on specific chromosomes} +\usage{ +selectLoci(chr, inLociPerChr, inLociLoc) +} +\arguments{ +\item{chr}{chromosomes to select} + +\item{inLociPerChr}{original lociPerChr vector} + +\item{inLociLoc}{original lociLoc vector} +} +\value{ +a list with lociPerChr and lociLoc +} +\description{ +This function alters the lociPerChr and lociLoc vectors to reflect +only loci on a specific chromosomes. +} +\keyword{internal} diff --git a/man/transMat.Rd b/man/transMat.Rd index 16451077..d9f563f7 100644 --- a/man/transMat.Rd +++ b/man/transMat.Rd @@ -29,21 +29,4 @@ desired correlation, but will hopefully be close if the input matrix wasn't too far removed from a valid correlation matrix. } -\examples{ -# Create an 2x2 correlation matrix -R = 0.5*diag(2) + 0.5 - -# Sample 1000 uncorrelated deviates from a -# bivariate standard normal distribution -X = matrix(rnorm(2*1000), ncol=2) - -# Compute the transformation matrix -T = transMat(R) - -# Apply the transformation to the deviates -Y = X\%*\%T - -# Measure the sample correlation -cor(Y) - -} +\keyword{internal} diff --git a/vignettes/articles/SimpleGeneticSimulations.Rmd b/vignettes/articles/SimpleGeneticSimulations.Rmd new file mode 100644 index 00000000..f4337670 --- /dev/null +++ b/vignettes/articles/SimpleGeneticSimulations.Rmd @@ -0,0 +1,31 @@ +--- +title: "Simple Genetic Simulations" +author: "Chris Gaynor" +date: "`r Sys.Date()`" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{Vignette Title} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r setup, include = FALSE} +knitr::opts_chunk$set( + collapse = TRUE, + comment = "#>" +) +``` + + + + +## Genotypes to phenotypes + + +## Creating new genotypes + + + + + + diff --git a/vignettes/intro.R b/vignettes/intro.R index 8deb9c57..4a64ddbd 100644 --- a/vignettes/intro.R +++ b/vignettes/intro.R @@ -1,29 +1,29 @@ ## ----eval=FALSE--------------------------------------------------------------- -# founderPop = quickHaplo(nInd=1000, nChr=10, segSites=1000) +# founderPop = quickHaplo(nInd=1000, nChr=10, segSites=1000) ## ----eval=FALSE--------------------------------------------------------------- -# SP = SimParam$new(founderPop) +# SP = SimParam$new(founderPop) ## ----eval=FALSE--------------------------------------------------------------- -# SP$addTraitA(nQtlPerChr=1000) +# SP$addTraitA(nQtlPerChr=1000) ## ----eval=FALSE--------------------------------------------------------------- -# SP$setSexes("yes_sys") +# SP$setSexes("yes_sys") ## ----eval=FALSE--------------------------------------------------------------- -# pop = newPop(founderPop) +# pop = newPop(founderPop) ## ----eval=FALSE--------------------------------------------------------------- -# genMean = meanG(pop) +# genMean = meanG(pop) ## ----eval=FALSE--------------------------------------------------------------- -# for(generation in 1:20){ -# pop = selectCross(pop=pop, nFemale=500, nMale=25, use="gv", nCrosses=1000) -# genMean = c(genMean, meanG(pop)) -# } +# for(generation in 1:20){ +# pop = selectCross(pop=pop, nFemale=500, nMale=25, use="gv", nCrosses=1000) +# genMean = c(genMean, meanG(pop)) +# } ## ----eval=FALSE--------------------------------------------------------------- -# plot(0:20, genMean, xlab="Generation", ylab="Mean Genetic Value", type="l") +# plot(0:20, genMean, xlab="Generation", ylab="Mean Genetic Value", type="l") ## ----message=FALSE, warning=FALSE--------------------------------------------- library(AlphaSimR) From 2c72418dda051abe8b41388f89e6a5cd4f8e9c4c Mon Sep 17 00:00:00 2001 From: Chris Gaynor Date: Thu, 2 Jan 2025 13:15:13 -0600 Subject: [PATCH 07/19] close #214 --- DESCRIPTION | 2 +- R/Class-SimParam.R | 25 ++++++++++++++++++++++--- man/SimParam.Rd | 22 ++++++++++++++++++++-- 3 files changed, 43 insertions(+), 6 deletions(-) diff --git a/DESCRIPTION b/DESCRIPTION index a8fa50f6..fe89da0b 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,7 +1,7 @@ Package: AlphaSimR Type: Package Title: Breeding Program Simulations -Version: 1.6.1.9000 +Version: 1.6.1.9001 Date: 2025-01-02 Authors@R: c(person("Chris", "Gaynor", email = "gaynor.robert@hotmail.com", role = c("aut", "cre"), comment = c(ORCID = "0000-0003-0558-6656")), diff --git a/R/Class-SimParam.R b/R/Class-SimParam.R index c9d0d19a..769052f5 100644 --- a/R/Class-SimParam.R +++ b/R/Class-SimParam.R @@ -188,8 +188,26 @@ SimParam = R6Class( invisible(self) }, - #' @description Sets restrictions on which segregating sites - #' can serve as a SNP and/or QTL. + #' @description Sets restrictions on which segregating sites can + #' serve as a SNP and/or QTL. The default behavior of AlphaSimR is + #' to randomly sample QTL or SNP from all eligible sites and then + #' mark the sampled sites ineligible to be sampled as the other + #' type (e.g. if a site is sampled as a QTL it will be marked as + #' ineligible to be sampled as a SNP). This behavior is designed + #' to produce the most challenging scenario for genomic selection + #' when the markers used for prediction are not causal. + #' + #' Setting overlap=TRUE will prevent the addition of loci to the + #' ineligible list, but it won't remove sites already added to these + #' lists. Thus, timing of when restrSegSites is called matters. It + #' should be called before any addTrait or addSnpChip functions with + #' the overlap=TRUE argument to freely allow loci to overlap. + #' + #' The minQtlPerChr and minSnpPerChr arguments can be used with + #' overlap=FALSE to preallocate sites as QTL and SNP respectively. This + #' option is useful when simulating multiple traits and/or SNP chips, + #' because it can be used to guarantee that enough eligible sites are + #' available when running addTrait and or addSnpChip functions. #' #' @param minQtlPerChr the minimum number of segregating sites for #' QTLs. Can be a single value or a vector values for each chromosome. @@ -2184,7 +2202,8 @@ SimParam = R6Class( pot[[chr]] = tmp[tmp%in%pot[[chr]]] } } - stopifnot(sapply(pot,length)>=nSitesPerChr) + + stopifnot("Not enough eligible sites, see ?SimParam and method restrSegSites()" = sapply(pot,length)>=nSitesPerChr) # Sample sites lociLoc = lapply(1:self$nChr,function(x){ diff --git a/man/SimParam.Rd b/man/SimParam.Rd index 2205a7ec..ff427163 100644 --- a/man/SimParam.Rd +++ b/man/SimParam.Rd @@ -581,8 +581,26 @@ pop2@id # 1:10 \if{html}{\out{}} \if{latex}{\out{\hypertarget{method-SimParam-restrSegSites}{}}} \subsection{Method \code{restrSegSites()}}{ -Sets restrictions on which segregating sites -can serve as a SNP and/or QTL. +Sets restrictions on which segregating sites can +serve as a SNP and/or QTL. The default behavior of AlphaSimR is +to randomly sample QTL or SNP from all eligible sites and then +mark the sampled sites ineligible to be sampled as the other +type (e.g. if a site is sampled as a QTL it will be marked as +ineligible to be sampled as a SNP). This behavior is designed +to produce the most challenging scenario for genomic selection +when the markers used for prediction are not causal. + +Setting overlap=TRUE will prevent the addition of loci to the +ineligible list, but it won't remove sites already added to these +lists. Thus, timing of when restrSegSites is called matters. It +should be called before any addTrait or addSnpChip functions with +the overlap=TRUE argument to freely allow loci to overlap. + +The minQtlPerChr and minSnpPerChr arguments can be used with +overlap=FALSE to preallocate sites as QTL and SNP respectively. This +option is useful when simulating multiple traits and/or SNP chips, +because it can be used to guarantee that enough eligible sites are +available when running addTrait and or addSnpChip functions. \subsection{Usage}{ \if{html}{\out{
}}\preformatted{SimParam$restrSegSites( minQtlPerChr = NULL, From 722cd517da20a3e10ebe1242f23c06c942414720 Mon Sep 17 00:00:00 2001 From: Chris Gaynor Date: Thu, 2 Jan 2025 14:09:28 -0600 Subject: [PATCH 08/19] Reintroduced count-location model for diploids when p=1 --- DESCRIPTION | 2 +- src/meiosis.cpp | 134 +++++++++++++++++++++++++----------------------- 2 files changed, 70 insertions(+), 66 deletions(-) diff --git a/DESCRIPTION b/DESCRIPTION index fe89da0b..743c11b8 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,7 +1,7 @@ Package: AlphaSimR Type: Package Title: Breeding Program Simulations -Version: 1.6.1.9001 +Version: 1.6.1.9002 Date: 2025-01-02 Authors@R: c(person("Chris", "Gaynor", email = "gaynor.robert@hotmail.com", role = c("aut", "cre"), comment = c(ORCID = "0000-0003-0558-6656")), diff --git a/src/meiosis.cpp b/src/meiosis.cpp index 44205e2a..718a1848 100644 --- a/src/meiosis.cpp +++ b/src/meiosis.cpp @@ -54,77 +54,84 @@ arma::Mat RecHist::getHist(arma::uword ind, } // Samples the locations for chiasmata via a gamma process -// start, the position downstream to start the gamma process (should be a negative value) // end, the length of the interval used to sample // v, the interference parameter // p, the proportion of non-interfering crossovers // n, the number of gamma deviates sampled at a time (affects performance, not results) -arma::vec sampleChiasmata(double start, double end, double v, +arma::vec sampleChiasmata(double end, double v, double p, arma::uword n=40){ if((1-p)<1e-6){ - // All chiasmata from type 2 pathway - // Changing v and p to model type 2 with gamma model - p = 0; - v = 1; - } - - if(p<1e-6){ // Gamma model - // Sample deviates from a gamma distribution - arma::vec output = arma::randg(n, arma::distr_param(v,1.0/(2.0*v))); - - // Find locations on genetic map - output = cumsum(output)+start; - - // Add additional values if max position less than end - while(output(output.n_elem-1)(n, arma::distr_param(v,1.0/(2.0*v))); - tmp = cumsum(tmp) + output(output.n_elem-1); - output = join_cols(output, tmp); - } - - // Remove values less than 0 - output = output(find(output>0)); - - // Select values less than the end - return output(find(output(n, arma::distr_param(v,1.0/(2.0*v*(1-p)))); - - // Find locations on genetic map - type1 = cumsum(type1)+start; + // No crossover interference + // Switching to count-location model + n = samplePoisson(end); + arma::vec x(n, arma::fill::randu); + return sort(x); - // Add additional values if max position less than end - while(type1(type1.n_elem-1)(n, arma::distr_param(v,1.0/(2.0*v*(1-p)))); - tmp = cumsum(tmp) + type1(type1.n_elem-1); - type1 = join_cols(type1, tmp); - } - - // Remove values less than 0 - type1 = type1(find(type1>0)); - - // Select values less than the end - type1 = type1(find(type1(n, arma::distr_param(1.0,1.0/(2.0*p))); + }else{ + // Using gamma or gamma-sprinkling model - // Find locations on genetic map - type2 = cumsum(type2); + // Choose a starting location 9-10 Morgans away + arma::vec u(1, arma::fill::randu); + double start = u(0)-10; - // Add additional values if max position less than end - while(type2(type2.n_elem-1)(n, arma::distr_param(1.0,1.0/(2.0*p))); - tmp = cumsum(tmp) + type2(type2.n_elem-1); - type2 = join_cols(type2, tmp); + if(p<1e-6){ // Gamma model + // Sample deviates from a gamma distribution + arma::vec output = arma::randg(n, arma::distr_param(v,1.0/(2.0*v))); + + // Find locations on genetic map + output = cumsum(output)+start; + + // Add additional values if max position less than end + while(output(output.n_elem-1)(n, arma::distr_param(v,1.0/(2.0*v))); + tmp = cumsum(tmp) + output(output.n_elem-1); + output = join_cols(output, tmp); + } + + // Remove values less than 0 + output = output(find(output>0)); + + // Select values less than the end + return output(find(output(n, arma::distr_param(v,1.0/(2.0*v*(1-p)))); + + // Find locations on genetic map + type1 = cumsum(type1)+start; + + // Add additional values if max position less than end + while(type1(type1.n_elem-1)(n, arma::distr_param(v,1.0/(2.0*v*(1-p)))); + tmp = cumsum(tmp) + type1(type1.n_elem-1); + type1 = join_cols(type1, tmp); + } + + // Remove values less than 0 + type1 = type1(find(type1>0)); + + // Select values less than the end + type1 = type1(find(type1(n, arma::distr_param(1.0,1.0/(2.0*p))); + + // Find locations on genetic map + type2 = cumsum(type2); + + // Add additional values if max position less than end + while(type2(type2.n_elem-1)(n, arma::distr_param(1.0,1.0/(2.0*p))); + tmp = cumsum(tmp) + type2(type2.n_elem-1); + type2 = join_cols(type2, tmp); + } + + // Select values less than the end + type2 = type2(find(type2 findBivalentCO(const arma::vec& genMap, double v, double p){ arma::uword startPos=0, endPos, readChr=0, nCO; double genLen = genMap(genMap.n_elem-1); - // Choose a starting location 9-10 Morgans away - arma::vec u(1, arma::fill::randu); - double start = u(0)-10; // Find crossover positions - arma::vec posCO = sampleChiasmata(start, genLen, v, p); + arma::vec posCO = sampleChiasmata(genLen, v, p); if(posCO.n_elem==0){ arma::Mat output(1,2,arma::fill::ones); return output; From ff2ae8cbd1b9ab7e792820f7ea9032ace2f09aa9 Mon Sep 17 00:00:00 2001 From: Chris Gaynor Date: Sun, 16 Feb 2025 16:34:10 -0600 Subject: [PATCH 09/19] added activeQtl to SimParam for adding traits, doesn't handle removing traits --- DESCRIPTION | 4 +- R/Class-SimParam.R | 111 ++++++++++++++++++ man/findLociMapSuperset.Rd | 25 ++++ man/findQtlIndex.Rd | 24 ++++ src/getGv.cpp | 4 +- src/meiosis.cpp | 16 +-- .../articles/SimpleGeneticSimulations.Rmd | 26 +++- 7 files changed, 194 insertions(+), 16 deletions(-) create mode 100644 man/findLociMapSuperset.Rd create mode 100644 man/findQtlIndex.Rd diff --git a/DESCRIPTION b/DESCRIPTION index 743c11b8..95ac2c4b 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,8 +1,8 @@ Package: AlphaSimR Type: Package Title: Breeding Program Simulations -Version: 1.6.1.9002 -Date: 2025-01-02 +Version: 1.6.1.9003 +Date: 2025-02-16 Authors@R: c(person("Chris", "Gaynor", email = "gaynor.robert@hotmail.com", role = c("aut", "cre"), comment = c(ORCID = "0000-0003-0558-6656")), person("Gregor", "Gorjanc", role = "ctb", diff --git a/R/Class-SimParam.R b/R/Class-SimParam.R index 769052f5..b5d5eb1e 100644 --- a/R/Class-SimParam.R +++ b/R/Class-SimParam.R @@ -89,6 +89,8 @@ SimParam = R6Class( private$.hasHap = logical() private$.hap = list() private$.isFounder = logical() + private$.activeQtl = NULL + private$.qtlIndex = list() invisible(self) }, @@ -2151,6 +2153,8 @@ SimParam = R6Class( .hasHap="logical", .hap="list", .isFounder="logical", + .activeQtl = "LociMap", + .qtlIndex = "integer", # Determines whether not a simulation has started using lastId as an indicator .isRunning = function(){ @@ -2165,6 +2169,34 @@ SimParam = R6Class( .addTrait = function(lociMap,varA=NA_real_,varG=NA_real_,varE=NA_real_){ stopifnot(is.numeric(varA),is.numeric(varG),is.numeric(varE), length(varA)==1,length(varG)==1,length(varE)==1) + + if(self$nTraits == 0L){ + # Initializing .activeQtl + private$.activeQtl = new("LociMap", + nLoci = lociMap@nLoci, + lociPerChr = lociMap@lociPerChr, + lociLoc = lociMap@lociLoc) + private$.qtlIndex[[1L]] = seq_len(private$.activeQtl@nLoci) + }else{ + # Check if exiting activeQtl is a superset + testSet = findLociMapSuperset(private$.activeQtl, + lociMap) + if(is.null(testSet)){ + # Existing activeQtl is a superset, just add qtlIndex + private$.qtlIndex[[self$nTraits + 1L]] = findQtlIndex(private$.activeQtl, lociMap) + }else{ + # New activeQtl object required, recompute all values for qtlIndex + private$.activeQtl = testSet + + for(i in seq_len(self$nTraits)){ + private$.qtlIndex[[i]] = findQtlIndex(private$.activeQtl, self$traits[[i]]) + } + + private$.qtlIndex[[self$nTraits + 1L]] = + findQtlIndex(private$.activeQtl, lociMap) + } + } + private$.traits[[self$nTraits + 1L]] = lociMap private$.varA = c(private$.varA,varA) private$.varG = c(private$.varG,varG) @@ -2631,3 +2663,82 @@ isSimParam = function(x) { ret = is(x, class2 = "SimParam") return(ret) } + +#' @title Find LociMap superset +#' +#' @description Compares to a \code{\link{LociMap-class}} objects to determine if +#' the first one is a superset of the second. If it is, the function returns NULL. +#' If it is not, the function return a \code{\link{LociMap-class}} object that is +#' a superset of both a \code{\link{LociMap-class}} objects. +#' +#' @param lociMap1 a \code{\link{LociMap-class}} that is tested to determine if +#' it is a superset +#' @param lociMap2 a second \code{\link{LociMap-class}} that is tested +#' +#' @returns NULL if locMap1 is a superset, or a \code{\link{LociMap-class}} if it +#' is not +#' +#' @keywords internal +findLociMapSuperset = function(lociMap1, lociMap2){ + + loc1 = paste(rep(x = seq_len(length(lociMap1@lociPerChr)), + times = lociMap1@lociPerChr), + lociMap1@lociLoc, sep="_") + + loc2 = paste(rep(x = seq_len(length(lociMap2@lociPerChr)), + times = lociMap2@lociPerChr), + lociMap2@lociLoc, sep="_") + + inSet1 = loc2 %in% loc1 + + if(all(inSet1)){ + # lociMap1 is a superset of lociMap2, no action needed + return(NULL) + } + + # Create an ordered matrix with columns chr and lociLoc for loci superset + loc = unique(c(loc1, loc2)) + tab = sapply(loc, function(x){ + y = strsplit(x, split="_", fixed=TRUE) + return(as.integer(unlist(y))) + }) + tab = t(tab) + tab = tab[order(tab[,1],tab[,2]),] + + # Return LociMap for superset of loci + return(new("LociMap", + nLoci = nrow(tab), + lociPerChr = tabulate(tab[,1], nbins=length(lociMap1@lociPerChr)), + lociLoc = unname(tab[,2]))) +} + + +#' @title Find trait QTL index +#' +#' @description Compares to a \code{\link{LociMap-class}} objects to determine if +#' the first one is a superset of the second. If it is, the function returns NULL. +#' If it is not, the function return a \code{\link{LociMap-class}} object that is +#' a superset of both a \code{\link{LociMap-class}} objects. +#' +#' @param activeQtl a \code{\link{LociMap-class}} representing all active QTL +#' @param traitQtl a \code{\link{LociMap-class}} representing QTL for a trait of +#' interest +#' +#' @returns an integer vector for relative positions of traitQtl on activeQtl +#' +#' @keywords internal +findQtlIndex = function(activeQtl, traitQtl){ + + activeLoci = paste(rep(x = seq_len(length(activeQtl@lociPerChr)), + times = activeQtl@lociPerChr), + activeQtl@lociLoc, sep="_") + + traitLoci = paste(rep(x = seq_len(length(traitQtl@lociPerChr)), + times = traitQtl@lociPerChr), + traitQtl@lociLoc, sep="_") + + return(match(traitLoci, activeLoci)) +} + + + diff --git a/man/findLociMapSuperset.Rd b/man/findLociMapSuperset.Rd new file mode 100644 index 00000000..0a419164 --- /dev/null +++ b/man/findLociMapSuperset.Rd @@ -0,0 +1,25 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/Class-SimParam.R +\name{findLociMapSuperset} +\alias{findLociMapSuperset} +\title{Find LociMap superset} +\usage{ +findLociMapSuperset(lociMap1, lociMap2) +} +\arguments{ +\item{lociMap1}{a \code{\link{LociMap-class}} that is tested to determine if +it is a superset} + +\item{lociMap2}{a second \code{\link{LociMap-class}} that is tested} +} +\value{ +NULL if locMap1 is a superset, or a \code{\link{LociMap-class}} if it +is not +} +\description{ +Compares to a \code{\link{LociMap-class}} objects to determine if +the first one is a superset of the second. If it is, the function returns NULL. +If it is not, the function return a \code{\link{LociMap-class}} object that is +a superset of both a \code{\link{LociMap-class}} objects. +} +\keyword{internal} diff --git a/man/findQtlIndex.Rd b/man/findQtlIndex.Rd new file mode 100644 index 00000000..983e1577 --- /dev/null +++ b/man/findQtlIndex.Rd @@ -0,0 +1,24 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/Class-SimParam.R +\name{findQtlIndex} +\alias{findQtlIndex} +\title{Find trait QTL index} +\usage{ +findQtlIndex(activeQtl, traitQtl) +} +\arguments{ +\item{activeQtl}{a \code{\link{LociMap-class}} representing all active QTL} + +\item{traitQtl}{a \code{\link{LociMap-class}} representing QTL for a trait of +interest} +} +\value{ +an integer vector for relative positions of traitQtl on activeQtl +} +\description{ +Compares to a \code{\link{LociMap-class}} objects to determine if +the first one is a superset of the second. If it is, the function returns NULL. +If it is not, the function return a \code{\link{LociMap-class}} object that is +a superset of both a \code{\link{LociMap-class}} objects. +} +\keyword{internal} diff --git a/src/getGv.cpp b/src/getGv.cpp index 2ff69eb6..e87be70a 100644 --- a/src/getGv.cpp +++ b/src/getGv.cpp @@ -97,7 +97,7 @@ arma::field getGvE(const Rcpp::S4& trait, output.set_size(1); output(0).set_size(nInd); } - arma::vec x(ploidy+1); // Genotype dossage + arma::vec x(ploidy+1); // Genotype dosage for(arma::uword i=0; i getGv(const Rcpp::S4& trait, output.set_size(1); output(0).set_size(nInd); } - arma::vec x(ploidy+1); // Genotype dossage + arma::vec x(ploidy+1); // Genotype dosage for(arma::uword i=0; i sampleQuadChiasmata(double start, double exchange, double end, - double v, double p, arma::uword n1=40, arma::uword n2=8){ +arma::field sampleQuadChiasmata(double exchange, double end, double v, + double p, arma::uword n1=40, arma::uword n2=8){ arma::field output(4); arma::vec u(1, arma::fill::randu); + double start = u(0)-10; // Randomly set order of chromosome arms arma::uvec arm = {0, 1, 2, 3}; @@ -341,7 +341,7 @@ arma::Mat findBivalentCO(const arma::vec& genMap, double v, double p){ return output; } - // Thin crossovers + // Thin crossovers arma::vec thin(posCO.n_elem, arma::fill::randu); posCO = posCO(find(thin>0.5)); nCO = posCO.n_elem; @@ -387,13 +387,9 @@ arma::field > findQuadrivalentCO(const arma::vec& genMap, arma::vec u(1, arma::fill::randu); double exchange = u(0)*genLen; - // Sample start point for gamma model - u.randu(); - double start = u(0)-10; - // Determine crossover positions // Returns field with crossover positions in each arm of the quadrivalent - arma::field posCO = sampleQuadChiasmata(start, exchange, genLen, v, p); + arma::field posCO = sampleQuadChiasmata(exchange, genLen, v, p); // Set chromatid configuration for each chiasmata arma::field chromatidPairs(4); // matches posCO diff --git a/vignettes/articles/SimpleGeneticSimulations.Rmd b/vignettes/articles/SimpleGeneticSimulations.Rmd index f4337670..72229e84 100644 --- a/vignettes/articles/SimpleGeneticSimulations.Rmd +++ b/vignettes/articles/SimpleGeneticSimulations.Rmd @@ -16,14 +16,36 @@ knitr::opts_chunk$set( ) ``` +In this vignette I will show how genetic simulations work using a simple examples. +I will start by showing how to simulate phenotypes from genotypes. I will then +show how to simulate new genotypes from existing genotypes. +## Simulating phenotypes from genotypes +Simulating phenotypes from genotypes is a straightforward process that can be +done fairly easily using just base R functionality. -## Genotypes to phenotypes -## Creating new genotypes +-Simulate inbred genotypes +-Simulate QTL effects + +-Calculate genetic values +-Scale genetic values to desired variance +-Shift mean + +-Simulate phenotypes + + +## Simulating new genotypes + +-Describe meiosis +-Break genotypes into haplotypes +-Create F1 by pairing chromosomes +-Assign loci to a genetic map +-Randomly choose a starting chromosome +-Model recombination From 55a3552e82e04ba4aed2b202b22742fe0fdae85b Mon Sep 17 00:00:00 2001 From: Chris Gaynor Date: Sun, 16 Feb 2025 17:24:48 -0600 Subject: [PATCH 10/19] activeQtl works with removing and modifying traits --- DESCRIPTION | 2 +- R/Class-SimParam.R | 74 ++++++++++++++++++++++++++++++++++++++++++---- man/SimParam.Rd | 5 ++++ 3 files changed, 75 insertions(+), 6 deletions(-) diff --git a/DESCRIPTION b/DESCRIPTION index 95ac2c4b..3bbd541e 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,7 +1,7 @@ Package: AlphaSimR Type: Package Title: Breeding Program Simulations -Version: 1.6.1.9003 +Version: 1.6.1.9004 Date: 2025-02-16 Authors@R: c(person("Chris", "Gaynor", email = "gaynor.robert@hotmail.com", role = c("aut", "cre"), comment = c(ORCID = "0000-0003-0558-6656")), diff --git a/R/Class-SimParam.R b/R/Class-SimParam.R index b5d5eb1e..ea1526dd 100644 --- a/R/Class-SimParam.R +++ b/R/Class-SimParam.R @@ -1621,6 +1621,10 @@ SimParam = R6Class( private$.varE = diag(private$.varE) } private$.varE[traitPos] = varE + + # Update activeQtl + private$.setActiveQtl() + invisible(self) }, @@ -1635,14 +1639,18 @@ SimParam = R6Class( private$.isRunning() } stopifnot(max(traits)<=self$nTraits, min(traits)>0) - private$.traits[-traits] - private$.varA[-traits] - private$.varG[-traits] + private$.traits = private$.traits[-traits] + private$.varA = private$.varA[-traits] + private$.varG = private$.varG[-traits] if(is.matrix(private$.varE)){ - private$.varE[-traits,-traits] + private$.varE = private$.varE[-traits,-traits] }else{ - private$.varE[-traits] + private$.varE = private$.varE[-traits] } + + # Update activeQtl + private$.setActiveQtl() + invisible(self) }, @@ -2299,6 +2307,43 @@ SimParam = R6Class( } return(posList) + }, + + # Sets activeQtl and qtlIndex slots + # Run after removing or modifying existing traits + .setActiveQtl = function(){ + + private$.activeQtl = NULL + private$.qtlIndex = list() + + if(self$nTraits > 0L){ + # Set activeQtl using trait 1 + private$.activeQtl = new("LociMap", + nLoci = self$traits[[1L]]@nLoci, + lociPerChr = self$traits[[1L]]@lociPerChr, + lociLoc = self$traits[[1L]]@lociLoc) + + private$.qtlIndex[[1L]] = seq_len(private$.activeQtl@nLoci) + } + + if(self$nTraits > 1L){ + # Update activeQtl, if needed + for(i in 2:self$nTraits){ + testSet = findLociMapSuperset(private$.activeQtl, + self$traits[[i]]) + if(!is.null(testSet)){ + private$.activeQtl = testSet + } + } + + # Update qtlIndex + for(i in seq_len(self$nTraits)){ + private$.qtlIndex[[i]] = findQtlIndex(private$.activeQtl, + self$traits[[i]]) + } + } + + invisible(self) } ), @@ -2572,6 +2617,25 @@ SimParam = R6Class( }else{ stop("`$version` is read only",call.=FALSE) } + }, + + #' @field activeQtl a LociMap representing all active QTL in simulation + activeQtl=function(value){ + if(missing(value)){ + private$.activeQtl + }else{ + stop("`$activeQtl` is read only",call.=FALSE) + } + }, + + #' @field qtlIndex a list of vectors giving trait specific QTL indices + #' relative to all active QTL + qtlIndex=function(value){ + if(missing(value)){ + private$.qtlIndex + }else{ + stop("`$qtlIndex` is read only",call.=FALSE) + } } ) ) diff --git a/man/SimParam.Rd b/man/SimParam.Rd index ff427163..71681a5e 100644 --- a/man/SimParam.Rd +++ b/man/SimParam.Rd @@ -377,6 +377,11 @@ genetic map} \item{\code{varE}}{default error variance} \item{\code{version}}{the version of AlphaSimR used to generate this object} + +\item{\code{activeQtl}}{a LociMap representing all active QTL in simulation} + +\item{\code{qtlIndex}}{a list of vectors giving trait specific QTL indices +relative to all active QTL} } \if{html}{\out{
}} } From b07d249b59528043a13e4209b37ca742eb5f8c1b Mon Sep 17 00:00:00 2001 From: Chris Gaynor Date: Sat, 8 Mar 2025 20:07:06 -0600 Subject: [PATCH 11/19] added getGvIndex for faster multitrait simulations --- DESCRIPTION | 4 +- NEWS.md | 2 + R/Class-Pop.R | 19 ++-- R/Class-SimParam.R | 2 +- R/RcppExports.R | 4 + src/RcppExports.cpp | 17 +++ src/getGv.cpp | 261 ++++++++++++++++++++++++++++++++++++++++++++ 7 files changed, 299 insertions(+), 10 deletions(-) diff --git a/DESCRIPTION b/DESCRIPTION index 3bbd541e..c1696493 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,8 +1,8 @@ Package: AlphaSimR Type: Package Title: Breeding Program Simulations -Version: 1.6.1.9004 -Date: 2025-02-16 +Version: 1.6.1.9005 +Date: 2025-03-08 Authors@R: c(person("Chris", "Gaynor", email = "gaynor.robert@hotmail.com", role = c("aut", "cre"), comment = c(ORCID = "0000-0003-0558-6656")), person("Gregor", "Gorjanc", role = "ctb", diff --git a/NEWS.md b/NEWS.md index f6e0fc9f..d6c48132 100644 --- a/NEWS.md +++ b/NEWS.md @@ -2,6 +2,8 @@ *added `asCategorical` to convert a normal (Gaussian) trait to an ordered categorical (threshold) trait +*improved computational performance of simulations with multiple traits + # AlphaSimR 1.6.1 *fixed bug in `mergePops` and `[` (subset) methods - they were failing for populations that had a misc slot with a matrix - we now check if a misc slot element is a matrix and rbind them for `mergePops` and subset rows for `[` (assuming the first dimension represents individuals) diff --git a/R/Class-Pop.R b/R/Class-Pop.R index afbd34e3..171fdb98 100644 --- a/R/Class-Pop.R +++ b/R/Class-Pop.R @@ -713,14 +713,19 @@ newPop = function(rawPop,simParam=NULL,...){ pheno = gv if(simParam$nTraits>=1){ + tmp = getGvIndex(rawPop, simParam$traits, simParam$activeQtl, + simParam$qtlIndex, simParam$nTraits, simParam$nThreads) + + gv = tmp[[1]] + colnames(gv) = simParam$traitNames + + gxeTmp = tmp[[2]] + dim(gxeTmp) = NULL # Account for matrix bug in RcppArmadillo + + # Move over gxeTmp for traits with GxE for(i in seq_len(simParam$nTraits)){ - tmp = getGv(simParam$traits[[i]], rawPop, simParam$nThreads) - gv[,i] = tmp[[1]] - - colnames(gv)[i] = simParam$traits[[i]]@name - - if(length(tmp)>1){ - gxe[[i]] = tmp[[2]] + if(.hasSlot(simParam$traits[[i]], "gxeEff")){ + gxe[[i]] = gxeTmp[[i]] } } } diff --git a/R/Class-SimParam.R b/R/Class-SimParam.R index ea1526dd..3da3d3ce 100644 --- a/R/Class-SimParam.R +++ b/R/Class-SimParam.R @@ -2162,7 +2162,7 @@ SimParam = R6Class( .hap="list", .isFounder="logical", .activeQtl = "LociMap", - .qtlIndex = "integer", + .qtlIndex = "list", # Determines whether not a simulation has started using lastId as an indicator .isRunning = function(){ diff --git a/R/RcppExports.R b/R/RcppExports.R index ca5bf5c7..e9df840f 100644 --- a/R/RcppExports.R +++ b/R/RcppExports.R @@ -272,6 +272,10 @@ getGv <- function(trait, pop, nThreads) { .Call(`_AlphaSimR_getGv`, trait, pop, nThreads) } +getGvIndex <- function(pop, traitList, activeQtl, qtlIndex, nTraits, nThreads) { + .Call(`_AlphaSimR_getGvIndex`, pop, traitList, activeQtl, qtlIndex, nTraits, nThreads) +} + getHybridGv <- function(trait, females, femaleParents, males, maleParents, nThreads) { .Call(`_AlphaSimR_getHybridGv`, trait, females, femaleParents, males, maleParents, nThreads) } diff --git a/src/RcppExports.cpp b/src/RcppExports.cpp index 848613d6..7ad8f059 100644 --- a/src/RcppExports.cpp +++ b/src/RcppExports.cpp @@ -588,6 +588,22 @@ BEGIN_RCPP return rcpp_result_gen; END_RCPP } +// getGvIndex +Rcpp::List getGvIndex(const Rcpp::S4& pop, const Rcpp::List& traitList, const Rcpp::S4& activeQtl, const arma::field >& qtlIndex, arma::uword nTraits, int nThreads); +RcppExport SEXP _AlphaSimR_getGvIndex(SEXP popSEXP, SEXP traitListSEXP, SEXP activeQtlSEXP, SEXP qtlIndexSEXP, SEXP nTraitsSEXP, SEXP nThreadsSEXP) { +BEGIN_RCPP + Rcpp::RObject rcpp_result_gen; + Rcpp::RNGScope rcpp_rngScope_gen; + Rcpp::traits::input_parameter< const Rcpp::S4& >::type pop(popSEXP); + Rcpp::traits::input_parameter< const Rcpp::List& >::type traitList(traitListSEXP); + Rcpp::traits::input_parameter< const Rcpp::S4& >::type activeQtl(activeQtlSEXP); + Rcpp::traits::input_parameter< const arma::field >& >::type qtlIndex(qtlIndexSEXP); + Rcpp::traits::input_parameter< arma::uword >::type nTraits(nTraitsSEXP); + Rcpp::traits::input_parameter< int >::type nThreads(nThreadsSEXP); + rcpp_result_gen = Rcpp::wrap(getGvIndex(pop, traitList, activeQtl, qtlIndex, nTraits, nThreads)); + return rcpp_result_gen; +END_RCPP +} // getHybridGv arma::field getHybridGv(const Rcpp::S4& trait, const Rcpp::S4& females, arma::uvec femaleParents, const Rcpp::S4& males, arma::uvec maleParents, int nThreads); RcppExport SEXP _AlphaSimR_getHybridGv(SEXP traitSEXP, SEXP femalesSEXP, SEXP femaleParentsSEXP, SEXP malesSEXP, SEXP maleParentsSEXP, SEXP nThreadsSEXP) { @@ -882,6 +898,7 @@ static const R_CallMethodDef CallEntries[] = { {"_AlphaSimR_calcGenoFreq", (DL_FUNC) &_AlphaSimR_calcGenoFreq, 4}, {"_AlphaSimR_calcChrFreq", (DL_FUNC) &_AlphaSimR_calcChrFreq, 1}, {"_AlphaSimR_getGv", (DL_FUNC) &_AlphaSimR_getGv, 3}, + {"_AlphaSimR_getGvIndex", (DL_FUNC) &_AlphaSimR_getGvIndex, 6}, {"_AlphaSimR_getHybridGv", (DL_FUNC) &_AlphaSimR_getHybridGv, 6}, {"_AlphaSimR_getNonFounderIbd", (DL_FUNC) &_AlphaSimR_getNonFounderIbd, 3}, {"_AlphaSimR_getFounderIbd", (DL_FUNC) &_AlphaSimR_getFounderIbd, 2}, diff --git a/src/getGv.cpp b/src/getGv.cpp index e87be70a..8e9334f1 100644 --- a/src/getGv.cpp +++ b/src/getGv.cpp @@ -223,3 +223,264 @@ arma::field getGv(const Rcpp::S4& trait, } return output; } + +// Basic implementation of getGv for multiple traits +arma::field getGvIndexStd(const Rcpp::S4& trait, + arma::Mat& genoMat, + arma::Col qtlIndex, + arma::uword ploidy, + int nThreads){ + qtlIndex = qtlIndex-1; // R to C++ + arma::field output; + bool hasD = trait.hasSlot("domEff"); + bool hasGxe = trait.hasSlot("gxeEff"); + arma::uword nInd = genoMat.n_rows; + double dP = double(ploidy); + arma::vec a,d,g; + a = Rcpp::as(trait.slot("addEff")); + if(hasD){ + d = Rcpp::as(trait.slot("domEff")); + } + arma::mat gv(nInd,nThreads),gxe; + gv.fill(double(trait.slot("intercept"))/double(nThreads)); + if(hasGxe){ + g = Rcpp::as(trait.slot("gxeEff")); + output.set_size(2); + output(0).set_size(nInd); + output(1).set_size(nInd); + gxe.set_size(nInd,nThreads); + gxe.fill(double(trait.slot("gxeInt"))/double(nThreads)); + }else{ + output.set_size(1); + output(0).set_size(nInd); + } + arma::vec x(ploidy+1); // Genotype dosage + for(arma::uword i=0; i getGvIndexE(const Rcpp::S4& trait, + arma::Mat& genoMat, + arma::Col qtlIndex, + arma::uword ploidy, + int nThreads){ + qtlIndex = qtlIndex-1; // R to C++ + arma::field output; + bool hasD = trait.hasSlot("domEff"); + bool hasGxe = trait.hasSlot("gxeEff"); + arma::uword nInd = genoMat.n_rows; + double dP = double(ploidy); + arma::mat E; + E = Rcpp::as(trait.slot("epiEff")); + E.col(0) -= 1; //R to C++ + E.col(1) -= 1; //R to C++ + arma::vec a,d,g; + a = Rcpp::as(trait.slot("addEff")); + if(hasD){ + d = Rcpp::as(trait.slot("domEff")); + } + arma::mat gv(nInd,nThreads),gxe; + gv.fill(double(trait.slot("intercept"))/double(nThreads)); + if(hasGxe){ + g = Rcpp::as(trait.slot("gxeEff")); + output.set_size(2); + output(0).set_size(nInd); + output(1).set_size(nInd); + gxe.set_size(nInd,nThreads); + gxe.fill(double(trait.slot("gxeInt"))/double(nThreads)); + }else{ + output.set_size(1); + output(0).set_size(nInd); + } + arma::vec x(ploidy+1); // Genotype dosage + for(arma::uword i=0; i getGvIndexA2(const Rcpp::S4& trait, + const Rcpp::S4& pop, + int nThreads){ + arma::field output; + bool hasD = trait.hasSlot("domEff"); + arma::uword nInd = pop.slot("nInd"); + arma::uword ploidy = pop.slot("ploidy"); + double dP = double(ploidy); + const arma::Col& lociPerChr = trait.slot("lociPerChr"); + arma::uvec lociLoc = trait.slot("lociLoc"); + arma::vec a1,a2,d; + a1 = Rcpp::as(trait.slot("addEff")); + a2 = Rcpp::as(trait.slot("addEffMale")); + if(hasD){ + d = Rcpp::as(trait.slot("domEff")); + } + arma::mat gv(nInd,nThreads); + gv.fill(double(trait.slot("intercept"))/double(nThreads)); + output.set_size(1); + output(0).set_size(nInd); + // Half ploidy for xa + arma::vec xa(ploidy/2+1); + for(arma::uword i=0; i maternalGeno = getMaternalGeno(Rcpp::as > >(pop.slot("geno")), + lociPerChr, lociLoc, nThreads); + arma::Mat paternalGeno = getPaternalGeno(Rcpp::as > >(pop.slot("geno")), + lociPerChr, lociLoc, nThreads); + +#ifdef _OPENMP +#pragma omp parallel for schedule(static) num_threads(nThreads) +#endif + for(arma::uword i=0; i >& qtlIndex, + arma::uword nTraits, + int nThreads){ + + // Get summary variables + arma::uword nInd = pop.slot("nInd"); + arma::uword ploidy = pop.slot("ploidy"); + + // Get genotypes + arma::Mat genoMat; + genoMat = getGeno(Rcpp::as > >(pop.slot("geno")), + Rcpp::as >(activeQtl.slot("lociPerChr")), + Rcpp::as(activeQtl.slot("lociLoc")), + nThreads); + + // Format output + arma::mat gv(nInd, nTraits); + arma::field gxe(nTraits); + + + // Loop over traits + for(arma::uword i=0; i(output[0]); + + if(output.length() == 2){ + // Has GxE + gxe(i) = Rcpp::as(output[1]); + } + } + + return Rcpp::List::create(Rcpp::Named("gv") = gv, + Rcpp::Named("gxe") = gxe); +} \ No newline at end of file From 967df7fe6a17959e21e304df4c2ab9babbdd8798 Mon Sep 17 00:00:00 2001 From: Chris Gaynor Date: Sat, 8 Mar 2025 20:52:02 -0600 Subject: [PATCH 12/19] renamed meiosis vignette --- .../{Meiosis.Rmd => GeneticRecombination.Rmd} | 36 +++++++++---------- 1 file changed, 17 insertions(+), 19 deletions(-) rename vignettes/articles/{Meiosis.Rmd => GeneticRecombination.Rmd} (52%) diff --git a/vignettes/articles/Meiosis.Rmd b/vignettes/articles/GeneticRecombination.Rmd similarity index 52% rename from vignettes/articles/Meiosis.Rmd rename to vignettes/articles/GeneticRecombination.Rmd index eb574454..94f7d7f5 100644 --- a/vignettes/articles/Meiosis.Rmd +++ b/vignettes/articles/GeneticRecombination.Rmd @@ -1,10 +1,10 @@ --- -title: "Meiosis" -author: "Vignette Author" +title: "Genetic Recombination" +author: "Chris Gaynor" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > - %\VignetteIndexEntry{Vignette Title} + %\VignetteIndexEntry{Genetic Recombination} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- @@ -18,26 +18,24 @@ knitr::opts_chunk$set( ## Placeholder for future development -Introduction - Pull from MOOC - Explain the genome and its representation - Recombination / meiosis - Stages of meiosis +Introduction with basic of inheritance +Review meiosis + Stages of meiosis Chromosome pairing Chiasma (crossovers) Crossover interference Segregation Diploids first and then polyploids Centromeres for polyploids - - Implementation - Gamma sprinkling model - Crossover interference - recHist - pedigree - Sex specific recombination rates - - - - +Review genetic maps + Morgan's fly lab + Haldane mapping function + Kosambi mapping function +Recombination models + Count-location model + Gamma model + Gamma sprinkling model + recHist + pedigree + Sex specific recombination rates From 70a77ffc19d4136ae27d5a01a57817a9e44b40c3 Mon Sep 17 00:00:00 2001 From: Chris Gaynor Date: Sun, 9 Mar 2025 10:52:09 -0500 Subject: [PATCH 13/19] changing intro vignette name to AlphaSimR --- DESCRIPTION | 4 +- vignettes/{intro.Rmd => AlphaSimR.Rmd} | 0 vignettes/articles/AlphaSimR.Rmd | 25 ------------ vignettes/intro.R | 55 -------------------------- 4 files changed, 3 insertions(+), 81 deletions(-) rename vignettes/{intro.Rmd => AlphaSimR.Rmd} (100%) delete mode 100644 vignettes/articles/AlphaSimR.Rmd delete mode 100644 vignettes/intro.R diff --git a/DESCRIPTION b/DESCRIPTION index c1696493..672675d2 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -17,7 +17,9 @@ Authors@R: c(person("Chris", "Gaynor", email = "gaynor.robert@hotmail.com", person("Audrey", "Martin", role = "ctb", comment = c(ORCID = "0000-0003-2235-0098")), person("Philip", "Greenspoon", role = "ctb", - comment = c(ORCID = "0000-0001-6284-7248"))) + comment = c(ORCID = "0000-0001-6284-7248")), + person("Ros", "Craddock", role = "ctb", + comment = c(ORCID = "0009-0004-1578-1580"))) Description: The successor to the 'AlphaSim' software for breeding program simulation [Faux et al. (2016) ]. Used for stochastic simulations of breeding programs to the level of DNA diff --git a/vignettes/intro.Rmd b/vignettes/AlphaSimR.Rmd similarity index 100% rename from vignettes/intro.Rmd rename to vignettes/AlphaSimR.Rmd diff --git a/vignettes/articles/AlphaSimR.Rmd b/vignettes/articles/AlphaSimR.Rmd deleted file mode 100644 index c56fbb7d..00000000 --- a/vignettes/articles/AlphaSimR.Rmd +++ /dev/null @@ -1,25 +0,0 @@ ---- -title: "AlphaSimR" -author: "Vignette Author" -date: "`r Sys.Date()`" -output: rmarkdown::html_vignette -vignette: > - %\VignetteIndexEntry{Vignette Title} - %\VignetteEngine{knitr::rmarkdown} - %\VignetteEncoding{UTF-8} ---- - -```{r setup, include = FALSE} -knitr::opts_chunk$set( - collapse = TRUE, - comment = "#>" -) -``` - -## Placeholder for future development - -Explain the package - -Point to important pages for further information (introduction) - -Include links to external sources diff --git a/vignettes/intro.R b/vignettes/intro.R deleted file mode 100644 index 4a64ddbd..00000000 --- a/vignettes/intro.R +++ /dev/null @@ -1,55 +0,0 @@ -## ----eval=FALSE--------------------------------------------------------------- -# founderPop = quickHaplo(nInd=1000, nChr=10, segSites=1000) - -## ----eval=FALSE--------------------------------------------------------------- -# SP = SimParam$new(founderPop) - -## ----eval=FALSE--------------------------------------------------------------- -# SP$addTraitA(nQtlPerChr=1000) - -## ----eval=FALSE--------------------------------------------------------------- -# SP$setSexes("yes_sys") - -## ----eval=FALSE--------------------------------------------------------------- -# pop = newPop(founderPop) - -## ----eval=FALSE--------------------------------------------------------------- -# genMean = meanG(pop) - -## ----eval=FALSE--------------------------------------------------------------- -# for(generation in 1:20){ -# pop = selectCross(pop=pop, nFemale=500, nMale=25, use="gv", nCrosses=1000) -# genMean = c(genMean, meanG(pop)) -# } - -## ----eval=FALSE--------------------------------------------------------------- -# plot(0:20, genMean, xlab="Generation", ylab="Mean Genetic Value", type="l") - -## ----message=FALSE, warning=FALSE--------------------------------------------- -library(AlphaSimR) - -## ----------------------------------------------------------------------------- -# Creating Founder Haplotypes -founderPop = quickHaplo(nInd=1000, nChr=10, segSites=1000) - -# Setting Simulation Parameters -SP = SimParam$new(founderPop) - -## ----include=FALSE------------------------------------------------------------ -SP$nThreads = 1L - -## ----------------------------------------------------------------------------- -SP$addTraitA(nQtlPerChr=1000) -SP$setSexes("yes_sys") - -# Modeling the Breeding Program -pop = newPop(founderPop) -genMean = meanG(pop) -for(generation in 1:20){ - pop = selectCross(pop=pop, nFemale=500, nMale=25, use="gv", nCrosses=1000) - genMean = c(genMean, meanG(pop)) -} - -# Examining the Results -plot(0:20, genMean, xlab="Generation", ylab="Mean Genetic Value", type="l") - From 26bb3af7556b91669867545b2c6fb900c3d0f83b Mon Sep 17 00:00:00 2001 From: Chris Gaynor Date: Sun, 9 Mar 2025 13:59:17 -0500 Subject: [PATCH 14/19] webpage test --- DESCRIPTION | 4 +- vignettes/articles/DataImport.Rmd | 21 ------ vignettes/articles/GeneticRecombination.Rmd | 12 +-- vignettes/articles/GenomicSelection.Rmd | 39 ---------- vignettes/articles/Misc.Rmd | 31 -------- vignettes/articles/QuantGen.Rmd | 32 -------- .../articles/SimpleGeneticSimulations.Rmd | 53 -------------- vignettes/articles/ToDo | 73 +++++++++++++++++++ vignettes/articles/Traits.Rmd | 35 --------- 9 files changed, 82 insertions(+), 218 deletions(-) delete mode 100644 vignettes/articles/DataImport.Rmd delete mode 100644 vignettes/articles/GenomicSelection.Rmd delete mode 100644 vignettes/articles/Misc.Rmd delete mode 100644 vignettes/articles/QuantGen.Rmd delete mode 100644 vignettes/articles/SimpleGeneticSimulations.Rmd create mode 100644 vignettes/articles/ToDo delete mode 100644 vignettes/articles/Traits.Rmd diff --git a/DESCRIPTION b/DESCRIPTION index 672675d2..d4416965 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,8 +1,8 @@ Package: AlphaSimR Type: Package Title: Breeding Program Simulations -Version: 1.6.1.9005 -Date: 2025-03-08 +Version: 1.6.1.9006 +Date: 2025-03-09 Authors@R: c(person("Chris", "Gaynor", email = "gaynor.robert@hotmail.com", role = c("aut", "cre"), comment = c(ORCID = "0000-0003-0558-6656")), person("Gregor", "Gorjanc", role = "ctb", diff --git a/vignettes/articles/DataImport.Rmd b/vignettes/articles/DataImport.Rmd deleted file mode 100644 index 10f55d57..00000000 --- a/vignettes/articles/DataImport.Rmd +++ /dev/null @@ -1,21 +0,0 @@ ---- -title: "Data Import" -author: "Vignette Author" -date: "`r Sys.Date()`" -output: rmarkdown::html_vignette -vignette: > - %\VignetteIndexEntry{Vignette Title} - %\VignetteEngine{knitr::rmarkdown} - %\VignetteEncoding{UTF-8} ---- - -```{r setup, include = FALSE} -knitr::opts_chunk$set( - collapse = TRUE, - comment = "#>" -) -``` - -## Placeholder for future development - -Pull from example script diff --git a/vignettes/articles/GeneticRecombination.Rmd b/vignettes/articles/GeneticRecombination.Rmd index 94f7d7f5..eb945550 100644 --- a/vignettes/articles/GeneticRecombination.Rmd +++ b/vignettes/articles/GeneticRecombination.Rmd @@ -18,7 +18,8 @@ knitr::opts_chunk$set( ## Placeholder for future development -Introduction with basic of inheritance +Introduction including basic details of inheritance + Review meiosis Stages of meiosis Chromosome pairing @@ -27,15 +28,16 @@ Review meiosis Segregation Diploids first and then polyploids Centromeres for polyploids + Review genetic maps - Morgan's fly lab + Development of maps in Morgan's fly lab Haldane mapping function Kosambi mapping function + Recombination models Count-location model Gamma model Gamma sprinkling model - recHist - pedigree - Sex specific recombination rates + +Recombination tracking in AlphaSimR diff --git a/vignettes/articles/GenomicSelection.Rmd b/vignettes/articles/GenomicSelection.Rmd deleted file mode 100644 index 5acee4e3..00000000 --- a/vignettes/articles/GenomicSelection.Rmd +++ /dev/null @@ -1,39 +0,0 @@ ---- -title: "Genomic Selection" -author: "Vignette Author" -date: "`r Sys.Date()`" -output: rmarkdown::html_vignette -vignette: > - %\VignetteIndexEntry{Vignette Title} - %\VignetteEngine{knitr::rmarkdown} - %\VignetteEncoding{UTF-8} ---- - -```{r setup, include = FALSE} -knitr::opts_chunk$set( - collapse = TRUE, - comment = "#>" -) -``` - -## Placeholder for future development - -Introduction - Point to genome (QTL/SNP) and trait - -How to build a training population - Mention fixed effect - -How to fit a model - -How to get GEBVs - Accuracy of selection (vs BV or GV, train vs pred) - -How to do selections - -Explain models - Use Guilherme's paper - Explain 1 vs 2 - -How to use external software - diff --git a/vignettes/articles/Misc.Rmd b/vignettes/articles/Misc.Rmd deleted file mode 100644 index bcb3b982..00000000 --- a/vignettes/articles/Misc.Rmd +++ /dev/null @@ -1,31 +0,0 @@ ---- -title: "Misc" -author: "Vignette Author" -date: "`r Sys.Date()`" -output: rmarkdown::html_vignette -vignette: > - %\VignetteIndexEntry{Vignette Title} - %\VignetteEngine{knitr::rmarkdown} - %\VignetteEncoding{UTF-8} ---- - -```{r setup, include = FALSE} -knitr::opts_chunk$set( - collapse = TRUE, - comment = "#>" -) -``` - -## Placeholder for future development - -Introduction - -Explain individual level slot - set/get - -Explain population slot - -Finalize pop? - - - diff --git a/vignettes/articles/QuantGen.Rmd b/vignettes/articles/QuantGen.Rmd deleted file mode 100644 index f6bdc78f..00000000 --- a/vignettes/articles/QuantGen.Rmd +++ /dev/null @@ -1,32 +0,0 @@ ---- -title: "Quantitative Genetics" -author: "Vignette Author" -date: "`r Sys.Date()`" -output: rmarkdown::html_vignette -vignette: > - %\VignetteIndexEntry{Vignette Title} - %\VignetteEngine{knitr::rmarkdown} - %\VignetteEncoding{UTF-8} ---- - -```{r setup, include = FALSE} -knitr::opts_chunk$set( - collapse = TRUE, - comment = "#>" -) -``` - -## Placeholder for future development - -Introduction - Point to vignette on traits as first step - History - Fisher (response to selection, variance components) - Bulmer (genic variance) - -Decomposition of genetic values - Reference population (HWE vs non-HWE) - Breeding value - Dominance deviations - Epistatic deviations - Variance, genic variance, LD covariance diff --git a/vignettes/articles/SimpleGeneticSimulations.Rmd b/vignettes/articles/SimpleGeneticSimulations.Rmd deleted file mode 100644 index 72229e84..00000000 --- a/vignettes/articles/SimpleGeneticSimulations.Rmd +++ /dev/null @@ -1,53 +0,0 @@ ---- -title: "Simple Genetic Simulations" -author: "Chris Gaynor" -date: "`r Sys.Date()`" -output: rmarkdown::html_vignette -vignette: > - %\VignetteIndexEntry{Vignette Title} - %\VignetteEngine{knitr::rmarkdown} - %\VignetteEncoding{UTF-8} ---- - -```{r setup, include = FALSE} -knitr::opts_chunk$set( - collapse = TRUE, - comment = "#>" -) -``` - -In this vignette I will show how genetic simulations work using a simple examples. -I will start by showing how to simulate phenotypes from genotypes. I will then -show how to simulate new genotypes from existing genotypes. - -## Simulating phenotypes from genotypes - -Simulating phenotypes from genotypes is a straightforward process that can be -done fairly easily using just base R functionality. - - - --Simulate inbred genotypes - --Simulate QTL effects - --Calculate genetic values --Scale genetic values to desired variance --Shift mean - --Simulate phenotypes - - -## Simulating new genotypes - --Describe meiosis --Break genotypes into haplotypes --Create F1 by pairing chromosomes --Assign loci to a genetic map --Randomly choose a starting chromosome --Model recombination - - - - - diff --git a/vignettes/articles/ToDo b/vignettes/articles/ToDo new file mode 100644 index 00000000..cefd3f47 --- /dev/null +++ b/vignettes/articles/ToDo @@ -0,0 +1,73 @@ +Vignettes: + +Traits + Background from existing vignette + altAddTraitAD (its own section?) + + Practical examples + How to get effects + Manual adjustments + importation + Mixture distributions + rescaleTrait + + +SimpleGeneticSimulations + In this vignette I will show how genetic simulations work using a simple examples. + I will start by showing how to simulate phenotypes from genotypes. I will then + show how to simulate new genotypes from existing genotypes. + + ## Simulating phenotypes from genotypes + + Simulating phenotypes from genotypes is a straightforward process that can be + done fairly easily using just base R functionality. + + +QuantGen + Introduction + Point to vignette on traits as first step + History + Fisher (response to selection, variance components) + Bulmer (genic variance) + + Decomposition of genetic values + Reference population (HWE vs non-HWE) + Breeding value + Dominance deviations + Epistatic deviations + Variance, genic variance, LD covariance + +Misc + Introduction + + Explain individual level slot + set/get + + Explain population slot + + Finalize pop? + + +GenomicSelection + Introduction + Point to genome (QTL/SNP) and trait + + How to build a training population + Mention fixed effect + + How to fit a model + + How to get GEBVs + Accuracy of selection (vs BV or GV, train vs pred) + + How to do selections + + Explain models + Use Guilherme's paper + Explain 1 vs 2 + + How to use external software + + +DataImport + Pull from example script diff --git a/vignettes/articles/Traits.Rmd b/vignettes/articles/Traits.Rmd deleted file mode 100644 index 41685812..00000000 --- a/vignettes/articles/Traits.Rmd +++ /dev/null @@ -1,35 +0,0 @@ ---- -title: "Traits" -author: "Vignette Author" -date: "`r Sys.Date()`" -output: rmarkdown::html_vignette -vignette: > - %\VignetteIndexEntry{Vignette Title} - %\VignetteEngine{knitr::rmarkdown} - %\VignetteEncoding{UTF-8} ---- - -```{r setup, include = FALSE} -knitr::opts_chunk$set( - collapse = TRUE, - comment = "#>" -) -``` - -## Placeholder for future development - -Background from existing vignette - altAddTraitAD (its own section?) - -Practical examples - How to get effects - Manual adjustments - importation - Mixture distributions - rescaleTrait - - - - - - From 46b7e1b1ae68f75101624006049f6ebf5b79dfa5 Mon Sep 17 00:00:00 2001 From: Chris Gaynor Date: Sat, 5 Apr 2025 10:27:30 -0500 Subject: [PATCH 15/19] minor changes for genMap and finalizePop --- DESCRIPTION | 4 +- NEWS.md | 4 ++ R/Class-Pop.R | 2 +- R/Class-SimParam.R | 35 ++++++++-- man/AlphaSimR-package.Rd | 1 + man/SimParam.Rd | 6 +- vignettes/AlphaSimR.R | 55 +++++++++++++++ vignettes/articles/GeneticRecombination.Rmd | 75 +++++++++++++++------ 8 files changed, 147 insertions(+), 35 deletions(-) create mode 100644 vignettes/AlphaSimR.R diff --git a/DESCRIPTION b/DESCRIPTION index d4416965..78ff1d9a 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,8 +1,8 @@ Package: AlphaSimR Type: Package Title: Breeding Program Simulations -Version: 1.6.1.9006 -Date: 2025-03-09 +Version: 1.6.1.9007 +Date: 2025-04-05 Authors@R: c(person("Chris", "Gaynor", email = "gaynor.robert@hotmail.com", role = c("aut", "cre"), comment = c(ORCID = "0000-0003-0558-6656")), person("Gregor", "Gorjanc", role = "ctb", diff --git a/NEWS.md b/NEWS.md index d6c48132..3dc7eb5c 100644 --- a/NEWS.md +++ b/NEWS.md @@ -4,6 +4,10 @@ *improved computational performance of simulations with multiple traits +*added support for data.frames in SimParam genetic map switching functions + +*changed finalizePop function call in `.newPop` to pass simParam as an argument + # AlphaSimR 1.6.1 *fixed bug in `mergePops` and `[` (subset) methods - they were failing for populations that had a misc slot with a matrix - we now check if a misc slot element is a matrix and rbind them for `mergePops` and subset rows for `[` (assuming the first dimension represents individuals) diff --git a/R/Class-Pop.R b/R/Class-Pop.R index 171fdb98..3780933b 100644 --- a/R/Class-Pop.R +++ b/R/Class-Pop.R @@ -757,7 +757,7 @@ newPop = function(rawPop,simParam=NULL,...){ simParam=simParam) } - output = simParam$finalizePop(output,...) + output = simParam$finalizePop(output, simParam=simParam, ...) if(simParam$isTrackPed){ if(simParam$isTrackRec){ diff --git a/R/Class-SimParam.R b/R/Class-SimParam.R index 3da3d3ce..34f697b1 100644 --- a/R/Class-SimParam.R +++ b/R/Class-SimParam.R @@ -1922,14 +1922,21 @@ SimParam = R6Class( #' positions. If NULL, the centromere are assumed to #' be metacentric. switchGenMap = function(genMap, centromere=NULL){ - genMap = lapply(genMap, function(x) x-x[1]) # Set position 1 to 0 + if(is.data.frame(genMap)){ + genMap = importGenMap(genMap) + }else{ + genMap = lapply(genMap, function(x) x-x[1]) # Set position 1 to 0 + } + if(is.null(centromere)){ centromere=sapply(genMap,max)/2 } stopifnot(length(genMap)==self$nChr, centromere<=sapply(genMap,max)) + tmp = do.call("c",lapply(genMap,length)) stopifnot(all(tmp==private$.segSites)) + private$.sepMap = FALSE private$.femaleMap = genMap private$.maleMap = NULL @@ -1948,14 +1955,21 @@ SimParam = R6Class( #' positions. If NULL, the centromere are assumed to #' be metacentric. switchFemaleMap = function(genMap, centromere=NULL){ - genMap = lapply(genMap, function(x) x-x[1]) # Set position 1 to 0 + if(is.data.frame(genMap)){ + genMap = importGenMap(genMap) + }else{ + genMap = lapply(genMap, function(x) x-x[1]) # Set position 1 to 0 + } + if(is.null(centromere)){ centromere=sapply(genMap,max)/2 } stopifnot(length(genMap)==self$nChr, centromere<=sapply(genMap,max)) + tmp = do.call("c",lapply(genMap,length)) stopifnot(all(tmp==private$.segSites)) + if(private$.sepMap){ private$.femaleMap = genMap private$.femaleCentromere = centromere @@ -1979,14 +1993,21 @@ SimParam = R6Class( #' positions. If NULL, the centromere are assumed to #' be metacentric. switchMaleMap = function(genMap, centromere=NULL){ - genMap = lapply(genMap, function(x) x-x[1]) # Set position 1 to 0 + if(is.data.frame(genMap)){ + genMap = importGenMap(genMap) + }else{ + genMap = lapply(genMap, function(x) x-x[1]) # Set position 1 to 0 + } + if(is.null(centromere)){ centromere=sapply(genMap,max)/2 } stopifnot(length(genMap)==self$nChr, centromere<=sapply(genMap,max)) + tmp = do.call("c",lapply(genMap,length)) stopifnot(all(tmp==private$.segSites)) + private$.sepMap = TRUE private$.maleMap = genMap private$.maleCentromere = centromere @@ -2450,7 +2471,7 @@ SimParam = R6Class( } }, - #' @field genMap "matrix" of chromosome genetic maps + #' @field genMap list of chromosome genetic maps genMap=function(value){ if(missing(value)){ if(private$.sepMap){ @@ -2459,7 +2480,7 @@ SimParam = R6Class( genMap[[i]] = (private$.femaleMap[[i]]+ private$.maleMap[[i]])/2 } - as.matrix(genMap) + genMap }else{ private$.femaleMap } @@ -2468,7 +2489,7 @@ SimParam = R6Class( } }, - #' @field femaleMap "matrix" of chromosome genetic maps for + #' @field femaleMap list of chromosome genetic maps for #' females femaleMap=function(value){ if(missing(value)){ @@ -2478,7 +2499,7 @@ SimParam = R6Class( } }, - #' @field maleMap "matrix" of chromosome genetic maps for + #' @field maleMap list of chromosome genetic maps for #' males maleMap=function(value){ if(missing(value)){ diff --git a/man/AlphaSimR-package.Rd b/man/AlphaSimR-package.Rd index 58e7cbee..b1ca5791 100644 --- a/man/AlphaSimR-package.Rd +++ b/man/AlphaSimR-package.Rd @@ -44,6 +44,7 @@ Other contributors: \item Thiago Oliveira (\href{https://orcid.org/0000-0002-4555-2584}{ORCID}) [contributor] \item Audrey Martin (\href{https://orcid.org/0000-0003-2235-0098}{ORCID}) [contributor] \item Philip Greenspoon (\href{https://orcid.org/0000-0001-6284-7248}{ORCID}) [contributor] + \item Ros Craddock (\href{https://orcid.org/0009-0004-1578-1580}{ORCID}) [contributor] } } diff --git a/man/SimParam.Rd b/man/SimParam.Rd index 71681a5e..9401beb2 100644 --- a/man/SimParam.Rd +++ b/man/SimParam.Rd @@ -342,12 +342,12 @@ autopolyploid. (default is 0)} \item{\code{sepMap}}{are there seperate genetic maps for males and females} -\item{\code{genMap}}{"matrix" of chromosome genetic maps} +\item{\code{genMap}}{list of chromosome genetic maps} -\item{\code{femaleMap}}{"matrix" of chromosome genetic maps for +\item{\code{femaleMap}}{list of chromosome genetic maps for females} -\item{\code{maleMap}}{"matrix" of chromosome genetic maps for +\item{\code{maleMap}}{list of chromosome genetic maps for males} \item{\code{centromere}}{position of centromeres genetic map} diff --git a/vignettes/AlphaSimR.R b/vignettes/AlphaSimR.R new file mode 100644 index 00000000..4a64ddbd --- /dev/null +++ b/vignettes/AlphaSimR.R @@ -0,0 +1,55 @@ +## ----eval=FALSE--------------------------------------------------------------- +# founderPop = quickHaplo(nInd=1000, nChr=10, segSites=1000) + +## ----eval=FALSE--------------------------------------------------------------- +# SP = SimParam$new(founderPop) + +## ----eval=FALSE--------------------------------------------------------------- +# SP$addTraitA(nQtlPerChr=1000) + +## ----eval=FALSE--------------------------------------------------------------- +# SP$setSexes("yes_sys") + +## ----eval=FALSE--------------------------------------------------------------- +# pop = newPop(founderPop) + +## ----eval=FALSE--------------------------------------------------------------- +# genMean = meanG(pop) + +## ----eval=FALSE--------------------------------------------------------------- +# for(generation in 1:20){ +# pop = selectCross(pop=pop, nFemale=500, nMale=25, use="gv", nCrosses=1000) +# genMean = c(genMean, meanG(pop)) +# } + +## ----eval=FALSE--------------------------------------------------------------- +# plot(0:20, genMean, xlab="Generation", ylab="Mean Genetic Value", type="l") + +## ----message=FALSE, warning=FALSE--------------------------------------------- +library(AlphaSimR) + +## ----------------------------------------------------------------------------- +# Creating Founder Haplotypes +founderPop = quickHaplo(nInd=1000, nChr=10, segSites=1000) + +# Setting Simulation Parameters +SP = SimParam$new(founderPop) + +## ----include=FALSE------------------------------------------------------------ +SP$nThreads = 1L + +## ----------------------------------------------------------------------------- +SP$addTraitA(nQtlPerChr=1000) +SP$setSexes("yes_sys") + +# Modeling the Breeding Program +pop = newPop(founderPop) +genMean = meanG(pop) +for(generation in 1:20){ + pop = selectCross(pop=pop, nFemale=500, nMale=25, use="gv", nCrosses=1000) + genMean = c(genMean, meanG(pop)) +} + +# Examining the Results +plot(0:20, genMean, xlab="Generation", ylab="Mean Genetic Value", type="l") + diff --git a/vignettes/articles/GeneticRecombination.Rmd b/vignettes/articles/GeneticRecombination.Rmd index eb945550..fa88d5d8 100644 --- a/vignettes/articles/GeneticRecombination.Rmd +++ b/vignettes/articles/GeneticRecombination.Rmd @@ -5,8 +5,11 @@ date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Genetic Recombination} - %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} + %\VignetteEngine{knitr::rmarkdown} +editor_options: + markdown: + wrap: 72 --- ```{r setup, include = FALSE} @@ -16,28 +19,56 @@ knitr::opts_chunk$set( ) ``` -## Placeholder for future development +This article provides an overview of the biology of genetic +recombination and how it is modeled within AlphaSimR. I will first +provide a brief review of meiosis with special emphasis on how process +differs between diploid and autopolyploid species. I will then discuss +genetic maps before concluding with an overview specific models used to +simulate genetic recombination. + +## Meiosis + +Usual process of diploids, two copies of each chromosome, one from +mother and one from father + +Chromosome doubling, sister chromatids + +Chromosome pairing in meiosis I, bivalent pair or tetrad + +Chiasma form which may result in crossovers + +Homologous pairs split + +Sister chromatids split in meiosis II to produce haploid + +### Meiosis in Polypoids + +There are two primary types of polyploids: autopolyploids and +allopolyploids. + +Allopolyploids, such as wheat, have multiple genome copies that come +from different species. + +Meiosis and recombination proceeds similarly to diploids. + +Autopolyploids have multiple copies of the same genome. This + +## Genetic Maps + +## Recombination Models + +### Recombination Models in Autopolyploids Introduction including basic details of inheritance -Review meiosis - Stages of meiosis - Chromosome pairing - Chiasma (crossovers) - Crossover interference - Segregation - Diploids first and then polyploids - Centromeres for polyploids - -Review genetic maps - Development of maps in Morgan's fly lab - Haldane mapping function - Kosambi mapping function - -Recombination models - Count-location model - Gamma model - Gamma sprinkling model - -Recombination tracking in AlphaSimR +Review meiosis Stages of meiosis Chromosome pairing Chiasma (crossovers) +Crossover interference Segregation Diploids first and then polyploids +Centromeres for polyploids + +Review genetic maps Development of maps in Morgan's fly lab Haldane +mapping function Kosambi mapping function +Recombination models Count-location model Gamma model Gamma sprinkling +model + +Recombination tracking in AlphaSimR From 7c1ffd3a9de055476f794f141aa3301241dc7f85 Mon Sep 17 00:00:00 2001 From: janaobsteter Date: Wed, 28 May 2025 13:55:24 +0200 Subject: [PATCH 16/19] Added names to recHist --- R/Class-SimParam.R | 338 +++++++++++++++++++++++---------------------- 1 file changed, 170 insertions(+), 168 deletions(-) diff --git a/R/Class-SimParam.R b/R/Class-SimParam.R index 34f697b1..19449b0a 100644 --- a/R/Class-SimParam.R +++ b/R/Class-SimParam.R @@ -191,33 +191,33 @@ SimParam = R6Class( }, #' @description Sets restrictions on which segregating sites can - #' serve as a SNP and/or QTL. The default behavior of AlphaSimR is - #' to randomly sample QTL or SNP from all eligible sites and then - #' mark the sampled sites ineligible to be sampled as the other - #' type (e.g. if a site is sampled as a QTL it will be marked as - #' ineligible to be sampled as a SNP). This behavior is designed - #' to produce the most challenging scenario for genomic selection + #' serve as a SNP and/or QTL. The default behavior of AlphaSimR is + #' to randomly sample QTL or SNP from all eligible sites and then + #' mark the sampled sites ineligible to be sampled as the other + #' type (e.g. if a site is sampled as a QTL it will be marked as + #' ineligible to be sampled as a SNP). This behavior is designed + #' to produce the most challenging scenario for genomic selection #' when the markers used for prediction are not causal. - #' - #' Setting overlap=TRUE will prevent the addition of loci to the - #' ineligible list, but it won't remove sites already added to these - #' lists. Thus, timing of when restrSegSites is called matters. It - #' should be called before any addTrait or addSnpChip functions with + #' + #' Setting overlap=TRUE will prevent the addition of loci to the + #' ineligible list, but it won't remove sites already added to these + #' lists. Thus, timing of when restrSegSites is called matters. It + #' should be called before any addTrait or addSnpChip functions with #' the overlap=TRUE argument to freely allow loci to overlap. - #' - #' The minQtlPerChr and minSnpPerChr arguments can be used with - #' overlap=FALSE to preallocate sites as QTL and SNP respectively. This - #' option is useful when simulating multiple traits and/or SNP chips, - #' because it can be used to guarantee that enough eligible sites are + #' + #' The minQtlPerChr and minSnpPerChr arguments can be used with + #' overlap=FALSE to preallocate sites as QTL and SNP respectively. This + #' option is useful when simulating multiple traits and/or SNP chips, + #' because it can be used to guarantee that enough eligible sites are #' available when running addTrait and or addSnpChip functions. #' - #' @param minQtlPerChr the minimum number of segregating sites for + #' @param minQtlPerChr the minimum number of segregating sites for #' QTLs. Can be a single value or a vector values for each chromosome. #' @param minSnpPerChr the minimum number of segregating sites for SNPs. #' Can be a single value or a vector values for each chromosome. - #' @param excludeQtl an optional vector of segregating site names to + #' @param excludeQtl an optional vector of segregating site names to #' exclude from consideration as a viable QTL. - #' @param excludeSnp an optional vector of segregating site names to + #' @param excludeSnp an optional vector of segregating site names to #' exclude from consideration as a viable SNP. #' @param overlap should SNP and QTL sites be allowed to overlap. #' @param minSnpFreq minimum allowable frequency for SNP loci. @@ -236,7 +236,7 @@ SimParam = R6Class( # Handle any named QTL exclusions if(!is.null(excludeQtl)){ matchList = private$.findNamedLoci(excludeQtl) - + # Make exclusions restr = self$invalidQtl for(i in seq_len(self$nChr)){ @@ -244,7 +244,7 @@ SimParam = R6Class( } self$invalidQtl = restr } - + # Handle any named SNP exclusions if(!is.null(excludeSnp)){ # Check if the SNP list matches the QTL list @@ -255,11 +255,11 @@ SimParam = R6Class( findMatch = FALSE } } - + if(findMatch){ matchList = private$.findNamedLoci(excludeSnp) } - + # Make exclusions restr = self$invalidSnp for(i in seq_len(self$nChr)){ @@ -267,7 +267,7 @@ SimParam = R6Class( } self$invalidSnp = restr } - + if(overlap){ # Not setting any restrictions if overlap is allow # Existing restrictions will be left in place @@ -341,11 +341,11 @@ SimParam = R6Class( } invisible(self) }, - - #' @description - #' Allows for the manual setting of founder haplotypes. This functionality + + #' @description + #' Allows for the manual setting of founder haplotypes. This functionality #' is not fully documented, because it is still experimental. - #' + #' #' @param hapMap a list of founder haplotypes setFounderHap = function(hapMap){ private$.hap = hapMap @@ -386,12 +386,12 @@ SimParam = R6Class( self$snpChips[[self$nSnpChips + 1L]] = snpChip invisible(self) }, - + #' @description - #' Assigns SNPs to a SNP chip by supplying marker names. This function does - #' check against excluded SNPs and will not add the SNPs to the list of - #' excluded QTL for the purpose of avoiding overlap between SNPs and QTL. - #' Excluding these SNPs from being used as QTL can be accomplished using + #' Assigns SNPs to a SNP chip by supplying marker names. This function does + #' check against excluded SNPs and will not add the SNPs to the list of + #' excluded QTL for the purpose of avoiding overlap between SNPs and QTL. + #' Excluding these SNPs from being used as QTL can be accomplished using #' the excludeQtl argument in SimParam's restrSegSites function. #' #' @param markers a vector of names for the markers @@ -406,21 +406,21 @@ SimParam = R6Class( #' SP$addSnpChipByName(c("1_1","1_3")) addSnpChipByName = function(markers, name=NULL){ genMap = private$.femaleMap - + # Check that the markers are present on the map genMapMarkerNames = unlist(lapply(genMap, names)) stopifnot(all(markers%in%genMapMarkerNames)) - + # Create lociPerChr and lociLoc lociPerChr = integer(length(genMap)) lociLoc = vector("list", length(genMap)) - + # Loop through chromosomes for(i in seq_len(length(genMap))){ - + # Initialize lociLoc lociLoc[[i]] = integer() - + # Find matches if they exist take = match(names(genMap[[i]]), markers) lociPerChr[i] = length(na.omit(take)) @@ -429,18 +429,18 @@ SimParam = R6Class( } } lociLoc = unlist(lociLoc) - + snpChip = new("LociMap", nLoci=sum(lociPerChr), lociPerChr=lociPerChr, lociLoc=lociLoc) - + if(is.null(name)){ snpChip@name = paste0("Chip",self$nSnpChips + 1L) }else{ snpChip@name = name } - + self$snpChips[[self$nSnpChips + 1L]] = snpChip invisible(self) }, @@ -637,14 +637,14 @@ SimParam = R6Class( } invisible(self) }, - - #' @description - #' An alternative method for adding a trait with additive and dominance effects - #' to an AlphaSimR simulation. The function attempts to create a trait matching - #' user defined values for number of QTL, inbreeding depression, additive genetic + + #' @description + #' An alternative method for adding a trait with additive and dominance effects + #' to an AlphaSimR simulation. The function attempts to create a trait matching + #' user defined values for number of QTL, inbreeding depression, additive genetic #' variance and dominance genetic variance. - #' - #' @param nQtlPerChr number of QTLs per chromosome. + #' + #' @param nQtlPerChr number of QTLs per chromosome. #' Can be a single value or nChr values. #' @param mean desired mean of the trait #' @param varA desired additive variance @@ -656,42 +656,42 @@ SimParam = R6Class( #' @param force should the check for a running simulation be #' ignored. Only set to TRUE if you know what you are doing. #' @param name optional name for trait - #' - #' @details - #' This function will always add a trait to 'SimParam', unless an error occurs - #' with picking QTLs. The resulting trait will always have the desired mean and - #' additive genetic variance. However, it may not have the desired values for - #' inbreeding depression and dominance variance. Thus, it is strongly recommended - #' to check the output printed to the console to determine how close the trait's + #' + #' @details + #' This function will always add a trait to 'SimParam', unless an error occurs + #' with picking QTLs. The resulting trait will always have the desired mean and + #' additive genetic variance. However, it may not have the desired values for + #' inbreeding depression and dominance variance. Thus, it is strongly recommended + #' to check the output printed to the console to determine how close the trait's #' parameters came to these desired values. - #' - #' The mean and additive genetic variance will always be achieved exactly. The - #' function attempts to achieve the desired dominance variance and inbreeding - #' depression while staying within the user supplied constraints for the + #' + #' The mean and additive genetic variance will always be achieved exactly. The + #' function attempts to achieve the desired dominance variance and inbreeding + #' depression while staying within the user supplied constraints for the #' acceptable range of dominance degree mean and variance. If the desired values - #' are not being achieved, the acceptable range need to be increased and/or the - #' number of QTL may need to be increased. There are not limits to setting the - #' range for dominance degree mean and variance, but care should be taken to - #' with regards to the biological feasibility of the limits that are supplied. - #' The default limits were somewhat arbitrarily set, so I make not claim to + #' are not being achieved, the acceptable range need to be increased and/or the + #' number of QTL may need to be increased. There are not limits to setting the + #' range for dominance degree mean and variance, but care should be taken to + #' with regards to the biological feasibility of the limits that are supplied. + #' The default limits were somewhat arbitrarily set, so I make not claim to #' how reasonable these limits are for routine use. - #' - #' Inbreeding depression in this function is defined as the difference in mean - #' genetic value between a population with the same allele frequency as the - #' reference population (population used to initialize SimParam) in - #' Hardy-Weinberg equilibrium compared to a population with the same allele - #' frequency that is fully inbred. This is equivalent to the amount the mean of - #' a population increases when going from an inbreeding coefficient of 1 (fully - #' inbred) to a population with an inbreeding coefficient of 0 (Hardy-Weinberg - #' equilibrium). Note that the sign of the value should (usually) be positive. - #' This corresponds to a detrimental effect of inbreeding when higher values of + #' + #' Inbreeding depression in this function is defined as the difference in mean + #' genetic value between a population with the same allele frequency as the + #' reference population (population used to initialize SimParam) in + #' Hardy-Weinberg equilibrium compared to a population with the same allele + #' frequency that is fully inbred. This is equivalent to the amount the mean of + #' a population increases when going from an inbreeding coefficient of 1 (fully + #' inbred) to a population with an inbreeding coefficient of 0 (Hardy-Weinberg + #' equilibrium). Note that the sign of the value should (usually) be positive. + #' This corresponds to a detrimental effect of inbreeding when higher values of #' the trait are considered biologically beneficial. - #' - #' Summary information on this trait is printed to the console when silent=FALSE. - #' The summary information reports the inbreeding depression and dominance - #' variance for the population as well as the dominance degree mean and variance + #' + #' Summary information on this trait is printed to the console when silent=FALSE. + #' The summary information reports the inbreeding depression and dominance + #' variance for the population as well as the dominance degree mean and variance #' applied to the trait. - #' + #' #' @examples #' #Create founder haplotypes #' founderPop = quickHaplo(nInd=10, nChr=1, segSites=10) @@ -700,7 +700,7 @@ SimParam = R6Class( #' SP = SimParam$new(founderPop) #' \dontshow{SP$nThreads = 1L} #' SP$altAddTraitAD(nQtlPerChr=10, mean=0, varA=1, varD=0.05, inbrDepr=0.2) - altAddTraitAD = function(nQtlPerChr,mean=0,varA=1,varD=0,inbrDepr=0, + altAddTraitAD = function(nQtlPerChr,mean=0,varA=1,varD=0,inbrDepr=0, limMeanDD=c(0,1.5),limVarDD=c(0,0.5), silent=FALSE,force=FALSE,name=NULL){ if(!force){ @@ -712,10 +712,10 @@ SimParam = R6Class( if(is.null(name)){ name = paste0("Trait",self$nTraits+1) } - + # Pick QTL qtlLoci = private$.pickLoci(nQtlPerChr) - + # Create list of arguments for optimization argsList = argAltAD(LociMap = qtlLoci, Pop = self$founderPop, @@ -724,16 +724,16 @@ SimParam = R6Class( varD = varD, inbrDepr = inbrDepr, nThreads = self$nThreads) - + # Run optim to optimize meanDD and varDD optOut = optim(par = c(mean(limMeanDD), mean(sqrt(limVarDD))), - fn = objAltAD, + fn = objAltAD, gr = NULL, method = "L-BFGS-B", lower = c(limMeanDD[1], sqrt(limVarDD[1])), upper = c(limMeanDD[2], sqrt(limVarDD[2])), args = argsList) - + # Finalize creation of trait output = finAltAD(input = optOut$par, args = argsList) trait = new("TraitAD", @@ -743,7 +743,7 @@ SimParam = R6Class( intercept=c(output$intercept), name=name) private$.addTrait(trait,varA,output$varG) - + # Report trait details if(!silent){ cat("A new trait called", name, "was added. \n") @@ -752,10 +752,10 @@ SimParam = R6Class( cat(" meanDD =", output$meanDD, "\n") cat(" varDD =", output$varDD, "\n") } - + invisible(self) }, - + #' @description #' Randomly assigns eligible QTLs for one or more additive GxE traits. @@ -1001,7 +1001,7 @@ SimParam = R6Class( #' @examples #' #Create founder haplotypes #' founderPop = quickHaplo(nInd=10, nChr=1, segSites=10) - #' + #' #' #Set simulation parameters #' SP = SimParam$new(founderPop) #' \dontshow{SP$nThreads = 1L} @@ -1621,10 +1621,10 @@ SimParam = R6Class( private$.varE = diag(private$.varE) } private$.varE[traitPos] = varE - + # Update activeQtl private$.setActiveQtl() - + invisible(self) }, @@ -1647,18 +1647,18 @@ SimParam = R6Class( }else{ private$.varE = private$.varE[-traits] } - + # Update activeQtl private$.setActiveQtl() - + invisible(self) }, #' @description Defines a default values for error - #' variances used in \code{\link{setPheno}}. These defaults - #' will be used to automatically generate phenotypes when new + #' variances used in \code{\link{setPheno}}. These defaults + #' will be used to automatically generate phenotypes when new #' populations are created. See the details section of \code{\link{setPheno}} - #' for more information about each arguments and how they + #' for more information about each arguments and how they #' should be used. #' #' @param h2 a vector of desired narrow-sense heritabilities @@ -1681,7 +1681,7 @@ SimParam = R6Class( stopifnot(isSymmetric(corE), nrow(corE)==self$nTraits) } - + # Set error variances (.varE) if(!is.null(h2)){ stopifnot(length(h2)==self$nTraits, @@ -1715,7 +1715,7 @@ SimParam = R6Class( }else{ private$.varE = rep(NA_real_, self$nTraits) } - + # Set error correlations if(!is.null(corE)){ if(is.matrix(private$.varE)){ @@ -1729,7 +1729,7 @@ SimParam = R6Class( varE = varE%*%corE%*%varE private$.varE = varE } - + invisible(self) }, @@ -1927,16 +1927,16 @@ SimParam = R6Class( }else{ genMap = lapply(genMap, function(x) x-x[1]) # Set position 1 to 0 } - + if(is.null(centromere)){ centromere=sapply(genMap,max)/2 } stopifnot(length(genMap)==self$nChr, centromere<=sapply(genMap,max)) - + tmp = do.call("c",lapply(genMap,length)) stopifnot(all(tmp==private$.segSites)) - + private$.sepMap = FALSE private$.femaleMap = genMap private$.maleMap = NULL @@ -1960,16 +1960,16 @@ SimParam = R6Class( }else{ genMap = lapply(genMap, function(x) x-x[1]) # Set position 1 to 0 } - + if(is.null(centromere)){ centromere=sapply(genMap,max)/2 } stopifnot(length(genMap)==self$nChr, centromere<=sapply(genMap,max)) - + tmp = do.call("c",lapply(genMap,length)) stopifnot(all(tmp==private$.segSites)) - + if(private$.sepMap){ private$.femaleMap = genMap private$.femaleCentromere = centromere @@ -1998,16 +1998,16 @@ SimParam = R6Class( }else{ genMap = lapply(genMap, function(x) x-x[1]) # Set position 1 to 0 } - + if(is.null(centromere)){ centromere=sapply(genMap,max)/2 } stopifnot(length(genMap)==self$nChr, centromere<=sapply(genMap,max)) - + tmp = do.call("c",lapply(genMap,length)) stopifnot(all(tmp==private$.segSites)) - + private$.sepMap = TRUE private$.maleMap = genMap private$.maleCentromere = centromere @@ -2054,12 +2054,14 @@ SimParam = R6Class( } private$.hasHap = c(private$.hasHap, rep(FALSE, nNewInd)) private$.isFounder = c(private$.isFounder, rep(TRUE, nNewInd)) + names(newRecHist) = id private$.recHist = c(private$.recHist, newRecHist) private$.lastHaplo = tmpLastHaplo }else{ # Add hist to recombination history private$.hasHap = c(private$.hasHap, rep(FALSE, nNewInd)) private$.isFounder = c(private$.isFounder, rep(FALSE, nNewInd)) + names(hist) = id private$.recHist = c(private$.recHist, hist) } private$.pedigree = rbind(private$.pedigree, tmp) @@ -2184,7 +2186,7 @@ SimParam = R6Class( .isFounder="logical", .activeQtl = "LociMap", .qtlIndex = "list", - + # Determines whether not a simulation has started using lastId as an indicator .isRunning = function(){ if(private$.lastId==0L){ @@ -2193,12 +2195,12 @@ SimParam = R6Class( stop("lastId doesn't equal 0, you must run resetPed to proceed") } }, - + # Adds a trait to simulation and ensures all fields are propagated .addTrait = function(lociMap,varA=NA_real_,varG=NA_real_,varE=NA_real_){ stopifnot(is.numeric(varA),is.numeric(varG),is.numeric(varE), length(varA)==1,length(varG)==1,length(varE)==1) - + if(self$nTraits == 0L){ # Initializing .activeQtl private$.activeQtl = new("LociMap", @@ -2208,7 +2210,7 @@ SimParam = R6Class( private$.qtlIndex[[1L]] = seq_len(private$.activeQtl@nLoci) }else{ # Check if exiting activeQtl is a superset - testSet = findLociMapSuperset(private$.activeQtl, + testSet = findLociMapSuperset(private$.activeQtl, lociMap) if(is.null(testSet)){ # Existing activeQtl is a superset, just add qtlIndex @@ -2216,24 +2218,24 @@ SimParam = R6Class( }else{ # New activeQtl object required, recompute all values for qtlIndex private$.activeQtl = testSet - + for(i in seq_len(self$nTraits)){ private$.qtlIndex[[i]] = findQtlIndex(private$.activeQtl, self$traits[[i]]) } - - private$.qtlIndex[[self$nTraits + 1L]] = + + private$.qtlIndex[[self$nTraits + 1L]] = findQtlIndex(private$.activeQtl, lociMap) } } - + private$.traits[[self$nTraits + 1L]] = lociMap private$.varA = c(private$.varA,varA) private$.varG = c(private$.varG,varG) private$.varE = c(private$.varE,varE) invisible(self) }, - - # Samples eligible loci for traits or SNP chips and ensures that they + + # Samples eligible loci for traits or SNP chips and ensures that they # are added to the exclusion list when applicable .pickLoci = function(nSitesPerChr, QTL=TRUE, minFreq=NULL, refPop=NULL){ stopifnot(length(nSitesPerChr)==self$nChr) @@ -2263,7 +2265,7 @@ SimParam = R6Class( pot[[chr]] = tmp[tmp%in%pot[[chr]]] } } - + stopifnot("Not enough eligible sites, see ?SimParam and method restrSegSites()" = sapply(pot,length)>=nSitesPerChr) # Sample sites @@ -2296,9 +2298,9 @@ SimParam = R6Class( lociLoc=as.integer(lociLoc)) return(loci) }, - + # Returns physical positions of named loci in a list format - # Input order is not preserved. This function is intended as + # Input order is not preserved. This function is intended as # a helper for restrSegSites .findNamedLoci = function(lociNames){ # Loci names @@ -2307,7 +2309,7 @@ SimParam = R6Class( if(any(is.na(take))){ stop("One or more loci are not on the genetic map. Beware of case sensitivity.") } - + # Find positions using an interval search strategy on the cumulative sum take = unique(take) posList = rep(list(integer()), self$nChr) @@ -2315,55 +2317,55 @@ SimParam = R6Class( for(i in take){ # Identify chromosome chr = findInterval(i, cumSumSegSite, left.open = TRUE) + 1L - + # Identify position if(chr>1L){ pos = i - cumSumSegSite[chr-1L] }else{ pos = i } - + # Add site to list posList[[chr]] = c(posList[[chr]], pos) } - + return(posList) }, - + # Sets activeQtl and qtlIndex slots # Run after removing or modifying existing traits .setActiveQtl = function(){ - + private$.activeQtl = NULL private$.qtlIndex = list() - + if(self$nTraits > 0L){ # Set activeQtl using trait 1 private$.activeQtl = new("LociMap", nLoci = self$traits[[1L]]@nLoci, lociPerChr = self$traits[[1L]]@lociPerChr, lociLoc = self$traits[[1L]]@lociLoc) - + private$.qtlIndex[[1L]] = seq_len(private$.activeQtl@nLoci) } - + if(self$nTraits > 1L){ # Update activeQtl, if needed for(i in 2:self$nTraits){ - testSet = findLociMapSuperset(private$.activeQtl, + testSet = findLociMapSuperset(private$.activeQtl, self$traits[[i]]) if(!is.null(testSet)){ private$.activeQtl = testSet } } - + # Update qtlIndex for(i in seq_len(self$nTraits)){ - private$.qtlIndex[[i]] = findQtlIndex(private$.activeQtl, + private$.qtlIndex[[i]] = findQtlIndex(private$.activeQtl, self$traits[[i]]) } } - + invisible(self) } @@ -2638,8 +2640,8 @@ SimParam = R6Class( }else{ stop("`$version` is read only",call.=FALSE) } - }, - + }, + #' @field activeQtl a LociMap representing all active QTL in simulation activeQtl=function(value){ if(missing(value)){ @@ -2648,8 +2650,8 @@ SimParam = R6Class( stop("`$activeQtl` is read only",call.=FALSE) } }, - - #' @field qtlIndex a list of vectors giving trait specific QTL indices + + #' @field qtlIndex a list of vectors giving trait specific QTL indices #' relative to all active QTL qtlIndex=function(value){ if(missing(value)){ @@ -2664,8 +2666,8 @@ SimParam = R6Class( #### External helpers ---- #' @title Sample additive effects -#' -#' @description Samples deviates from a normal distribution or gamma distribution +#' +#' @description Samples deviates from a normal distribution or gamma distribution #' with a random sign #' #' @param qtlLoci total number of loci @@ -2675,7 +2677,7 @@ SimParam = R6Class( #' @param shape gamma distribution shape parameter #' #' @returns a matrix with dimensions qtlLoci by nTraits -#' +#' #' @keywords internal sampAddEff = function(qtlLoci,nTraits,corr,gamma,shape){ addEff = matrix(rnorm(qtlLoci@nLoci*nTraits), @@ -2690,9 +2692,9 @@ sampAddEff = function(qtlLoci,nTraits,corr,gamma,shape){ } #' @title Sample dominance effects -#' -#' @description Samples dominance deviation effects from a normal distribution -#' and uses previously sampled additive effects to form dominance +#' +#' @description Samples dominance deviation effects from a normal distribution +#' and uses previously sampled additive effects to form dominance #' effects #' #' @param qtlLoci total number of loci @@ -2703,7 +2705,7 @@ sampAddEff = function(qtlLoci,nTraits,corr,gamma,shape){ #' @param varDD variance of dominance degrees #' #' @returns a matrix with dimensions qtlLoci by nTraits -#' +#' #' @keywords internal sampDomEff = function(qtlLoci,nTraits,addEff,corDD, meanDD,varDD){ @@ -2716,7 +2718,7 @@ sampDomEff = function(qtlLoci,nTraits,addEff,corDD, } #' @title Sample epistatic effects -#' +#' #' @description Samples epistatic effects from a normal distribution or gamma distribution #' with a variance relative to the variance of the additive effects #' @@ -2728,7 +2730,7 @@ sampDomEff = function(qtlLoci,nTraits,addEff,corDD, #' @param relVar desired variance for epistatic effects #' #' @returns a matrix with dimensions qtlLoci by nTraits -#' +#' #' @keywords internal sampEpiEff = function(qtlLoci,nTraits,corr,gamma,shape,relVar){ epiEff = matrix(rnorm(qtlLoci@nLoci*nTraits/2), @@ -2750,37 +2752,37 @@ isSimParam = function(x) { } #' @title Find LociMap superset -#' -#' @description Compares to a \code{\link{LociMap-class}} objects to determine if -#' the first one is a superset of the second. If it is, the function returns NULL. +#' +#' @description Compares to a \code{\link{LociMap-class}} objects to determine if +#' the first one is a superset of the second. If it is, the function returns NULL. #' If it is not, the function return a \code{\link{LociMap-class}} object that is #' a superset of both a \code{\link{LociMap-class}} objects. #' -#' @param lociMap1 a \code{\link{LociMap-class}} that is tested to determine if +#' @param lociMap1 a \code{\link{LociMap-class}} that is tested to determine if #' it is a superset #' @param lociMap2 a second \code{\link{LociMap-class}} that is tested #' -#' @returns NULL if locMap1 is a superset, or a \code{\link{LociMap-class}} if it +#' @returns NULL if locMap1 is a superset, or a \code{\link{LociMap-class}} if it #' is not -#' +#' #' @keywords internal findLociMapSuperset = function(lociMap1, lociMap2){ - + loc1 = paste(rep(x = seq_len(length(lociMap1@lociPerChr)), times = lociMap1@lociPerChr), lociMap1@lociLoc, sep="_") - + loc2 = paste(rep(x = seq_len(length(lociMap2@lociPerChr)), times = lociMap2@lociPerChr), lociMap2@lociLoc, sep="_") - + inSet1 = loc2 %in% loc1 - + if(all(inSet1)){ # lociMap1 is a superset of lociMap2, no action needed return(NULL) } - + # Create an ordered matrix with columns chr and lociLoc for loci superset loc = unique(c(loc1, loc2)) tab = sapply(loc, function(x){ @@ -2789,9 +2791,9 @@ findLociMapSuperset = function(lociMap1, lociMap2){ }) tab = t(tab) tab = tab[order(tab[,1],tab[,2]),] - + # Return LociMap for superset of loci - return(new("LociMap", + return(new("LociMap", nLoci = nrow(tab), lociPerChr = tabulate(tab[,1], nbins=length(lociMap1@lociPerChr)), lociLoc = unname(tab[,2]))) @@ -2799,29 +2801,29 @@ findLociMapSuperset = function(lociMap1, lociMap2){ #' @title Find trait QTL index -#' -#' @description Compares to a \code{\link{LociMap-class}} objects to determine if -#' the first one is a superset of the second. If it is, the function returns NULL. +#' +#' @description Compares to a \code{\link{LociMap-class}} objects to determine if +#' the first one is a superset of the second. If it is, the function returns NULL. #' If it is not, the function return a \code{\link{LociMap-class}} object that is #' a superset of both a \code{\link{LociMap-class}} objects. #' #' @param activeQtl a \code{\link{LociMap-class}} representing all active QTL -#' @param traitQtl a \code{\link{LociMap-class}} representing QTL for a trait of +#' @param traitQtl a \code{\link{LociMap-class}} representing QTL for a trait of #' interest #' #' @returns an integer vector for relative positions of traitQtl on activeQtl -#' +#' #' @keywords internal findQtlIndex = function(activeQtl, traitQtl){ - + activeLoci = paste(rep(x = seq_len(length(activeQtl@lociPerChr)), times = activeQtl@lociPerChr), activeQtl@lociLoc, sep="_") - + traitLoci = paste(rep(x = seq_len(length(traitQtl@lociPerChr)), times = traitQtl@lociPerChr), traitQtl@lociLoc, sep="_") - + return(match(traitLoci, activeLoci)) } From fe2d5c1f453402758fdde46011a86894b74dd40a Mon Sep 17 00:00:00 2001 From: Gregor Gorjanc Date: Wed, 28 May 2025 14:21:50 +0200 Subject: [PATCH 17/19] Added NEWS item, recompiled docs, and ran checks --- NEWS.md | 2 + man/SimParam.Rd | 110 ++++++++++++++++++------------------- man/findLociMapSuperset.Rd | 8 +-- man/findQtlIndex.Rd | 6 +- man/sampAddEff.Rd | 2 +- man/sampDomEff.Rd | 4 +- 6 files changed, 67 insertions(+), 65 deletions(-) diff --git a/NEWS.md b/NEWS.md index 3dc7eb5c..d8a91f1d 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,5 +1,7 @@ # AlphaSimR 1.6.1.9990 +*added names to `SP$recHist` + *added `asCategorical` to convert a normal (Gaussian) trait to an ordered categorical (threshold) trait *improved computational performance of simulations with multiple traits diff --git a/man/SimParam.Rd b/man/SimParam.Rd index 9401beb2..ca5ccae1 100644 --- a/man/SimParam.Rd +++ b/man/SimParam.Rd @@ -380,7 +380,7 @@ genetic map} \item{\code{activeQtl}}{a LociMap representing all active QTL in simulation} -\item{\code{qtlIndex}}{a list of vectors giving trait specific QTL indices +\item{\code{qtlIndex}}{a list of vectors giving trait specific QTL indices relative to all active QTL} } \if{html}{\out{}} @@ -587,24 +587,24 @@ pop2@id # 1:10 \if{latex}{\out{\hypertarget{method-SimParam-restrSegSites}{}}} \subsection{Method \code{restrSegSites()}}{ Sets restrictions on which segregating sites can -serve as a SNP and/or QTL. The default behavior of AlphaSimR is -to randomly sample QTL or SNP from all eligible sites and then -mark the sampled sites ineligible to be sampled as the other -type (e.g. if a site is sampled as a QTL it will be marked as -ineligible to be sampled as a SNP). This behavior is designed -to produce the most challenging scenario for genomic selection +serve as a SNP and/or QTL. The default behavior of AlphaSimR is +to randomly sample QTL or SNP from all eligible sites and then +mark the sampled sites ineligible to be sampled as the other +type (e.g. if a site is sampled as a QTL it will be marked as +ineligible to be sampled as a SNP). This behavior is designed +to produce the most challenging scenario for genomic selection when the markers used for prediction are not causal. -Setting overlap=TRUE will prevent the addition of loci to the -ineligible list, but it won't remove sites already added to these -lists. Thus, timing of when restrSegSites is called matters. It -should be called before any addTrait or addSnpChip functions with +Setting overlap=TRUE will prevent the addition of loci to the +ineligible list, but it won't remove sites already added to these +lists. Thus, timing of when restrSegSites is called matters. It +should be called before any addTrait or addSnpChip functions with the overlap=TRUE argument to freely allow loci to overlap. -The minQtlPerChr and minSnpPerChr arguments can be used with -overlap=FALSE to preallocate sites as QTL and SNP respectively. This -option is useful when simulating multiple traits and/or SNP chips, -because it can be used to guarantee that enough eligible sites are +The minQtlPerChr and minSnpPerChr arguments can be used with +overlap=FALSE to preallocate sites as QTL and SNP respectively. This +option is useful when simulating multiple traits and/or SNP chips, +because it can be used to guarantee that enough eligible sites are available when running addTrait and or addSnpChip functions. \subsection{Usage}{ \if{html}{\out{
}}\preformatted{SimParam$restrSegSites( @@ -620,16 +620,16 @@ available when running addTrait and or addSnpChip functions. \subsection{Arguments}{ \if{html}{\out{
}} \describe{ -\item{\code{minQtlPerChr}}{the minimum number of segregating sites for +\item{\code{minQtlPerChr}}{the minimum number of segregating sites for QTLs. Can be a single value or a vector values for each chromosome.} \item{\code{minSnpPerChr}}{the minimum number of segregating sites for SNPs. Can be a single value or a vector values for each chromosome.} -\item{\code{excludeQtl}}{an optional vector of segregating site names to +\item{\code{excludeQtl}}{an optional vector of segregating site names to exclude from consideration as a viable QTL.} -\item{\code{excludeSnp}}{an optional vector of segregating site names to +\item{\code{excludeSnp}}{an optional vector of segregating site names to exclude from consideration as a viable SNP.} \item{\code{overlap}}{should SNP and QTL sites be allowed to overlap.} @@ -700,7 +700,7 @@ SP$setSexes("yes_sys") \if{html}{\out{}} \if{latex}{\out{\hypertarget{method-SimParam-setFounderHap}{}}} \subsection{Method \code{setFounderHap()}}{ -Allows for the manual setting of founder haplotypes. This functionality +Allows for the manual setting of founder haplotypes. This functionality is not fully documented, because it is still experimental. \subsection{Usage}{ \if{html}{\out{
}}\preformatted{SimParam$setFounderHap(hapMap)}\if{html}{\out{
}} @@ -758,10 +758,10 @@ SP$addSnpChip(10) \if{html}{\out{}} \if{latex}{\out{\hypertarget{method-SimParam-addSnpChipByName}{}}} \subsection{Method \code{addSnpChipByName()}}{ -Assigns SNPs to a SNP chip by supplying marker names. This function does -check against excluded SNPs and will not add the SNPs to the list of -excluded QTL for the purpose of avoiding overlap between SNPs and QTL. -Excluding these SNPs from being used as QTL can be accomplished using +Assigns SNPs to a SNP chip by supplying marker names. This function does +check against excluded SNPs and will not add the SNPs to the list of +excluded QTL for the purpose of avoiding overlap between SNPs and QTL. +Excluding these SNPs from being used as QTL can be accomplished using the excludeQtl argument in SimParam's restrSegSites function. \subsection{Usage}{ \if{html}{\out{
}}\preformatted{SimParam$addSnpChipByName(markers, name = NULL)}\if{html}{\out{
}} @@ -950,9 +950,9 @@ SP$addTraitAD(10, meanDD=0.5) \if{html}{\out{}} \if{latex}{\out{\hypertarget{method-SimParam-altAddTraitAD}{}}} \subsection{Method \code{altAddTraitAD()}}{ -An alternative method for adding a trait with additive and dominance effects -to an AlphaSimR simulation. The function attempts to create a trait matching -user defined values for number of QTL, inbreeding depression, additive genetic +An alternative method for adding a trait with additive and dominance effects +to an AlphaSimR simulation. The function attempts to create a trait matching +user defined values for number of QTL, inbreeding depression, additive genetic variance and dominance genetic variance. \subsection{Usage}{ \if{html}{\out{
}}\preformatted{SimParam$altAddTraitAD( @@ -972,7 +972,7 @@ variance and dominance genetic variance. \subsection{Arguments}{ \if{html}{\out{
}} \describe{ -\item{\code{nQtlPerChr}}{number of QTLs per chromosome. +\item{\code{nQtlPerChr}}{number of QTLs per chromosome. Can be a single value or nChr values.} \item{\code{mean}}{desired mean of the trait} @@ -997,38 +997,38 @@ ignored. Only set to TRUE if you know what you are doing.} \if{html}{\out{
}} } \subsection{Details}{ -This function will always add a trait to 'SimParam', unless an error occurs -with picking QTLs. The resulting trait will always have the desired mean and -additive genetic variance. However, it may not have the desired values for -inbreeding depression and dominance variance. Thus, it is strongly recommended -to check the output printed to the console to determine how close the trait's +This function will always add a trait to 'SimParam', unless an error occurs +with picking QTLs. The resulting trait will always have the desired mean and +additive genetic variance. However, it may not have the desired values for +inbreeding depression and dominance variance. Thus, it is strongly recommended +to check the output printed to the console to determine how close the trait's parameters came to these desired values. -The mean and additive genetic variance will always be achieved exactly. The -function attempts to achieve the desired dominance variance and inbreeding -depression while staying within the user supplied constraints for the +The mean and additive genetic variance will always be achieved exactly. The +function attempts to achieve the desired dominance variance and inbreeding +depression while staying within the user supplied constraints for the acceptable range of dominance degree mean and variance. If the desired values -are not being achieved, the acceptable range need to be increased and/or the -number of QTL may need to be increased. There are not limits to setting the -range for dominance degree mean and variance, but care should be taken to -with regards to the biological feasibility of the limits that are supplied. -The default limits were somewhat arbitrarily set, so I make not claim to +are not being achieved, the acceptable range need to be increased and/or the +number of QTL may need to be increased. There are not limits to setting the +range for dominance degree mean and variance, but care should be taken to +with regards to the biological feasibility of the limits that are supplied. +The default limits were somewhat arbitrarily set, so I make not claim to how reasonable these limits are for routine use. -Inbreeding depression in this function is defined as the difference in mean -genetic value between a population with the same allele frequency as the -reference population (population used to initialize SimParam) in -Hardy-Weinberg equilibrium compared to a population with the same allele -frequency that is fully inbred. This is equivalent to the amount the mean of -a population increases when going from an inbreeding coefficient of 1 (fully -inbred) to a population with an inbreeding coefficient of 0 (Hardy-Weinberg -equilibrium). Note that the sign of the value should (usually) be positive. -This corresponds to a detrimental effect of inbreeding when higher values of +Inbreeding depression in this function is defined as the difference in mean +genetic value between a population with the same allele frequency as the +reference population (population used to initialize SimParam) in +Hardy-Weinberg equilibrium compared to a population with the same allele +frequency that is fully inbred. This is equivalent to the amount the mean of +a population increases when going from an inbreeding coefficient of 1 (fully +inbred) to a population with an inbreeding coefficient of 0 (Hardy-Weinberg +equilibrium). Note that the sign of the value should (usually) be positive. +This corresponds to a detrimental effect of inbreeding when higher values of the trait are considered biologically beneficial. -Summary information on this trait is printed to the console when silent=FALSE. -The summary information reports the inbreeding depression and dominance -variance for the population as well as the dominance degree mean and variance +Summary information on this trait is printed to the console when silent=FALSE. +The summary information reports the inbreeding depression and dominance +variance for the population as well as the dominance degree mean and variance applied to the trait. } @@ -1629,10 +1629,10 @@ ignored. Only set to TRUE if you know what you are doing} \if{latex}{\out{\hypertarget{method-SimParam-setVarE}{}}} \subsection{Method \code{setVarE()}}{ Defines a default values for error -variances used in \code{\link{setPheno}}. These defaults -will be used to automatically generate phenotypes when new +variances used in \code{\link{setPheno}}. These defaults +will be used to automatically generate phenotypes when new populations are created. See the details section of \code{\link{setPheno}} -for more information about each arguments and how they +for more information about each arguments and how they should be used. \subsection{Usage}{ \if{html}{\out{
}}\preformatted{SimParam$setVarE(h2 = NULL, H2 = NULL, varE = NULL, corE = NULL)}\if{html}{\out{
}} diff --git a/man/findLociMapSuperset.Rd b/man/findLociMapSuperset.Rd index 0a419164..1542d2dd 100644 --- a/man/findLociMapSuperset.Rd +++ b/man/findLociMapSuperset.Rd @@ -7,18 +7,18 @@ findLociMapSuperset(lociMap1, lociMap2) } \arguments{ -\item{lociMap1}{a \code{\link{LociMap-class}} that is tested to determine if +\item{lociMap1}{a \code{\link{LociMap-class}} that is tested to determine if it is a superset} \item{lociMap2}{a second \code{\link{LociMap-class}} that is tested} } \value{ -NULL if locMap1 is a superset, or a \code{\link{LociMap-class}} if it +NULL if locMap1 is a superset, or a \code{\link{LociMap-class}} if it is not } \description{ -Compares to a \code{\link{LociMap-class}} objects to determine if -the first one is a superset of the second. If it is, the function returns NULL. +Compares to a \code{\link{LociMap-class}} objects to determine if +the first one is a superset of the second. If it is, the function returns NULL. If it is not, the function return a \code{\link{LociMap-class}} object that is a superset of both a \code{\link{LociMap-class}} objects. } diff --git a/man/findQtlIndex.Rd b/man/findQtlIndex.Rd index 983e1577..8ee3e63d 100644 --- a/man/findQtlIndex.Rd +++ b/man/findQtlIndex.Rd @@ -9,15 +9,15 @@ findQtlIndex(activeQtl, traitQtl) \arguments{ \item{activeQtl}{a \code{\link{LociMap-class}} representing all active QTL} -\item{traitQtl}{a \code{\link{LociMap-class}} representing QTL for a trait of +\item{traitQtl}{a \code{\link{LociMap-class}} representing QTL for a trait of interest} } \value{ an integer vector for relative positions of traitQtl on activeQtl } \description{ -Compares to a \code{\link{LociMap-class}} objects to determine if -the first one is a superset of the second. If it is, the function returns NULL. +Compares to a \code{\link{LociMap-class}} objects to determine if +the first one is a superset of the second. If it is, the function returns NULL. If it is not, the function return a \code{\link{LociMap-class}} object that is a superset of both a \code{\link{LociMap-class}} objects. } diff --git a/man/sampAddEff.Rd b/man/sampAddEff.Rd index 490356da..b4d0ff74 100644 --- a/man/sampAddEff.Rd +++ b/man/sampAddEff.Rd @@ -21,7 +21,7 @@ sampAddEff(qtlLoci, nTraits, corr, gamma, shape) a matrix with dimensions qtlLoci by nTraits } \description{ -Samples deviates from a normal distribution or gamma distribution +Samples deviates from a normal distribution or gamma distribution with a random sign } \keyword{internal} diff --git a/man/sampDomEff.Rd b/man/sampDomEff.Rd index f5e69a89..552c9ae0 100644 --- a/man/sampDomEff.Rd +++ b/man/sampDomEff.Rd @@ -23,8 +23,8 @@ sampDomEff(qtlLoci, nTraits, addEff, corDD, meanDD, varDD) a matrix with dimensions qtlLoci by nTraits } \description{ -Samples dominance deviation effects from a normal distribution -and uses previously sampled additive effects to form dominance +Samples dominance deviation effects from a normal distribution +and uses previously sampled additive effects to form dominance effects } \keyword{internal} From 01002d57c875d5901b6c7cad2b8afbc792c8088d Mon Sep 17 00:00:00 2001 From: Gregor Gorjanc Date: Fri, 1 Aug 2025 17:07:22 +0100 Subject: [PATCH 18/19] Update CITATION Added Bancic et al. (2025) citation so we direct users to plant breeding examples/scripts --- inst/CITATION | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/inst/CITATION b/inst/CITATION index 5940f96c..5d25638e 100644 --- a/inst/CITATION +++ b/inst/CITATION @@ -1,5 +1,5 @@ bibentry( - bibtype = "Article", + bibtype = "Article", title = "AlphaSimR: an R package for breeding program simulations", author = "R Chris Gaynor, Gregor Gorjanc, John M Hickey", journal = "G3 Gene|Genomes|Genetics", @@ -9,3 +9,14 @@ bibentry( issue = "2", url = "https://doi.org/10.1093/g3journal/jkaa017" ) + +bibentry( + bibtype = "Article", + title = "Plant breeding simulations with AlphaSimR", + author = "Jon Bančič, Philip Greenspoon, R Chris Gaynor, Gregor Gorjanc", + journal = "Crop Science", + year = "2025", + volume = "65", + number = "e21312", + url = "https://doi.org/10.1002/csc2.21312" +) From 40722ed2d3b6deca05c4c9beeab13ad9c389c996 Mon Sep 17 00:00:00 2001 From: Chris Gaynor Date: Mon, 1 Sep 2025 16:20:30 -0500 Subject: [PATCH 19/19] pre release 2.0.0 --- DESCRIPTION | 4 ++-- NEWS.md | 4 +++- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/DESCRIPTION b/DESCRIPTION index 78ff1d9a..e73b575b 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,8 +1,8 @@ Package: AlphaSimR Type: Package Title: Breeding Program Simulations -Version: 1.6.1.9007 -Date: 2025-04-05 +Version: 2.0.0 +Date: 2025-09-01 Authors@R: c(person("Chris", "Gaynor", email = "gaynor.robert@hotmail.com", role = c("aut", "cre"), comment = c(ORCID = "0000-0003-0558-6656")), person("Gregor", "Gorjanc", role = "ctb", diff --git a/NEWS.md b/NEWS.md index d8a91f1d..45d4d7a9 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,4 +1,4 @@ -# AlphaSimR 1.6.1.9990 +# AlphaSimR 2.0.0 *added names to `SP$recHist` @@ -10,6 +10,8 @@ *changed finalizePop function call in `.newPop` to pass simParam as an argument +*updated version numbering to follow tidyverse format with a major version indicating backwards compatibility has been broken + # AlphaSimR 1.6.1 *fixed bug in `mergePops` and `[` (subset) methods - they were failing for populations that had a misc slot with a matrix - we now check if a misc slot element is a matrix and rbind them for `mergePops` and subset rows for `[` (assuming the first dimension represents individuals)