--- title: "Interactive Use of nprcgenekeepr" subtitle: "An R Package for the Genetic Management of Colonies" output: html_document: df_print: paged vignette: > %\VignetteEngine{knitr::rmarkdown_notangle} %\VignetteIndexEntry{Interactive Use of nprcgenekeepr} %\usepackage[UTF-8]{inputenc} --- ```{r setup, include=FALSE} knitr::opts_chunk$set( echo = TRUE, include = TRUE, results = "markup", cache = FALSE ) pdf.options(useDingbats = TRUE) start_time <- proc.time() ``` ```{r, include = FALSE} library(stringi) # library(rmsutilityr) suppressMessages(library(ggplot2)) library(knitr) library(nprcgenekeepr) set_seed(1L) ``` ## Introduction This tutorial demonstrates the major functions used within the Shiny application provided by the __nprcgenekeepr__ package and provides sufficient insight into those functions that they may be used independently. This tutorial is primarily directed toward someone with experience using R who wants to better understand how the Shiny application works or to perform some actions not directly supported by the Shiny application. Please provide any comments, questions, or bug reports through the GitHub issue tracker at [https://github.com/rmsharp/nprcgenekeepr/issues]( https://github.com/rmsharp/nprcgenekeepr/issues). ## Installation and Help You can install **nprcgenekeepr** from GitHub with the following code. ```{r gh-installation, eval = FALSE, echo=TRUE} install.packages(nprcgenekeepr) ## Use the following code to get the development version # install.packages("devtools") # devtools::install_github("rmsharp/nprcgenekeepr") ``` All missing dependencies should be automatically installed. You can get help from the __R__ console with ```{r gh-help, eval = FALSE, echo=TRUE} ?nprcgenekeepr ``` __The help provided by this (_nprcgenekeepr.R_) needs to be more complete and include links to the tutorials.__ ## Reading in a Pedigree A pedigrees can be imported using either Excel worksheets or text files that contain all of the pedigree information or using either Excel worksheets or text files that contain a list of focal animals with the remainder of the pedigree information is pulled in through the LabKey API. This tutorial will use a pedigree file that can be created using the __makeExamplePedigreeFile__ function as shown below. The function __makeExamplePedigreeFile__ both saves a file and returns the full path name to the saved file, which we are saving into the variable _pedigreeFile_. Note: the user will select where to store the file. ```{r make-example-pedigree, eval = FALSE, include = TRUE} library(nprcgenekeepr) pedigreeFile <- makeExamplePedigreeFile() ``` This writes _ExamplePedigree.csv_ to a place you select within your file system. ```{r get-file-name, eval = TRUE, echo = FALSE, include = FALSE} pedigreeFile <- "../inst/extdata/ExamplePedigree.csv" ``` You use the file name provided by the __makeExamplePedigreeFile__ function to tell __read.table__ what file to read. ```{r read-in-example-pedigree} breederPedCsv <- read.table(pedigreeFile, sep = ",", header = TRUE, stringsAsFactors = FALSE ) breederPedCsv$fromCenter <- "TRUE" breederPedCsv$fromCenter[ suppressWarnings(sample( which(is.na(breederPedCsv$sire) & is.na(breederPedCsv$dam)), round(0.8 * length(which( is.na(breederPedCsv$sire) & is.na(breederPedCsv$dam) ))) )) ] <- "FALSE" ``` Note the number of rows read. Each row represents an individual within the pedigree. ```{r row-count} nrow(breederPedCsv) ``` The next step is to put the information read from the file into a pedigree object. This is done with the __qcStudbook__ function, which examines the file contents and tests for common pedigree errors. You can see the errors that can be detected by __qcStudbook__ by returning the empty error list with __getEmptyErrorLst()__. We are not showing the output of the function call now because later in this tutorial we will explore errors in more depth. __qcStudbook__ can take four arguments _sb_, _minParentAge_ (in years), _reportChanges_, and _reportErrors_. However, all but _sb_ have default values and only the _sb_ argument is required. It is prudent to ensure that parents are at least of breeding age, which is species specific. I have used a _minParentAge_ of 2 years.[^1] [^1]: Setting the _minParentAge_ to 3.5 and above will cause an error along with the creation of a file _~/lowParentAge.csv_ that will list the parents with low age at the birth of an offspring. ```{r form-pedigree-object} breederPed <- qcStudbook(breederPedCsv, minParentAge = 2L) ``` If __qcStudbook__ reports an error, change the call by adding the __reportErrors__ argument set to __TRUE__ and examine the returned object. More on this is presented in the __Pedigree Errors__ section. ## Identifying Focal Animals You may want to focus your work on a _focal_ group of animals. This can be done by reading in a list of animal IDs that make up the _focal_ group and use that list to update the pedigree. Alternatively you can created a list of animal IDs based on criteria you have selected. For example, to select living animals at the facility with at least one parent, the following can be used. ```{r select-focal-animals-from-pedigree, results='asis'} focalAnimals <- breederPed$id[!(is.na(breederPed$sire) & is.na(breederPed$dam)) & is.na(breederPed$exit)] print(stri_c( "There are ", length(focalAnimals), " animals in the vector _focalAnimals_." )) ``` As can be seen, these animals have at least one parent and have not left the facility. ```{r short-pedigree-list-of-focal-animals} breederPed[breederPed$id %in% focalAnimals, c("id", "sire", "dam", "exit")][1L:10L, ] ``` We indicate that these are the animals of interest by using the __setPopulation__ function. This function simply sets a column named _population_[^2] to the logical value of __TRUE__ if the row represents an animal in the list and __FALSE__ otherwise. [^2]: The _population_ column is created and added to the pedigree object if it does not already exist. The first line of code below sets the _population_ column and the second counts the number of rows where the value was set to __TRUE__. ```{r set-population-flag} breederPed <- setPopulation(ped = breederPed, ids = focalAnimals) nrow(breederPed[breederPed$population, ]) ``` The IDs used to populate the _population_ flag can be used to trim the pedigree so that it contains only those individuals who are in the ID list or are ancestors of those individuals. ```{r trim-pedigree} trimmedPed <- trimPedigree(focalAnimals, breederPed) nrow(breederPed) nrow(trimmedPed) ``` The __trimPedigree__ function has the ability to remove those ancestors that do not contribute genetic information. Uninformative founders are those individuals who are parents of only one individual and who have no parental information. (_Currently genotypic information is ignored by __trimPedigree___). ```{r removed-uninformatiive-animals} trimmedPedInformative <- trimPedigree(focalAnimals, breederPed, removeUninformative = TRUE ) nrow(trimmedPedInformative) ``` We can find all of the animals that are in the trimmed pedigree but are not focal animals. ```{r get-animals-added-to-focal-animals} nonfocalInTrimmedPed <- trimmedPed$id[!trimmedPed$id %in% focalAnimals] length(nonfocalInTrimmedPed) ``` We can see which of these `r length(nonfocalInTrimmedPed)` are and are not parents. We will first make sure we have all of the parents by getting our list of parents from the entire pedigree. We then demonstrate that they are all in the trimmed pedigree. ```{r find-focal-parents} allFocalParents <- c( breederPed$sire[breederPed$id %in% focalAnimals], breederPed$dam[breederPed$id %in% focalAnimals] ) trimmedFocalParents <- c( trimmedPed$sire[trimmedPed$id %in% focalAnimals], trimmedPed$dam[trimmedPed$id %in% focalAnimals] ) all.equal(allFocalParents, trimmedFocalParents) # Are the IDs the same? ``` However, not all of the animals in the trimmed pedigree are either the focal animals or their parents. They are more distant ancestors as we will show. ```{r find-non-focal-non-focal-parent} notFocalNotParent <- trimmedPed$id[!trimmedPed$id %in% c(focalAnimals, allFocalParents)] length(notFocalNotParent) ``` ```{r find-grandparents, echo = FALSE, include=FALSE, eval = TRUE} allFocalGrandParents <- c( breederPed$sire[breederPed$id %in% allFocalParents], breederPed$dam[breederPed$id %in% allFocalParents] ) ## Not all parents are known so the unknown individuals (NA) are removed. allFocalGrandParents <- allFocalGrandParents[!is.na(allFocalGrandParents)] trimmedFocalGrandParents <- c( trimmedPed$sire[trimmedPed$id %in% allFocalParents], trimmedPed$dam[trimmedPed$id %in% allFocalParents] ) trimmedFocalGrandParents <- trimmedFocalGrandParents[!is.na(trimmedFocalGrandParents)] all.equal(allFocalGrandParents, trimmedFocalGrandParents) ``` Since the trimming process is supposed to retain the focal animals and their ancestors, we will leave it as an exercise for you to demonstrate that at least some of the remaining animals are grandparents of the focal animals. _Hint: there are `r length(trimmedFocalGrandParents)` grandparents in both the trimmed and the complete pedigree_. ```{r added-animals, echo=FALSE, include=FALSE, eval=TRUE} # get-informative-and-uninformative-added-animals trimmedPedInformative <- trimPedigree(focalAnimals, breederPed, removeUninformative = TRUE ) uninformative <- trimmedPed$id[!trimmedPed$id %in% trimmedPedInformative$id] notFocalInTrimmedPed <- trimmedPed$id[!trimmedPed$id %in% focalAnimals] additionalAnimals <- nrow(trimmedPed) - length(focalAnimals) geneticallyInformative <- nrow(trimPedigree(focalAnimals, breederPed, removeUninformative = TRUE )) - length(focalAnimals) ``` As you can see from the number of rows in the full pedigree (`r nrow(breederPed)`) versus the trimmed pedigree (`r nrow(trimmedPed)`), trimmed pedigrees can be much smaller. Of the additional `r additionalAnimals` animals, `r geneticallyInformative` provide genetic information while the others (`r length(uninformative)`) are genetically uninformative. ```{r animals-no-birth-no-exit, echo = FALSE, eval = TRUE, include=FALSE} unknownBirth <- breederPed$id[is.na(breederPed$birth)] unknownBirthOrExit <- breederPed$id[is.na(breederPed$birth) | !is.na(breederPed$exit)] knownPed <- breederPed[!breederPed$id %in% unknownBirthOrExit, ] otherIds <- knownPed$id[!knownPed$id %in% trimmedPed$id[is.na(trimmedPed$exit)]] ``` As is shown below only `r length(otherIds)` (`r get_and_or_list(otherIds)`) living animals are still in the colony but are not in the trimmed pedigree.[^3] [^3]: All animals within the colony have a known birth date. ```{r show-living-animals-not-in-trimmed-pedigree, results = 'asis'} unknownBirth <- breederPed$id[is.na(breederPed$birth)] knownExit <- breederPed$id[!is.na(breederPed$exit)] unknownBirthKnownExit <- breederPed$id[is.na(breederPed$birth) | !is.na(breederPed$exit)] knownPed <- breederPed[!breederPed$id %in% unknownBirthKnownExit, ] otherIds <- knownPed$id[!knownPed$id %in% trimmedPed$id[is.na(trimmedPed$exit)]] print(stri_c( "The living animals in the pedigree that are not in the trimmed ", "pedigree are ", get_and_or_list(otherIds), "." )) ``` ## Age Sex Pyramid Plot You can examine the population structure using an age-sex pyramid plot with a single function. We will limit our view to just the focal animals and their living relatives. This is appropriate for colony management because in addition to the genetic diversity we seek, we have to remain cognizant of the age and sex distributions within the colonies we manage. ```{r plot-focal-age-sex-pyramid, include = TRUE} #| fig.alt: > #| Age Sex Pyramid Plot. getPyramidPlot(ped = trimmedPed[is.na(trimmedPed$exit), ]) ``` ## Genetic Value Analysis Your genetic value analysis must be carefully performed. The next three commands set up the entire pedigree for analysis. The first of these three commands set all of the pedigree members to be part of the population of interest by setting the _population_ column to __TRUE__ for all individuals. ```{r set-entire-pedigree-as-population} ped <- setPopulation(breederPed, NULL) ``` Note that a new pedigree object (__ped__) is being created. ```{r full-pedigree-genetic-value-summary} probands <- ped$id[ped$population] ped <- trimPedigree(probands, ped, removeUninformative = FALSE, addBackParents = FALSE ) ``` The arguments to __reportGV__ are all optional except for _ped_, but you may often want to non-default values. - __ped__ Pedigree information in data.frame format - __guIter__ Integer indicating the number of iterations for the gene-drop analysis. Default is 5000 iterations - __guThresh__ Integer indicating the threshold number of animals for defining a unique allele. Default considers an allele "unique" if it is found in only 1 animal. - __pop__ Character vector with animal IDs to consider as the population of interest. The default is NULL. - __byID__ Logical variable of length 1 that is passed through to eventually be used by alleleFreq(), which calculates the count of each allele in the provided vector. If byID is TRUE and ids are provided, the function will only count the unique alleles for an individual (homozygous alleles will be counted as 1). ```{r full-geneticValue} geneticValue <- reportGV(ped, guIter = 50L, guThresh = 3L, byID = TRUE, updateProgress = NULL ) summary(geneticValue) ``` What happens if we limit our analysis to the trimmed pedigree? Remember that the trimmed pedigree still contains all of the genetic information that the full pedigree has for the focal animals. ```{r trimmed-genetic-value-analysis} trimmedGeneticValue <- reportGV(trimmedPed, guIter = 50L, guThresh = 3L, byID = TRUE, updateProgress = NULL ) summary(trimmedGeneticValue) ``` It is clear that limiting your analysis to the animals of interest reduces the effort required to examine the animals of interest. ### Detailed look at the Genetic Value Report object The names of the object within the genetic value report object (_trimmedGeneticValue_) can be listed as shown in the next line of code. ```{r list-genetic-value-objects} names(trimmedGeneticValue) ``` The _report_ object (an R dataframe) can in-turn be examined. ```{r list-report-object-parts} names(trimmedGeneticValue$report) ## column names nrow(trimmedGeneticValue$report) ## Number of rows ``` The report is more conveniently used as a separate object. The next section of code rounds some of the numerical values and converts all columns to characters for display as a table where only the first 10 lines are included. ```{r look-at-genetic-value-report} rpt <- trimmedGeneticValue[["report"]] rpt$indivMeanKin <- round(rpt$indivMeanKin, 5L) rpt$zScores <- round(rpt$zScores, 2L) rpt$gu <- round(rpt$gu, 5L) rpt <- toCharacter(rpt) names(rpt) <- headerDisplayNames(names(rpt)) knitr::kable(rpt[1L:10L, ]) # needs more work for display purposes. ``` We start the next lines of code by getting a fresh copy of the genetic value report since we changed all of the numeric values to characters in the last section to print the table. These lines demonstrate one way of extracting the component objects from the _genetic value_ object. ```{r kinship-and-founders} rpt <- trimmedGeneticValue[["report"]] kmat <- trimmedGeneticValue[["kinship"]] f <- trimmedGeneticValue[["total"]] mf <- trimmedGeneticValue[["maleFounders"]] ff <- trimmedGeneticValue[["femaleFounders"]] nmf <- trimmedGeneticValue[["nMaleFounders"]] nff <- trimmedGeneticValue[["nFemaleFounders"]] fe <- trimmedGeneticValue[["fe"]] fg <- trimmedGeneticValue[["fg"]] ``` It is informative to examine the distribution of _genetic uniqueness_, _mean kinship_, and _z-scores_ (normalized _mean kinship_ values). Creation of the boxplot for the _genetic uniqueness_ values is shown below. ```{r genetic-uniqueness-boxplot} #| fig.alt: > #| Genetic Uniqueness Box Plot. gu <- rpt[, "gu"] guBox <- ggplot(data.frame(gu = gu), aes(x = "", y = gu)) + geom_boxplot( color = "darkblue", fill = "lightblue", notch = TRUE, #| fig.alt: > #| Histogram of time between eruptions for Old Faithful. #| It is a bimodal distribution with peaks at 50-55 and #| 80-90 minutes. outlier.color = "red", outlier.shape = 1L ) + theme_classic() + geom_jitter(width = 0.2) + coord_flip() + ylab("Score") + ggtitle("Genetic Uniqueness") print(guBox) ``` Extraction of the individual _mean kinship_ values and their corresponding z-scores is shown in the next code chunk. ```{r extraction-of-mk-zs} mk <- rpt[, "indivMeanKin"] zs <- rpt[, "zScores"] ``` Creation of boxplots for the _mean kinship_ and _z-scores_ is left as an exercise. ## Breeding Group Formation The primary purpose of __nprcgenekeepr__ is to form breeding groups according to our best information regarding maintaining the genetic characteristics we desire and the realities associated with other animal husbandry needs. There are several options you must consider when forming groups using __nprcgenekeepr__, which we will examine using code below. - Animals used to form groups - _high-value_: Randomly select from only high-value animals in genetic value analysis - _all_: Randomly select from all animals in genetic value analysis - _candidates_: Use candidate animals entered below to form groups - Sex ratio - Randomly assign animals without regard to sex - Use a harem structure with one breeding male per group - Specify the sex ratio between 0.5 and 10 (F/M) - Whether or not to pre-populate (seed) groups with animals of your choice - Number of groups to be formed - Whether or not to ignore females at or above the minimum parent age - Number of simulations used to search for the optimal group makeup - Whether or not kinship coefficients are to be included in results You decisions regarding each of the above options are expressed in a call to the function __groupAddAssign__. A complete description of the function and its arguments is available using the code shown below. ```{r groupAddAssign-help-request, eval = FALSE} ?groupAddAssign ``` Below is the descriptions of the function parameters extracted from the documentation near the time this tutorial was prepared. - __candidates__ Character vector of IDs of the animals available for use in forming the groups. The animals that may be present in currentGroups are not included within candidates. - __currentGroups__ List of character vectors of IDs of animals currently assigned to groups. Defaults to a list with character(0) in each sub-list element (one for each group being formed) assuming no groups are pre-populated. - __kmat__ Numeric matrix of pairwise kinship values. Rows and columns are named with animal IDs. - __ped__ Dataframe that is the 'Pedigree'. It contains pedigree information including the IDs listed in candidates. - __threshold__ Numeric value indicating the minimum kinship level to be considered in group formation. Pairwise kinship below this level will be ignored. The default values is 0.015625. - __ignore__ List of character vectors representing the sex combinations to be ignored. If provided, the vectors in the list specify if pairwise kinship should be ignored between certain sexes. Default is to ignore all pairwise kinship between females. - __minAge__ Integer value indicating the minimum age to consider in group formation. Pairwise kinships involving an animal of this age or younger will be ignored. Default is 1 year. - __iter__ Integer indicating the number of times to perform the random group formation process. Default value is 1000 iterations. - __numGp__ Integer value indicating the number of groups that should be formed from the list of IDs. Default is 1. - __updateProgress__ Function or NULL. If this function is defined, it will be called during each iteration to update a shiny::Progress object. - __harem__ Logical variable when set to TRUE, the formed groups have a single male at least minAge old. - __sexRatio__ Numeric value indicating the ratio of females to males x from 0.5 to 20 by increments of 0.5. - __withKin__ Logical variable when set to TRUE, the kinship matrix for the group is returned along with the group and score. Defaults to not return the kinship matrix. This maintains compatibility with earlier versions. We will use the _trimmedPed_ pedigree in our code. For illustration purposes we are going to keep the numbers of candidates, groups, and iterations fairly small. We will get first some animal IDs to use for our candidates by selecting animals at least 2 years old at the time this pedigree was sampled (01-01-2015). ```{r get-candidates} candidates <- trimmedPed$id[trimmedPed$birth < as.Date("2013-01-01") & !is.na(trimmedPed$birth) & is.na(trimmedPed$exit)] table(trimmedPed$sex[trimmedPed$id %in% candidates]) ``` Our candidates are made up of `r table(trimmedPed$sex[trimmedPed$id %in% candidates])[[1]]` females and `r table(trimmedPed$sex[trimmedPed$id %in% candidates])[[2]]` males. The parameters __currentGroups__, __threshold__, __ignore__, __minAge__, __sexRatio__, __withKin__, and __updateProgress__ are allowed to take their default values. The setting of the __sexRatio__ parameter to 0 is ignored in the following call of the __groupAddAssign__ function. This is consistent with the a value of 0 making little since in a breeding colony. The empty seventh group at the bottom is evidence that all of the candidate animals could be placed in a group without exceeding the default value of 0.015625. ### Harems The following group assignments will be forming harem groups. This is done by setting __harem__ to \code{TRUE}. Setting `iter` to 100 or more will increase optimal composition of breeding groups ```{r create-harem-groups} haremGrp <- groupAddAssign( candidates = candidates, kmat = trimmedGeneticValue[["kinship"]], ped = trimmedPed, iter = 10L, numGp = 6L, harem = TRUE ) haremGrp$group ``` We can identify and list the males in each group with the following code. ```{r list-males-in-harem-groups} sapply(haremGrp$group, function(ids) { ids[ids %in% trimmedPed$id[trimmedPed$sex == "M"]] }) ``` It is easy to notice that the male is listed first within each breeding group. We can also see the number of animals and the sex ratios created in each group. Since these are harem groups the sex ratios are determined by the number of animals in the individual groups. ```{r harem-count-and-sexratios} lines <- sapply(haremGrp$group, function(ids) { paste0( "Count: ", length(ids), " Sex Ratio: ", round(calculateSexRatio(ids, trimmedPed), 2) ) }) for (line in lines) print(line) ``` Examination of this table shows that of the `r table(trimmedPed$sex[trimmedPed$id %in% candidates])[[1]]` females `r (sum(sapply(haremGrp$group[-1], function(ids) length(ids))) - length(haremGrp$group[-1]))` are included. ### Controlling Sex Ratios The following group assignments will be forming harem groups. This is done by setting __harem__ to \code{TRUE}. ```{r create-sexratio-groups} sexRatioGrp <- groupAddAssign( candidates = candidates, kmat = trimmedGeneticValue[["kinship"]], ped = trimmedPed, iter = 10L, numGp = 6L, sexRatio = 9.0 ) sexRatioGrp$group ``` Again we can identify and list the males in each group with the following code. ```{r list-males-in-sexratio-groups} sapply(sexRatioGrp$group, function(ids) { ids[ids %in% trimmedPed$id[trimmedPed$sex == "M"]] }) ``` We can also see the number of animals and the sex ratios created in each group. ```{r list-count-and-sexratios} lines <- sapply(sexRatioGrp$group, function(ids) { paste0( "Count: ", length(ids), " Sex Ratio: ", round(calculateSexRatio(ids, trimmedPed), 2L) ) }) for (line in lines) print(line) ``` Examination of this table shows that of the `r table(trimmedPed$sex[trimmedPed$id %in% candidates])[[1]]` females `r (sum(sapply(sexRatioGrp$group[-1], function(ids) length(ids))) - length(sexRatioGrp$group[-1]))` are included. ## Pedigree Errors As stated earlier you can see which types of errors are detected by _qcStudbook_ by looking at names returned by _getEmptyErrorLst()_ as shown below. ```{r getEmptyErrorLst} names(getEmptyErrorLst()) ``` Each is defined below. ```{r make-errorList-definition-tbl, echo = FALSE, eval=TRUE} errorTypes <- names(getEmptyErrorLst()) errorDescriptions <- c( "Database connection failed: configuration or permissions are invalid", "Columns that must be within the pedigree file are missing.", "Values, which are supposed to be dates, cannot be interpreted as a date.", "Parents were too young on the date of birth of to have been the parent.", "Individuals listed as female or hermaphroditic and as a sire.", "Individuals are listed as male and as a dam.", "Individuals who are listed as both a sire and a dam.", "IDs listed more than once.", stri_c( "Columns that have been changed to conform to internal naming ", "conventions and what they were changed to." ) ) errorTbl <- data.frame( Error = errorTypes, Definition = errorDescriptions, stringsAsFactors = FALSE ) ``` ```{r print-error-definition-tbl, echo=FALSE} knitr::kable(errorTbl) ``` We are going to use the small imaginary pedigree listed below that has multiple errors to discuss pedigree error detection and reporting. First note the birth dates of ego_id _o4_ (2006-04-13) and the purported sire _s2_ (2006-06-19). Note also the purported birth date of the _d2_ and the birth dates of her offspring. Obviously dates or IDs may be incorrect. This is the pedigree. _(We will discuss the column names shortly.)_ ```{r list-pedOne} knitr::kable(nprcgenekeepr::pedOne) ``` If we try to convert this pedigree file into the standardized studbook format, we are going to get an error message and the creation of a file in the R sessions temporary directory that lists the records that have generated the errors. ```{r summary-pedOne-no-errors, error = TRUE} pedOne <- nprcgenekeepr::pedOne # put it in the local environment ped <- qcStudbook(pedOne, minParentAge = 0.0) ``` The contents of _lowParentAge.csv_ is shown below. ```{r read-lowParentAge.csv, echo = FALSE, include = FALSE} lowParentAge <- read.csv(paste0(tempdir(), "/lowParentAge.csv")) ``` ```{r print-lowParentAge.csv, echo = FALSE} knitr::kable(lowParentAge) ``` Examination of the ages of the parents reveals the issues being reported. We can remove the errors by changing the birth dates of _o4_ from 2006-04-13 to 2015-09-16 and of _d2_ from 2015-09-16 to 2006-04-13. ```{r correct-birthdate, error = TRUE} pedOne$birth_date[pedOne$ego_id == "o4"] <- as.Date("2015-09-16") pedOne$birth_date[pedOne$ego_id == "d2"] <- as.Date("2006-04-13") ``` Note the changes made to the column names between the original __pedOne__ pedigree and the pedigree (__ped__) we get from __qcStudbook__. We have chosen to limit the displayed pedigree by selecting the _ego_id_'s and _id_'s in __pedOne__ and __ped__ respectively. ```{r display-corrected-birth-records} ped <- qcStudbook(pedOne, minParentAge = 0.0) ped[ped$id %in% c("s2", "d2", "o3", "o4"), ] ``` However, the preferred method of creating the standardized studbook format with __qcStudbook__ is to examine all errors found and correcting them before proceeding. This is done by explicitly requesting that all aspects inconsistent with the studbook format be identified by setting _reportChanges_ and _reportErrors_ to \code{TRUE}. ```{r look-for-errors} errorList <- qcStudbook(pedOne, minParentAge = 0.0, reportChanges = TRUE, reportErrors = TRUE ) summary(errorList) ``` We will discuss each of these newly identified errors in a moment, however, let's look at shortening this report, because often you will not be interested in the more trivial changes to the column names made by __qcStudbook__ and in those cases you simply choose not to report changes to the column names as is shown here by setting _reportChanges_ to \code{FALSE}. For this illustration, we are going to bring back the original copy of __pedOne__ to see how the suspicious parents are reported by the __summary__ function. ```{r look-for-errors-only} pedOne <- nprcgenekeepr::pedOne errorList <- qcStudbook(pedOne, minParentAge = 0L, reportChanges = FALSE, reportErrors = TRUE ) options(width = 90L) summary(errorList) ``` The first two errors mentioned are of particular interest. Currently __qcStudbook__ automatically changes the sex of dams to _F_ (female) and sires to _M_ (male) when __reportErrors__ is set to \code{FALSE}. ## Genetic Loops This feature is not supported within the Shiny application and is not fully implemented. To use the __findLoops__ function run the following code and select a pedigree as your input file that has a loop in it. We are continuing to use the example pedigree that comes with the software _Example_Pedigree.csv_. ```{r look-at-loops} exampleTree <- createPedTree(breederPed) exampleLoops <- findLoops(exampleTree) ``` You can count how many loops you have with the following code. ```{r countLoops} length(exampleLoops) nLoops <- countLoops(exampleLoops, exampleTree) sum(unlist(nLoops[nLoops > 0L])) ``` You can list the first 10 sets of ids, sires and dams in loops with the following line of code: ```{r listLoops} examplePedigree[unlist(exampleLoops), c("id", "sire", "dam")][1L:10L, ] ``` ```{r elapsed-time} elapsed_time <- get_elapsed_time_str(start_time) ``` The current date and time is `r Sys.time()`. The processing time for this document was `r elapsed_time`. ```{r session-info} sessionInfo() ```