No articles match
Training a Random Forest Classifier for Population Structure Identification3 months ago
Download reference data | Set-up | Match study genotypes and reference data | Filter reference and study data for non A-T or G-C SNPs | Renaming variant identifiers | Filtering out shared SNPs between study and reference dataset | Conducting markerQC, pruning LD, and individual QC | PCA | Training a random forest classifier in R | Predicting ancestries of new study data | Evalulating and Tuning of Classification Model | Parameter Tuning via Grid Search | Evaluating/Interpretting the RF | References
Genotype quality control with plinkQC3 months ago
Introduction | Per-individual quality control | Per-marker quality control | Clean data | Workflow | Create QC-ed dataset | Step-by-step | Individuals with discordant sex information | Individuals with outlying missing genotype and/or heterozygosity rates | Related individualis | Ancestry Predictions of Data | Markers with excessive missingness rate | Markers with deviation from HWE | Markers with low minor allele frequency | References
Processing 1000 Genomes reference data for ancestry estimation5 months ago
Introduction | Workflow | Set-up | PLINK software | Download and decompress 1000 Genomes phase 3 data | Convert 1000 Genomes phase 3 data to plink 1 binary format | References
Processing HapMap III reference data for ancestry estimation5 months ago
Introduction | Workflow | Set-up | Download and convert Hapmap phase III data | Update annotation | Update the reference data | References
my-vignette8 months ago