Genotyping Support—Getting Started
Find valuable information.
Optimize your experiments to get the best results. We’ve compiled a detailed knowledge base of the top tips and tricks to meet your research needs.
View the relevant questions below:
Axiom™ arrays formerly required a total of 200ng of gDNA per sample, with the exception of the Axiom™ Genome-Wide Pan-African Array set, which requires a total of 300ng of gDNA per sample (100ng per array in the set). As of 2016, there are new guidelines for sample input. All human Axiom™ arrays (except the Axiom™ Genome-Wide Pan-African Array Set) require a total of 100ng. The Axiom™ Genome-Wide Pan- African Array Set still requires a total of 300ng, or 100ng per array. Diploid plants and animals require 150ng per array and polyploid plants and animals require 200ng per array. For Axiom™ Microbiome Arrays, a total of 50ng of gDNA or 17.5 μL of cDNA reaction + 2.5 μL reduced TE buffer starting material is required per array. Please refer to the Axiom 2.0 gDNA sample preparation QRC for more details. Starting DNA must be double-stranded for the purpose of accurate concentration determination. gDNA must be of high purity. DNA should be free of DNA polymerase inhibitors.
Examples of inhibitors include high concentrations of heme (from blood) and high concentrations of chelating agents (i.e., EDTA). The gDNA extraction/ purification method should render DNA that is generally salt-free because high concentrations of particular salts can also inhibit enzyme reactions. DNA purity is indicated by OD260/OD280 and OD260/ OD230 ratios. The OD260/OD280 ratio should be between 1.8 and 2.0 and the OD260/OD230 ratio should be greater than 1.5. We recommend that DNA samples that do not meet these criteria be cleaned up as described under Genomic DNA Cleanup in the Axiom user guide. DNA must not be degraded. The approximate average size of gDNA may be assessed on a 1% agarose gel using an appropriate size standard control. Approximately 90% of the DNA must be greater than 10 Kb in size. Control DNA can be run on the same gel for side-by-side comparison.
50ng of input mass is optimal.
The species included on the array represent all known microbial sequences available in the NCBI database as of October 2014 for which probes could be designed. Any sequences deposited after October 2014 are not represented on the array.
Human fecal samples or cDNA samples made from purified RNA have been validated.
DNA was extracted from fecal samples using the PowerSoil™ DNA Isolation Kit from MoBio (Cat. No. 12888-50 or 12888-100).
12,513 species can be detected on Axiom™ Microbiome Array. These are species from five microbial domains; archaea, bacteria, fungi and protozoa. For a complete list of species, please contact the Technical Support Team.
81 species are not detectable on Axiom™ Microbiome Array. For a complete list of species not detected, please contact the Technical Support Team.
Approximately 648,000 SNPs are tiled on the array.
Approximately 648,000 SNPs are tiled on the array.
We obtained information for more than 46 million SNPs from sequencing efforts coordinated by the Affymetrix™ Bovine Consortium (basic and applied researchers in the bovine community). From these, we selected SNPs to use based on physical coverage of the bovine genome and the number of breeds in which the SNPs were observed.
We then screened many millions of SNPs against approximately 400 samples from the Affymetrix™ Bovine Consortium and bovine HapMap samples (Texas A&M University). The result is a database of ~3 million validated SNPs, which means each has been demonstrated as a real, truly polymorphic SNP and has been proven to work in the assay. This implies that customers should be able to select SNPs from our database and create their own custom array, and that the SNPs chosen should perform at a very high level in the assay.
Finally, we selected a subset of more than 648,000 SNPs primarily based on genetic coverage for a selection of breeds and secondarily based on physical coverage of the bovine genome.
SNPs were selected to represent polymorphisms from a comprehensive set of commercially important breeds of dairy and beef cattle from both Bos indicus and Bos taurus. Our primary goal was to obtain high levels of genetic coverage. SNP pairwise linkage disequilibrium (LD) values (r2) were calculated to identify SNPs in strong LD. Then we selected the fewest number of SNPs to cover the known genetic variation for each breed we prioritized (first five breeds in Table 1). After SNPs were selected for optimal genetic coverage, the distances between the selected SNPs were calculated, and inter-SNP gaps filled (starting with the largest) by selecting polymorphic SNPs from the five major breeds: Holstein, Angus, Nelore, Jersey, and Fleckvieh.
Genetic coverage was calculated based on SNP pairwise LD (r2) results obtained from genotype data of reasonably unrelated samples within each breed of interest. The default r2 threshold for genetic coverage was 0.8. Unless otherwise specified, all SNPs with r2 results for a given breed are included in the target coverage set. This set includes all SNPs that are converted in the screen and have polymorphic genotype data for the breed of interest. For breeds for which bovine HapMap genotype data were available for enough of the samples used in the screen, HapMap SNPs with polymorphic genotypes were also included in the target coverage set.
LD calculations were carried out for each breed separately.
There are no copy number probes on the array. However, measuring copy number variation (CNV) in the bovine genome can be performed using a custom whole-genome sampling assay design (a cartridge-based SNP Array 6.0 genotyping assay).
The Axiom Genome-Wide BOS 1 Array Plate is designed for use with the Axiom™ 2.0 Reagent Kit. Assay kits are available for 96 (one plate), 192 (two plates), and 768 (eight plates) reactions.
The assay requires 200 ng of high-quality, double-stranded genomic DNA that is not highly degraded. Genomic DNA must be of high purity and free of DNA polymerase inhibitors such as high salt, hemes, and chelating agents. For details about general assay requirements for genomic DNA, please refer to Chapter 2 of the Axiom™ 2.0 Genotyping Assay User Guide.
A variety of samples, including those originating from blood, semen, nasal swabs, hair bulbs, and ear punch tissue, were tested using the Axiom™ Genome-Wide BOS 1 Array.
Genotyping Console™ (GTC) Software version 4.1 is used to analyze data from the Axiom™ Genome- Wide BOS 1 Array Plate. A guide (vignette) is available that describes how to perform SNP filtering for quality control and normalization. Please look for the Axiom™ Genome-Wide BOS 1 APT Vignettes.
Supporting files are available to assist with SNP genotyping and can be found on the website. The supporting files include:
- Sample genotyping script for APT
- CDF library file
- GeneChip™ Command Console™ Software (AGCC) library files
- Sample CEL (data) files and CEL file list
- Genotype and quality control (QC) sample output files
- Genotype and QC support files
- NetAffx™ Analysis Center annotation and alignment files
Dish QC (DQC) is the recommended QC metric for Axiom™ Genome-Wide Arrays in Genotyping ™ Console Software. The default threshold is greater than or equal to 0.82 for each sample. It operates by measuring signal at a collection of sites in the genome that are known not to vary from one individual to the next. Because the metric monitors non-polymorphic locations, at each position it is known which of the two channels in the assay should contain signal and which should be just background. DQC is a measure of the extent to which the distribution of signal values is separated from background values, with 0 indicating no separation and 1 indicating perfect separation.
The average sample file size is 28 MB.
A number of service providers have experience running the Axiom™ Assay and the genotyping call algorithm that is used with the Axiom™ Genome-Wide BOS 1 Array Plate.
Human Mitochondrial Resequencing Array 2.0
|Human Mitochondrial Resequencing Array 2.0|
|Fluidics Protocol (FS450; User Prepared Solutions)||Mini_DNAARRAY_WS5_450|
|Feature Size||8 µm|
|Probe Pairs per SNP||eight unique 25-mer probes per base position
(4 oligonucleotide probes per strand).
|Hybridization Volume||Hybridization volume 80 µL.|
|SARS Resequencing Array|
|Fluidics Protocol (FS450; User Prepared Solutions)||DNAARRAY_WS4_450|
|Feature Size||25x20 µm|
|Probe Pairs per SNP||eight unique 25-mer probes per base position|
|Hybridization Volume||Hybridization volume 200 µL.|
For array hybridization, washing, staining and scanning, the following instrumentation is required:
- GeneChip™ Scanner 3000 7G
- Fluidics Station 450
- Hybridization Oven 640 or Hybridization Oven 645
Data acquisition and analysis of Resequencing Arrays requires the following software:
- Affymetrix™ GeneChip™ Command Console™ Software (AGCC)
- GeneChip™ Sequence Analysis Software (GSEQ)
Affymetrix™ Mouse Diversity Genotyping Array
SNPs were selected to represent polymorphisms from a comprehensive set of common inbred laboratory and wild-derived mice. For additional information, please see “A customized and versatile high-density genotyping array for the mouse” (Yang H., et al., Nature Methods, 2009).
To determine how a specific gene is covered by the Mouse Diversity Genotyping Array, please visit the NetAffx™ Analysis Center. Query the gene of interest in the genotyping search field and select the Mouse Diversity Array to return the list of SNPs and their annotations to the gene.
Yes. The array contains more than 916,000 non-polymorphic copy number probe sets for the detection of copy number variation—the largest number of copy number probes for mice on the market. These probe sets are targeted to functional elements and regions known to harbor segmental duplications. However, because so little information on copy number variation in mice exists, the copy number application is for discovery use only. Copy number analysis is not supported because there are no known informatics tools for this purpose. You can find annotation comma-separated values (CSV) files for the copy number probes by visiting the NetAffx Analysis Center and selecting the Mouse Diversity Array.
Please visit The Jackson Laboratory , which has annotation for all the probes. These annotation files will be updated regularly. You can also find the annotation files for SNP and copy number probes by visiting the NetAffx Analysis Center and choosing “Mouse Diversity Array” from the drop-down menu. Sample data, probe set data, library files, alignment, annotation, and sequence files can be found here.
Content found on the Mouse Diversity Genotyping Array is so comprehensive that it will not fit on the format required for array plates.
The Mouse Diversity Genotyping Array is designed for use with the Affymetrix™ SNP 6.0 Core Reagent Kit. This kit is available in 100 reactions and more information for ordering can be found on our website.
Total genomic DNA (500 ng) is digested with Nsp I and Sty I restriction enzymes and ligated to adaptors that recognize the cohesive 4 base pair (bp) overhangs. All fragments resulting from restriction enzyme digestion, regardless of size, are substrates for adaptor ligation. A generic primer that recognizes the adaptor sequence is used to amplify adaptor-ligated DNA fragments. PCR conditions have been optimized to preferentially amplify fragments in the 200 to 1,100 bp size range. PCR amplification products for each restriction enzyme digest are combined and purified using polystyrene beads. The amplified DNA is then fragmented, labeled, and hybridized to a Mouse Diversity Genotyping Array.
Although the array contains more than 916,000 non-polymorphic copy number probe sets for the detection of copy number variation, copy number analysis is not supported because there are no known informatics tools for this purpose. There are some third-party software options that can be utilized. For more information, please contact technical support directly.
The Mouse Diversity Genotyping Array contains:
- No. of SNP probes = 623,000
- Average distance between SNPs = 4,300
- No. of CNV probes = 916,000
Additional information of the content and design of the array can be found in the publication, "A customized and versatile high-density genotyping array for the mouse" (Yang H., et al., Nature Methods, 2009).
The assay requires 500 ng of high-quality, double-stranded genomic DNA that is not highly degraded. Genomic DNA must be free of PCR and other enzymatic inhibitors such as high salt, heme, and EDTA. For details about general assay requirements for genomic DNA, please refer to Chapter 3, page 19 of Affymetrix™ Genome-Wide Human SNP Nsp/Sty 6.0 User Guide (Cat. No. 901182, 901153, 901150). We have tested a variety of extraction and purification methods as well as cleanup procedures that are also listed in this user guide.
Scanning an individual Mouse Diversity Genotyping Array takes about 35 minutes.
Use the GenomeWideSNP6_450 protocol with the GeneChip™ Fluidics Station 450.
The protocol for processing the SNP Affymetrix™ Genome-Wide Human SNP 5.0/6.0 Assay is manual. Some customers have automated assay processing using their own liquid-handling equipment. At this time, we do not recommend a particular automated protocol.
Genotype calls are made using Affymetrix™ Genotyping Console™ Software (GTC). However there are no QC methods, no signature SNPs, no copy number data analysis functionality and the output is not compatible with PLINK . A modified version of the BRLMM-P algorithm is used for the analysis. Birdseed is not available for this array.
Detailed instructions for downloading and using GTC, along with sample CEL files, can be found on the website. We strongly recommends against clustering data that have been run in different facilities, as this will introduce systematic errors.
The output of the GTC is a text file that is compatible with multiple third-party software tools. We work closely a number of GeneChip™ array-compatible™ software providers that offer genome-wide linkage or association analysis solutions for Affymetrix™ genotyping arrays. GeneChip™ array-compatible software providers that have been shown to support the genotype format for the Mouse Diversity Genotyping Array include:
Golden Helix SNP & Variation Suite (SVS) – Golden Helix has tested data from the Mouse Diversity Genotyping Array and demonstrated compatibility. SVS can import SNP calls from the output text files through an automated import wizard. Once imported, data can be easily manipulated, augmented, and prepared with a full complement of QC tools, and then analyzed with powerful association methods, in-depth statistical analyses, and robust visualization tools. SVS can also import intensity data directly from the array’s CEL files, detecting changes in copy number and enabling CNV association studies. For more information, please visit www.goldenhelix.com or email firstname.lastname@example.org.
JMP™ Genomics from SAS – JMP Genomics can import text genotypes and annotation for the new array using the import individual text files process. Imported genotype text files should be transposed into standard JMP Genomics SNP format for downstream analysis (individuals in rows, SNPs in columns) using the transpose rectangular process. For more information, please visit www.jmp.com/software/genomics/ or email email@example.com.
- Partek™ Genomics Suite™ – Partek has tested sample data from the Mouse Diversity Genotyping Array and can import SNP calls from the output text files. Partek Genomics Suite supports single-marker association workflows as well as inheritance tests. For more information, please visit www.partek.com or email firstname.lastname@example.org.
Library files contain information about probe array design layout, probe use and content, scanning and analysis parameters, and other characteristics. These files are unique for each probe array type. Library files for the Mouse Diversity Genotyping Array are called MouseDIV and are located on the web page.
We have developed an instructional vignette describing a method to identify samples that may negatively affect the overall performance of the experiment and should be removed before genotyping. A separate instructional vignette details the process to genotype the array, including the necessary normalization steps.
The average sample file size is 66 MB.
Affymetrix™ GTC software does not perform copy number analysis. However, the developers of this array (The Jackson Labs) have developed an R package that can be used for CNV analysis.
A number of service providers have experience running the SNP 5.0/6.0 assay and the genotyping call algorithm that is used with the Mouse Diversity Genotyping Array. The list includes The Jackson Laboratory, the microarray and computational analysis group that participated in the design and validation of the Mouse Diversity Genotyping Array. The Jackson Laboratory has experience assaying more than 1,000 samples, has access to the most up-to-date methods, and has achieved excellent genotyping call accuracy. Their service spans basic delivery of raw data and genotype calls to custom bioinformatics analysis. For more information, please visit their website or email email@example.com .
Customers must purchase a minimum of 60 arrays to begin a study, because this number of samples is required for BLRMM-P to cluster the data properly. For subsequent orders, the minimum order is approximately 30 arrays.
Genome-Wide Human SNP Array 6.0
Typically, customers can run 48 samples per week (5 days) with three fluidics stations, 1 scanner, 1.5 technicians. Using the 96-sample protocol, customers can run 96 samples in 6 days, using the same instrumentation or 96 samples in 5 days using six fluidics stations and 2 scanners. If a group staggers the protocol during the week or has more instrumentation it is possible to get higher throughput, but this is the standard.
Yes. The combination of the SNP Array 6.0 and Genotyping Console 2.1 provides a great set of tools for researchers who would like to study copy number changes in cancer as well as copy number variation in association studies.
The SNP Array 6.0 provides:
- The highest resolution across the genome to detect and define chromosomal aberrations.
- The average median SNP + CNV inter-marker distance is 680 base pairs. It also has the highest coverage of known copy number variants (90.5% of 3400 known regions). This gives researchers more power and confidence to detect chromosomal aberrations and makes it easier to define boundaries and breakpoints.
- Provides allele-specific copy number. This allows customers to perform LOH and allele-specific analyses. The clear advantage of including this information is in the ability to differentiate between mechanisms which cause the underlying biological effect. For example, a copy-neutral event is only detectable with this additional information. A copy neutral event is detected as no change in copy number but LOH is present.
- A platform that supports a portfolio of products for genomic research—copy number genotype, gene expression and splice variant analysis on a single industry standard microarray platform.
49 format, 5 µm Feature Size
All the SNPs are tiled with PM only 3-4 replicated probe pairs per SNP.
- Contains 906,600 SNPs
- All screened in 500 distinct samples (270 HapMap plus diversity panels)
- Unbiased selection of 494,000 SNPs from 5.0 and 500k tiled on the 6.0
- 6k SNPs not tiled due to lack of cluster classification or multiple hits to the genome. More information available below
- 482,000 SNPs; historical SNPs from 500k and 5.0 out of the 494,000 SNPs can be analyzed with the default library file and the SNP 6.0 genotyping algorithm (Birdseed)
- Selection of additional 424,000 SNPs
- Tag SNPs
- SNPs from chromosomes X
- Y Chromosome SNPs (257 in default, 900 in full)
- Y Chromosome CN Probes (8,583)
- Mitochondrial SNPs (119 in default, 465 in full)
- 100K New SNPs added to the HapMap database
- SNPs in recombination hotspots
- Contains 946,000 Non-Polymorphic Probes
|Average Minor Allele Frequency (MAF)||19.6% in HapMap Caucasians
18.2% in HapMap Asians
20.6% in HapMap Africans
|Average Heterozygosity||26.7% in HapMap Caucasians
24.6% in HapMap Asians
28.5% in HapMap Africans
The SNP 6.0 Assay requires 500 ng genomic DNA.
GeneChip™ Scanner 3000 7G System
Fluidics Station 450 (Please note that the FS400 is not supported with SNP 6.0 arrays.)
Genotyping Console 2.1
The chp file is 66 MB
The chp file is 30 MB
- 6.0 contains over 906k SNPs
- 6.0 Contains 946,000 non-polymorphic probes for CN variation
- CN/SNP combined intermarker genome coverage of 680 bp
The lower the P-value, the better. In a whole genome study, the general P-value significance cutoff for genes associated with a disease is 10-4 or less.
500 ng of total genomic DNA.
The same assay is used (and samples run on the 5.0 can be re-hybed on the 6.0).
DNA is digested with Nsp I and Sty I restriction enzymes and ligated to adaptors that recognize the cohesive four base pair (bp) overhangs. All fragments resulting from restriction enzyme digestion, regardless of size, are substrates for adaptor ligation. A generic primer that recognizes the adaptor sequence is used to amplify adaptor-ligated DNA fragments. PCR conditions have been optimized to preferentially amplify fragments in the 200 to 1,100 bp size range. PCR amplification products for each restriction enzyme digest are combined and purified using activated beads. The amplified DNA is then fragmented, labeled, and hybridized to a Genome-Wide Human SNP 6.0 Array.
The recommended number of samples that should be clustered and analyzed using the Birdseed v1 algorithm is a minimum of 44. A minimum of 15 female samples should be included for robust gender determination results.
Scanning an individual SNP 6.0 array takes about 35 minutes.
There are a couple ways to determine how a specific gene of interest is covered by the SNP 6.0 array. The first would be to use the NetAffx and query the gene of interest in the genotyping search field. This will return the list of SNPs and their annotations to the gene. Within one of these records is a link to the UCSC browser which is a nice visual display of the SNPs on your gene of interest.
Another way to determine how a specific gene of interest is covered by the SNP 6.0 array would be to go to the UCSC genome browser directly and search for the gene of interest. Please be sure that the SNP field is visible and not hidden under the option.
Genome-Wide Human SNP Array 6.0, Fluidic Scripts for FS450 is GenomeWideSNP6_450, Library Files named GenomeWideSNP_6.
906,600 SNPs on the SNP 6.0 array can be accessed utilizing Genotyping Console 2.1. In addition, the SNP 6.0 array contains 945,826 non-polymorphic “potential” copy number variation probes.
The SNP 6.0 HapMap CEL files have been archived by NCBI. The files are available for download through the NCBI ftp server. To get the full set of CEL files, all the files with the extension .tgz should be downloaded. Any .tgz files will need to be extracted (or unzipped) twice. There is freeware called 7-Zip File Manager that can be used to extract the files. For additional information, please contact technical support.
- Each SNP is represented by a pair of 2 Perfect Match (PM) probe sequences, one for Allele A and one for Allele B, which are placed in adjacent positions on the array.
- Each pair of SNP probes is replicated at least 3 times on the array. The replicates are distributed across the array such that each replicate is located in a different quadrant.
- Copy-number probes are Perfect Match (PM) non-polymorphic sequences located in the stripes between quadrants.
DMET Plus Array
Different probe lengths can help differentiate alleles when the GC content varies. Since we are using a single condition for hyb and wash, it is difficult to optimize across a wide range of GC content. So for example if a SNP is in a sequence that is high GC, a shorter probe (23 or 21 mer) might improve allelic discrimination. If the SNP is in an AT rich sequence and has an overall lower Tm, then a slightly longer probe could improve discrimination.
We used multiple populations (HapMap and Extended HapMap) when building the large internal reference set. The actual SNPs that are on the array come largely from the literature and we are unaware of any population bias that is in this collection. Much of the content did in fact come from the Pharma ADME web site. This consortia assembled a list of markers in genes that were prioritized into different classes. The most important of these is referred to as Core markers which mean that they have been validated in the scientific literature as having clinical relevance in drug responses. Markers were reviewed with several Pharmaceutical companies as well for input on important ADME markers.
There are no stopping points designed into the assay.
Blood is the only validated sample type on the assay. If you are interested in running other sample types (saliva, buccal) please contact technical support.
The mPCR step allows for genes that have pseudogenes or other regions of high homology to be amplified for accurate genotyping. The mPCR is designed to preferentially amplify the real gene of interest.
The PhyloChip Array is a custom array currently developed by Dr. Gary Andersen at the Lawrence Berkeley National Laboratory. The array identifies and measures the relative abundance of >50,000 individual microbes. PhyloChip relies on the analysis of all nine variable regions of the 16S gene, providing more in-depth taxonomic classification than other common approaches. The array is only available through Second Genome Solutions and cannot be purchased directly from our company.
For Research Use Only. Not for use in diagnostic procedures.