This data set is expected to be useful for a variety of purposes including software and workflow demonstration and development of probe-level analysis methods for making genotype calls from probe intensity data-

The data set consists of 48 samples, each on both the Nsp and Sty arrays (so a total of 48x2=96 hybridizations). The samples consist of thirteen trios (5 HapMap CEPH trios, 5 HapMap Yoruban trios and three other non-HapMap trios) and 9 unrelated HapMap Asian samples. In total 39 of the 48 samples are part of the samples use in the International HapMap Project .

Of particular use is the fact that the HapMap Project has made available a large number of reference genotypes which can be used in conjunction with this data set. HapMap data access policy  limits redistribution rights on these genotypes so they cannot be made available directly by Affymetrix, but the reference data can be downloaded directly from the HapMap Project. As of HapMap release 16c1, a total of about 124,624 SNPs have reference genotypes available for the samples shared here (65,246 SNPs for Nsp and 59,378 SNPs for Sty). These numbers are steadily increasing with each HapMap update. The details of the analysis method used by GTYPE to determine genotype calls based on probe intensity data have been published in Bioinformatics.

The data set has been split into 13 parts for convenient download. These can be unzipped on top of one another. The file with the word 'base' in the filename is required, the other 12 zip files each contain distinct collections of chip data and users wanting to download only a subset of the data may pick a subset of these zips.

The data is provided in two versions. Each version contains the same data but in different file formats. Version 1 (in table 1) contains raw CEL, CHP and EXP files and is suitable for use outside of the GCOS/GTYPE framework. It is expected to be mainly of interest for users interested in low-level probe analysis. Version 2 (in table 2) contains DTT format files for integration with the GCOS/GTYPE framework and is expected to be mainly of interest for users wishing to integrate the data with these applications.

In either case there is a file named README.txt provided in the 'base' file with detailed instructions on how to use the data- Md5 checksums are provided in the tables below for verification of the integrity of downloaded data-

Version 1 release of data: CEL, CHP, and EXP format

(Suitable for use outside of GCOS/GTYPE framework)

File Size md5 checksum Description

500K_data-base.zip

101 MB

30abc864a12205eda21af3a9d33d3a27

Documentation and library files for entire data set

500K_data-nsp-1.zip

215 MB

b563e2b3a809983f36fb1d5dbddd0e43

Probe intensities and genotype calls for 8 HapMap samples on Nsp array

500K_data-nsp-2.zip

221 MB

b30c9d56ad9fff4b777d6a2bad98c1f6

Probe intensities and genotype calls for 8 HapMap samples on Nsp array

500K_data-nsp-3.zip

210 MB

6ad65f1d4b8fd3a4fe807a6b4b8a843e

Probe intensities and genotype calls for 8 HapMap samples on Nsp array

500K_data-nsp-4.zip

220 MB

50aaf3213d2ce0f43c6b6188b241f6be

Probe intensities and genotype calls for 8 HapMap samples on Nsp array

500K_data-nsp-5.zip

215 MB

e504591076cac04c9dd78320ff622916

Probe intensities and genotype calls for 7 HapMap and 1 non-HapMap samples on Nsp array

500K_data-nsp-6.zip

216 MB

50a796fccbf06c7d767260a50a03094c

Probe intensities and genotype calls for 8 non-HapMap samples on Nsp array

500K_data-sty-1.zip

209 MB

2a45a4f4fc282029c6aae652250c75fb

Probe intensities and genotype calls for 8 HapMap samples on Sty array

500K_data-sty-2.zip

222 MB

654a6d82147930a5bbc207643d16ba1f

Probe intensities and genotype calls for 8 HapMap samples on Sty array

500K_data-sty-3.zip

216 MB

c86549f0011c019da253b88ba9cd8e60

Probe intensities and genotype calls for 8 HapMap samples on Sty array

500K_data-sty-4.zip

221 MB

446683507a8df1dd6affd4a792bf844f

Probe intensities and genotype calls for 8 HapMap samples on Sty array

500K_data-sty-5.zip

221 MB

5a368488d8383a95ab801e32e10f5806

Probe intensities and genotype calls for 7 HapMap and 1 non-HapMap samples on Sty array

500K_data-sty-6.zip

222 MB

d488c793bc1ac3747719226267fb283e

Probe intensities and genotype calls for 8 non-HapMap samples on Sty array

Version 2 release of data: DTT format

(Intended for use within GCOS/GTYPE framework)

File Size md5 checksum Description

500K_data-base.zip

101 MB

30abc864a12205eda21af3a9d33d3a27

Documentation and library files for entire data set

500K_Data-nsp-dtt-1.zip

215 MB

eb6812d72feab9167deb28c7cad81ce5

Archived DTT files for 8 HapMap samples on Nsp array

500K_Data-nsp-dtt-2.zip

221 MB

419b0311dd55105a36e0bf3acf0ab957

Archived DTT files for 8 HapMap samples on Nsp array

500K_Data-nsp-dtt-3.zip

210 MB

cfcede32d13d1deabe11c6b5c4d3795e

Archived DTT files for 8 HapMap samples on Nsp array

500K_Data-nsp-dtt-4.zip

221 MB

34b25f3e188b1b7373e635f121c13efa

Archived DTT files for 8 HapMap samples on Nsp array

500K_Data-nsp-dtt-5.zip

215 MB

b19b1bdbdb4aa83ed35d45390d71c738

Archived DTT files for 7 HapMap and 1 non-HapMap samples on Nsp array

500K_Data-nsp-dtt-6.zip

216 MB

177077a197712bbc77dcca2059b3b76c

Archived DTT files for 8 non-HapMap samples on Nsp array

500K_Data-sty-dtt-1.zip

209 MB

0dde6dbf688d32eb79252b1d3a161f33

Archived DTT files for 8 HapMap samples on Sty array

500K_Data-sty-dtt-2.zip

222 MB

70c459d8da59f13a384b40d870c38f02

Archived DTT files for 8 HapMap samples on Sty array

500K_Data-sty-dtt-3.zip

216 MB

4ef48cfc72c9dafc712effaff53eb0b3

Archived DTT files for 8 HapMap samples on Sty array

500K_Data-sty-dtt-4.zip

221 MB

feca84b7dcc8d473487c73ddfa94d7ea

Archived DTT files for 8 HapMap samples on Sty array

500K_Data-sty-dtt-5.zip

222 MB

c94a4d7e3434a4538e8966374ba01bac

Archived DTT files for 7 HapMap and 1 non-HapMap samples on Sty array

500K_Data-sty-dtt-6.zip

222 MB

f9eb0b7db9fe3a375d6ccbb8c83103db

Archived DTT files for 8 non-HapMap samples on Sty array