Are you a life scientist interested in applications to analyze and manage the vast amounts of data generated by the SOLiD® System? Are you an independent software vendor or academic researcher interested in developing software for next-generation sequencing? In this site, you will find resources to help, including: 

  • Academic and open-source software
  • Commercial software
  • Sample data sets
  • Technical documentation
  • Data conversion tools

Application-Specific Software

Chromatin Immunoprecipitation Sequencing (ChIP-Seq)

  • ChIP-Seq File Formating Tool
    Enables you to convert result files generated by aligning SOLiD® reads to reference sequences using SOLiD® software into a format compatible with most available ChIP-seq analysis tools. Available as source code.
  • MACS
    Enables you to convert result files generated by aligning SOLiD® reads to reference sequences using SOLiD® software into a format compatible with most available ChIP-seq analysis tools. Available as source code.
  • QuEST
    A Kernel Density Estimator-based package for analyzing massively parallel sequencing data from chromatin immunoprecipitations (ChIP-Seq or ChIPseq).

Gene Expression

  • SOLiD® SAGE™ Analysis Software
    A tool for taking the raw sequence data files from SOLiD® SAGE™ reads and matching them to known sequences in your reference database of choice.

de novo Sequencing

  • Velvet
    De novo of short reads (EMBL-EBI)
  • ADIR
    ADiR is an assembler for AB SOLiD® colorspace reads that makes use of color transition properties in resolving overlap and consensus.


  • AB Inversion Tool
    This stand-alone tool enables you to detect inversions in SOLiD® data.
  • AB CNV Tool
    This tool is designed to detect copy number variation using SOLiD® data from a single human sample mapped to the human reference sequence.

Small RNA Analysis

  • Small RNA Analysis Pipeline Tool
    A small RNA SOLiD® System Analysis Pipeline for human whole genome alignment of small RNA sequencing data with tag counting features. Available as source code.

Whole Transcriptome Analysis

  • AB WT Analysis Pipeline
    Allows you to align transcriptome reads to a reference genome, tag counting for exons and genes, and output data in base space.
    A computational pipeline which provides automatic and integrated way to align color- space sequencing data, collate this information and generate files for examining gene-expression data in a genomic context.

General Software (for Multiple Applications)

Data Analysis Tools

  • Galaxy
    Galaxy allows you to analyze multiple alignments, compare genomic annotations, profile metagenomic samples, and much much more—without the need to install or download anything. For developers, it is an open-source, scalable framework for tool and data integration.

Data Conversion Tools

  • Sequence Read Format Conversion Tool
    Enables you to convert native SOLiD® data into a community-driven standard (Short Read Format) for NCBI submission. Available as source code.
  • SOLiD® BaseQV Tool
    Tool for the conversion of SOLiD® output files to base sequences data with associated quality values.

Genome Browsers

  • SOLiD® Alignment Browser
    A genome annotation viewer and editor which is based on the Apollo Genome Annotation Curation Tool.
  • UCSC Genome Broswer
    The Genome Browser zooms and scrolls over chromosomes, showing the work of annotators worldwide.

Mapping Tools

  • Color Space Mapping Tool
    Enables you to map SOLiD® color space reads to whole human genomes. Available as source code.
  • SHRiMP
    The SHort Read Mapping Package.
    Enables you to map short reads to reference sequences quickly and accurately.
  • SOCS
    This software uses iterative mismatch tolerances to speed up high tolerance mapping. It finds optimal alignments for each read across one or more reference files. It also provides maps of sequence census and isolated mismatches.

1000 Genomes Project

  • 1000 Genomes Project Data
    This 1000 Genomes Project data contains the first set of SNP calls for 4 individuals that are part of the high coverage pilot project. These SNPs represent the preliminary analysis of a portion of the data so far collected and are released in accordance with the Ft. Lauderdale agreement for community resource projects. The data consist of README files, fastq files (nucleotides and qualities) suitable for use with most aligners, and md5 checksum data.


  • SOLiD® Human Sample NA18507
    Human sample NA18507 was sequenced using the SOLiD® System. Data sets are available to the scientific community through the public database hosted by the National Center for Biotechnology Information (NCBI). The public availability of this sequence data will help scientists gain a greater understanding of human genetic variation and potentially help explain differences in individual susceptibility and response to treatment for disease. The data may also be used to further enable the development of analytical tools for next generation sequence data.
  • E.Coli DH10B Mate Pair Data
    Sample resequencing data from E.Coli DH10B. It is the statistical analysis output from a quarter of a slide of a 50 bp mate-pair run.
  • E.Coli DH10B Fragment Data
    This data set was sequenced using the SOLiD® System. It consists of a sequencing run from a fragment library of E.Coli DH10B.

Small RNA Analysis

  • Human Small RNA Data Set
    Total RNA was isolated from human tissue samples and subsequently fractionated using flashPAGE. The small RNA fraction was converted to double stranded DNA templates suitable for sequencing using the Applied Biosystems SOLiD® Small RNA Expression Kit. Subsequently small RNA libraries were clonally amplified by ePCR and run on a SOLiD® instrument. Each library generated between 40 and 45 million of 35 bp (colors) length reads for a total of 175 million reads.

Whole Transcriptome Analysis

  • Whole Transcriptome Data Set
    This data set was sequenced using the SOLiD® System. It consists of a mRNA sequencing run using the methodology published by Cloonan, N. et al. in Nature Methods doi: 10.1038/nmeth.1223 (2008)
  • Whole Transcriptome Paired-End Data Set
    This data set was generated by sequencing SOLiD® Total RNA-Seq prepared libraries using paired-end reads of 50bp (forward) and 25 bp (reverse) on the SOLiD® 4 System. The data provided is the mapping output and whole transcriptome results from the SOLiD® BioScope™ 1.2.1 WT analysis pipeline.

SOLiD® Data Format and File Definitions Guide (PDF)
This guide is a comprehensive description of the Human Genome data sets available through NCBI. Files defined in this guide are a representative listing of standard outputs from the SOLiD® analysis pipeline and can be used as a reference for software developers.

SOLiD® General Feature Format Document (PDF)
This document describes the general feature format and contains data and analysis results; colorspace calls, quality values (QV) and variant annotations.


Company/Software   Supported Applications

BioTeam provides services for integrating sequencers, computing and storage systems, and managing data with WikiLIMS™
  • Data Management

BioTique Systems
BioTique has a commercially developed software, HT-BLIS (High-Throughput HT-BLIS), which enables academic and research organizations to process the massive amounts of data being generated by high-throughput genomic analysis systems
  • Data Management

CLC bio
Comprehensive Next Generation Sequencing analysis solutions with native Color Space analysis of SOLiD® System data. CLC bio's customizable solutions include accelerated analysis of Genomics, Transcriptomics, and Epigenomics data. CLC bio offers everything from user-friendly desktop applications to full High-Performance Computing enterprise solutions, completely integrated with existing workflows.
  • ChIP-Seq
  • De Novo Assembly
  • Whole Transcriptome Analysis
Lab and data management solutions for life sciences research. The Genologics platform automates workflows and data capture by integrating to technology platforms such as the SOLiD® System.
  • Data Management
We offer high throughput sequencing service with the SOLiD® System as an Applied Biosystems Service Provider in Europe. We have implemented and optimized all the applications for this platform, and we collaborate with two academic bioinformatics groups in order to provide the most complete information to our customers.
  • Small RNA Analysis
  • Whole Transcriptome Analysis

Genome Quest
GenomeQuest is a web-based software application for managing and mining sequence data and a management system for sequence reference data.
  • Data Management

Geospiza GeneSifter®
Geospiza's GeneSifter software offers award winning ease-of-use, enabling you to understand the biological significance of your SOLiD® System data and draw meaningful conclusions quickly.
  • Data Management
  • Whole Transcriptome Analysis
  • Small RNA Analysis
  • ChIP-Seq

Partek® Genomics Suite™
Partek® Genomics Suite™ takes your Next Generation Sequencing data all the way from aligned reads to gene ontology and pathway analysis and beyond. The RNA-seq, ChIP-seq and DNA-Seq user friendly data analysis workflows, will guide you through sophisticated statistical analysis into biological interpretation within our intuitive and interactive genome browser.  In supporting all microarray and next generation sequencing technologies and assays, Functional Genomics turns concrete with our Integrative Genomics workflows.
  • DNA-Seq including SNP duo and trio calling
  • RNA-Seq including Whole Transcriptome, Small RNA and Allele Specific Expression
  • ChIP-Seq
  • Methylation

SoftGenetics, world leader in Genetic Analysis Software features NextGENe 2nd Gen Analysis software for the SOLiD® System. Software provides biologist-friendly interface; no scripting reduced bioinformatics costs; includes many unique technologies including its patent-pending Condensation Tool to increase analysis accuracy. Provides a host of applications: de novo, SNP & structural discovery, transcriptome, ChiP-Seq, mRNA and other applications.
  • ChIP-Seq
  • De Novo Assembly
  • Resequencing
  • Whole Transcriptome