Bioconductor is an open source, open development software project to provide tools for the analysis and comprehension of high-throughput genomic data. It is based primarily on the R programming language.


The Bioconductor release version is updated twice each year, and is appropriate for most users. There is also a development version, to which new features and packages are added prior to incorporation in the release. A large number of meta-data packages provide pathway, organism, microarray and other annotations.

The Bioconductor project started in 2001 and is overseen by a core team, based primarily at the Fred Hutchinson Cancer Research Center, and by other members coming from US and international institutions. It gained widespread exposure in a 2004 Genome Biology paper.


The Bioconductor project relies on a peer-review process of candidate pack age add-ons to ensure it grows containing high-quality, scientifically-relevant software. It has achieved a virtuous cycle, where its success has brought in new scientific software developers, and they, in turn, have been contributing more and more to the Bioconductor project.

Available PackagesEdit

Microarray AnalysisEdit

Import Affymetrix, Illumina, Nimblegen, Agilent, and other platforms. Perform quality assessment, normalization, differential expression, clustering, classification, gene set enrichment, genetical genomics and other workflows for expression, exon, copy number, SNP, methylation and other assays. Access GEO, ArrayExpress, Biomart, UCSC, and other community resources.

Pre-processing Affymetrix arraysEdit

  • affy - Methods for Affymetrix Oligonucleotide Arrays

Assessing differential expressionEdit

  • limma - Linear Models for Microarray Data

Sequence DataEdit

Import fasta, fastq, ELAND, MAQ, BWA, Bowtie, BAM, gff, bed, wig, and other sequence formats. Trim, transform, align, and manipulate sequences. Perform quality assessment, ChIP-seq, differential expression, RNA-seq, and other workflows. Access the Sequence Read Archive.

Range-based calculation, data manipulation and representation.Edit

  • IRanges - Infrastructure for manipulating intervals on sequences
  • GenomicRanges - Representation and manipulation of genomic intervals
  • genomeIntervals - Operations on genomic intervals

Alignment, pattern matching and data manipulation of large biological sequencesEdit

  • Biostrings - String objects representing biological sequences, and matching algorithms

File I/O, quality assessment, and high-level, general purpose data summaryEdit

  • ShortRead - Classes and methods for high-throughput short-read sequencing data.
  • Rsamtools - Import aligned BAM file format sequences into R / Bioconductor

Import and export of tracks on the UCSC genome browserEdit

  • rtracklayer - R interface to genome browsers and their annotation tracks

Accessing and manipulating curated whole-genome representationsEdit

  • BSgenome - Infrastructure for Biostrings-based genome data packages

Annotation of sequence features across common genomesEdit

  • GenomicFeatures - Tools for making and manipulating transcript centric annotations

Access to Biomart databasesEdit

  • biomaRt - Interface to BioMart databases (e.g. Ensembl, COSMIC ,Wormbase and Gramene)

Querying and retrieving data from the Sequence Read ArchiveEdit

  • SRAdb - A compilation of metadata from NCBI SRA and tools

ChIP-seq and related (e.g., motif discovery, identification of high-coverage segments) activitiesEdit

  • CSAR - Statistical tools for the analysis of ChIP-seq data
  • chipseq - A package for analyzing chipseq data
  • ChIPseqR - Identifying Protein Binding Sites in High-Throughput Sequencing Data
  • ChIPsim - Simulation of ChIP-seq experiments
  • ChIPpeakAnno - Batch annotation of the peaks identified from either ChIP-seq or ChIP-chip experiments.
  • rGADEM - De novo motif discovery
  • segmentSeq - Methods for identifying small RNA loci from high-throughput sequencing data
  • BayesPeak - Bayesian Analysis of ChIP-seq Data
  • PICS - Probabilistic inference of ChIP-seq

Differential expression and RNA-seq style analysisEdit

  • Genominator - Analyze, manage and store genomic data
  • edgeR - Empirical analysis of digital gene expression data in R
  • baySeq - Empirical Bayesian analysis of patterns of differential expression in count data
  • DESeq - Digital gene expresion analysis based on the negative binomial distribution
  • DEGseq - Identify Differentially Expressed Genes from RNA-seq data

High Throughput AssaysEdit

Import, transform, edit, analyze and visualize flow cytometric, mass spec, HTqPCR, cell-based, and other assays.

Analysis of flow cytometry dataEdit

  • flowCore - flowCore: Basic structures for flow cytometry data
  • flowViz - Visualization for flow cytometry
  • flowQ - Qualitiy control for flow cytometry
  • flowStats - Statistical methods for the analysis of flow cytometry data
  • flowUtils - Utilities for flow cytometry
  • flowFP - Fingerprinting for Flow Cytometry
  • flowTrans - Parameter Optimization for Flow Cytometry Data Transformation
  • iFlow - A Graphical User Interface for Flow Cytometry Tools

Clustering flow cytometry dataEdit

  • flowClust - Clustering for Flow Cytometry
  • flowMeans - Non-parametric Flow Cytometry Data Gating
  • flowMerge - Cluster Merging for Flow Cytometry Data
  • SamSPECTRAL - Identifies cell population in flow cytometry data.

Data structures and algorithms for cell-based high-throughput screens (HTS).Edit

  • cellHTS2 - Analysis of cell-based screens - revised version of cellHTS
  • RNAither - Statistical analysis of high-throughput RNAi screens
  • RTCA - Open-source toolkit to analyse data from xCELLigence System (RTCA) by Roche

High-throughput qPCR AssaysEdit

  • HTqPCR - Automated analysis of high-throughput qPCR data
  • ddCt - The ddCt Algorithm for the Analysis of Quantitative Real-Time PCR (qRT-PCR)
  • qpcrNorm - Data-driven normalization strategies for high-throughput qPCR data.

Mass Spectrometry and Proteomics dataEdit

  • clippda - A package for the clinical proteomic profiling data analysis
  • MassArray - Analytical Tools for MassArray Data
  • MassSpecWavelet - Mass spectrum processing by wavelet-based algorithms
  • PROcess - Ciphergen SELDI-TOF Processing
  • flagme - Analysis of Metabolomics GC/MS Data
  • xcms - LC/MS and GC/MS Data Analysis

Infrastructure for image-based phenotyping and automation of other image-related tasksEdit

  • EBImage - Image processing toolbox for R


Use microarray probe, gene, pathway, gene ontology, homology and other annotations. Access GO, KEGG, NCBI, Biomart, UCSC, vendor, and other sources.

Make and manipulate annotation mapping objectsEdit

  • AnnotationDbi - Annotation Database Interface

Deal with annotation questions that involve categorical data. Edit

  • Category -Category Analysis

Hypergeometric testing using the Gene Ontology found in the GO.db packageEdit

  • GOstats - Tools for manipulating GO and microarrays.

Tools for making use of annotationsEdit

  • annotate - Annotation for microarrays

Pull annotation data directly from web based annotation resources. Edit

  • biomaRt - Interface to BioMart databases (e.g. Ensembl, COSMIC ,Wormbase and Gramene)


Bioconductor has become a vital software platform for the worldwide genomic research community. As of July, 2010, Google Scholar notes there are 2375 scientific documents that cite the ground-breaking Genome Biology

2004 paper Bioconductor: open software development for computational biology and bioinformatics. The 2004 Bioconductor paper is the second most accessed article of all time from Genome Biology.

Bioconductor citations in leading scientific journals have increased from January, 2003 to July, 2010. Table 4 contains the results of PubMed searches for “bioconductor” over different time frames. It shows there have been at least 233 journal citations from January, 2003 to July, 2010, with nearly 50% (115) being made in Bioinformatics. A sample of 84 publications citing Bioconductor in 2009 or 2010 are listed in the bibliography of this report.

Competing ProjectsEdit

  • BioPerl

  • BioPython

  • BioJava

  • BioRuby

Associated ProjectsEdit

Useful LinkEdit


Ad blocker interference detected!

Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.