National Human Genome Research Institute (NHGRI) [1] started ENCODE on september 2003. The current effort involves 28 leading teams: 10 for production scale effort, 4 for mouse production scale effort, 6 for pilot scale effort, 1 for data coordination center, 1 for data analysis center and 6 for technology development effort [2]. A total of 8 requests for application (RFAs) were funded by the NHGRI asking for more than 40 million of dollars (10 million only for the pilot study) [3]. From 2009 new grants were funded to expand ENCODE to the mouse genome and proteogenomics [4].

The following 3 pictures show the locations of the participating institutions, as well as the different projects and associated main PIs constituting the consortium.

Grants Edit

7 November 2009

New grants funded: NHGRI has funded 5 new ENCODE grants, as part of the American Investment and Recovery Act. The new grants include expansion of ENCODE to the mouse genome and proteogenomics.


To carry out a project to identify all functional elements in the human genome sequence



Regulatory elements: DNA hypersensitivity assays, assays of DNA methylation, and chromatin immunoprecipitation (ChIP) of proteins that interact with DNA, including modified histones and transcription factors, followed by sequencing (ChIP-Seq)
Transcribed elements: RNAseq
Long-Range interactions: C5


- 2003 NHGRI initiates international consortium

- 2003-2007 ENCODE pilot phase (1% of the human genome

- 2006 EGASP: the human ENCODE Genome Annotation Assessment Project

- 2006 GENCODE- producing a reference annotation for ENCODE

- 2007-2011 ENCODE scaling-up


Access to the dataEdit

All data produced within the ENCODE project is submitted to, and released by, the Data Coordination Center at UC Santa Cruz (DCC). Users may freely download and analyze ENCODE-produced data, and also publish papers based on it after a 9-month embargo period (see Data Release Terms). Genome-mapped data is viewable through the UCSC Genome Browser.

The UCSC ENCODE data download interface is often criticized for being confusing and not very user-friendly. As an alternative, the CRG provides the ENCODE RNA Dashboard, a portal to all ENCODE transcriptome data, together with a short tutorial on how to batch-download ENCODE data files.


There are to date (11/04/2011) 57 scientific publications coming from the ENCODE project. The paper on the pilot project was cited 772 times.

Competing ProjectsEdit

The FANTOM project (Functional Annotation of the Mammalian Genome) is another international project aiming to the functional annotation of genomes, including human. This has been launched by japan researchers in 2000 and is now at its phase 5, whose objective is generating transcriptional regulatory models to define all human cell types.

Associated ProjectsEdit

The modENCODE project (Model Organism ENCyclopedia Of DNA Elements - main page) is an project analog to ENCODE focusing on Drosophila melanogaster and Caenorhabditis elegans. modENCODE started in 2006 is being run as an open consortium and welcomes any investigator willing to abide by the criteria for participation.

As for ENCODE, both computational and experimental approaches are being applied by modENCODE. In addition, model organisms allow the validation of the identified DNA elements with experiments that cannot be performed on humans.

The modENCODE consortium published two papers in Science in December 2010 (freely accessible here and here), summarizing the main findings on the two model organisms. Several other companion papers were also published in many other high-impact journals.

The modENCODE Data Coordination Centre (DCC) maintains a webpage to access all public data.

Useful LinksEdit

UCSC data repositories

