ACTG people

retreat 2009

Welcome to the ACGT 2011 Retreat WikiEdit

Welcome to the AGCT wiki. This place is where the bioinformatics community of the PRBB will post the public proceedings of its retreat. The schedule is available here


Follow this link


Each group will populate a wiki dedicated to one of the the following projects listed below. They can include high-throughput projects like the ENCODE or 1000 human genomes, but also community based project like bio-conductor. Do not hesitate to add your favorite project.

For each wiki, we have designed a Project Description Template. It is by no mean exhaustive and you should feel free to complete. This template will be the backbone of your wiki and of its presentations.

We have also planned a series of short Round Table Discussions for Tuesday morning (list of topics here). If you find one you like, sign up on this wikipage (max 10 particpant per topic). Otherwise add your own.

The workshop will go as follows:

Before the retreat: sign up for a Project (list below) and for a Round Table

  • No more than 10 people per project
  • If the project you are interested in is full sign up for another one
  • If all projects are full, create a new one

During the retreat:

  1. Create a wikipage named after your project
  2. Link it to the main page (i.e. to the corresponding entry in the Project list section)
  3. Use the description template to fill up your wikipage. The goal is to put together as many informative facts as possible
  4. Create a summary section (Heading 2) with a least one highly informative picture
  5. Develop in the rest of the wiki adding one section for each item (can contain more than one page).
  6. Link it to the Project List section

Project ListEdit

1. 1000 Genomes: A deep catalog of human genetic variationEdit

This international effort was launched in 2008 with support from the National Institutes of Health, the Wellcome Trust Sanger Institute, the Bejiing Genomics Institute, and in-kind contributions by a number of private companies. The project’s goal is to sequence the genomes of at least 1,000 people, discovering both SNPs and structural variants, and to place them in a public database. The project has now been scaled up to 2,500 people from 27 different populations.


  1. Andy Pohl
  2. Giancarlo Castellano
  3. Cedrik Magis
  4. Ernesto Lowy
  5. Natalia Petit
  6. Guglielmo Roma
  7. Darek Kedra
  8. Paolo Di Tommaso

2. International Cancer Genomics Projects:Edit

Perhaps the most prominent and ambitious is the International Cancer Genome Consortium:

The ICGC goal is to obtain a comprehensive description of genomic, transcriptomic and epigenomic changes in 50 different tumor types and/or subtypes which are of clinical and societal importance across the globe. Several countries, including Spain, are participating. There are several other projects that started earlier or in parallel and have the same aims:

The Cancer Genome Project from the Wellcome Trust Sanger Institute:

The Cancer Genome Atlas (TCGA) project, from NIH:

and also from NIH, the Cancer Genome Anatomy and Characterization Projects:


  1. Sonia Althammer
  2. Michael Schroeder
  3. Abul Islam
  4. Christian Pérez-Llamas
  5. Abel Gonzalez Perez
  6. Pedro Ferreira
  7. David González Knowles
  8. Jose A Espinosa
  9. David Tamborero

3. Energy genomics at the Deparment of Energy Joint Genome InstituteEdit

In the field of bioenergy, sequence data can be applied to the problem of reducing the U.S. dependency on imported oil by improving biomass yield and the efficiency of processes used to convert plant materials into liquid fuels and valuable byproducts.

The project's wiki can be found here.

From Fernando Muñiz (So sorry, won't be attending do to sickness. I'm posting some webs from home. Dispose as you see fit: )

The EERA: 1.000 researchers currently working for the development of the next generation of energy technologies:
Short rotation energy crops could generate 4% of current UK electricity demand[1]
Production of dimethylfuran for liquid fuels from biomass-derived carbohydrates
Sustainable Bio-Composites from Renewable Resources: Opportunities and Challenges in the Green Materials World
Biodiesel fuel production from algae as renewable energy (pdf)


  1. Fernando Muñiz
  2. Fran Supek
  3. Cinzia De Benedictis
  4. Gabriela Aguileta
  5. Paolo Ribeca


The National Human Genome Research Institute (NHGRI) launched a public research consortium named ENCODE, the Encyclopedia Of DNA Elements, in September 2003, to carry out a project to identify all functional elements in the human genome sequence.With the success of the initial phases of the ENCODE Project, NHGRI funded new awards in September 2007 to scale the ENCODE Project to a production phase on the entire genome along with additional pilot-scale studies.

The project's wiki can be edited here.


  1. Nicolás Bellora
  2. Julien Lagarde
  3. Angelika Merkel
  4. Giovanni Bussotti
  5. Andrea Tanzer
  6. Sarah Djebali
  7. Alba Jené
  8. Marco Mariotti
  9. Marta Melé
  10. Toni Hermoso
  11. Luca Cozzuto

5. BioconductorEdit

Bioconductor provides tools for the analysis and comprehension of high-throughput genomic data. Bioconductor uses the R statistical programming language, and is open source and open development.

Please edit the wiki page for this topic.


  1. Robert Castelo
  2. Armand Gutiérrez
  3. João Moura
  4. Xavier Rafael
  5. Jordi Deu-Pons
  6. Francesco Mancuso
  7. Rory Johnson
  8. Maik Röder
  9. Didac Santesmasses

6. Human MicrobiomeEdit The Common Fund's Human Microbiome Project (HMP) aims to characterize the microbial communities found at several different sites on the human body, including nasal passages, oral cavities, skin, gastrointestinal tract, and urogenital tract, and to analyze the role of these microbes in human health and disease.


  1. Toby
  2. Margo Meer
  3. Carme Arnan
  4. Marina Marcet
  5. Marianela Masin

7. International Epigenomics Projects:Edit

Project page

The NIH Roadmap Epigenomics Mapping Consortium ( ) was launched with the goal of producing a public resource of human epigenomic data to catalyze basic biology and disease-oriented research. The Consortium leverages experimental pipelines built around next-generation sequencing technologies to map DNA methylation, histone modifications, chromatin accessibility and small RNA transcripts in stem cells and primary ex vivo tissues selected to represent the normal counterparts of tissues and organ systems frequently involved in human disease.

There are other projects with similar objectives:

The International Human Epigenome Consortium:

and EpiGeneSys:

See also:


  1. Juan González-Vallinas
  2. Eneritz Agirre
  3. Amadís Pagès
  4. Ionas Erb
  5. Inna Povolotskaya
  6. Javier Prado

8. Ocean GenomicsEdit

Project page

The above issue of Nature reports on how genome sequences from the sea an lead to a new understanding of biodiversity, ecology and biogeochemistry of marine environments.

See also the Marine Microbiology Initiative from the Gordon and Betty Moore Foundation:


  1. Gunes Gundem
  2. Sonja Haenzelmann
  3. Michael Breen
  4. Fedya
  5. Onuralp
  6. Joao Curado
  7. Salvador Capella
  8. Francisco Camara Ferreira
  9. Alexandros Pittis
  10. Jean-François Taly

9. Personal Genomes and DiseasesEdit

Geuvadis ( is an EU-funded project aims at using next-generation DNA sequencing technologies to sequence genes from patients and controls (RNASeq and exonSeq). Several disease conditions will be investigated.

The Personal Genome Project ( aims to record the DNA and medical history of 100000 individuals to understand how the genome determines our risk of having disease.

See here for the personal_genomes page.


  1. Steve Laurie
  2. Núria Radó
  3. Macarena Toll-Riera
  4. JL Villanueva
  5. Jørgen Skancke
  6. Leszek Pryszcz
  7. Anna Kedzierska
  8. Cinta Pegueroles
  9. Jia Ming Chang
  10. Gabriel Santpere

10. Synthetic genomicsEdit

Synthetic genomics combines methods for the chemical synthesis of DNA with computational techniques to design it. These methods allow scientists and engineers to construct genetic material that would be impossible or impractical to produce using more conventional biotechnological approaches. For example, using synthetic genomics it is possible to design and assemble chromosomes, genes and gene pathways, and even whole genomes.

Please edit the wiki page for this topic.


  1. Sophia Derdak

Project Description TemplateEdit

Each group will investigate a different large scale project and will fill up a number of items on the wiki page. Ideally, each item listed below will be a section in your wiki and will be listed in the summary section.

  • Organization: project starting date, main driving Institution, teams involved, sources/amount of funding.
  • Aims: initial aims
  • Development: changes in aims, scaling up, inclusion of new teams, etc.
  • Results: any published results to date.
  • Access to the data: which data has bee made available to date?
  • Impact: Expected incidence in scientific and/or social progress.
  • Competing Projects
  • Associated Projects
  • Useful Link: Section

? (add your own)

Scientific PresentationEdit

Improving the assessment of the outcome of non-synonymous SNPs with a Consensus deleteriousness score (Condel)

Abel Davi Gomzales PerezEdit

Several large ongoing initiatives that profit from next generation sequencing technologies have driven –and will continue to drive in coming years– the emergence of long catalogs of missense single nucleotide polymorphisms (SNPs) in the human genome. As a consequence, various methods and their related computational tools have been developed to classify these missense SNPs as likely deleterious or probably neutral polymorphisms. The outputs produced by each of these computational tools are of different nature and thus difficult to compare and integrate. This challenges the possibility to obtain more accurate classifications by taking advantage of the possible complementarity between different tools. Here we propose an effective approach to integrate the output of some of these tools into a unified classification, based on a Weighed Average of the normalized Scores of the individual methods (WAS). (The approach is illustrated in this paper with the integration of five tools.) We show that this WAS outperforms each individual method in the task of classifying missense SNPs as deleterious or neutral. Furthermore, we demonstrate that this WAS can be used, not only for classification purposes (deleterious vs neutral mutation), but also as an indicator of the impact of the mutation on the functionality of the mutant protein. In other words, it may be used as a deleteriousness score of missense SNPs. Therefore, we recommend the use of this WAS, as a Consensus deleteriousness score of missense mutations (Condel).

Fine grain structural Classification with the T-RMSD

Cedrik MagisEdit

Round TablesEdit

1. New Models For Databases (NoSQL and the rest)Edit


  1. Christian Pérez-Llamas
  2. Maik Röder
  3. Armand Gutiérrez
  4. Salvador Capella
  5. Paolo Di Tommaso
  6. Jordi Deu-Pons
  7. Fernando Muñiz Fernandez
  8. Jean-François Taly
  9. Toni Hermoso Pulido
  10. Darek Kedra

2. Disease (and other) ontologiesEdit


  1. Fran Supek
  2. Sophia Derdak

3. The war on cancer: Where is the battle-ground?Edit


  1. Alba Jené

4. Darwinian medicine: a case for cancerEdit


  1. Gunes Gundem
  2. Sonja Haenzelmann
  3. Sonia Althammer
  4. Jørgen Skancke
  5. Abul Islam
  6. Steve Laurie
  7. Núria Radó

5. Smoke without fire? How much of expression is rubbish, how much is functional?Edit


  1. Tobias Warnecke
  2. Onuralp
  3. Joao Curado
  4. Andrea Tanzer
  5. Inna Povolotskaya
  6. julien lagarde
  7. Rory Johnson
  8. Ionas Erb
  9. Pedro Ferreira
  10. Cinzia De Benedictis
  11. Angelika Merkel