REPORT ON THE MEETING OF THE TRYPANOSOMA CRUZI GENOME INITIATIVE

Hinxton, England. April 26-28, 1999
Sponsored by : WHO - TDR.
Organizer : Wim Degrave and Antonio Gonzalez.
Report : Wim M. Degrave, secretary of the T. cruzi genome project (WHO/TDR)

PARTICIPANTS :

Drs.
Alberto Carlos Frasch (coordinator; Buenos Aires, Argentina)
Bjorn Andersson (Uppsala, Sweden)
John Kelly (London, UK)
Jacqueline Bua (Buenos Aires, Argentina)
Antonio Gonzalez (Granada, Spain)
Alberto Delgado (Granada, Spain)
Wim Degrave (Rio de Janeiro, Brazil)
Etel Gimba (Rio de Janeiro, Brazil)
MarttiTammi (Uppsala, Sweden)
Anh-Nhi Tran (Uppsala, Sweden)
Najib El-Sayed (TIGR, USA)

Boris Dobrokhotov (WHO/TDR)

Were unable to attend : Jose Franco da Silveira (Sao Paulo, Brazil), Bianca Zingales (Sao Paulo, Brazil), Jose Luis Ramirez (Caracas, Venezuela), Denis Le Paslier (Paris, France), Rick Tarleton (Athens, USA), John Swindle (Seattle, USA).

The T. cruzi Genome Planning Meeting was held together with the Leishmania Genome Meeting. Several of the sessions were common to the two networks, alternated with parasite-specific sessions. This formula was felt by most, if not all participants as highly beneficial and effective.

Presentations on progress in the different parasite genome projects.

Dr. Dobrokhotov gave a brief overview on the restructuring of WHO and of TDR, and stressed the importance of functional genomics in the future workplans for the parasite genome initiatives.
Dr. Jenny Blackwell gave an overview of current progress in the Leishmania Genome initiative, while Peter Myler reviewed the genomic sequencing of chromosome 1 and parcial sequence of chromosome 3 of Leishmania major Friedlin. Dan Lawson (Sanger Center) reported progress on the sequencing of chromosome 4.
Dr. Carlos Frasch reviewed the general progress in the T. cruzi genome initative, followed by Bjorn Andersson who reported on the sequence of chromosome 3 (620 kb).
David Johnston (Schistosome network) presented an overview of the general progress in the Schistosoma network, and explained the EST clustering done for that initiative.
Dan Lawson explained the flux of information during large scale sequencing at Sanger Center.
Mark Blaxter reviewed progress within the Filaria genome initiative.
Dan Lawson (Sanger Center) and Najib El-Sayed (TIGR) reported on large scale sequencing within the T. brucei network.

A workgroup consisting of Peter Myler, Al Ivens, Wim Degrave, Dan Lawson, Mark Blaxter, Bjorn Andersson, Najib El-Sayed met several times, discussing a common gene nomenclature and functional gene classification. The resulting proposal can be found at Al Ivens' page, and comments are expected from the scientific community.

A visit to the sequencing facilities at the Sanger Center was organized for all participants.

A hands-on BioInformatics session, centered around large scale sequencing projects and anotation was organized by Al Ivens, Mark Blaxter and Dan Lawson.

Joint session on functional genomics.

During the joint session, presentations were given on functional genomics approaches using micro-array techniques, optical mapping, antisense strategies, systematic knock-out of genes, chromosome fragmentation, mutagenesis and use of transposons, transformation vectors and genetic elements, proteomics, combinatorial library screening and DNA vaccination experiments. Needs for further developments common to all parasite genome projects were discussed.

SUMMARY OF COMMUNICATIONS AND TECHNICAL PROGRESS FOR THE TRYPANOSOMA CRUZI NETWORK.

EST sequencing.

Ahn-Nhi Tran initiated the session reporting on the EST sequencing effort in Uppsala. 23.000 clones from the epimastigote normalized library had been ordered, and an analysis of 5009 sequences, using Phrap, revealed 778 clusters (containing 52% of the sequences) and 2382 singletons (48%). Sequence redundancy was thus around 50%. About 3000 distinct transcripts were identified (amongst which 280 full-length genes), of which 65% had no match in the database. Examples were given of transcripts (e.g. succinyl-CoA ligase) with polimorphism as to the splice acceptor and poli-adenylation sites.
Informations were complemented by Carlos Frasch, Antonio Gonzales, Alberto Delgado and Wim Degrave for EST sequences obtained respectively at the University of San Martin, Instituto de Parasitologia e Biomedicina and Fundacao Oswaldo Cruz, and the following conclusions were drawn :
  1. A total of 6900 EST's have been deposited by the different labs, and about 2000 sequences remain to be deposited, which will be completed by the end of May. This number of sequences is estimated to represent at least 50% of the total number of genes expressed in the epimastigote stage and at least 30% of the total number of single genes in T. cruzi. As the redundancy is now considerable, it was decided that no further effort would be invested in the epimastigote library.
  2. A trypomastigote and amastigote library will be made in the laboratory of Antonio Gonzales, and 200 clones of each will be sequenced to estimate the usefullness and quality of the libraries.
  3. The laboratories who have been sequencing EST's will transfer their raw data chromatograms to MartiiTammi (Univ. of Uppsala) to be reanalyzed, using Phrap, in cluster analysis and a final assembly of all EST's.

Random Genome Sampling Sequencing (GSS)

Carlos Frasch reported on the single pass sequencing of 3500 random genomic clones, totalling 1.5 Mbp. It is expected to obtain 10.000 such sequences per year, during two years, totalling about 20% of the genome. Of an initial sample of 2229 sequences, 33% had significant matches to the database; of those 75% were with T. cruzi sequences, 25% with other organisms. Remarkably, 4.9% of the sequences matched to the 195 bp repeat sequence (estimated copy number 5000), 4.2% to the DGF-1 hypothetical protein (>3000 AA's; estimated 70 copies), 1.5% to the sialidase superfamily (estimated 250 copies), 1.3% to mucins (estimated 250 copies), 1.3% to GP63 (estimated 250 copies), 0.2% to cruzipain (estimated 100 copies).

Genomic sequencing.

Bjorn Andersson reviewed the genomic sequencing effort in Uppsala. Chromosome 3 (and a considerable part of its homologue) is nearly finished, and chromosome 4 has been started. A transcriptional strand switch region was identified in chromosome 3 but short policystronic transcription units were also identified inwards from either telomere. Surprisingly, apart from some missing genes between the two homologous chromosomes, one difference in every 50 bp was observed in the region analyzed in both homologues. It is at present not known whether this observation can be extended to other regions of the genome.

A new BAC library was prepared by Dr. Denis le Paslier (CEPH, France), containing about 2000 clones with average insert size of 100 kb. The library has been gridded in Andersson's lab (Uppsala) and is now being used for initial BAC end sequencing.

A joint project application by the University of Washington, TIGR and the University of Uppsala has been submitted to NIH for funding, foreseeing the end sequencing of 24.000 BAC's, which, together with the EST and GSS sequencing, would cover about 60% of the genome. In a second phase, 15- 20 Mb of sequence will be obtained by full BAC sequencing. A final decision on funding is expected by september 1999. The group decided to write a letter of support from the network.

Physical mapping.

The current state of physical mapping was reviewed.
A partial map of chromosome 1 and a full map of chromosome 3 and 4 is available from Andersson's group.

Jose Franco da Silveira (EPM, Brazil) constructed a physical map of about 800 kb from chromosome band XVI and its homologue. Furthermore, about 300 probes, mostly EST's, have been mapped to chromosomes. These are now also being mapped to YAC clones, in order to assist in the construction of a low density map of the genome. It is expected that the BAC end sequencing initiative, if funded, will be decisive for the construction of a full map of the genome.

Jacqueline Bua reported on her efforts to obtain an overlapping cosmid map of the genome, in collaboration with Joerg Hoheisel in Heidelberg. Initially, cosmid pools were used as probes in a hybridization assay, but due to background signal problems, the strategy was changed to the use of cDNA probes. But even here, substraction had to be made of EST's containing repetitive sequences, and the available computer program could not deal with the amount of data, making manual scoring necessary. It is still hoped that a whole genome cosmid map will be achieved.

Functional studies and post-genome activities.

A few post-genomics projects are currently ongoing in the T. cruzi initiative, such as the search for virulence factors in T. cruzi using DNA array techniques (Rick Tarleton) and the further development of genetic tools for T. cruzi.
John Kelly (London School, UK) reported on the latter, reviewing current vectors for gene deletion, expression and shuttle vectors and the use of dominant-negative approaches. Further efforts are invested in the development of negative selection systems and inducible expression systems. Of particular new interest is the chromosome knock-out system which is being tested, using a chromosome fragmentation vector containing T. brucei telemere sequences, which is expected to replace chromosomal sequences between the cloned target and the telomere end.

The following group discussion stressed the need for new projects in the areas of proteomics, micro-arraying with either GSS or EST sequences, more refined genetic tools and developments in the field of Bioinformatics.

Data analysis, database construction and distribution.

Martii Tammi (Uppsala) reported the development and adaptation of software for large scale sequencing analysis, including a sequence assembly and clipping program, an adaptation of Blast for parallel computing and a genefinder program.

A joint initiative for the analysis of all T. cruzi EST's (in Uppsala, using Phrap) was mentioned above.

Wim Degrave (Fiocruz) communicated the release of the new version of TcruziDB, to be included in the joint parasite genome database CD (by Martin Aslett, EBI). He also promised to update the T. cruzi WWW pages at http://www.dbbm.fiocruz. br. Furthermore, a post-doctoral student from Fiocruz will join Al Ivens and Dan Lawson at the Sanger Center for annotation of T. cruzi sequences and maintenance and integration of the databases.