REPORT ON
THE MEETING OF THE TRYPANOSOMA CRUZI GENOME INITIATIVE
Hinxton, England. April 26-28, 1999
Sponsored by : WHO -
TDR.
Organizer : Wim Degrave and Antonio Gonzalez.
Report : Wim M.
Degrave, secretary of the T. cruzi genome project (WHO/TDR)
PARTICIPANTS :
Drs.
Alberto Carlos Frasch (coordinator; Buenos
Aires, Argentina)
Bjorn Andersson (Uppsala, Sweden)
John Kelly (London,
UK)
Jacqueline Bua (Buenos Aires, Argentina)
Antonio Gonzalez (Granada,
Spain)
Alberto Delgado (Granada, Spain)
Wim Degrave (Rio de Janeiro,
Brazil)
Etel Gimba (Rio de Janeiro, Brazil)
MarttiTammi (Uppsala,
Sweden)
Anh-Nhi Tran (Uppsala, Sweden)
Najib El-Sayed (TIGR,
USA)
Boris Dobrokhotov (WHO/TDR)
Were unable to attend : Jose
Franco da Silveira (Sao Paulo, Brazil), Bianca Zingales (Sao Paulo, Brazil),
Jose Luis Ramirez (Caracas, Venezuela), Denis Le Paslier (Paris, France), Rick
Tarleton (Athens, USA), John Swindle (Seattle, USA).
The T. cruzi Genome
Planning Meeting was held together with the Leishmania Genome Meeting. Several
of the sessions were common to the two networks, alternated with
parasite-specific sessions. This formula was felt by most, if not all
participants as highly beneficial and effective.
Presentations on progress in the different parasite genome
projects.
Dr. Dobrokhotov gave a brief overview on the restructuring of
WHO and of TDR, and stressed the importance of functional genomics in the future
workplans for the parasite genome initiatives.
Dr. Jenny Blackwell gave an
overview of current progress in the Leishmania Genome initiative, while Peter
Myler reviewed the genomic sequencing of chromosome 1 and parcial sequence of
chromosome 3 of Leishmania major Friedlin. Dan Lawson (Sanger Center) reported
progress on the sequencing of chromosome 4.
Dr. Carlos Frasch reviewed the
general progress in the T. cruzi genome initative, followed by Bjorn Andersson
who reported on the sequence of chromosome 3 (620 kb).
David Johnston
(Schistosome network) presented an overview of the general progress in the
Schistosoma network, and explained the EST clustering done for that initiative.
Dan Lawson explained the flux of information during large scale sequencing
at Sanger Center.
Mark Blaxter reviewed progress within the Filaria genome
initiative.
Dan Lawson (Sanger Center) and Najib El-Sayed (TIGR) reported on
large scale sequencing within the T. brucei network.
A workgroup
consisting of Peter Myler, Al Ivens, Wim Degrave, Dan Lawson, Mark Blaxter,
Bjorn Andersson, Najib El-Sayed met several times, discussing a common gene
nomenclature and functional gene classification. The resulting proposal can be
found at Al Ivens'
page, and comments are expected from the scientific community.
A
visit to the sequencing facilities at the Sanger Center was organized for all
participants.
A hands-on BioInformatics session, centered around large
scale sequencing projects and anotation was organized by Al Ivens, Mark Blaxter
and Dan Lawson.
Joint session on functional genomics.
During the joint session,
presentations were given on functional genomics approaches using micro-array
techniques, optical mapping, antisense strategies, systematic knock-out of
genes, chromosome fragmentation, mutagenesis and use of transposons,
transformation vectors and genetic elements, proteomics, combinatorial library
screening and DNA vaccination experiments. Needs for further developments common
to all parasite genome projects were discussed.
SUMMARY OF COMMUNICATIONS AND TECHNICAL PROGRESS FOR THE TRYPANOSOMA
CRUZI NETWORK.
EST sequencing.
Ahn-Nhi Tran initiated the session reporting on
the EST sequencing effort in Uppsala. 23.000 clones from the epimastigote
normalized library had been ordered, and an analysis of 5009 sequences, using
Phrap, revealed 778 clusters (containing 52% of the sequences) and 2382
singletons (48%). Sequence redundancy was thus around 50%. About 3000 distinct
transcripts were identified (amongst which 280 full-length genes), of which 65%
had no match in the database. Examples were given of transcripts (e.g.
succinyl-CoA ligase) with polimorphism as to the splice acceptor and
poli-adenylation sites.
Informations were complemented by Carlos Frasch,
Antonio Gonzales, Alberto Delgado and Wim Degrave for EST sequences obtained
respectively at the University of San Martin, Instituto de Parasitologia e
Biomedicina and Fundacao Oswaldo Cruz, and the following conclusions were drawn
:
- A total of 6900 EST's have been deposited by the different labs, and about
2000 sequences remain to be deposited, which will be completed by the end of
May. This number of sequences is estimated to represent at least 50% of the
total number of genes expressed in the epimastigote stage and at least 30% of
the total number of single genes in T. cruzi. As the redundancy is now
considerable, it was decided that no further effort would be invested in the
epimastigote library.
- A trypomastigote and amastigote library will be made in the laboratory of
Antonio Gonzales, and 200 clones of each will be sequenced to estimate the
usefullness and quality of the libraries.
- The laboratories who have been sequencing EST's will transfer their raw
data chromatograms to MartiiTammi (Univ. of Uppsala) to be reanalyzed, using
Phrap, in cluster analysis and a final assembly of all EST's.
Random Genome Sampling Sequencing (GSS)
Carlos Frasch reported
on the single pass sequencing of 3500 random genomic clones, totalling 1.5 Mbp.
It is expected to obtain 10.000 such sequences per year, during two years,
totalling about 20% of the genome. Of an initial sample of 2229 sequences, 33%
had significant matches to the database; of those 75% were with T. cruzi
sequences, 25% with other organisms. Remarkably, 4.9% of the sequences matched
to the 195 bp repeat sequence (estimated copy number 5000), 4.2% to the DGF-1
hypothetical protein (>3000 AA's; estimated 70 copies), 1.5% to the sialidase
superfamily (estimated 250 copies), 1.3% to mucins (estimated 250 copies), 1.3%
to GP63 (estimated 250 copies), 0.2% to cruzipain (estimated 100 copies).
Genomic sequencing.
Bjorn Andersson reviewed the genomic
sequencing effort in Uppsala. Chromosome 3 (and a considerable part of its
homologue) is nearly finished, and chromosome 4 has been started. A
transcriptional strand switch region was identified in chromosome 3 but short
policystronic transcription units were also identified inwards from either
telomere. Surprisingly, apart from some missing genes between the two homologous
chromosomes, one difference in every 50 bp was observed in the region analyzed
in both homologues. It is at present not known whether this observation can be
extended to other regions of the genome.
A new BAC library was prepared
by Dr. Denis le Paslier (CEPH, France), containing about 2000 clones with
average insert size of 100 kb. The library has been gridded in Andersson's lab
(Uppsala) and is now being used for initial BAC end sequencing.
A joint
project application by the University of Washington, TIGR and the University of
Uppsala has been submitted to NIH for funding, foreseeing the end sequencing of
24.000 BAC's, which, together with the EST and GSS sequencing, would cover about
60% of the genome. In a second phase, 15- 20 Mb of sequence will be obtained by
full BAC sequencing. A final decision on funding is expected by september 1999.
The group decided to write a letter of support from the network.
Physical mapping.
The current state of physical mapping was
reviewed.
A partial map of chromosome 1 and a full map of chromosome 3 and 4
is available from Andersson's group.
Jose Franco da Silveira (EPM,
Brazil) constructed a physical map of about 800 kb from chromosome band XVI and
its homologue. Furthermore, about 300 probes, mostly EST's, have been mapped to
chromosomes. These are now also being mapped to YAC clones, in order to assist
in the construction of a low density map of the genome. It is expected that the
BAC end sequencing initiative, if funded, will be decisive for the construction
of a full map of the genome.
Jacqueline Bua reported on her efforts to
obtain an overlapping cosmid map of the genome, in collaboration with Joerg
Hoheisel in Heidelberg. Initially, cosmid pools were used as probes in a
hybridization assay, but due to background signal problems, the strategy was
changed to the use of cDNA probes. But even here, substraction had to be made of
EST's containing repetitive sequences, and the available computer program could
not deal with the amount of data, making manual scoring necessary. It is still
hoped that a whole genome cosmid map will be achieved.
Functional studies and post-genome activities.
A few
post-genomics projects are currently ongoing in the T. cruzi initiative,
such as the search for virulence factors in T. cruzi using DNA array
techniques (Rick Tarleton) and the further development of genetic tools for
T. cruzi.
John Kelly (London School, UK) reported on the latter,
reviewing current vectors for gene deletion, expression and shuttle vectors and
the use of dominant-negative approaches. Further efforts are invested in the
development of negative selection systems and inducible expression systems. Of
particular new interest is the chromosome knock-out system which is being
tested, using a chromosome fragmentation vector containing T. brucei
telemere sequences, which is expected to replace chromosomal sequences between
the cloned target and the telomere end.
The following group discussion
stressed the need for new projects in the areas of proteomics, micro-arraying
with either GSS or EST sequences, more refined genetic tools and developments in
the field of Bioinformatics.
Data analysis, database construction and distribution.
Martii Tammi (Uppsala) reported the development and adaptation of software
for large scale sequencing analysis, including a sequence assembly and clipping
program, an adaptation of Blast for parallel computing and a genefinder program.
A joint initiative for the analysis of all T. cruzi EST's (in Uppsala,
using Phrap) was mentioned above.
Wim Degrave (Fiocruz) communicated the release of the new version of TcruziDB,
to be included in the joint parasite genome database CD (by Martin Aslett, EBI).
He also promised to update the T. cruzi WWW pages at http://www.dbbm.fiocruz.
br. Furthermore, a post-doctoral student from Fiocruz will join Al Ivens and
Dan Lawson at the Sanger Center for annotation of T. cruzi sequences
and maintenance and integration of the databases.