PARTICIPANTS : Drs. Bianca Zingales, Alberto Carlos Frasch (coordinator for the WHO/TDR T. cruzi genome project), Turan Urmenyi (representing Edson Rondinelli), Bjorn Andersson (representing Ulf Pettersson), Jose Luiz Ramirez, John Kelly, John Swindle, Andres Ruiz, Antonio Gonzalez, Jens Hanke (representing Joerg Hoheisel), Mariano Levin, Carlos Morel, Wim Degrave, Daniel Sanchez, Boris Dobrokhotov
Were unable to attend : Jose Franco da Silveira, Denis Le Paslier.
Mariano Levin reported on the newly made BAC libraries at CEPH, Paris (resulting from the effort of Drs. Gloria Franco, Tulio Santos and Denis Le Paslier) TCI and TCII, a copy of which (about 4500 clones) arrived recently in Buenos Aires (INGEBI). They represent a 2x genome coverage, and were made by Hind III partial digestion and size selection in the range of 200-300 kb. However, the mean insert size was determined to be around 30 and 50 kb, respectively. Only 4% of the BACs seem to have suffered recombinations. A lot of effort was invested in trying to improve the mean insert size, however without success. Similar results were obtained with a Schistosoma mansoni BAC library, made at CEPH at the same time. It was discussed whether T. cruzi could have AT rich intergenic regions and "fragile" DNA difficulting large insert cloning. John Swindle reported similar problems with his BAC library. It was remembered that the T. cruzi YAC library also had a smaller insert size than expected, and cloning had proceeded with an efficiency of less that 10% of human YAC cloning. The BAC I and II libraries made in 1996 do not grow very well at present. It was also suggested that the integrity of the new BAC libraries could be checked against the sequence/map of chromosome 3, now near to completion.
The YAC library (made in 1996 at CEPH) has been gridded on high density filters (at Research Genetics, USA) and the first filters are now available. Twenty copies will be made. The YAC library from John Swindle was not included as yet. Levin remembered that the YAC libraries were made by Eco RI partial digestion and size selection in the range of 500-1000 kb, and thus represent probably only the larger T. cruzi chromosomes.
Results were also shown from Jose Franco da Silveira, who mapped 9 YACs to chromosome XVI (and homologous XVII), and the position of 15 marker sequences, as well as restriction sites for rare cutting enzymes. A contig of 880 kb was thus characterized in detail.
Dr. Jose Luiz Ramirez reported on the cloning and characterization of T. cruzi telomeric sequences. He remembered that, during the workshop held in Granada, Spain (1996), 10 cosmids were selected which hybridized to telomeric sequences (using T. brucei telomeres as probes). However, only 5 proved to contain telomeres. A specific telomere cloning strategy was presented, using a special adaptor on the telomere side, and digestion with Pst I at the internal side. An overview was given of the telomere organization, and preliminary data on specific genes that might be clustered in sub-telomeric localizations. Hybridizations to transferred bi-directional pulse field gels (first direction, followed by Not I digestions, and second pulse-field run at 90 degrees) using telomeric probes showed a particularly strong signal for chromosome IX, and chromosome I yielded only a single spot. It was observed that a careful analysis of these results might contribute to the determination of the exact number of chromosomal bands in T. cruzi.
Jens Hanke reported on the mapping efforts using the Hoheisel cosmid library, which represents cerca 14 times the genome of T. cruzi CL-Brener. Library construction from excised chromosomal bands from pulse-field gels turned out not to be feasible. Instead, such bands, after labeling, were used to screen the full cosmid library and 300-400 cosmids, specific to chromosomes 1-4, were identified. Mapping was performed by hybridization of anonymous cosmids to filter arrays of the sublibraries, allowing for contig construction. Gaps could be closed using genetic markers and by hybridization of a BAC library (J. Swindle). In this way, chromosomes 3 and 4 were fully mapped.
Andres Ruiz reported on the whole genome mapping effort using the same cosmid library. It was verified that the use of cosmids as probes against the cosmid library resulted in very high background. Thus, pools of cDNAs are now being used, and it was calculated that 1200 hybridizations would be necessary (288 having been done so far).
Daniel Sanchez reported on the EST sequencing in Buenos Aires, using the normalized epimastigote cDNA library. Templates were sequenced from the 5’end, and an average usefull read-length of 420 bp was obtained, with an estimated error of 2-4%. Sequence homologies were analyzed using BlastX and BlastN, with a p. value of statistical significance of lower than 10E-5. A total of 1949 ESTs were obtained and deposited in dbEST, of which 33% showed similarity with sequences in the database. 72% of the ESTs obtained were unique sequences, 28% were obtained more than once in this sample. A initial functional classification of the attributed encoded proteins was started. It was also observed that the mucin gene family consists of more than 500 genes. Details of this EST sequencing effort can be found http://www.iib.unsam.edu.ar/genomelab/tcruzi/5ests.html.
Wim Degrave reported on the sequencing and deposit of 800 EST’s at Fiocruz, also from the normalized library. Similarity with database sequences and redundancy were similar as mentioned above.
Bjorn Andersson reported on the sequencing of about 3800 EST’s, mostly (3100) from the 3’-end. Sequences had not been deposited yet, but will be shortly. The present sequencing rate is about 1400 ESTs/month. Details on the sequencing effort at Uppsala University can be found at http://www.medgen.uu.se/public/groups/UGSC/ Antonio Gonzalez completed with the sequencing of 390 EST’s (from the 5’- end), not deposited as yet, and observed that the normalized EST library is not completely directional as some clones are inverted.
During the following discussion, it was observed that the total number of EST’s in the group had reached now close to 7000 EST’s, with still more than 50% to be deposited. By the end of 1998, more than 10.000 sequences should be available, and an urgent effort should be done to determine the current redundancy of the normalized epimastigote library, in order to have a clear picture on the efficiency of further sequencing.
Turan Urmenyi commented on the difficulties concerning the construction of cDNA libraries of metacyclic and blood stream forms. Problems with the efficiency of metacyclogenesis have been overcome to some extent. Cultures with 20-30% of metacyclic trypomastigotes are being obtained with some regularity and purified metacyclics are being collected. Cell numbers adequate for total RNA extraction will be achieved soon. In addition, tissue culture- derived bloodstream form trypomastigotes are also being collected for total RNA extraction. Since it would be difficult and time-consuming to obtain enough trypomastigotes to isolate poly(A)+ RNA, it was decided to generate the cDNA library from total RNA by RT-PCR using the mini-exon and oligo(dT) as primers.
Bjorn Andersson reported on the Swedish genomic sequencing effort, concentrated on chromosome 3. A minimal set of 15-20 cosmids were chosen to cover the 600 kb chromosome. Mapping was done by hybridization and cosmid walking. The group has a current capacity of 960 reactions/week (more than 1 cosmid), but has in the past experienced limitations in mapping, staff and bioinformatics. Four cosmid sequences have been deposited in Genbank, others are available on the WWW site in Uppsala or by contacting Bjorn Andersson. In total, more than 450 kb of sequence has been completed. The homologous chromosome has 400 kb extra sequence, but the content of this surplus has not been completely identified. A 93.4 kb region has recently been analyzed in detail and the results submitted for publication. In this region, 151 ORFs greater than 300 bp have been found (using, amongst other programs, GRAIL), and 29 ORFs greater than 700 bp. Several repetitive sequences, amongst which a 400 bp repeat sequence specific for chromosome 3 and its homologue were also identified, along with a strand switch region possibly containing regulatory sequences.
John Kelly discussed currently available shuttle vector systems for T. cruzi transformation, and on reproducible plating techniques, since it takes 21-28 days to grow colonies.
John Swindle discussed the feasibility of post-genome projects in T. cruzi, estimating 1500-2000 essential genes, as in yeast. Several points were raised, stressing that, at this stage, post-genomics should concentrate on important sequences, which are either essential, developmentally regulated or unique to the parasite. Systematic disruption of all genes is not feasible but subsets of genes could be tried, aiming principally at single copy genes, using antisense, ribozyme etc. techniques. An inducible system is not available as yet. He announced plans to use functional complementation in yeast in order to identify essential genes in T. cruzi.
John Swindle further announced plans by several American laboratories and the Swedish group to submit a joint project to NIH in order to obtain funds for the sequencing of at least part of the T. cruzi genome. However, he expressed considerable doubts on whether clone CL-Brener, thus far the chosen prototype strain for the T. cruzi genome initiative, was a good choice, as CL- Brener might present a larger genome size than some other strains. The project would envisage cosmid and BAC end sequencing, using newly constructed libraries with sufficient depth, aiming mostly at gene discovery and mapping through end sequencing, and would further proceed with chromosome shotgun sequencing. All participants gave full support to this new initiative and consider it very relevant to achieve the goals proposed in the T. cruzi genome project.
Carlos Morel argued that the studied strain should belong to lineage 1, and that clear evidence, based on demonstrated facts, should be available before a change of strains/clones could be considered. Choice of a new strain should be rational and evidence based. Furthermore, a good collaboration between the TDR funded group and the new initiative is essential.
Wim Degrave reported on the new version of the TcruziDB database, which will be included on the CD, to be released jointly by the parasite genome initiatives. He also reported on the joint planning with the Leishmania and T. brucei initiatives to promote sequence annotation, database refinement, the organization of training courses in bioinformatics, the functional classification of predicted protein sequences with database similarity, and the drawing of parasite specific biochemical pathways. The group agreed to contribute to 1/3rd of the salary of one extra person, based at EBI, UK, dedicated to sequence annotation and analysis, together with the Leishmania and T. brucei initiative.
The possibility of submission of new projects was discussed, and several fields of need were identified :
1. Bioinformatics : Database updates and distribution, analysis of EST
redundancy in the epimastigote normalizes library, and cooperation with
the other trypanosomatid genome projects in this field. 2. Genomic sequencing
of single chromosomes and to join efforts to start with a Genome Sequence
Survey 3. Mapping, including BAC end sequencing, construction of YAC contigs,
and further minimum tiling mapping of cosmids and marker identification.
4. Post-genomic activities in the field of microchips, microarrays, and
development of new tools for functional studies in T. cruzi and systematic
gene knock-out.