When it comes to public access, the tree of life has holes.
A new study co-authored by University of Florida researchers shows about 70 percent of published genetic sequence comparisons are not publicly accessible, leaving researchers worldwide unable to get to critical data they may need to tackle a host a problems ranging from climate change to disease control.
Scientists are using the genetic data to construct the largest open-access tree of life as part of the National Science Foundation’s $5.6-million Assembling, Visualizing and Analyzing the Tree of Life project. Understanding organismal relationships is increasingly valuable for tracking the origin and spread of emerging diseases, creating agricultural and pharmaceutical products, studying climate change, controlling invasive species and establishing plans for conservation and ecosystem restoration.
The study appearing today in PLoS Biology describes a significant challenge for the project, which is expected to produce an initial draft tree by the end of the year. It highlights the need for developing more effective methods for storing data for long-term use and urges journals to adopt more stringent data-sharing policies.
“I think what we need is a major change in our mindset about just how important it is to deposit your data – this has to be a standard part of what we do,” said co-author Doug Soltis, a distinguished professor at the Florida Museum of Natural History on the UF campus and UF’s biology department. “Because if it’s not there, it’s lost forever. These are really, really important for long-term use, as we’re seeing now in our efforts to build a tree.”
Estimates of the amount of missing data were based on 7,539 peer-reviewed studies about animals, fungi, seed plants, bacteria and various microscopic organisms. Soltis said the missing genetic data has required project collaborators to contact hundreds of researchers to request information, or attempt to reproduce the sequence alignments and analyses, which is extremely labor intensive.
“There are ambiguities with the alignments, you have to make certain judgment calls, and so an alignment that I do is not going to be the same as an alignment that somebody else does,” said lead author Bryan Drew, a postdoctoral researcher in UF’s biology department. “It’s hard to assess a publication’s validity in a lot of cases if you don’t have access to the alignments. To me, that’s the biggest problem with all of this.”
Challenges include complicated mechanisms for uploading data and inconsistencies between journals – some require or strongly recommend data be stored in an online database and others do not, Drew said. The most widely used, publicly accessible databases include GenBank, TreeBASE and Dryad. Most journals require DNA sequences be deposited in GenBank, but comparatively few require the sequence alignments to be publicly archived. When study co-authors emailed researchers to obtain missing information, a majority did not respond, and the co-authors were rarely successful in retrieving the data.
“A lot of the authors I contacted said their data was in TreeBASE, but they were unaware of the next step needed after acceptance by the journal – the researchers didn’t know they had to go back into TreeBASE and actually make the data available to the public,” Drew said.
Elizabeth Kellogg, a professor in the department of biology at the University of Missouri-St. Louis who was not involved with the study, said she is not surprised about the large amount of missing information.
The Latest on: Genetic sequence comparisons
- Circulation of a novel strain of dolphin morbillivirus (DMV) in stranded cetaceans in the Mediterranean Seaon July 5, 2019 at 2:08 am
A comparison of these sequences with those of other reported CeMV strains (including a DMV isolated from a fin whale stranded in 2013 on the Italian coast 22 that was sequenced 26) was made to better ... […]
- Russian scientist plans to use controversial DNA editing to stop babies inheriting deafnesson July 4, 2019 at 11:39 pm
They have drawn comparisons to a disgraced Chinese scientist who secretly ... The components of CRISPR-Cas9 – the DNA sequence and the enzymes needed to implant it – are often sent into the body on ... […]
- Preventing hereditary deafnesson July 4, 2019 at 7:01 pm
By comparison, mice without the genetic defect retain normal ... for 15 other forms of inherited deafness also caused by a single-letter mutation in the DNA sequence of other hearing genes. ... […]
- Genome-wide identification of DNA-PKcs-associated RNAs by RIP-Seqon July 4, 2019 at 3:12 pm
The Tomtom software was used to compare the obtained motif sequence against ... Motif analysis showed that DNA-PKcs preferentially binds the AGGA sequence, which was in accordance with previous ... […]
- Optimized gene-editing tool prevents hearing loss in mice with hereditary deafness without detectable off-target effectson July 3, 2019 at 8:02 am
By comparison, mice without the genetic defect ... inherited deafness also caused by a single-letter mutation in the DNA sequence of other hearing genes. Additionally, the team said, their ... […]
- Genomic warning flag just in time for beach season: Jellyfish toxinson July 2, 2019 at 9:05 am
The work here finally provides genome sequences for all five lineages of the Medusozoa (a subphylum of Cnidaria). Corresponding author Joseph Ryan said: "With these new genomes, for the first time, we ... […]
- ‘A cure of HIV is possible’: UNMC, Temple researchers eliminate virus in humanized miceon July 2, 2019 at 8:39 am
By comparison, the researchers could readily detect HIV in ... Gendelman said the researchers went as far to analyze the genetic sequences of the human cells in the mice to look for such effects. They ... […]
- Sandia researchers uncover personalized medicine software vulnerabilityon July 1, 2019 at 11:56 pm
sequencing the entire genetic content from a patient's cells and comparing that sequence to a standardized human genome. Through that comparison, doctors identify specific genetic changes in a patient ... […]
- Research provides genomic warning flag just in time for beach seasonon July 1, 2019 at 4:00 am
The work here finally provides genome sequences for all five lineages of the Medusozoa (a subphylum of Cnidaria). Corresponding author Joseph Ryan said: With these new genomes, for the first time, we ... […]
- Ancient DNA help scientists study human evolution: 'It's like a time capsule'on June 28, 2019 at 5:56 am
Being able to sequence ancient DNA provides a snapshot of those people ... Raghavan and her team can compare the ancient genomic data to modern samples and infer how much intermarriage was ... […]
via Google News and Bing News