Currently viewing Global edition

Vale unlocks rich bioinformatics resource

New work by mining giant Vale on the application of DNA sequencing technologies and bioinformatics is said to be advancing the industry’s understanding of the importance of tools such as DNA barcoding in biodiversity management at a time when financial institutions and other mining stakeholders are increasing scrutiny of the environmental performance marker.
Vale unlocks rich bioinformatics resource Vale unlocks rich bioinformatics resource Vale unlocks rich bioinformatics resource Vale unlocks rich bioinformatics resource Vale unlocks rich bioinformatics resource

DNA markers previously all but non-existent for Amazon flora

One of the authors of the Brazil-based miner's extensive work in the field this decade, Instituto Tecnológico Vale's (ITV) Guilherme Oliveira, told Mining Journal's Future of Mining channel laws in the country required comprehensive evaluation of environmental impacts - and impact minimisation - before projects could proceed towards development, as was the case in many other countries.

Miners generally have become more proactive and vocal about energy consumption levels and greenhouse gas emissions. Oliveira indicated these tended to be the industry's headline markers.

"Sector reports, such as the Responsible Mining Index, indicate progress," he said.

"But there is still a gap concerning biodiversity management."

Environmental impact studies that attempted to measure potential biodiversity loss, and outline mitigation responses, typically made "no allowance for the loss of species", Oliveira said.

The "first critical question - to determine which species are present" often could not be adequately answered.

"Traditional means of assessing and monitoring biodiversity require expensive and prolonged field expeditions and the identification of specimens by specialised taxonomists. However, there is a recognised lack of taxonomy experts to classify a massive volume of collected samples appropriately," Oliveira said.

The "taxonomist impediment" and conventional analogical approach hindered data auditing, the identification of difficult specimens - involving, in many cases, extended exchanges between scientific collections - and accurate data analysis.

"The global rate of species extinction is significantly higher than of the description of new species, making imperative the use of next-generation species identification and cataloging procedures based on molecular and bioinformatics tools.

"Recent developments in the use of molecular tools brought a revolution to environment and species assessment practices.

"The use of high-throughput DNA sequencing methodologies connects biodiversity studies with big data science by coupling data production and concurrent analysis, including the use of artificial intelligence and deep learning tools.

"These approaches will make the combination of DNA sequencing technologies and bioinformatics tools an industry standard [in future], or at least widespread in environmental studies."

Vale established ITV in 2010 and in the past three years has stepped up work on DNA barcoding and molecular approaches to studying biodiversity at operations and conservation sites. And not just in any part of the planet.

"The focus is on in the mega-diverse Amazonian forest, specifically the ferruginous savanna rocky outcrops in the National Forest of Carajás [in Pará in northern Brazil] where, among others, Vale's S11D project is located," Oliveira said.

"The [challenges of biodiversity mapping and management] are boosted by operating in a biodiversity-rich and understudied area."

Oliveira said DNA markers were previously all but non-existent for Amazon flora.

"When we started about 260 species were known in the area," he said.

"By the time we finished there were over 1,100 species, some new ones and some that are endemic.

"We have done extensive field work at ITV [with] many collaborators, [including] the Museu Paraense Emilio Goeldi, a federal research institution as the main partner.

"All of the plants were deposited in the public herbarium where they are available to any scientist worldwide.

"The entire flora of the ferruginous savanna was studied [and subsequently published in the journal of the Brazilian Botanical Gardens], and DNA barcodes were produced for essentially all 1,000-plus known species, generating an extraordinary resource that will contribute to bridging the knowledge gap about the local species.

"This work pretty much doubled all of the DNA barcoding knowledge of the entire Brazilian flora produced globally and deposited at BOLDSystem [international Barcode of Life Data System].

"The availability of a referenced DNA barcode library has also been used for the identification of the plants on mine land rehabilitation areas, creating a much clearer picture of the evolution of restoration efforts.

"The reference library is supporting the full deployment of molecular approaches such as metabarcoding and eDNA for environmental monitoring. These methods are currently under validation for the ferruginous savannas, but are also being implemented for the cave fauna and fish that inhabit the many watercourses in the region."

 nstituto ecnolgico ale has sown the seeds of future mining industry biodiversity management and reporting Instituto Tecnológico Vale has sown the seeds of future mining industry biodiversity management and reporting

Oliveira said DNA barcodes that had been deployed in the identification of species globally to produce public DNA barcode databases such as at www.boldsystems.org offered the broadest and highest-level means of plant and animal genomic analysis and reference.

While full-scale genomics was, "for the most part, extreme for environmental studies", genome-wide approaches had been developed to address large-scale issues.

"Genome-based approaches are essential for the definition of conservation strategies based on the fine resolution of population genetics, unequivocal species distinction and the identification of genetic markers for important traits," he said.

"Currently, environmental based DNA analysis has attracted significant interest because the information regarding the species present at a location can be assessed either by collecting specimens en masse [meta-barcoding] or by the DNA track left in the environment [eDNA].

"Meta-bacoding produces regular DNA barcodes from a mixture of specimens collected together, rather than per individual. The use of next-generation sequencing technologies and computational analysis resolves the DNA sequences generated.

"eDNA utilises DNA present in the environment. Soil samples can be used to investigate plants and animals and water samples for fish eDNA, for example.

"The objective in this case is not necessarily to detect every species in the area but to establish baseline data that will change according to environmental modifications. Baseline data is quite useful as a site monitoring tool and the identification of environmental biomarkers. In any case, the existence of a reference library will also significantly improve the identification of a species.

"[ITV's] DNA barcoding efforts will continue until we have exhausted the ferruginous canga flora.

"We have initiated work on forest species, fish, ants and bats.

"To obtain greater resolution on difficult to diagnose species we have been deploying the complete sequencing of the chloroplast genome of plants and of mitochondria of animals. We have also been using genome wide based approaches for maximum resolution.

"For species of special interest we are developing a reference genome for them in addition to functional genomics to understand how they respond to different environmental conditions.

"The entire canga area has been mapped also from the perspective of the microbes that are present. This will allow us to use this information to develop environment area-specific signatures for monitoring."

The volume and calibre of data generated and probed with ITV-developed software - "including machine learning approaches" - was dramatically changing how biodiversity data was captured and analysed, Oliveira said.

ITV had just bought two new DNA sequencers, PacBio and MinIOn, to become the only laboratory in South America with all current sequencing technologies available.

"We have also invested in compute power to be able to handle all of the data, including a GPU cluster for artificial intelligence applications," Oliveira said.

"The culture of information sharing adopted by the molecular biology and bioinformatics communities is enormously influential.

"Open data empowers the corporate and scientific communities to make instantaneous species identification, by the comparison of DNA sequence identity, and the use of computational and artificial intelligence tools to promptly derive indepth data analysis.

"Auditing is made possible, and data reanalysis allowed, as new reference libraries are made available, and DNA databases grow.

"And stakeholders can assess progress made on species conservation using species-specific DNA markers or environmental surrogates [such as] metabarcoding and eDNA.

"Overall this is a transformative opportunity for a fast, financially sound, objective, auditable, and open approach to biodiversity assessment and monitoring, in line with the demands of a data-driven and empowered contemporary society, and the industry 4.0 transformation."