Surfing the Research Data Wave

While more and more data is generated by an increasing number of researchers and increasing research expenditure worldwide, this data is hardly manageable by our scholarly practice of communicating scientific results. Even managing your own data manually is time-consuming and error-prone, but accessing and re-analyzing data from other research groups is almost impossible. The lack of standards, incomplete metadata, and missing original data make it nearly impossible to reproduce published results. More and more researchers feel like they are drowning in a tsunami of data.

This also applies to studies on the catalytic activity, selectivity, and stability of enzymes and enzymatic networks, a field of research that is equally important for industrial biotechnology and biomedicine. What also complicates matters in this area is the fact that data describing enzymatic experiments is particularly complex, because an enzymatic reaction depends on many factors, such as the protein sequence of the enzyme, the recombinant host organism, the reaction conditions, and non-enzymatic secondary reactions. Furthermore, other effects such as inactivation or inhibition of the enzyme, or evaporation of the medium affect the results.

Webinar

Maximizing Lab Workflows: Integrating Slide Staining and Cytocentrifugation with Aerospray

Join Lab Manager and our experts as we discuss slide staining and cytocentrifugation

The new, standardized data exchange format "EnzymeML," presented by 23 authors from 14 different research institutions in the scientific journal Nature Methods gives hope in this respect. EnzymeML can completely record the results of an enzymatic experiment, from the reaction conditions to the measured data, as well as the kinetic model used to analyze experimental data and the estimated kinetic parameters. The format thus provides a seamless communication channel between experimental platforms, electronic lab notebooks, enzyme kinetics modeling tools, publication platforms, and enzymatic reaction databases. "We demonstrate the feasibility and usefulness of the EnzymeML toolbox using six scenarios where data and metadata from various enzymatic reactions is collected, analyzed, and uploaded to public databases for future use," explains first author Simone Lauterbach.

EnzymeML documents are structured and standardized, therefore the experimental results encoded in an EnzymeML document are interoperable and reusable by other groups. Because an EnzymeML document is machine-readable, it can be used in an automated workflow to store, visualize, and analyze data, as well as reanalyze previously published data, with no restrictions of the size of each data set, or the number of experiments.

"The digitalization of biocatalysis increases the efficiency of data management, visualization, and analysis," emphasizes professor Jürgen Pleiss, corresponding author, and project coordinator. Furthermore, digitalization improves the reproducibility of experiments and data analyses, thus promoting trust in scientific results. "The EnzymeML toolbox makes best use of rapidly growing enzymatic data and is a useful tool that allows researchers to surf the research data wave."

- This press release was provided by the University of Stuttgart