Prior to the age of digitization, the norm in the lab was to scribble information in a notebook before storing said book along with hundreds or even thousands of others in stacks of boxes. With this traditional method of recording, there were obvious issues. Even if the correct notebook could be located among a sea of boxes, there’s no guarantee that the recordings would be legible. Plus, there was no practical way to compile similar or related data from multiple books.
Enter electronic laboratory notebooks (ELNs). These make the lives of lab personnel infinitely easier by providing a tool to record, store, and analyze vast amounts of data. While ELNs have obvious advantages, they don’t offer the whole solution. As Gabrielle Whittick, project leader and consultant, The Pistoia Alliance, explains: “In theory, ELNs are easy to search and archive, and users can link samples, experiments, and results. In reality, this isn’t always the case.”
Whittick goes on to say that because of how individual researchers work, the range of nomenclature used, and the variance of structured and unstructured data, search and retrieval doesn’t always deliver the most accurate results.
But there are solutions to these problems underway. Here, we examine the pros and cons of ELNs more closely and reveal how semantic enrichment is helping bridge the gap between a slew of disorganized information and valuable, usable data.
ELNs and their advantages and drawbacks
ELNs have revolutionized the way in which laboratories operate. They allow users to input all data associated with their work, including material information, experiment equipment and conditions, and, of course, results. As Whittick notes, “ELNs are vital to how researchers work today, as a digital solution to record and store data safely and securely.” They are also increasingly useful as collaborative tools, enabling researchers to share knowledge across organizations and with partners.
Whittick reveals that the early focus of ELNs was to improve data capture by facilitating the transition from paper-based notes to digital inputs. Even with this component, there have been some issues. “Bespoke ELNs tailored to lab workflows are most useful, but ‘out of the box’ ELNs may not fit how a researcher works, which limits the benefits,” says Whittick. She also notes that if an ELN is not platform-agnostic, a researcher needs to be based in a lab to use it, and can’t utilize it from home or on the move.
To overcome these issues and facilitate the changing way in which personnel are working, remote and mobile access to ELNs is necessary. Indeed, Whittick notes that digital-native researchers entering the lab in the early days of their career expect digital solutions to be accessible.
“ELNs are vital to how researchers work today, as a digital solution to record and store data safely and securely.”
While most of these challenges are readily solved, recording, storing, and accessing data is only part of the solution. There is also the issue of the usability of the data being accessed. With vast amounts of data input into ELNs, there can be challenges in compiling and sorting information such that researchers can easily locate and retrieve the data points they require. “Some captured experimental data are therefore locked in ELNs, and rendered unusable and unsearchable. This results in duplicated experiments and time spent tracking down and wrangling data,” explains Whittick.
Another problem arises with non-compatible ELNs. For example, partner organizations may use different ELN systems, which can actually end up creating more work for both parties. A large potential benefit of ELNs is the ability to collaborate, but this is stifled by issues of inefficient data extraction and system incompatibility.
How semantic enrichment of ELN data can help
The Pistoia Alliance is currently working on a large-scale initiative that is set to overcome many of the challenges faced by ELN users, dubbed the Semantic Enrichment of ELN Data (SEED) project. Whittick reveals that semantic enrichment of data includes enriching free text in ELNs with metadata for every relevant term from agreed ontologies. “It also uses dedicated ontologies for improved data management, incorporating additional data like attributes, mappings, and annotations,” she explains. “This creates relationships between ontology classes to help to describe and define them.”
The alliance brings together more than a dozen large pharma organizations to contribute to the project. These include AstraZeneca, Bayer, Biogen, Bristol Myers Squibb, CDD, Elsevier, GSK, Linguamatics, Merck, Pfizer, Sanofi, SciBite, University of Southampton, and Takeda. The first phase of the project involved the development of new standard assay ontologies for ADME (absorption, distribution, metabolism, and excretion), PD (Pharmacodynamic), and drug safety, because there was a gap in existing ontologies. “These have now been added to BioAssay Ontology (BAO) and are freely available,” Whittick notes. “As a cross-pharma project team, we built new standards and added them to the key ontology BAO, and then used this in the semantic enrichment process.”
The next phase of the SEED project is underway and aims to continue to make ELN data more searchable and usable. With metadata assigned to each relevant term, data can become readily accessible for future analysis. The aim is to develop a set of standards for ELN data structure across the pharma and life science industries. Among these, there is advocacy for the alignment with FAIR principles (findability, accessibility, interoperability, and reusability) as published in Scientific Data in 2016.
ELNs are incredibly useful tools in today’s laboratories, but there are barriers to utilizing them to their full potential. Semantic enrichment is paving the way for users to be able to more efficiently extract data and enhance collaboration opportunities. As Whittick puts it: “In short, semantic enrichment unlocks the value of scientific data currently ‘trapped’ in ELNs.”