Selecting the Right Informatics Management System
Steve Thomas, an investigator within the Drug Metabolism and Pharmacokinetics department at GSK, talks to contributing editor Tanuja Koppal, PhD, about his experiences implementing a database of
metabolic knowledge that helps the company store, share, and search data around the globe. The process involved analyzing internal needs, evaluating several options, and finding the right informatics solution to give GSK scientists access to each other’s findings to prevent error, repetition, or inefficiency.
Q: Can you tell us a little about your department and the work that you do?
A: I am in the Biotransformation and Drug Disposition group at GSK, with about 40 scientists at our location here in Ware, UK. In drug development we are a hub for drug metabolism and pharmacokinetic (DMPK) studies, looking to make sure that a benign drug hasn’t been turned into something toxic as the body tries to get rid of it. We have a mix of chemists, biochemists, and biologists, and many are analytical specialists for the spectral identification of the structure of small molecules. So, many people with whom I work have expertise in mass spectrometry (MS) or nuclear magnetic resonance (NMR) or both.
We work with fairly high-end NMR instruments that are powered with cryoprobes that give us exquisite sensitivity to be able to get information from very small amounts of material that we get back from clinical trials. For MS, we have matrix-assisted laser absorption/desorption (MALDI) and time-of-flight (TOF) instruments, as we need the power of these instruments to be able to tease out the materials from the complex biological matrices that they come buried in.
Q: Working with different types of instruments and data, what kind of informational challenges do you face?
A: For us to be able to get a coherent picture of what the body does to our drug molecules, we need to be able to bring all the data together into one place, just like humans bringing together pieces of a jigsaw puzzle. We have two techniques that are complementary to each other. We have the sensitivity of MS along with the selectivity of NMR, and we need all that data put together to be able to find the molecular structure. Prior to 2009, that place was an analyst’s head. We had a very talented analyst who had been with the company for decades, and when he retired we realized just how reliant we were on people’s memories. The call then went out to get a database approach to try to replace that dependence on human memory.
Q: What did getting this database involve?
A: We had to involve ourselves in a due diligence process to go out and see what was available in the marketplace at that time that fitted our workflow and the success criteria we were looking to achieve. We looked at a number of different vendors and found that ACD/ChemFolder Enterprise and ACD/SpecManager Enterprise from ACD/Labs fitted our workflow best. It was a work in progress since the two pieces of software managed structural schema and spectral data in two separate databases. We wanted it to hold integrated structural and spectral data for a complete biotransformation map—the body produces upwards of 50 metabolites. So we wanted to push the software further than it was intended. We did a pilot trial and the software held up, giving us the possibility to store the data with a biotransformation map, which included a schematic of the complete metabolic fate of a drug as a top-level executive summary. So the people who were interested in just knowing what happens to the drug in our body could look at the top-level summary, and people who were more interested in the analysis to prove those structures could dig down deeper in the database.
One of the things that became apparent was that this database could be searched from various angles. As you associate data with a molecule, you are building up metadata as well as an information-rich environment around its basic structure. People were interested in this data for many reasons. Some wanted the NMR data to help with their analysis, while others needed the fragmentation data from MS or wanted to know which liver enzyme caused this metabolite to be produced. So what people really wanted was a data cube—a database that could be picked up and turned around to look at the facet that you were interested in. They wanted something that could be drilled down from the angle or perspective that you asked your question from. So that’s how we started using ACD/ChemFolder Enterprise, which has now morphed into a single integrated chemical and analytical knowledge management solution on their latest offering—the ACD/Spectrus Platform.
Q: How did you go about defining the success criteria for what you were looking for?
A: You have to speak to internal customers to find out what is required now, and then you need a crystal ball to see if this solution can grow to fulfill your requirements in the years to come. One of the first criteria we had was ease of putting data in. If putting data into the database is as painful as pulling teeth, then you end up with compliance issues. Getting away from the power of memory was the second criterion, and with moving to a database that was a given. We also needed the internal processes and backups in place to make sure that if the data became corrupted we could “wind back” a couple of days and retrieve the stored data. Other criteria for selection included ways to speed up the process of interpretation by being able to look at and interrogate the data belonging to a colleague who is located across the Atlantic. This would be akin to looking into his lab notebook to see what he has worked on in order to help with a similar problem that I am facing. Other criteria included improving the confidence we had in the hits we got from the database, reducing the likelihood of making mistakes in elucidating the data, and increasing our ability to share the data when we wanted to conduct a richer peer review.
Q: So is it easy to share the data using this database, and how secure is it?
A: Data security is taken care of by our IT personnel, and we have a licensed access to the database from an Oracle-based server that has a firewall. If you need to share your data with external customers such as contract research organizations (CROs), you need to put special procedures in place. You could sanitize a certain space in your database and give the external customers access to it so that they don’t get access to your entire database, although that’s not something we do here.
Q: Is there something that is lacking or can be improved upon?
A: The ability to share our data with the rest of the organization still requires licenses to the software. This is currently changing; soon anybody in the company who could benefit from the data can get access to it. Just as the power of the biotransformation map that links the parent molecule to its metabolites works for us, I can see other groups, such as degradation chemists, benefitting from this as well. They also have clusters of molecules where the active ingredient is broken down over time or by the environment, so they would benefit from a similar approach.
Q: Were you able to customize the database to fit your needs?
A: It’s almost scarily flexible! You have carte blanche to rename bits of metadata, including what species you saw the chemical structure in, what biological matrix was used, when the analysis was done, the instrument it was done on, and the name of the analyst. You are just creating areas on the database that you can then control, and as the database grows you can use all this information to do a very specific search. We are happy with the speed with which we can search. However, there is a lag when you work directly with the remote database. So we first have to create a local database as a part of our workflow. The idea is to do all the work locally and then export the data to the remote database at the end of the day or when the job is done.
Q: Is the database fairly intuitive or did you have to undergo extensive training to use it?
A: The first criterion was the ease of getting data in, so the intuitive nature of the software was a priority. However, ACD/Labs did provide the necessary training and we have had subsequent training on the various releases of the software as it has evolved. We tend to run with our core users who are more experienced in using the database, and for our parttime users, who use the database for some spectral identification, we have our in-house resource work through any issues with them. We have training manuals and user guides online that people can use, allowing most problems to be very easily resolved.
Q: How do you justify the return on investment for this database?
A: The thinking we have embraced to justify our investment is to remind ourselves that this approach could save an error from being made or stop an interpretation from being wrong. If clinical trials “go wrong” because an incorrect interpretation was used, then the ramifications can be severe. This database is a rich resource that can “link up” our organization; it can save us from suffering the possible negative outcomes from continuing to rely on human memory, which are really scary.
Q: What is your advice to lab managers in a similar situation?
A: First, talk to your internal customers and find out how much can be gained by linking your data to those of others in your organization. The idea is that your data creates reports and you want those reports to go as far across your organization as they possibly can. Also, you want to get something that works for you. Find a piece of data that is particularly onerous and use that in a demonstration to find the weak links in your current software or system. We did that, and we did end up “breaking” the software when we evaluated it. ACD/Labs was very proactive and reactive to what we had found, and we worked together through the pilot trial to fix it. This did not just involve requests that we were making to suit our workflows, and ACD/ Labs understood that what we were uncovering would make their software a better product.
Steve Thomas has a degree in chemistry from Warwick University, UK. Always intrigued by puzzles, he gravitated to analytical chemistry, choosing a third-year project in mass spectrometry under Prof. Keith Jennings. Steve graduated in 1990, taking a position in the NMR department of Merck’s Neuroscience Research Centre at Terlings Park. While gaining a wealth of experience in medicinal chemistry support, he became analytically bilingual, speaking both NMR and mass spec, to tackle the most challenging aspect of the role: the structural identification of drug metabolites. He expanded on this role, leaving Merck in 2006 for GSK, to join the Biotransformation and Drug Disposition group as an investigator within the Drug Metabolism and Pharmacokinetics department at Ware in the UK. The seamless combination of analytical techniques to generate reliable definitive structures was even more vital as he was moving from a discovery to a development environment. To facilitate assignments and add confidence to results, ready access to past analyses and knowledge proved invaluable but elusive. It became clear that the company was generating far more data than any one individual could keep in their head. Steve led the efforts to find a suitable platform to store, search, and share their data globally. Such a database mitigated the risk of duplicated effort, but required a deep dive into the dark arts of informatics. His success was measured by the quality of the resulting repository of knowledge that didn’t forget, go senile, retire, or leave the company for a competitor.