Relinquishing Control of Data Systems Can Result in Making Bad Data Faster
In the pre-digital era, changes to laboratory instruments didn’t have the same impact that they do today. A device or instrument may have changed but access to the recorded data didn’t; the paper charts and notebooks were still there and accessible for someone to read. In the digital era, the data is tied to the instrument’s data system. Without careful planning, data system changes can impact access to important instrument data.
Data management in the pre-digital era was pretty simple. You worked with notebooks (paper) and instrument output that was either read and entered in a notebook or recorded on charts or film. Everything was human-readable.
“Managing” the data was a matter of managing the notebook and cataloging charts and films in file drawers and cabinets. There wasn’t much concern about media compatibility, because the only thing between the data and your eyes was air; as long as you had the education needed to understand the output, you were ready to go. Reevaluating the data was a matter of finding the right material and looking at it. It was all there. Even decades later, if you could locate the material, you could make sense of it. Data management was a matter of managing physical items, keeping track of where they were, making copies for backup, and storing them so that they would be protected from deterioration.
When instrument data systems arrived in labs, things changed. In the early days it was by choice; we looked for products to offload routine work. Later it became part of the instrument-data system package purchase. These were readily accepted because they relieved us of tedious work. Rather than our evaluating a hundred feet or more of strip chart paper from a chromatograph, the system did it and even printed out a nicely formatted report. That convenience came with changes and at a cost that sometimes wasn’t obvious.
In the digital era things are a lot different
The first change was in the need to acquire new skills; you knew how to do the analysis, but getting the computer to do it took some learning. Data had to be entered in order to identify samples, and those sample IDs had to be matched to the designations the computer used. You also had to understand how control parameters worked and how they affected the analysis—using the wrong set of parameters, or just relying on the vendor’s defaults, could produce erroneous results. This becomes more significant as vendors move toward the-instrument-as-an-appliance, where the instrument is essentially a closed box and all the interaction is through the computer systems. As more of the control functions are assumed by onboard processing, your ability to independently verify settings (flow rates, etc., depending on the type of instrument) and to ensure that the device is operating according to your specifications is reduced. Another change was in data utilization and management.
One of the most significant changes is that the data is digital rather than analog output. You couldn’t view the actual detector output, only the computer’s estimation of the detector’s signal, and it wasn’t a continuous data stream but a sequence of points that represented the detector’s continuous signal output. If enough points were taken, the distinction might not matter for the initial calculations, but it would if you wanted to reevaluate the data, possibly looking for anything unusual about the shape of a peak or small, unexpected elements. Then the difference between a sequence of points representing the data and a pen trace is significant.
Did the computer keep the raw data? If not, you have to rerun the sample (still have it? ). In some cases the printed report may have a graphic representing the data. Is it sufficient for your needs? Can you detect small shoulders, the way you might have on a strip chart? Making allowances for these kind of contingencies is something you have to do before the samples are processed— you have to anticipate what questions might be asked and make sure the system is capable of capturing enough data to answer them. In the pre-digital days, the data was recorded on paper and you had all of it.
Second, there is the process of looking at the data. It is no longer a matter of looking directly at the recording but of looking at it through a series of components. You need a computer with access to the data files (which may require authorization) and the software application used to acquire and process that data. Without the application, you can’t make sense of the files; you may not even have access to the files if the application maintains its own data directory structure. Exported data files may not have all the information you want (the vendor may not export everything), and then you need something to view it with. If the exported data is in the form of a PDF file, what you see is what you get; you may be able to expand the image, but all you are doing is making the pixels bigger, not getting more detail.
Data storage management
Third, there is the issue of data storage management. Not only do you have to manage the print output and the data files, but you also need to preserve the computer computer application-data structure if you want to look at anything beyond the printed report (paper or PDF). Going paperless reduces those options to one. That means you have to set up a file management system with a catalog of entries and provide on-site and off-site backup. On-site backup protects against the failure of a computer or disk drive; off-site backup protects against more significant failures.
What you back up is important. It isn’t restricted to the data files but requires the tools needed to work with them, and that can include everything from the operating system (some vendor applications may not keep up with operating system changes, particularly the products of small companies), database systems (which change independent of operating systems and applications), the applications, AND the data. Omitting elements of this structure may leave you with the data but no means of using it or gaining value from it.
One way of backing up everything is to duplicate the computer system and everything on it. That isn’t as difficult as it might seem. The IT concept of “virtualization” allows you to capture an entire computer’s data structure—everything from the operating system on up through the applications and data—and store it as a large file or “container.” The container can be stored on data center servers and run on an as-needed basis. Once the container is accessed, it looks and behaves like the computer system you were used to working on, with access to the applications and data. You may not be able to acquire new data, but you can work with the data that is there. You can create as many containers as you like, and doing so for a system before an upgrade would be a good way of preserving laboratory data systems for future use.
Managing laboratory operations
There are other issues as well. As we relinquish control over the devices, we move toward push-button science. Are people going to lose critical skills as they become dependent on systems to carry out the analysis? How are you going to verify that the systems are working as you need them to work? “Trust the vendor” is not a good option. When an analyst looks at data, he or she consciously or unconsciously looks for clues that something is off. Extra peaks, a baseline that doesn’t look right, a separation that isn’t as clean as it should be. Will the systems allow you to build that into the analysis? We need to maintain control over the systems and how they function. Relinquishing that in the name of higher productivity may result in making bad data faster.
Planning is an essential part of managing lab operations. For example, if your lab has a number of the same type of instrument, can you use a multi-instrument, multiuser system to manage them? Rather than several instrument-computer pairings, having the same type of instrument connected to a common data system simplifies data and system management, reduces costs, and increases operational flexibility.
The extent of your use of automation technologies (including instrument systems, robotics, laboratory information management systems, electronic lab notebooks, scientific data management systems, lab execution systems, etc.) is up to your discretion. Their use can have a beneficial effect on lab operations. In the past, advances in instrumentation—new features, more speed, more capability—were incremental. However, the use of advanced technologies is not just a 1+ over existing instruments; the use of digital systems in concert with measuring equipment changes the dynamics of laboratory management and operations. It is a transformational change in how labs work, well beyond anything taught in schools. These systems require thought, education, and planning to be fully effective. They are part of the continuing and rapidly developing evolution of lab technologies, and being prepared for them is a necessary course of action. Our role is to help you with that preparation.
“Labs in TransitionSM” is a series of articles discussing the impact of modern information technologies on lab personnel and operations. Rather than dealing with “hype,” the series focuses on how these technologies and products transform laboratory work and the skills needed to work with them. The articles are supported by the ILA’s educational (http://www.institutelabauto.org/courses/coursesovrvw2.htm) and “Elements of Lab Technology Management” (http://www.institutelabauto.org/publications/ELTM.html) programs.