The optimal instrument interface is dependent on your compliance requirements, the instrument, the integration outcomes, and, of course, the amount of effort you wish to expend. In this article, we will define the interface selection criteria and then discuss interface options.
|Interface Options||Security||Integration Potential||Setup Difficulty||Extraction Difficulty|
|Print & Scan||Medium||Low||Low||NA|
|Real Time Serial IP||High||High||Medium||Medium|
|Enterprise Content Management||High||High||Very High||Low|
|Direct Printer Capture||High||Medium||Low||Depends|
Security is defined as the opportunity for data tampering. For compliant labs, data custody throughout the interface activities is required by either physical or procedural processes. If compliance is not required, simpler, less costly options can be selected.
Integration potential is the ability to extract specific data from the instrument output. The low end of the spectrum is essentially qualitative information such as a scan or picture of the instrument data. The high or quantitative end of the spectrum is a well-structured data source from which individual results with full context can be extracted. The end user requirements should drive selection of this option. There is no need to employ a more complex option if qualitative information is sufficient.
Setup difficulty is a measure of how much effort is required to connect the instrument.
Extraction difficulty is relevant only in quantitative interfaces and is determined by how well the interface data is structured. Report files are often more difficult to parse since the information is highly formatted for readability. Report files are also a maintenance concern since unusual information can result in parsing errors. In increasing order of structure: csv, xml, database table, database query, web service.
Print & scan
Historically, scientists obtained printed reports from instruments. The reports were sized and pasted into paper notebooks. With the advent of LIMS and ELN, the paper reports could be scanned and attached as electronic media within a compliant audit-trailed solution. This interface option is supported by most instrumentation and is relatively easy to implement. Setup is limited to purchase of a scanner and a secure compliant storage solution. This interface option does have some large negative aspects, however. Quantitative data extraction is not likely, and individual data points must be manually transcribed, with the potential for transcription errors and, of course, more scientist effort. Lack of searchability is another negative aspect of the print & scan interface. Although it is possible to deploy textual search tools against scans, these tools are not 100 percent effective and cannot be considered validated. Finally, security is a concern as the printed documents may be modified and the scanned files may be manipulated prior to attachment. Print & scan should be deployed only if the other interface options have been exhausted.
Direct printer capture
The next step up from print & scan is direct printer capture, which uses a custom printer driver to capture the instrument report output and then stores the resulting electronic file in a secure space.
A custom printer driver or a printer application (utilizing the same technology as software such as the Adobe PDF driver) is designed to immediately send “printed” files to secure storage. Printer drivers are compatible with virtually any Windows application that can print files, making this an extremely viable tool. It also provides greater compliance to regulated environments as there are rarely intermediary “stops” for the files. A custom printer driver can also provide the instrument output in text format to facilitate parsing of individual data points.
If the printer driver is not directly connected to secure storage there will be a gap in security while the file is “parked” on a local drive or file share.
Direct printer capture is a secure interface method, applicable to most instruments and with the potential to extract quantitative information.
A network file share, such as Dropbox, is easy to set up and use and can handle all file formats, including Word and Excel. File shares allow files to be easily shared among groups of people. Accessibility is both a positive and a negative characteristic. It is essential to employ a robust access control list on a file share that allows users to write files but restricts modification or deletion. Backup and archival processes must be well defined and adhered to.
File shares are ubiquitous across most organizations but controlled/ secure shares are rare. File shares are the go-to option for labs that do not have storage solutions such as LIMS, ELN, or content management systems.
“Real-time” instrument interface
Another way to connect instrumentation to an ELN is through a direct connection to the device itself. Real-time interfaces are typically established using a serial (RS-232) or network connection to either a LIMS or ELN solution. Real-time interface provides immediate data capture from the instrument directly to the LIMS or ELN solution. The authenticity of the data is implicit as there are no steps between the instrument and the ELN application. Once the instruments are interfaced appropriately, there is rarely a problem, but getting them set up is a little more difficult than the previous strategies mentioned. It is important to note that the setup of a “real-time” connection varies between instruments and can depend on the vendor and model. Some instruments may have only a serial connection, whereas others have network connections available.
Special hardware to convert serial output to an IP connection prevents any hard-wiring that would otherwise be necessary. Expertise is often required to get these connections to work initially.
Parsing the instrument data stream is required with specific drivers capable of “understanding” the stream of data coming from the instrument and being able to translate it into a human-readable format.
Other instruments produce massive amounts of data and are extremely complex, almost forcing a connection straight to the instrument database. Examples of this would be Empower, tiamo (connecting to Oven KF devices), or any application that stores scientific data. We refer to these as “application to application” interfaces. The idea is that an application, your ELN, could access the instrument database and retrieve the data in a meaningful way. One of the advantages to this is that the application has access to large amounts of raw instrument data. However, when working with data generated by thirdparty applications, it is important to understand the database structure, which, even for the same application, is likely to change from version to version. Also, there is the possibility that an ELN may not be granted access or given support when attempting to connect to the database. Remember, these database companies are also trying to sell services and getting your ELN connected may be costly. This doesn’t even include what it may cost the lab just to set up.
Today, the “big data” paradigm is well established. Using this paradigm, data shall be shared across systems and organizations, and specialized data sources should be designed to be consumed and accessed by different systems and applications.
To solve this, SciCord has created SciMart, a type of data mart to provide relatively simple analysis-specific tables, which can be produced by scientists with just a little bit of basic IT knowledge. To create the data mart, rather complex queries are executed for each data mart table. These queries are CPU intensive and must be run during off-hours to avoid loss of production system performance. Despite this very minor limitation, these queries are able to turn generic results tables into specific analytical tables.
Enterprise Content Management system
Enterprise content management (ECM) systems, such as Agilent OpenLab, MasterControl, Documentum, and SharePoint are designed to store and manage process documentation for laboratories. An ECM is capable of storing large amounts of company files, which can enhance business processes. ECMs contain powerful security features and are more controlled than the file share method we described earlier. They are a popular choice for highly regulated industries, such as pharmaceuticals.
With this greater security, it may be more difficult to access, adding a layer of complexity between the ECM and the ELN. How does the authentication work? Is it as simple as usernames, passwords, and being granted permissions? One of the easier scenarios would consist of a simple HTTP connection (usually implemented through temporary links), but this is not common; more than likely, a custom interface module will need to be developed to interact with the ECM and obtain the data.
Despite all of this security, an added benefit is that once the data is validated (reviewed and approved), there is no need to duplicate parts of the process to ensure data accuracy, as this has already been completed.
Most of the above examples rely on the parsing of files. The exclusions would be real-time interfacing methods and some forms of database interfacing, but even these may require some parsing or data translation. Parsing files is an automated process that reduces the work of the scientist or reviewer, and limits the possibility of transcription error. In fact, as long as the parsing is being done correctly, there shouldn’t be any possibility for transcription error, unless some formatting has changed that would make the mapping incorrect.
To parse files, it is necessary to have experience mapping data, as the data needs to be directed to the appropriate areas within a template or ELN document. If an input format changes, then it is possible that the parsing component will need to be made aware of this, otherwise there will be a disconnect between what is searched for and what is returned.
Instruments vary and the ways in which instrument data is obtained also varies. Who knows whether it will ever be universal, but for now, you must consider several of these methods and find the solutions that work best for you and your lab.