Chromatography, a technique essential for separating, identifying, quantifying, and purifying mixtures, is crucial in advancing scientific research. As the importance of chromatography continues to grow, so does the need for efficient data management. Moreover, there is a growing aspiration to make chromatography data easily accessible for advanced analytics, including Artificial Intelligence (AI), thereby maximizing its overall value. In this two-part blog series, we will review the preparation of chromatography data, involving its centralization and engineering to a vendor-agnostic format, and explore various use cases where it can be effectively employed.
Shimadzu LabSolutions: Empowering Chromatographers Worldwide
Shimadzu LabSolutions is a suite of software designed to manage data and workflows in analytical laboratories. It can control a wide range of scientific instrumentation, including high-performance liquid chromatography (HPLC) and gas chromatography (GC) systems. LabSolutions ensures data integrity, security, and traceability, supporting compliance with regulatory standards such as FDA 21 CFR Part 11.
Integration with Data Analytics and AI
Shimadzu and TetraScience have entered into a groundbreaking partnership. This collaboration seamlessly integrates LabSolutions with the Tetra Scientific Data and AI Cloud. The result? Data generated through LabSolutions can now be automatically centralized within a customer's dedicated instance of the TetraScience cloud-based platform. This integration paves the way for versatile data utilization across various use cases and marks a significant step toward making chromatography data AI-ready.
Benefits for Chromatographers Worldwide
This partnership is not just about technology; it's about empowering chromatographers worldwide. Data generated through LabSolutions is transformed into a standardized, accessible format, opening up new possibilities for data analysis and utilization with diverse data analytics tools and AI applications. Regardless of their location, chromatographers can now harness the power of AI for enhanced research.
Standardization for Global Accessibility
Data standardization is key in chromatography. LabSolutions and TetraScience offer a standardized data structure that is ideal for storage and accessibility. This format is vendor-agnostic, ensuring that chromatographers worldwide can access and analyze their data consistently and efficiently.
So what’s new?
Let's delve into the details of this new release. In collaboration with Shimadzu, TetraScience is proud to introduce the first release of a Tetra LabSolutions intermediate data schema (IDS). With this IDS, data is stored in an open and vendor-agnostic format, JavaScript Object Notation (JSON). This schema captures scientific data in a format that is easy to interpret, vendor-neutral, and, for some fields, even instrument class agnostic. This means data can now be queried and aggregated in ways previously impossible. For example, users can query by sample ID or user ID across all data generated within their organization.
Why JSON?
JSON is an attractive format for lab automation and data science due to its lightweight nature and ease of access for both humans and machines. Its key-value pairing structure enables efficient data serialization and parsing. This structure makes storing, transmitting, and working with complex datasets straightforward in most of today’s programming languages. Beyond creating standardized structures for raw data, the Tetra Data Platform extends its utility by automatically extracting data into relational database table views. These views are readily accessible through SQL queries, offering additional versatility and ease of access for data manipulation and analysis.
The Tetra LabSolutions IDS is parsed from data-rich .lcb and .lcd files after they are ingested into the Tetra Data Platform. This inaugural release of the LabSolutions data schema includes support for HPLC instruments utilizing UV-visible and refractive index (RID) detection methods. The raw chromatographic detector values are stored as “datacubes.” They can be used to recreate chromatograms or even customize new visualizations, such as use case–specific chromatogram overlays created with a scientist's favorite third-party and open-source tools.
In addition to raw measurements, the Tetra IDS also includes processed results as calculated by LabSolutions, including retention time, area percent, asymmetry, and resolution. These values are cataloged in the Peaks section of the IDS under Results. This data can now be easily accessed for downstream task-specific calculations and integration into downstream repositories, such as an ELN or LIMS, facilitated by industrialized Tetra Connectors.
A few other highlights of the new Tetra LabSolutions IDS are the inclusion of method data, such as pump control parameters, high-level system details like software and software version, and instrument components. The IDS also captures sample details, including sample ID, sample holder type, and injection location. The comprehensive list of IDS values is assembled to build Tetra Data that is foundational to establishing numerous value-adding use cases.
In the second segment of this blog series, we will delve into the diverse applications of Tetra LabSolutions IDS, demonstrating its capability to generate substantial value for chromatographers.
Existing customers can learn more about these updates by visiting the Shimadzu LabSolutions ReadMe on the TDP at slug: shimadzu-labsolutions-raw-to-ids. It thoroughly describes our parsing strategies and provides examples of accessing the data via SQL. Feel free to reach out to your Customer Success Manager with any questions.
Not a customer, but curious to learn more? Please connect with us. We’d be excited to hear from you.