Continuing the expansion of the data analytics portfolio available with Thermo Scientific™ SampleManager LIMS™ software, a new machine learning Profiling capability is now available for the Data Analytics solution. The Profiling capability provides an innovative way to predict the result of a yet to be executed binary test using historical data and novel Machine Learning (ML) based techniques.
Anticipating the result of a test without conducting it has several benefits for a lab:
- Identify and eliminate the need to perform redundant tests.
- Reduce the number of samples tested.
- Save money by not using expensive reagents and consumables.
- Generate a prediction to all eligible samples without any extra cost or time.
- Take actions by anticipating the result of the test.
- Fail samples early that would not go on to pass the test.
The Profiling capability of the Data Analytics Solution has many potential applications. For example, a food and beverage company might apply the Profiling capability to enable supervised learning in the food production process. In this case, SampleManager LIMS would use historical data to gain an understanding of the critical variables that determine whether a product is safe for consumers. This holistic approach considers not only the values of the individual critical variables themselves, but also the relationships between them. If a sample were to be flagged as failing, the system would alert stakeholders in advance to issue adjustments or investigations to avoid any risk to finished products.
With the Profiling capability, SampleManager LIMS software uses the results of previous tests applied to similar samples as inputs to create a ML pipeline that leverages the predictive power of the gradient tree boosting library called XGBoost.
This capability provides a no-code, easy-to-use framework for users to create various models based on different data, interpret the results, and create their own predictions. Users can easily automate the models so they are continuously trained, allowing predictions on new samples to be made without user interaction. The Profiling capability uses the new Statistical Script functionality introduced in SampleManager LIMS software version 21.0.
Using the Profiling capability involves four main steps in SampleManager LIMS software:
- Profiling entity creation
- Model training
- Results evaluation
- Sample prediction
Step 1 – Profiling entity creation
Profiling entities are the foundation of the Profiling capability. When creating a profiling entity, you select the data the model will use. You can also set the number of parameter combinations to use to during model training to improve model performance. Normally, a higher number of combinations will give more accurate results but will involve a larger training time.
You must also select the name of the tests that are going to be used as predictors, along with the target variable.
Two additional options available when creating the profiling entity include automated training and automated prediction. Automated training enables the system to train the model recurrently and should only be used when new samples with new results are added in the predictors and target tests. With automated prediction enabled, the model will predict recurrently.
Step 2 – Model training
After creating your profiling entity, you can train your new model in SampleManager LIMS software.
To train the model, select the profiling entity and click on “Train Model.” The model training duration varies based on the amount of data evaluated, the number of parameter combinations chosen, and hardware specifications.
Step 3 – Results evaluation
When the training process is complete, the next step is evaluating the results. SampleManager LIMS software enables you to see the results of the model and experiment with the model without impacting your data. The system displays the ROC-AUC, Cut Point, Sensitivity, Specificity, Accuracy values. The ROC Curve, Cut point distribution and the Variable importance plots are updated with the model results.
Using the sample prediction function, you can select a sample to calculate its probability of its target test of being “true” and its corresponding label. The results of the samples’ predictions are solely for informative purposes and are not stored within SampleManager LIMS software.
Step 4 – Sample prediction
Once you are satisfied with the model, you can now use the model to predict sample’s results. You can apply the model to one sample or select multiple samples to predict. The results of these predictions are stored in the SampleManager LIMS database.
Getting more from your data
Tools like the Data Analytics Solutions in SampleManager LIMS software provide a practical way for laboratories across all industries to get more from their data. Want to learn more about the data visualization capabilities available in SampleManager LIMS software? Watch our advancing data visualization in the lab webinar on-demand or read more about the powerful preconfigured dashboards in SampleManager LIMS.
Leave a Reply