How photonics and data science can transform disease diagnosis

(Image: Arek Socha from Pixabay)

Researchers from the Leibniz Institute of Photonic Technology and Friedrich Schiller University of Jena reveal how combining photonics and data science enables faster, label-free disease detection.

Written by Shuxia Guo, Elena Corbetta and Thomas Bocklitz from Friedrich Schiller University of Jena, Leibniz Institute of Photonic Technology.

Nowadays, the healthcare system is facing unprecedented challenges worldwide. On the one hand, chronic, infectious, lifestyle- and age-related diseases keep increasing as a consequence of the growing lifespan and expanding global transport network. While on the other, both the number of clinicians and the financial availability decrease year-by-year. Technical breakthroughs are urgently needed to provide better healthcare, while also lowering costs.

As enablers of innovative and transformative tools, photonic technologies demonstrate vast potential in this regard, allowing the biochemical composition and molecular structures of samples to be investigated in a non-destructive, non-invasive and label-free manner, and thus providing the opportunity to answer biomedical questions at a molecular level and pave the way for early diagnosis, a critical factor in the treatment of many serious diseases like cancer.

The introduction of data science allows photonic technologies to have more involvement in biomedical investigations, as photonic signals can be effectively and efficiently translated into meaningful biological knowledge. Photonic data science in biomedical investigations normally follows the workflow in the image below, where biological samples (tissue samples, for example) are imaged in tile mode, i.e., each tile covers a small area of the sample. Then multiple tiles are reconstructed into one image with a large field-of-view that covers the complete sample area.

The quality of the measured image is enhanced by preprocessing algorithms to reduce artefacts from the measurement, before any modelling methods are applied to extract biological knowledge. Our latest study into photonic image analysis covers these three topics: image quality assessment, tissue classification and computational staining.

normal workflow of photonic data science

The normal workflow of photonic data science in biomedical investigations (Image: Friedrich Schiller University of Jena / Leibniz-IPHT: under CC-BY 4.0)

Image quality assessment

Good quality images play a critical role in ensuring the validity of any imaging-based investigations. This is particularly important for biomedical applications, in which the samples of different medical statuses often bear only subtle differences, especially for early-disease diagnostics. It is therefore important and necessary to assess image quality effectively and reliably.

Existing methods usually work by comparing a measured image with the ground truth, commonly in pixel mode. But the full-reference mechanism often limits the performance and applicability of these methods, showing poor agreement with visual perception. To tackle these limitations, we proposed a multi-marker strategy, which is able to assess image quality from multiple perspectives in a reference-free manner.

Multi-marker image quality assessment (Image: Friedrich Schiller University of Jena / Leibniz-IPHT: under CC-BY 4.0)

The workflow of the method is depicted at the top of the above diagram. A classifier – in this case a linear discriminant analysis (LDA) is constructed using multiple standard quality markers calculated from semi-synthetic images with known artefacts. The semi-synthetic images were themselves generated by corrupting high-quality images with varying levels of artefacts modelled with priority knowledge, including blurring, noises and vignetting effects.

A set of reference-free metrics are employed to benchmark the amount of the artefacts contained in an image. Fourier ring correlation (FRC) is used to estimate the image resolution, of which the degradation was characterised by the sum of the FRC curve. The signal-to-background noise ratio (SNRσ) and the contrast to background (SNRμ) were adapted from the definition of the signal-to-noise ratio (SNR), where the presence of noise is represented by the sum of the high-frequency components of the image power spectrum and the structural complexity. The vignetting effect is detected by the energy ratio between the edge pixels and the whole image.

The calculated metrics are then fed into LDA to classify the type of artefact contained in the corresponding image, with the output decision score indicating the probability of an image belonging to each class. Furthermore, by including ‘reference’ (‘artefact-free’) as one of the classes, we obtained the probability that the image is artefact-free, and that enabled us to rank the quality of the input images.

Given images with different artefacts (bottom left of the above image), the resulting decision scores were ranked based on image quality (bottom middle of above image). Example images of different ranks, (bottom right of above image), are ranked as good-, medium-, and low-quality, with the ranking scores aligning with visual inspections. The proposed approach is proven to be simple, reliable and interpretable for image quality assessment without the need of reference images.

Computational staining

Among all photonic techniques, multimodal nonlinear imaging has demonstrated particular potential in biomedical investigations. It combines three nonlinear imaging modalities: coherent anti-Stokes Raman scattering (CARS), two-photon excited fluorescence (TPEF) and second-harmonic generation (SHG), which are sensitive to lipids/proteins, auto-fluorophores and collagen, respectively.

Such a combination enables multimodal imaging to simultaneously detect different biomolecules that provide rich molecular signal for biomedical use. Furthermore, as a label-free, non-destructive and non-invasive approach, it provides a perfectly suited alternative to the histopathological staining, which remains the gold standard for cancer diagnosis.

To bring multimodal imaging into clinical use, however, remains challenging as pathologists are trained with HE-stained images for the diagnosis. It is thus needed to convert the multimodal image into similar colour space to the H&E image. This forms the idea of computational staining, to obtain the pseudo-stained HE-image with a computational approach but not actually do the staining.

Bearing this idea in mind, we developed deep neural networks to transform multimodal images to pseudo-HE images. The transformation was achieved in both supervised and unsupervised manners, based on conditional generative adversarial networks (CGANs) and cycle conditional generative adversarial networks (cycle CGANs), respectively. The performance of the models was benchmarked in both qualitative and quantitative modes, demonstrating comparable results between the two models.

Example results are shown in the images below, demonstrating a good match between the prediction and the ground truth target image. Without the need for paired H&E and multimodal images, however, the cycle CGAN avoided the need for image registration, which remains a critical challenge. This study showed the promise of using computational staining based on multimodal imaging for pathological annotation, making it a huge space to improve the efficiency of cancer diagnosis.

Results of GNN-based generation for computational staining

Results of GNN-based generation for computational staining (Image: Friedrich Schiller University of Jena / Leibniz-IPHT: under CC-BY 4.0)

Tissue classification

Besides transforming multimodal images into pseudo-HE images for pathological annotation, photonics can be deployed alongside data science for clinical use in a more straightforward manner. Here, the machine learning or deep learning models are developed to extract features from the image and translate them into biological knowledge like disease status. Depending on the application, the modelling can be pixel-to-pixel (full-annotation) or one single output for each image (weak-annotation).

Lately, we achieved head and neck cancer detection based on multimodal imaging and a convolutional neural network. The multimodal imaging is performed with a newly developed endomicroscopy, which combines the three nonlinear imaging modalities: CARS, TPEF and SHG. To achieve cancer detection, parallel tissue sections were annotated by pathologies on histopathological images at pixel level.

These annotations were then projected to the corresponding positions on the multimodal image with an image co-registration approach. Thereafter, CNN-based semantic segmentation models were constructed that input the multimodal images and output the tissue types.

The modelling was conducted in two cases. In one case, the tissue was segmented into six groups: healthy epithelium, tumour, tumour stroma, necrosis, other tissue and background. The prediction demonstrated a good match with the annotation. In the latter case, to put the focus more on cancer detection, we merged the tumour, necrosis and tumour stroma into “tissue to resect” and the other two classes except background became “tissue to preserve”. This formed a three-class segmentation CNN model, of which the example predictions are shown in the images below. A balanced accuracy of around 92% demonstrates a satisfying performance for cancer detection. The study demonstrated the huge potential of endoscopic multimodal imaging, when used with data science for the automated, in-situ detection of head and neck cancer.

Results of CNN prediction for tissue classification

Results of CNN prediction for tissue classification (Image: Friedrich Schiller University of Jena / Leibniz-IPHT: under CC-BY 4.0)

In summary, photonic technologies applied in combination with data science have become a powerful tool in biomedical investigations and hold the promise of next-generation healthcare. The study demonstrates this by covering three main aspects:

Firstly, photonic data science brings with it the opportunity to speed up pathological diagnosis without the need to perform staining, but with pseudo-staining generated by computational approaches and photonic imaging used in non-destructive, non-invasive and label-free manners.
Furthermore, with powerful computational algorithms like deep neural networks, subtle differences caused by changes in health status can be extracted, and this enables early diagnoses of diseases, a critical factor for the survival rate of many.
Finally, another important perspective is that data science brings huge convenience to photonic technologies by reducing the inevitable artefacts from the measurement and taking control of the image quality.

With continuous development of both photonic technologies and data science, we expect substantial contributions of photonics to biomedical investigations and healthcare.

How photonics and data science can transform disease diagnosis

Written by Shuxia Guo, Elena Corbetta and Thomas Bocklitz from Friedrich Schiller University of Jena, Leibniz Institute of Photonic Technology.

Image quality assessment

Computational staining

Tissue classification

Topics

Read more about:

Editor's picks

Matteo Calvarese awarded the Siegfried Czapski Prize for endomicroscope study

Light up your brain this Christmas, with Electro Optics’ Christmas Light Quiz!

On-Demand | Technological Advancements and Supply-Chain Needs in Endoscopy

On-Demand | Handheld Laser Safety: Closing welding knowledge gaps

Enhance imaging precision with advanced optical filters for life sciences and medical imaging

Vision system essentials: key components and camera power accessory insights

Advanced high-energy laser systems: The DPSS advantage