Why isn't there a competition or benchmarking for microscopes?

An honest question. I am currently comparing spinning disc to light sheet and I am wondering why it is practically impossible to find image data of the same specimen from both light sheet and spinning disc.

Is it impossible to come up with a standardized comparison?

The individual researcher usually has only limited access to equipment and does not have the time and resources for a thorough comparison.

I only found a single recent paper that compares two microscopes based on one specimen:

Ryu, Youngjae, Yoonju Kim, Sang-Joon Park, Sung Rae Kim, Hyung-Jun Kim, and Chang Man Ha. ā€œComparison of Light-Sheet Fluorescence Microscopy and Fast-Confocal Microscopy for Three-Dimensional Imaging of Cleared Mouse Brain.ā€ Methods and Protocols 6, no. 6 (December 2023): 108. Comparison of Light-Sheet Fluorescence Microscopy and Fast-Confocal Microscopy for Three-Dimensional Imaging of Cleared Mouse Brain.

1 Like

Hi @macromeer, welcome to the forum, itā€™s a good question.

You hit the nail on the head with:

While not ā€œimpossibleā€ in the strictest definition of the word, itā€™s extremely hard (like, even harder than it might appear) to do a rigorous, fair comparison between two different microscopes. Canā€™t overemphasize that. Moreover, there are multiple things that one might wish to standardize when making the comparison (SNR, speed, irradiance and/or bleaching), and that can affect interpretations. Also, even measuring the metrics of performance (resolution, contrast, SNR, bleaching, toxicity) is not trivial.

I like the concept of a benchmark or competition, but honestly my first thought is that doing this all correctly is so much more than simply setting some parameters in the software, that Iā€™m not sure it would be practically all that useful to compare measurements made by different researchers at different institutions on different (physical) samples, etcā€¦ Thereā€™s just way too many ways that could lead to erroneous conclusions.

This leaves us in a place where the most useful thing that you can generally do is compare the practical utility of two or more specific microscopes (and operators) you make have direct access to (and resist the temptation to broadcast and generalize those findings as representative of an entire class or ā€œtypeā€ of microscope technology or product). The paper you link can be seen as something like that. Those authors are happier with their (specific) LSFM than their (specific) spinning disc for the purpose of imaging whole cleared mouse brain. Thatā€™s great! :slight_smile: But I would caution anyone from taking much more away from that paper. Note that they used different objective lenses, different cameras, different pixel sizes & sampling rates, different exposure times, different laser power, etcā€¦ Itā€™s all fine if what one wants to say is ā€œif I sit down at this exact microscope vs that exact microscope and work with what Iā€™m given without changing too many parameters, which image generally looks qualitatively better?ā€ but Iā€™m not sure Iā€™d go much farther than that.

As a specific example of things that require more investigation take their resolution measurements:

scope objective theoretical resolution @ 525nm actual measured FWHM
light sheet 2.5x/0.12 NA ~2.7Āµm 6.04 Āµm
spinning disc 4x/0.2 NA ~1.6Āµm 7.43 Āµm

My first question would be: why was the measured resolution for the spinning disc so much worse than theory? (it was just a bead on a coverslip). It makes me suspect some sort of error. (note also that they used a 4Āµm bead that was well above the diffraction limit for both objective lenses to make these measurements)


In any case, making truly careful comparisons that are actually generalizeable is just incredibly hard

4 Likes

Thank you very much! How about using a standardized specimen as used for calibration and comparing the captured image data, e.g. by applying the same image segmentation pipeline (or an ensemble) and corresponding metrics. Similar to the evaluation of machine learning models.

We compared different microscopes/imaging modalities in the paper below. We discuss the difficulties of fairly comparing microscopes and explain the (great!) lengths we went through to develop and validate our method. Itā€™s an old paper now but the principles still apply.

Murray, J.M. et al. (2007) Evaluating performance in three-dimensional fluorescence microscopy. J. Microsc. 228, 390ā€“405

6 Likes

yeah, highly recommend @jenniferā€™s paper as an excellent example for just how much thought needs to go into a fair comparison of two microscopes. (it gets even harder when trying to compare modalities with dramatically different illumination strategies light LSFM and confocal).

as for the standardized specimen you mention, itā€™s also a great idea. the beads described in Jenniferā€™s paper were specially designed. John Murray also published a strategy for getting exactly 240 FPs in a viral capsid (An icosahedral virus as a fluorescent calibration standard: a method for counting protein molecules in cells by fluorescence microscopy - PubMed), which can be used in theory. thereā€™s argolights and stuff like that.

3 Likes

The move away from RSM standard to proprietor-specific infinity optics adds significantly to the issues raised by @talley and @jennifer . This is because you canā€™t just ā€˜use the same objectiveā€™ on different microscopes because each brand of infinity-corrected optics microscope has their own custom proprietary ā€˜tube lensā€™ system designed to work together with their proprietary brand-specific objectives.

Still, it should be possible to construct some meaningful test for a range of specific types of specimen preparations looking at specific features with a given set of parameters. I agree it will be ā€˜difficult but not impossibleā€™ and would be very instructive to see because it might influence interpretation of results published in the literature.

One thing you can be reasonably sure of - the big microscope manufacturers are not going to want to sponsor such a study, unless they can be sure that their brand / scope will come out on (or near) the top of the resulting ā€˜league tableā€™ because you can imagine how prospective microscope buyers will use the data.

P

1 Like

We have dabbled in comparisons as well and I share the sentiments above. I think there has been a fair bit of development in test samples that might lead us further. Considering all things SNR, Detector sensitivity (different types of detectors is an issue in itself comparing PMTs, HyDs, SPADs and cameras give very intriguing results), bleaching, laser output, resolution etc. Having stable, well defined test samples is a must.

Weā€™ve had some good results using FNTDs** (in which we made squares with different intensities locally using a multi-photon setup) and have tried developing super-resolution test samples using etching techniques***. Making a good comparison will definitely need at least multiple samples and many measurements. But I think the best test will remain a difficult biological sample.

I do think if you are a heavy user of microscopy it is a very good exercise to try at least, I have learned a lot (and am still learing) about different detectors and how they function that way. Quite surprisingly also similar detectors behave differently between different microscope companies.

** https://onlinelibrary.wiley.com/doi/full/10.1111/jmi.12686
*** Electron-beam patterned calibration structures for structured illumination microscopy | Scientific Reports

3 Likes

I know that QUAREP-LiMi is working on standardizing quality control criteria, metadata, performance indicators, etc. specifically for light microscopy: https://quarep.org/

2 Likes