How AI is Transforming PET Scan Interpretation

From automated lesion detection to SUV-based biomarker extraction, a clinical perspective on where AI adds genuine value - and where the hype outpaces the evidence.

AI-assisted PET scan interpretation

Positron emission tomography has always generated more data than any radiologist can fully extract in a reading session. A whole-body FDG PET/CT study contains hundreds of transaxial slices across fused modalities, with tracer uptake patterns that shift based on the patient's metabolic state, blood glucose at the time of injection, and scan timing post-injection. AI does not change that complexity - but it changes how much of it a physician needs to manage manually.

The Gap Between Acquisition and Interpretation

PET scanning produces quantitative metabolic data. The standard metric, SUVmax (maximum standardized uptake value), captures peak tracer concentration in a region of interest, normalized to the patient's body weight and injected dose. But SUVmax alone is a blunt instrument. Lesion geometry, background uptake in adjacent structures, partial volume effects from small lesions, and the quality of attenuation correction all affect whether that number accurately reflects metabolic activity.

Until recently, the nuclear medicine physician handled all of this manually - drawing regions of interest by hand, visually estimating background, and mentally correcting for known scanner limitations. A busy PET center reading 20 studies per day leaves limited time for thorough quantitative analysis on every case. This is the gap AI tools are positioned to close.

Deep learning-based lesion detection models, trained on large annotated datasets of FDG PET/CT studies, can automatically identify and mark hypermetabolic foci above a configurable SUV threshold. More sophisticated models segment each lesion and extract volumetric metrics - total lesion glycolysis (TLG) and metabolic tumor volume (MTV) - that have stronger prognostic associations in several cancer types than SUVmax alone.

Where Automated Detection Actually Performs Well

Published performance data on AI-assisted PET reading is more nuanced than vendor marketing suggests. The strongest evidence exists for a specific subset of tasks: detecting bone metastases on FDG PET in patients with known primary cancer, identifying pathological lymph nodes above a size threshold, and flagging studies where lesion burden has changed significantly from a prior scan.

These tasks share a common characteristic - they involve finding objects that differ from a well-established normal background pattern. When a model is trained specifically on oncology FDG studies from a defined scanner protocol, sensitivity for detecting lesions above 1 cm diameter reaches 90-95% in controlled validation studies. That is not uniformly generalizable - scanner type, acquisition protocol (2D vs. 3D mode, time-per-bed-position), and tracer preparation all affect how well a model trained at Institution A performs at Institution B.

The partial volume effect remains a genuine limitation. Lesions smaller than approximately twice the scanner's spatial resolution (typically 4-6 mm for modern PET/CT systems) are systematically underestimated in SUV measurements. AI models that do not explicitly account for partial volume correction will generate quantitatively biased outputs for small lesions - which are precisely the ones where early detection matters most in staging studies.

Structured Reporting and the Extraction Problem

Beyond lesion detection, AI's contribution to PET interpretation increasingly lies in structured data extraction. PERCIST (PET Response Criteria in Solid Tumors) requires measuring SULpeak (SUV normalized to lean body mass) in the five most metabolically active lesions, comparing to prior study values, and classifying overall response as complete metabolic response, partial metabolic response, stable metabolic disease, or progressive metabolic disease. That workflow, done rigorously by hand, takes 15-25 minutes per study.

Automated PERCIST calculation requires the system to: identify corresponding lesions across baseline and follow-up scans, account for differences in scanner calibration and scan timing, apply lean body mass normalization, and handle cases where lesions appear or disappear between time points. This is a registration and matching problem as much as a detection problem. Current AI tools handle straightforward cases reliably, but fail non-trivially when scan quality differs significantly between time points or when the patient has had intervening treatment that changes lesion morphology.

As we discuss in our article on quantitative PET imaging and AI-derived biomarkers, the move from single-value SUVmax to multi-parametric biomarker extraction is where the next generation of imaging AI is focused - and where the evidence base is still being established.

The Physician's Role Does Not Diminish - It Changes

A common concern from nuclear medicine residents and junior physicians is that AI-assisted interpretation tools will reduce their diagnostic skill development. The opposite argument is more defensible: when AI handles the routine cataloguing of metabolically active foci, the physician's attention is freed for the clinically complex cases where pattern recognition and clinical correlation are genuinely irreplaceable.

The cases that benefit most from physician expertise are not the straightforward "active lymphoma vs. background" reads. They are the incidentalomas - the unexpected finding that requires differential diagnosis, the equivocal mediastinal node in a patient with inflammatory bowel disease, the brown fat uptake pattern misregistered as a lymph node. AI systems trained on labeled oncology data have no basis for handling these cases well, and the evidence from clinical deployment confirms this - false positive rates on atypical presentations remain high.

The physician who understands how an AI model was trained, what its confidence thresholds represent, and where its failure modes cluster will use it more effectively than one who treats it as an oracle. This is not a new dynamic - it mirrors the relationship between radiologists and CAD (computer-aided detection) systems for mammography, where over-reliance on CAD contributed to performance problems documented in the literature.

Protocol Standardization as a Prerequisite

One underappreciated barrier to effective AI integration in PET interpretation is protocol variability. Institutions running FDG PET/CT for oncology may use different uptake times (45 vs. 60 vs. 90 minutes), different blood glucose cutoffs for proceeding with the scan, and different reconstruction algorithms (OSEM vs. TOF-OSEM with different iteration/subset parameters). A model trained on 60-minute uptake studies from GE Discovery scanners will produce systematically different outputs when applied to 45-minute studies from a Siemens Biograph system.

This is not a theoretical concern. Published multi-site validation studies routinely report 10-20% degradation in detection sensitivity when models are applied to scanner/protocol combinations not represented in training data. The practical implication is that AI deployment for PET interpretation should begin with a prospective performance validation on the institution's specific scanner and protocol before being used in clinical workflow - a step that is frequently skipped in practice.

A Realistic Assessment

AI is making PET scan interpretation faster and more consistent for a well-defined subset of tasks: detecting hypermetabolic lesions above a threshold size, quantifying total lesion burden in oncology staging, and generating structured follow-up comparisons. For these tasks, the evidence supports clinical use with appropriate physician oversight.

For tasks involving clinical integration, ambiguous findings, incidental pathology, or rare presentations, AI tools provide limited value today - and pretending otherwise misleads both clinicians and administrators who make procurement decisions. The physicians who will get the most from these tools are those who understand what the model can and cannot do, rather than those expecting it to replace interpretive judgment.

NucliVision's approach is to present AI outputs as structured inputs to the physician's reading workflow, not as independent diagnostic conclusions. The enhanced images, extracted metrics, and preliminary report draft appear in the viewer alongside the original data - the physician decides what is clinically meaningful.

Learn More About NucliVision

See how our platform integrates AI-assisted PET interpretation into nuclear medicine reading workflows without disrupting established clinical processes.

Our Solutions Request a Demo
Back to News