Trust me, I DO have the conviction on scaling law, the paradigm of data + compute is all you need. I find the “less structure, more intelligent” 1 idea very appealing.
BUT, we are just not there yet, at least in the medical imaging domain.
So, here we are, leveraging prior knowledge(aka domain knowledge) to boost the efficiency of training and data utilization.
Before I dive into medical imaging specifics, I would like to point out
Many natural imaging CV tasks leverage prior knowledge as well
- I would argue pose landmark detection rely on HUMAN DEFINED interpretation on skeleton , instead of naturally learned representation.
- BEV feature aggregation is predefined by human calculated epipoler consistancy.
- In speaking of Autonomous Vehicle, this target trajectory prediction paper on how to model the coordinate system into transformer postional embedding is pretty interesting 1 .
Radiologist utilize Prior knowledge to build better visualization
-
curved multiplanar reformation (CPR) and Stretched multiplanar reformation (sMPR) are commonly used to diagnose coronary disease. Which is essentially stretching the 3D vessel along a plane into a straight line for easier interpretation of the structure and interior by eliminating other information of vast 3D voxels.
-
Similarly, rib-unfolding visualization makes it easier and faster to pinpoint rib fracture, even for non expert.
Examples on medical imaging analysis
There are MANY more, I just name a few I’ve known over the years
-
mass detection in Mammography: mass feature should be correlated in multi view
-
Vertebra Localization and Identification in CT: Vertebra should form a coherent centerline along the spine
-
suspicious node malignancy prediction: should take account of surrounding lymph node conditions
-
vessel centerline extraction: easier/better to model with graph (nodes) than with segmentation (voxels)
DeformCL: Learning Deformable Centerline Representation for Vessel Extraction in 3D Medical Image
Not to mention algorithm also build on top specialized reconstruction for human interpretation; unsurprisingly, what’s easier for Radiologist to tell, is also easier for model to learn.
What leveraging prior knowledge anyway?
Essentially, leveraging prior knowledge is the act/technique of carry biological/physics/pathological grounded knowledge into the model, in the form of “tensor affinity”:
- choose a level of abstraction: pixel space, spatial feature space, semantic feature space, instance feature space, etc
- heuristically set “communication rule” (which tensor shall talk to which)
- choose a way of communication (feature aggregation), transformer is the default.
- Note the result might be warped into a whole new space , eg dense pixel -> sparse node, vice versa.
References
- X: ↩