Accurate detection of cephalometric landmarks is crucial for orthodontic diagnosis and treatment planning. Current methods rely on a cascading form of multiple models to achieve higher accuracy, which greatly complicates both training and deployment processes. In this paper, we introduce a novel regression paradigm capable of simultaneously detecting all cephalometric landmarks in high-resolution X-ray images. Our approach only utilizes the encoder module from the transformer to design a dual-encoder architecture, enabling precise detection of cephalometric landmark positions from coarse to fine. Specifically, the entire model architecture comprises three main components: a feature extractor module, a reference encoder module, and a finetune encoder module. These components are respectively responsible for feature extraction and fusion for X-ray images, coarse localization of cephalometric landmark, and fine-tuning of cephalometric landmark positioning. Notably, our framework is fully end-to-end differentiable and innately learns to exploit the interdependencies among cephalometric landmarks. Experiments demonstrate that our method significantly surpasses the current state-of-the-art methods in Mean Radical Error (MRE) and the 2mm Success Detection Rate (SDR) metrics, while also reducing computational resource consumption. Our code will be available soon.
Live content is unavailable. Log in and register to view live content