2024-04-30
2024-06-28
2024-06-06
Manuscript received January 17, 2024; revised February 1, 2024; accepted February 28, 2024; published May 8, 2024.
Abstract—The field of CT imaging has been witnessing significant advancements. However, extracting precise information from complex image data remains a challenging task. This study focuses on automating the extraction of CT images. In our study, we adopt the U-Net architecture, a multi-scale blurring technique on data, to obtain a multi-resolution representation. This method is specifically designed to capture information at various granularities, from more detailed information to broader structures. After applying this multi-step blur, we calculate the difference between adjacent images to take advantage of the change in situation between different resolutions. Although feeding the blurred results Directly into the U-Net model may yield satisfactory results, our approach to computing differences between blurred images focuses on the nuances of these changes. To further enhance accuracy, we focused on ensemble learning, leveraging the weights from the training processes of multiple models to average their output during prediction. The results demonstrated that by adopting our approach, we achieved a Dice accuracy of 96.8% and improved the accuracy of CT image extraction. Keywords—convolutional neural network, U-Net, medical image processing, spine segmentation, ensemble learning, attention gate Cite: Kai Yang and Masayuki Kikuchi, "Automatic Detection of Spine Region Using Multiple Pseudo 3D U-Net Models with Weighted Average Voting and Attention Mechanisms," Journal of Image and Graphics, Vol. 12, No. 2, pp. 152-157, 2024. Copyright © 2024 by the authors. This is an open access article distributed under the Creative Commons Attribution License (CC BY-NC-ND 4.0), which permits use, distribution and reproduction in any medium, provided that the article is properly cited, the use is non-commercial and no modifications or adaptations are made.