Ⅰ. INTRODUCTION
Computed Tomography (CT) is used in various medical fields such as disease diagnosis and biopsy, and the frequency of such examinations is rapidly increasing. With rapid hardware and software developments. CT can significantly shorten scan times and provide excellent quality images. Unlike simple X-ray, CT can acquire images of abdominal organs more easily. According to international standards, the images acquired using CT can be stored and transferred through a picture archiving and communication system (PACS). Whereas rapid diagnosis is necessary, an increase in CT scans and subsequent images has been identified as a factor that decelerates the diagnosis process[1]. Furthermore, diagnosis using CT imaging involves identifying organic relationships such as the shape, location, and size of organs. Recent advances in deep learning have significantly affected social and industrial sectors through predictions of greater accuracy. In addition, the accuracy and speed of image recognition and classification are improving and investigated the most actively in medical imaging[2]. The convolution neural network (CNN) and you only look once (YOLO) algorithms, specifically, are widely used for object detection in image recognition[3]. CNN is an artificial neural network primarily used in image recognition and uses convolution[4]. It is typically used for classifying images, videos and texts but has a limitation of a long processing time[5]. YOLO, on the other hand, is relatively quicker at recognition compared to CNN[6]. As such, this study aims to evaluate the accuracy of normal kidney and vertebrae recognition in abdominal CT images using YOLOv3.
Ⅱ. MATERIAL AND METHODS
1. CT image acquisition
In this study, 1000 pancreas CT images were acquired from a cancer imaging archive[7], where 900 of the images were used for deep learning training, and the remaining 100 images were used for evaluating the deep learning training.
2. Deep learning preprocessing
Yolo_mark was used for marking and labeling, as demonstrated in Fig. 1, where the bounding box on the kidney and vertebrae were for YOLOv3 transfer learning.
Fig. 1. Kidney and vertebrae labeling using YOLO_mark.
3. YOLOv3 network configuration and training
Seven convolution layers excluding the input and output layers were configured, and the input data were resized to 416 × 416 pixels. A 3 × 3 convolution filter was applied to each layer; 1 was applied to padding and stride, followed by batch normalization. Furthermore, the activation function, a leaky rectified linear unit, was applied, as shown by the network displayed in Fig. 2[8,9].
Fig. 2. Diagram of YOLOv3 network.
The operating system, Ubuntu 16.04, GPU GeForce GTX 1080Ti 11GB, RAM 32G, was used for YOLOv3 training.
4. Analysis of deep learning progress
The YOLOv3 transfer learning progress was analyzed using the average loss, Region 82 and 94 average intersection over union (IoU), class, .5R, and .75R. Average loss is an indicator of the error rate of the correct response, where a value closer to 0 indicates proximity to the correct response in deep learning training. Region 82 is the largest mask that represents the prediction of small objects, whereas Region 94 is a medium mask. IoU, an analysis index used for each region, is an assessment tool used for object detection that is acquired by dividing the area of intersection of the bounding box (ground truth) of the actual object and the results of the object detection by the combined area of two bounding boxes; furthermore, it is used as an indicator of accuracy in object detection. The closer Eq. (1) is to 1, the greater is the accuracy.
\(\text { Io } U=\frac{\text { Area of Overlap }}{\text { Area of Union }}\) (1)
Furthermore, the closer the class and .5R, .75R indicators are to 1, the higher is the correct response rate. New weights were acquired through these indicators.
5. Analysis of deep learning model validation
Using the new weights acquired from YOLOv3 transfer learning, 100 pancreas CT images were used to measure accuracy. Furthermore, the function of real-time detection using a camera was confirmed.
Ⅲ. RESULT
As a result of YOLOv3 learning, no distinct learning effects were observed after 1520 epochs in any indicator; hence, learning was discontinued.
1. Results of deep learning progress
1.1 Average loss
The results of average loss approached 0 after 300 epochs. The learning progress was continued up to 1520 epochs to improve the learning ability of Regions 82 and 94. The results of the epochs are shown in Fig. 3.
Fig. 3. Average loss value by epochs.
1.2 Region 82 and Region 94
The average IoU, class, .5R, and .75R at 1520 epochs, were 0.82, 0.99. 1.00, and 0.81 for Region 82, respectively, and 0.82, 0.99, 1.00, and 0.88 for Region 94, respectively. It was confirmed that the accuracy of each indicator improved with the increase in the number of epochs. The results are shown in Table 1 and Fig. 4.
Table 1. Results of YOLOv3 Region indicators by epochs
Fig. 4. Results of YOLOv3 Region indicators by epochs.
2. Deep learning model validation
The accuracy verification on 100 pancreas CT images using weights generated from YOLOv3 transfer learning demonstrated the accuracies of kidney and vertebrae detections of 83.00% and 82.45%, respectively. Moreover, the function of the model during real-time detection using a camera was confirmed. The results are as shown in Fig. 5.
Fig. 5. Real-time CT image object detection using YOLOv3.
Ⅳ. DISCUSSION
Many examinations have been performed owing to the increased use of CT in the field of medicine, resulting in delays in readings, etc. As such, a quick review and diagnosis is necessary; as such, various deep learning methods are applied in medicine[10-11]. Deep learning in medical imaging is used through various models, such as object detection for lesion diagnosis, medical image visualization using three-dimensional augmented reality, medical image segmentation using a U-net model and computer-aided diagnosis (CAD), with increasing accuracy and predictability following advancements in computer science[12-16]. In addition, Korean companies such as Lunit and VUNO are advancing deep learning in medical fields and plan to use it in clinical practice upon approval from the Ministry of Food and Drug Safety. In this study, an object detection model using pancreas CT images demonstrated organ detection accuracy exceeding 80%. It is apparent that in most previous studies, such as a cancer classification study by Gul et al. (AUC 0.989) and a study by Tokai et al. (80.9% accuracy), detection function accuracy ranged from a minimum of 80.9% to a maximum of 89%[17-19]. As such, it can be concluded that the function of kidney and vertebrae detection in pancreas CT images using YOLOv3 is reliable. Nonetheless, learning was performed only for normal kidneys and vertebrae and hence the function in lesion diagnosis could not be verified. This indicates the necessity of implementing a real-time lesion detection model in future studies.
Ⅴ. CONCLUSION
YOLOv3 successfully recognized the kidneys and vertebrae within abdominal CT images in this study; hence, it may be used as basic data for medical image object detection using deep learning.
Acknowledgement
This research was supported by 2020 Eulji University University Innovation Support Project grant funded.
References
- E. Stephen Amis Jr, Priscilla F. Butler, Kimberly E. Applegate, Steven B. Birnbaum, Libby F. Brateman, James M. Hevezi, Fred A. Mettler, Richard L. Morin, Michael J. Pentecost, Geoffrey G. Smith, Keith J. Strauss, Robert K. Zeman, "American College of Radiology white paper on radiation dose in medicine", Journal Of The American College Of Radiology, Vol. 4, No. 5, pp. 272-284, 2007. http://dx.doi.org/10.1016/j.jacr.2007.03.002
- Y. J. Kim, K. G. Kim, "Development of an Optimized Deep Learning Model for Medical Imaging", Journal of the Korean Society of Radiology, Vol. 81, No. 6, pp. 1274-1289, 2002. https://doi.org/10.3348/jksr.2020.0171
- J. FENG, X. Wu, Y. ZHANG, "Lane detection base on deep learning", 2018 11th International Symposium on Computational Intelligence and Design, IEEE, pp. 315-318, 2018. https://doi.org/10.1109/ISCID.2018.00078
- Y. H. Lee, Y. Kim, "Comparison of CNN and YOLO for Object Detection", Journal of the Semiconductor & Display Technology, Vol. 19, No. 1, pp. 85-92, 2020.
- Redmon, J., A. Farhadi, "Yolov3: An incremental improvement", arXiv preprint arXiv:1804.02767, 2018.
- J. W Chae, H. C Cho, "Detecting Abnormal Behavior of Cattle based on Object Detection Algorithm," The Transactions of the Korean Institute of Electrical Engineers, Vol. 69, No. 3, pp. 468-473, 2020. http://dx.doi.org/10.5370/KIEE.2020.69.3.468
- Holger R. Roth, Amal Farag, Evrim B. Turkbey, Le Lu, Jiamin Liu, Ronald M. Summers, "Data From Pancreas-CT", The Cancer Imaging Archive, 2016. http://doi.org/10.7937/K9/TCIA.2016.tNB1kqBU
- D. H. Kim, D. Y. Kim, I. H. Lee, "Performance Analysis of Wireless Communication Systems Using Deep Learning Based Transmit Power Control in Nakagami Fading Channels," Journal of the korea institute of information and communication engineering, Vol. 24, No. 6, pp. 744-750, 2020. https://doi.org/10.6109/JKIICE.2020.24.6.744
- J. S Han, K. C Kwak, "Image Classification Using Convolutional Neural Network and Extreme Learning Machine Classifier Based on ReLU Function," Journal of KIIT, Vol. 15, No. 2, pp. 15-23, 2017. http://dx.doi.org/10.14801/jkiit.2017.15.2.15
- R. Azuma, Y. Baillot, R. Behringer, S. Feiner, S. Julier, B. Maclbtyre, "Recent Advances in Augmented Reality", IEEE Computer Graphics and Application, Vol. 21, No. 6, pp. 34-47, 2001. http://dx.doi.org/10.1109/38.963459
- S. H L im, Y. J Kim, K. G Kim, "Three-Dimensional Visualization of Medical Image using Image Segmentation Algorithm based on Deep Learning", Journal of Korea Multimedia, Vol. 23, No. 3, pp. 468-475, 2020. https://doi.org/10.9717/kmms.2020.23.3.468
- J. Jang, C. M. Tschabrunn, M. Barkagan, E. Anter, B Menze, R. Nezafat, et al., "Three-dimensional Holographic Visualization of High Resolution Myocardial Scar on Hololens", PloS One, Vol. 13, No. 10, e0205188, 2018. http://dx.doi.org/10.1371/journal.pone.0205188
- S. Sivakumar, C. Chandrasekar, "Lung Nodule Detection Using Fuzzy Clustering and Support Vector Machines", International Journal of Engineering and Technology, Vol. 5, No. 1, pp. 179-185, 2013. https://doi.org/10.1007/s10916-016-0539-9
- S. J. Park, Y. J. Kim, D. Y. Park, J. W. Jung, "Evaluation of Transfer Learning in Gastroscopy Image Classification using Convolutional Neual Network", Journal of Korea Society of Medical and Biological Engineering, Vol. 39, No. 5, pp. 213-219, 2018. https://doi.org/10.9718/JBER.2018.39.5.213
- D. Shen, G. Wu, H. I. Suk, "Deep L earning in Medical Image Analysis", Annual Review of Biomedical Engineering, Vol. 19, No. 9, pp. 221-248, 2017. https://doi.org/10.1146/annurev-bioeng-071516-044442
- O. Ronneberger, P. Fischer, T. Brox, "UNet: Convolutional Networks for Biomedical Image Segmentation", Proceeding of International Conference on Medical Image Computer-assisted Intervention, Vol. 9351, pp. 234-241, 2015.
- Guo. L, X. Xiao, C. Wu, X. Zeng, Y. Zhang, J. Du, S. Bai, J. Xie, Z. Zhang, Y. Li, X. Wang, O. Cheung, M. sharma, J. Liu, B. Hu, "Real-time automated diagnosis of precancerous lesions and early esophageal squamous cell carcinoma using a deep learning model (with videos)", Gastrointestinal Endoscopy, Vol. 91, No. 1, pp. 41-51, 2020. http://dx.doi.org/10.1016/j.gie.2019.08.018
- Y. Tokai, T. Yoshio, K. Aoyama, Y. Horie, S. yoshimizu, Y. Horiuchi, A. Ishiyama, T. Tsuchida, T. Hirasawa, Y. Sakakibara, T. yamada, S. Yamaguchi, J. Fujisaki. T. Tada, "Application of artificial intelligence using convolutional neural networks in determining the invasion depth of esophageal squamous cell carcinoma", Esophagus, Vol. 17, pp. 250-256, 2020. https://doi.org/10.1007/s10388-020-00716-x
- Y. Horie, T. Yoshio, K. Aoyama, S. Yoshimizu, Y. Horiuchi, A. Ishiyama, T. Hirasawa, T. Tsuchida, T. Ozawa, S. Ishihara, Y. Kumagai, M. Fujishiro, I. Maetani, J. Fujisaki, T. Tada, "Diagnostic outcomes of esophageal cancer by artificial intelligence using convolutional neural networks", Gastrointestinal Endoscopy, Vol. 89, No. 1, pp.25-32, 2019. https://doi.org/10.1016/j.gie.2018.07.037