• Title/Summary/Keyword: Image deep learning

Search Result 1,828, Processing Time 0.038 seconds

Deep Learning Algorithm to Identify Cancer Pictures (딥러닝 기반 암세포 사진 분류 알고리즘)

  • Seo, Young-Min;Han, Jong-Ki
    • Journal of Broadcast Engineering
    • /
    • v.23 no.5
    • /
    • pp.669-681
    • /
    • 2018
  • CNN (Convolution Neural Network) is one of the most important techniques to identify the kind of objects in the captured pictures. Whereas the conventional models have been used for low resolution images, the technique to recognize the high resolution images becomes crucial in the field of artificial intelligence. In this paper, we proposed an efficient CNN model based on dilated convolution and thresholding techniques to increase the recognition ratio and to decrease the computational complexity. The simulation results show that the proposed algorithm outperforms the conventional method and the thresholding technique enhances the performance of the proposed model.

Sparse Feature Convolutional Neural Network with Cluster Max Extraction for Fast Object Classification

  • Kim, Sung Hee;Pae, Dong Sung;Kang, Tae-Koo;Kim, Dong W.;Lim, Myo Taeg
    • Journal of Electrical Engineering and Technology
    • /
    • v.13 no.6
    • /
    • pp.2468-2478
    • /
    • 2018
  • We propose the Sparse Feature Convolutional Neural Network (SFCNN) to reduce the volume of convolutional neural networks (CNNs). Despite the superior classification performance of CNNs, their enormous network volume requires high computational cost and long processing time, making real-time applications such as online-training difficult. We propose an advanced network that reduces the volume of conventional CNNs by producing a region-based sparse feature map. To produce the sparse feature map, two complementary region-based value extraction methods, cluster max extraction and local value extraction, are proposed. Cluster max is selected as the main function based on experimental results. To evaluate SFCNN, we conduct an experiment with two conventional CNNs. The network trains 59 times faster and tests 81 times faster than the VGG network, with a 1.2% loss of accuracy in multi-class classification using the Caltech101 dataset. In vehicle classification using the GTI Vehicle Image Database, the network trains 88 times faster and tests 94 times faster than the conventional CNNs, with a 0.1% loss of accuracy.

Enhanced Sign Language Transcription System via Hand Tracking and Pose Estimation

  • Kim, Jung-Ho;Kim, Najoung;Park, Hancheol;Park, Jong C.
    • Journal of Computing Science and Engineering
    • /
    • v.10 no.3
    • /
    • pp.95-101
    • /
    • 2016
  • In this study, we propose a new system for constructing parallel corpora for sign languages, which are generally under-resourced in comparison to spoken languages. In order to achieve scalability and accessibility regarding data collection and corpus construction, our system utilizes deep learning-based techniques and predicts depth information to perform pose estimation on hand information obtainable from video recordings by a single RGB camera. These estimated poses are then transcribed into expressions in SignWriting. We evaluate the accuracy of hand tracking and hand pose estimation modules of our system quantitatively, using the American Sign Language Image Dataset and the American Sign Language Lexicon Video Dataset. The evaluation results show that our transcription system has a high potential to be successfully employed in constructing a sizable sign language corpus using various types of video resources.

Cultural Region-based Clustering of SNS Big Data and Users Preferences Analysis (문화권 클러스터링 기반 SNS 빅데이터 및 사용자 선호도 분석)

  • Rho, Seungmin
    • Journal of Advanced Navigation Technology
    • /
    • v.22 no.6
    • /
    • pp.670-674
    • /
    • 2018
  • Social network service (SNS) related data including comments/text, images, videos, blogs, and user experiences contain a wealth of information which can be used to build recommendation systems for various clients' and provide insightful data/results to business analysts. Multimedia data, especially visual data like image and videos are the richest source of SNS data which can reflect particular region, and cultures values/interests, form a gigantic portion of the overall data. Mining such huge amounts of data for extracting actionable intelligence require efficient and smart data analysis methods. The purpose of this paper is to focus on this particular modality for devising ways to model, index, and retrieve data as and when desired.

Synthesis of contrast CT image using deep learning network (딥러닝 네트워크를 이용한 조영증강 CT 영상 생성)

  • Woo, Sang-Keun
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2019.01a
    • /
    • pp.465-467
    • /
    • 2019
  • 본 논문에서는 영상생성이 가능한 딥러닝 네트워크를 이용하여 조영증강 CT 영상을 획득하는 연구를 수행하였다. CT는 고해상도 영상을 바탕으로 환자의 질병 및 암 세포 진단에 사용되는 의료영상 기법 중 하나이다. 특히, 조영제를 투여한 다음 CT 영상을 획득되는 영상을 조영증강 CT 영상이라 한다. 조영증강된 CT 영상은 물질의 구성 성분의 영상대비를 강조하여 임상의로 하여금 진단 및 치료반응 평가의 정확성을 향상시켜준다. 하지많은 수의 환자들이 조영제 부작용을 갖기 때문에 이에 해당되는 환자의 경우 조영증강 CT 영상 획득이 불가능해진다. 따라서 본 연구에서는 조영증강 영상을 얻지 못하는 환자 및 일반 환자의 불필요한 방사선의 노출을 최소화 하기 위하여 영상생성 딥러닝 기법을 이용하여 CT 영상에서 조영증강 CT 영상을 생성하는 연구를 진행하였다. 영상생성 딥러닝 네트워크는 generative adversarial network (GAN) 모델을 사용하였다. 연구결과 아무런 전처리도 거치지 않은 CT 영상을 이용하여 영상을 생성하는 것 보다 히스토그램 균일화 과정을 거친 영상이 더 좋은 결과를 나타냈으며 생성영상이 기존의 실제 영상과 영상의 구조적 유사도가 높음을 확인할 수 있다. 본 연구결과 딥러닝 영상생성 모델을 이용하여 조영증강 CT 영상을 생성할 수 있었으며, 이를 통하여 환자의 불필요한 방사선 피폭을 최소하며, 생성된 조영증강 CT 영상을 바탕으로 정확한 진단 및 치료반응 평가에 기여할 수 있을거라 기대된다.

  • PDF

Epileptic Seizure Detection for Multi-channel EEG with Recurrent Convolutional Neural Networks (순환 합성곱 신경망를 이용한 다채널 뇌파 분석의 간질 발작 탐지)

  • Yoo, Ji-Hyun
    • Journal of IKEEE
    • /
    • v.22 no.4
    • /
    • pp.1175-1179
    • /
    • 2018
  • In this paper, we propose recurrent CNN(Convolutional Neural Networks) for detecting seizures among patients using EEG signals. In the proposed method, data were mapped by image to preserve the spectral characteristics of the EEG signal and the position of the electrode. After the spectral preprocessing, we input it into CNN and extracted the spatial and temporal features without wavelet transform. Results from the Children's Hospital of Boston Massachusetts Institute of Technology (CHB-MIT) dataset showed a sensitivity of 90% and a false positive rate (FPR) of 0.85 per hour.

Comparative Analysis of Deep Learning Researches for Compressed Video Quality Improvement (압축 영상 화질 개선을 위한 딥 러닝 연구에 대한 분석)

  • Lee, Young-Woon;Kim, Byung-Gyu
    • Journal of Broadcast Engineering
    • /
    • v.24 no.3
    • /
    • pp.420-429
    • /
    • 2019
  • Recently, researches using Convolutional Neural Network (CNN)-based approaches have been actively conducted to improve the reduced quality of compressed video using block-based video coding standards such as H.265/HEVC. This paper aims to summarize and analyze the network models in these quality enhancement studies. At first the detailed components of CNN for quality enhancement are overviewed and then we summarize prior studies in the image domain. Next, related studies are summarized in three aspects of network structure, dataset, and training methods, and present representative models implementation and experimental results for performance comparison.

Comparison of Code Similarity Analysis Performance of funcGNN and Siamese Network (funcGNN과 Siamese Network의 코드 유사성 분석 성능비교)

  • Choi, Dong-Bin;Jo, In-su;Park, Young B.
    • Journal of the Semiconductor & Display Technology
    • /
    • v.20 no.3
    • /
    • pp.113-116
    • /
    • 2021
  • As artificial intelligence technologies, including deep learning, develop, these technologies are being introduced to code similarity analysis. In the traditional analysis method of calculating the graph edit distance (GED) after converting the source code into a control flow graph (CFG), there are studies that calculate the GED through a trained graph neural network (GNN) with the converted CFG, Methods for analyzing code similarity through CNN by imaging CFG are also being studied. In this paper, to determine which approach will be effective and efficient in researching code similarity analysis methods using artificial intelligence in the future, code similarity is measured through funcGNN, which measures code similarity using GNN, and Siamese Network, which is an image similarity analysis model. The accuracy was compared and analyzed. As a result of the analysis, the error rate (0.0458) of the Siamese network was bigger than that of the funcGNN (0.0362).

SEL-RefineMask: A Seal Segmentation and Recognition Neural Network with SEL-FPN

  • Dun, Ze-dong;Chen, Jian-yu;Qu, Mei-xia;Jiang, Bin
    • Journal of Information Processing Systems
    • /
    • v.18 no.3
    • /
    • pp.411-427
    • /
    • 2022
  • Digging historical and cultural information from seals in ancient books is of great significance. However, ancient Chinese seal samples are scarce and carving methods are diverse, and traditional digital image processing methods based on greyscale have difficulty achieving superior segmentation and recognition performance. Recently, some deep learning algorithms have been proposed to address this problem; however, current neural networks are difficult to train owing to the lack of datasets. To solve the afore-mentioned problems, we proposed an SEL-RefineMask which combines selector of feature pyramid network (SEL-FPN) with RefineMask to segment and recognize seals. We designed an SEL-FPN to intelligently select a specific layer which represents different scales in the FPN and reduces the number of anchor frames. We performed experiments on some instance segmentation networks as the baseline method, and the top-1 segmentation result of 64.93% is 5.73% higher than that of humans. The top-1 result of the SEL-RefineMask network reached 67.96% which surpassed the baseline results. After segmentation, a vision transformer was used to recognize the segmentation output, and the accuracy reached 91%. Furthermore, a dataset of seals in ancient Chinese books (SACB) for segmentation and small seal font (SSF) for recognition were established which are publicly available on the website.

Sign2Gloss2Text-based Sign Language Translation with Enhanced Spatial-temporal Information Centered on Sign Language Movement Keypoints (수어 동작 키포인트 중심의 시공간적 정보를 강화한 Sign2Gloss2Text 기반의 수어 번역)

  • Kim, Minchae;Kim, Jungeun;Kim, Ha Young
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.10
    • /
    • pp.1535-1545
    • /
    • 2022
  • Sign language has completely different meaning depending on the direction of the hand or the change of facial expression even with the same gesture. In this respect, it is crucial to capture the spatial-temporal structure information of each movement. However, sign language translation studies based on Sign2Gloss2Text only convey comprehensive spatial-temporal information about the entire sign language movement. Consequently, detailed information (facial expression, gestures, and etc.) of each movement that is important for sign language translation is not emphasized. Accordingly, in this paper, we propose Spatial-temporal Keypoints Centered Sign2Gloss2Text Translation, named STKC-Sign2 Gloss2Text, to supplement the sequential and semantic information of keypoints which are the core of recognizing and translating sign language. STKC-Sign2Gloss2Text consists of two steps, Spatial Keypoints Embedding, which extracts 121 major keypoints from each image, and Temporal Keypoints Embedding, which emphasizes sequential information using Bi-GRU for extracted keypoints of sign language. The proposed model outperformed all Bilingual Evaluation Understudy(BLEU) scores in Development(DEV) and Testing(TEST) than Sign2Gloss2Text as the baseline, and in particular, it proved the effectiveness of the proposed methodology by achieving 23.19, an improvement of 1.87 based on TEST BLEU-4.