• Title/Summary/Keyword: deep similarity

Search Result 226, Processing Time 0.023 seconds

Convergence evaluation method using multisensory and matching painting and music using deep learning based on imaginary soundscape (Imaginary Soundscape 기반의 딥러닝을 활용한 회화와 음악의 매칭 및 다중 감각을 이용한 융합적 평가 방법)

  • Jeong, Hayoung;Kim, Youngjun;Cho, Jundong
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.11
    • /
    • pp.175-182
    • /
    • 2020
  • In this study, we introduced the technique of matching classical music using deep learning to design soundscape that can help the viewer appreciate painting and proposed an evaluation index to evaluate how well matching painting and music. The evaluation index was conducted with suitability evaluation through the Likeard 5-point scale and evaluation in a multimodal aspect. The suitability evaluation score of the 13 test participants for the deep learning based best match between painting and music was 3.74/5.0 and band the average cosine similarity of the multimodal evaluation of 13 participants was 0.79. We expect multimodal evaluation to be an evaluation index that can measure a new user experience. In addition, this study aims to improve the experience of multisensory artworks by proposing the interaction between visual and auditory. The proposed matching of painting and music method can be used in multisensory artwork exhibition and furthermore it will increase the accessibility of visually impaired people to appreciate artworks.

Handwritten One-time Password Authentication System Based On Deep Learning (심층 학습 기반의 수기 일회성 암호 인증 시스템)

  • Li, Zhun;Lee, HyeYoung;Lee, Youngjun;Yoon, Sooji;Bae, Byeongil;Choi, Ho-Jin
    • Journal of Internet Computing and Services
    • /
    • v.20 no.1
    • /
    • pp.25-37
    • /
    • 2019
  • Inspired by the rapid development of deep learning and online biometrics-based authentication, we propose a handwritten one-time password authentication system which employs deep learning-based handwriting recognition and writer verification techniques. We design a convolutional neural network to recognize handwritten digits and a Siamese network to compute the similarity between the input handwriting and the genuine user's handwriting. We propose the first application of the second edition of NIST Special Database 19 for a writer verification task. Our system achieves 98.58% accuracy in the handwriting recognition task, and about 93% accuracy in the writer verification task based on four input images. We believe the proposed handwriting-based biometric technique has potential for use in a variety of online authentication services under the FIDO framework.

Stochastic Non-linear Hashing for Near-Duplicate Video Retrieval using Deep Feature applicable to Large-scale Datasets

  • Byun, Sung-Woo;Lee, Seok-Pil
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.8
    • /
    • pp.4300-4314
    • /
    • 2019
  • With the development of video-related applications, media content has increased dramatically through applications. There is a substantial amount of near-duplicate videos (NDVs) among Internet videos, thus NDVR is important for eliminating near-duplicates from web video searches. This paper proposes a novel NDVR system that supports large-scale retrieval and contributes to the efficient and accurate retrieval performance. For this, we extracted keyframes from each video at regular intervals and then extracted both commonly used features (LBP and HSV) and new image features from each keyframe. A recent study introduced a new image feature that can provide more robust information than existing features even if there are geometric changes to and complex editing of images. We convert a vector set that consists of the extracted features to binary code through a set of hash functions so that the similarity comparison can be more efficient as similar videos are more likely to map into the same buckets. Lastly, we calculate similarity to search for NDVs; we examine the effectiveness of the NDVR system and compare this against previous NDVR systems using the public video collections CC_WEB_VIDEO. The proposed NDVR system's performance is very promising compared to previous NDVR systems.

Automatic space type classification of architectural BIM models using Graph Convolutional Networks

  • Yu, Youngsu;Lee, Wonbok;Kim, Sihyun;Jeon, Haein;Koo, Bonsang
    • International conference on construction engineering and project management
    • /
    • 2022.06a
    • /
    • pp.752-759
    • /
    • 2022
  • The instantiation of spaces as a discrete entity allows users to utilize BIM models in a wide range of analyses. However, in practice, their utility has been limited as spaces are erroneously entered due to human error and often omitted entirely. Recent studies attempted to automate space allocation using artificial intelligence approaches. However, there has been limited success as most studies focused solely on the use of geometric features to distinguish spaces. In this study, in addition to geometric features, semantic relations between spaces and elements were modeled and used to improve space classification in BIM models. Graph Convolutional Networks (GCN), a deep learning algorithm specifically tailored for learning in graphs, was deployed to classify spaces via a similarity graph that represents the relationships between spaces and their surrounding elements. Results confirmed that accuracy (ACC) was +0.08 higher than the baseline model in which only geometric information was used. Most notably, GCN was able to correctly distinguish spaces with no apparent difference in geometry by discriminating the specific elements that were provided by the similarity graph.

  • PDF

Deep Learning Based Semantic Similarity for Korean Legal Field (딥러닝을 이용한 법률 분야 한국어 의미 유사판단에 관한 연구)

  • Kim, Sung Won;Park, Gwang Ryeol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.2
    • /
    • pp.93-100
    • /
    • 2022
  • Keyword-oriented search methods are mainly used as data search methods, but this is not suitable as a search method in the legal field where professional terms are widely used. In response, this paper proposes an effective data search method in the legal field. We describe embedding methods optimized for determining similarities between sentences in the field of natural language processing of legal domains. After embedding legal sentences based on keywords using TF-IDF or semantic embedding using Universal Sentence Encoder, we propose an optimal way to search for data by combining BERT models to check similarities between sentences in the legal field.

Efficient Recognition of Easily-confused Chinese Herbal Slices Images Using Enhanced ResNeSt

  • Qi Zhang;Jinfeng Ou;Huaying Zhou
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.8
    • /
    • pp.2103-2118
    • /
    • 2024
  • Chinese herbal slices (CHS) automated recognition based on computer vision plays a critical role in the practical application of intelligent Chinese medicine. Due to the complexity and similarity of herbal images, identifying Chinese herbal slices is still a challenging task. Especially, easily-confused CHS have higher inter-class and intra-class complexity and similarity issues, the existing deep learning models are less adaptable to identify them efficiently. To comprehensively address these problems, a novel tiny easily-confused CHS dataset has been built firstly, which includes six pairs of twelve categories with about 2395 samples. Furthermore, we propose a ResNeSt-CHS model that combines multilevel perception fusion (MPF) and perceptive sparse fusion (PSF) blocks for efficiently recognizing easilyconfused CHS images. To verify the superiority of the ResNeSt-CHS and the effectiveness of our dataset, experiments have been employed, validating that the ResNeSt-CHS is optimal for easily-confused CHS recognition, with 2.1% improvement of the original ResNeSt model. Additionally, the results indicate that ResNeSt-CHS is applied on a relatively small-scale dataset yet high accuracy. This model has obtained state-of-the-art easily-confused CHS classification performance, with accuracy of 90.8%, far beyond other models (EfficientNet, Transformer, and ResNeSt, etc) in terms of evaluation criteria.

Unsupervised Non-rigid Registration Network for 3D Brain MR images (3차원 뇌 자기공명 영상의 비지도 학습 기반 비강체 정합 네트워크)

  • Oh, Donggeon;Kim, Bohyoung;Lee, Jeongjin;Shin, Yeong-Gil
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.15 no.5
    • /
    • pp.64-74
    • /
    • 2019
  • Although a non-rigid registration has high demands in clinical practice, it has a high computational complexity and it is very difficult for ensuring the accuracy and robustness of registration. This study proposes a method of applying a non-rigid registration to 3D magnetic resonance images of brain in an unsupervised learning environment by using a deep-learning network. A feature vector between two images is produced through the network by receiving both images from two different patients as inputs and it transforms the target image to match the source image by creating a displacement vector field. The network is designed based on a U-Net shape so that feature vectors that consider all global and local differences between two images can be constructed when performing the registration. As a regularization term is added to a loss function, a transformation result similar to that of a real brain movement can be obtained after the application of trilinear interpolation. This method enables a non-rigid registration with a single-pass deformation by only receiving two arbitrary images as inputs through an unsupervised learning. Therefore, it can perform faster than other non-learning-based registration methods that require iterative optimization processes. Our experiment was performed with 3D magnetic resonance images of 50 human brains, and the measurement result of the dice similarity coefficient confirmed an approximately 16% similarity improvement by using our method after the registration. It also showed a similar performance compared with the non-learning-based method, with about 10,000 times speed increase. The proposed method can be used for non-rigid registration of various kinds of medical image data.

An evaluation methodology for cement concrete lining crack segmentation deep learning model (콘크리트 라이닝 균열 분할 딥러닝 모델 평가 방법)

  • Ham, Sangwoo;Bae, Soohyeon;Lee, Impyeong;Lee, Gyu-Phil;Kim, Donggyou
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.24 no.6
    • /
    • pp.513-524
    • /
    • 2022
  • Recently, detecting damages of civil infrastructures from digital images using deep learning technology became a very popular research topic. In order to adapt those methodologies to the field, it is essential to explain robustness of deep learning models. Our research points out that the existing pixel-based deep learning model evaluation metrics are not sufficient for detecting cracks since cracks have linear appearance, and proposes a new evaluation methodology to explain crack segmentation deep learning model more rationally. Specifically, we design, implement and validate a methodology to generate tolerance buffer alongside skeletonized ground truth data and prediction results to consider overall similarity of topology of the ground truth and the prediction rather than pixel-wise accuracy. We could overcome over-estimation or under-estimation problem of crack segmentation model evaluation through using our methodology, and we expect that our methodology can explain crack segmentation deep learning models better.

Deep Learning-Based Computed Tomography Image Standardization to Improve Generalizability of Deep Learning-Based Hepatic Segmentation

  • Seul Bi Lee;Youngtaek Hong;Yeon Jin Cho;Dawun Jeong;Jina Lee;Soon Ho Yoon;Seunghyun Lee;Young Hun Choi;Jung-Eun Cheon
    • Korean Journal of Radiology
    • /
    • v.24 no.4
    • /
    • pp.294-304
    • /
    • 2023
  • Objective: We aimed to investigate whether image standardization using deep learning-based computed tomography (CT) image conversion would improve the performance of deep learning-based automated hepatic segmentation across various reconstruction methods. Materials and Methods: We collected contrast-enhanced dual-energy CT of the abdomen that was obtained using various reconstruction methods, including filtered back projection, iterative reconstruction, optimum contrast, and monoenergetic images with 40, 60, and 80 keV. A deep learning based image conversion algorithm was developed to standardize the CT images using 142 CT examinations (128 for training and 14 for tuning). A separate set of 43 CT examinations from 42 patients (mean age, 10.1 years) was used as the test data. A commercial software program (MEDIP PRO v2.0.0.0, MEDICALIP Co. Ltd.) based on 2D U-NET was used to create liver segmentation masks with liver volume. The original 80 keV images were used as the ground truth. We used the paired t-test to compare the segmentation performance in the Dice similarity coefficient (DSC) and difference ratio of the liver volume relative to the ground truth volume before and after image standardization. The concordance correlation coefficient (CCC) was used to assess the agreement between the segmented liver volume and ground-truth volume. Results: The original CT images showed variable and poor segmentation performances. The standardized images achieved significantly higher DSCs for liver segmentation than the original images (DSC [original, 5.40%-91.27%] vs. [standardized, 93.16%-96.74%], all P < 0.001). The difference ratio of liver volume also decreased significantly after image conversion (original, 9.84%-91.37% vs. standardized, 1.99%-4.41%). In all protocols, CCCs improved after image conversion (original, -0.006-0.964 vs. standardized, 0.990-0.998). Conclusion: Deep learning-based CT image standardization can improve the performance of automated hepatic segmentation using CT images reconstructed using various methods. Deep learning-based CT image conversion may have the potential to improve the generalizability of the segmentation network.

Optimization of Abdominal X-ray Images using Generative Adversarial Network to Realize Minimized Radiation Dose (방사선 조사선량의 최소화를 위한 생성적 적대 신경망을 활용한 복부 엑스선 영상 최적화 연구)

  • Sangwoo Kim;Jae-Dong Rhim
    • Journal of the Korean Society of Radiology
    • /
    • v.17 no.2
    • /
    • pp.191-199
    • /
    • 2023
  • This study aimed to propose minimized radiation doses with an optimized abdomen x-ray image, which realizes a Deep Blind Image Super-Resolution Generative adversarial network (BSRGAN) technique. Entrance surface doses (ESD) measured were collected by changing exposure conditions. In the identical exposures, abdominal images were acquired and were processed with the BSRGAN. The images reconstructed by the BSRGAN were compared to a reference image with 80 kVp and 320 mA, which was evaluated by mean squared error (MSE), peak signal-to-noise ratio (PSNR), and structural similarity index measure (SSIM). In addition, signal profile analysis was employed to validate the effect of the images reconstructed by the BSRGAN. The exposure conditions with the lowest MSE (about 0.285) were shown in 90 kVp, 125 mA and 100 kVp, 100 mA, which decreased the ESD in about 52 to 53% reduction), exhibiting PSNR = 37.694 and SSIM = 0.999. The signal intensity variations in the optimized conditions rather decreased than that of the reference image. This means that the optimized exposure conditions would obtain reasonable image quality with a substantial decrease of the radiation dose, indicating it could sufficiently reflect the concept of As Low As Reasonably Achievable (ALARA) as the principle of radiation protection.