• Title/Summary/Keyword: Similarity Learning

Search Result 499, Processing Time 0.023 seconds

Predicting Learning Achievement Using Big Data Cluster Analysis - Focusing on Longitudinal Study (빅데이터 군집 분석을 이용한 학습성취도 예측 - 종단 연구를 중심으로)

  • Ko, Sujeong
    • Journal of Digital Contents Society
    • /
    • v.19 no.9
    • /
    • pp.1769-1778
    • /
    • 2018
  • As the value of using Big Data is increasing, various researches are being carried out utilizing big data analysis technology in the field of education as well as corporations. In this paper, we propose a method to predict learning achievement using big data cluster analysis. In the proposed method, students in Korea Children and Youth Panel Survey(KCYPS) are classified into groups with similar learning habits using the Kmeans algorithm based on the learning habits of students of the first year at middle school, and group features are extracted. Next, using the extracted features of groups, the first grade students at the middle school in the test group were classified into groups having similar learning habits using the cosine similarity, and then the neighbors were selected and the learning achievement was predicted. The method proposed in this paper has proved that the learning habits at middle school are closely related to at the university, and they make it possible to predict the learning achievement at high school and the satisfaction with university and major.

Development of Deep Recognition of Similarity in Show Garden Design Based on Deep Learning (딥러닝을 활용한 전시 정원 디자인 유사성 인지 모형 연구)

  • Cho, Woo-Yun;Kwon, Jin-Wook
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.52 no.2
    • /
    • pp.96-109
    • /
    • 2024
  • The purpose of this study is to propose a method for evaluating the similarity of Show gardens using Deep Learning models, specifically VGG-16 and ResNet50. A model for judging the similarity of show gardens based on VGG-16 and ResNet50 models was developed, and was referred to as DRG (Deep Recognition of similarity in show Garden design). An algorithm utilizing GAP and Pearson correlation coefficient was employed to construct the model, and the accuracy of similarity was analyzed by comparing the total number of similar images derived at 1st (Top1), 3rd (Top3), and 5th (Top5) ranks with the original images. The image data used for the DRG model consisted of a total of 278 works from the Le Festival International des Jardins de Chaumont-sur-Loire, 27 works from the Seoul International Garden Show, and 17 works from the Korea Garden Show. Image analysis was conducted using the DRG model for both the same group and different groups, resulting in the establishment of guidelines for assessing show garden similarity. First, overall image similarity analysis was best suited for applying data augmentation techniques based on the ResNet50 model. Second, for image analysis focusing on internal structure and outer form, it was effective to apply a certain size filter (16cm × 16cm) to generate images emphasizing form and then compare similarity using the VGG-16 model. It was suggested that an image size of 448 × 448 pixels and the original image in full color are the optimal settings. Based on these research findings, a quantitative method for assessing show gardens is proposed and it is expected to contribute to the continuous development of garden culture through interdisciplinary research moving forward.

Course recommendation system using deep learning (딥러닝을 이용한 강좌 추천시스템)

  • Min-Ah Lim;Seung-Yeon Hwang;Dong-Jin Shin;Jae-Kon Oh;Jeong-Joon Kim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.3
    • /
    • pp.193-198
    • /
    • 2023
  • We study a learner-customized lecture recommendation project using deep learning. Recommendation systems can be easily found on the web and apps, and examples using this feature include recommending feature videos by clicking users and advertising items in areas of interest to users on SNS. In this study, the sentence similarity Word2Vec was mainly used to filter twice, and the course was recommended through the Surprise library. With this system, it provides users with the desired classification of course data conveniently and conveniently. Surprise Library is a Python scikit-learn-based library that is conveniently used in recommendation systems. By analyzing the data, the system is implemented at a high speed, and deeper learning is used to implement more precise results through course steps. When a user enters a keyword of interest, similarity between the keyword and the course title is executed, and similarity with the extracted video data and voice text is executed, and the highest ranking video data is recommended through the Surprise Library.

Automated Ulna and Radius Segmentation model based on Deep Learning on DEXA (DEXA에서 딥러닝 기반의 척골 및 요골 자동 분할 모델)

  • Kim, Young Jae;Park, Sung Jin;Kim, Kyung Rae;Kim, Kwang Gi
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.12
    • /
    • pp.1407-1416
    • /
    • 2018
  • The purpose of this study was to train a model for the ulna and radius bone segmentation based on Convolutional Neural Networks and to verify the segmentation model. The data consisted of 840 training data, 210 tuning data, and 200 verification data. The learning model for the ulna and radius bone bwas based on U-Net (19 convolutional and 8 maximum pooling) and trained with 8 batch sizes, 0.0001 learning rate, and 200 epochs. As a result, the average sensitivity of the training data was 0.998, the specificity was 0.972, the accuracy was 0.979, and the Dice's similarity coefficient was 0.968. In the validation data, the average sensitivity was 0.961, specificity was 0.978, accuracy was 0.972, and Dice's similarity coefficient was 0.961. The performance of deep convolutional neural network based models for the segmentation was good for ulna and radius bone.

Word Sense Similarity Clustering Based on Vector Space Model and HAL (벡터 공간 모델과 HAL에 기초한 단어 의미 유사성 군집)

  • Kim, Dong-Sung
    • Korean Journal of Cognitive Science
    • /
    • v.23 no.3
    • /
    • pp.295-322
    • /
    • 2012
  • In this paper, we cluster similar word senses applying vector space model and HAL (Hyperspace Analog to Language). HAL measures corelation among words through a certain size of context (Lund and Burgess 1996). The similarity measurement between a word pair is cosine similarity based on the vector space model, which reduces distortion of space between high frequency words and low frequency words (Salton et al. 1975, Widdows 2004). We use PCA (Principal Component Analysis) and SVD (Singular Value Decomposition) to reduce a large amount of dimensions caused by similarity matrix. For sense similarity clustering, we adopt supervised and non-supervised learning methods. For non-supervised method, we use clustering. For supervised method, we use SVM (Support Vector Machine), Naive Bayes Classifier, and Maximum Entropy Method.

  • PDF

Detecting Similar Designs Using Deep Learning-based Image Feature Extracting Model (딥러닝 기반 이미지 특징 추출 모델을 이용한 유사 디자인 검출에 대한 연구)

  • Lee, Byoung Woo;Lee, Woo Chang;Chae, Seung Wan;Kim, Dong Hyun;Lee, Choong Kwon
    • Smart Media Journal
    • /
    • v.9 no.4
    • /
    • pp.162-169
    • /
    • 2020
  • Design is a key factor that determines the competitiveness of products in the textile and fashion industry. It is very important to measure the similarity of the proposed design in order to prevent unauthorized copying and to confirm the originality. In this study, a deep learning technique was used to quantify features from images of textile designs, and similarity was measured using Spearman correlation coefficients. To verify that similar samples were actually detected, 300 images were randomly rotated and color changed. The results of Top-3 and Top-5 in the order of similarity value were measured to see if samples that rotated or changed color were detected. As a result, the VGG-16 model recorded significantly higher performance than did AlexNet. The performance of the VGG-16 model was the highest at 64% and 73.67% in the Top-3 and Top-5, where similarity results were high in the case of the rotated image. appear. In the case of color change, the highest in Top-3 and Top-5 at 86.33% and 90%, respectively.

Viewpoint Unconstrained Face Recognition Based on Affine Local Descriptors and Probabilistic Similarity

  • Gao, Yongbin;Lee, Hyo Jong
    • Journal of Information Processing Systems
    • /
    • v.11 no.4
    • /
    • pp.643-654
    • /
    • 2015
  • Face recognition under controlled settings, such as limited viewpoint and illumination change, can achieve good performance nowadays. However, real world application for face recognition is still challenging. In this paper, we propose using the combination of Affine Scale Invariant Feature Transform (SIFT) and Probabilistic Similarity for face recognition under a large viewpoint change. Affine SIFT is an extension of SIFT algorithm to detect affine invariant local descriptors. Affine SIFT generates a series of different viewpoints using affine transformation. In this way, it allows for a viewpoint difference between the gallery face and probe face. However, the human face is not planar as it contains significant 3D depth. Affine SIFT does not work well for significant change in pose. To complement this, we combined it with probabilistic similarity, which gets the log likelihood between the probe and gallery face based on sum of squared difference (SSD) distribution in an offline learning process. Our experiment results show that our framework achieves impressive better recognition accuracy than other algorithms compared on the FERET database.

The alignment between contextual and model generalization: An application with PISA 2015

  • Wan Ren;Wendy Chan
    • Communications for Statistical Applications and Methods
    • /
    • v.31 no.5
    • /
    • pp.467-485
    • /
    • 2024
  • Policymakers and educational researchers have grown increasingly interested in the extent to which study results generalize across different groups of students. Current generalization research in education has largely focused on the compositional similarity among students based on a set of observable characteristics. However, generalization is defined differently across various disciplines. While the concept of compositional similarity is prominent in causal research, generalization among the statistical learning community refers to the extent to which a model produces accurate predictions across samples and populations. The purpose of this study is to assess the extent to which concepts related to contextual generalization (based on compositional similarity) are associated with the ideas related to model generalization (based on accuracy of prediction). We use observational data from the Programme for International Student Assessment (PISA) 2015 wave as a case study to examine the conditions under which contextual and model generalization are aligned. We assess the correlations between statistical measures that quantify compositional similarity and prediction accuracy and discuss the implications for generalization research.

A code-based chromagram similarity for cover song identification (커버곡 검색을 위한 코드 기반 크로마그램 유사도)

  • Seo, Jin Soo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.38 no.3
    • /
    • pp.314-319
    • /
    • 2019
  • Computing chromagram similarity is indispensable in constructing cover song identification system. This paper proposes a code-based chromagram similarity to reduce the computational and the storage costs for cover song identification. By learning a song-specific codebook, a chromagram sequence is converted into a code sequence, which results in the reduction of the feature storage cost. We build a lookup table over the learned codebooks to compute chromagram similarity efficiently. Experiments on two music datasets were performed to compare the proposed code-based similarity with the conventional one in terms of cover song search accuracy, feature storage, and computational cost.

SVM based Clustering Technique for Processing High Dimensional Data (고차원 데이터 처리를 위한 SVM기반의 클러스터링 기법)

  • Kim, Man-Sun;Lee, Sang-Yong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.7
    • /
    • pp.816-820
    • /
    • 2004
  • Clustering is a process of dividing similar data objects in data set into clusters and acquiring meaningful information in the data. The main issues related to clustering are the effective clustering of high dimensional data and optimization. This study proposed a method of measuring similarity based on SVM and a new method of calculating the number of clusters in an efficient way. The high dimensional data are mapped to Feature Space ones using kernel functions and then similarity between neighboring clusters is measured. As for created clusters, the desired number of clusters can be got using the value of similarity measured and the value of Δd. In order to verify the proposed methods, the author used data of six UCI Machine Learning Repositories and obtained the presented number of clusters as well as improved cohesiveness compared to the results of previous researches.