Search | Korea Science

Collaborative Similarity Metric Learning for Semantic Image Annotation and Retrieval

Wang, Bin;Liu, Yuncai
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.7 no.5
- /
- pp.1252-1271
- /
- 2013
Automatic image annotation has become an increasingly important research topic owing to its key role in image retrieval. Simultaneously, it is highly challenging when facing to large-scale dataset with large variance. Practical approaches generally rely on similarity measures defined over images and multi-label prediction methods. More specifically, those approaches usually 1) leverage similarity measures predefined or learned by optimizing for ranking or annotation, which might be not adaptive enough to datasets; and 2) predict labels separately without taking the correlation of labels into account. In this paper, we propose a method for image annotation through collaborative similarity metric learning from dataset and modeling the label correlation of the dataset. The similarity metric is learned by simultaneously optimizing the 1) image ranking using structural SVM (SSVM), and 2) image annotation using correlated label propagation, with respect to the similarity metric. The learned similarity metric, fully exploiting the available information of datasets, would improve the two collaborative components, ranking and annotation, and sequentially the retrieval system itself. We evaluated the proposed method on Corel5k, Corel30k and EspGame databases. The results for annotation and retrieval show the competitive performance of the proposed method.
https://doi.org/10.3837/tiis.2013.05.018 인용 PDF KSCI

Learning Similarity with Probabilistic Latent Semantic Analysis for Image Retrieval

Li, Xiong;Lv, Qi;Huang, Wenting
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.9 no.4
- /
- pp.1424-1440
- /
- 2015
It is a challenging problem to search the intended images from a large number of candidates. Content based image retrieval (CBIR) is the most promising way to tackle this problem, where the most important topic is to measure the similarity of images so as to cover the variance of shape, color, pose, illumination etc. While previous works made significant progresses, their adaption ability to dataset is not fully explored. In this paper, we propose a similarity learning method on the basis of probabilistic generative model, i.e., probabilistic latent semantic analysis (PLSA). It first derives Fisher kernel, a function over the parameters and variables, based on PLSA. Then, the parameters are determined through simultaneously maximizing the log likelihood function of PLSA and the retrieval performance over the training dataset. The main advantages of this work are twofold: (1) deriving similarity measure based on PLSA which fully exploits the data distribution and Bayes inference; (2) learning model parameters by maximizing the fitting of model to data and the retrieval performance simultaneously. The proposed method (PLSA-FK) is empirically evaluated over three datasets, and the results exhibit promising performance.
https://doi.org/10.3837/tiis.2015.04.009 인용 PDF KSCI KPUBS HTML

A Study on Designing Metadata Standard for Building AI Training Dataset of Landmark Images (랜드마크 이미지 AI 학습용 데이터 구축을 위한 메타데이터 표준 설계 방안 연구)

Kim, Jinmook
- Journal of the Korean Society for Library and Information Science
- /
- v.54 no.2
- /
- pp.419-434
- /
- 2020
The purpose of the study is to design and propose metadata standard for building AI training dataset of landmark images. In order to achieve the purpose, we first examined and analyzed the state of art of the types of image retrieval systems and their indexing methods, comprehensively. We then investigated open training dataset and machine learning tools for image object recognition. Sequentially, we selected metadata elements optimized for the AI training dataset of landmark images and defined the input data for each element. We then concluded the study with implications and suggestions for the development of application services using the results of the study.
https://doi.org/10.4275/KSLIS.2020.54.2.419 인용 PDF KSCI

Multiclass Music Classification Approach Based on Genre and Emotion

Jonghwa Kim
- International Journal of Internet, Broadcasting and Communication
- /
- v.16 no.3
- /
- pp.27-32
- /
- 2024
Reliable and fine-grained musical metadata are required for efficient search of rapidly increasing music files. In particular, since the primary motive for listening to music is its emotional effect, diversion, and the memories it awakens, emotion classification along with genre classification of music is crucial. In this paper, as an initial approach towards a "ground-truth" dataset for music emotion and genre classification, we elaborately generated a music corpus through labeling of a large number of ordinary people. In order to verify the suitability of the dataset through the classification results, we extracted features according to MPEG-7 audio standard and applied different machine learning models based on statistics and deep neural network to automatically classify the dataset. By using standard hyperparameter setting, we reached an accuracy of 93% for genre classification and 80% for emotion classification, and believe that our dataset can be used as a meaningful comparative dataset in this research field.
https://doi.org/10.7236/IJIBC.2024.16.3.27 인용 PDF

An Object-Level Feature Representation Model for the Multi-target Retrieval of Remote Sensing Images

Zeng, Zhi;Du, Zhenhong;Liu, Renyi
- Journal of Computing Science and Engineering
- /
- v.8 no.2
- /
- pp.65-77
- /
- 2014
To address the problem of multi-target retrieval (MTR) of remote sensing images, this study proposes a new object-level feature representation model. The model provides an enhanced application image representation that improves the efficiency of MTR. Generating the model in our scheme includes processes, such as object-oriented image segmentation, feature parameter calculation, and symbolic image database construction. The proposed model uses the spatial representation method of the extended nine-direction lower-triangular (9DLT) matrix to combine spatial relationships among objects, and organizes the image features according to MPEG-7 standards. A similarity metric method is proposed that improves the precision of similarity retrieval. Our method provides a trade-off strategy that supports flexible matching on the target features, or the spatial relationship between the query target and the image database. We implement this retrieval framework on a dataset of remote sensing images. Experimental results show that the proposed model achieves competitive and high-retrieval precision.
https://doi.org/10.5626/JCSE.2014.8.2.65 인용 PDF KSCI KPUBS

Learning-Based Multiple Pooling Fusion in Multi-View Convolutional Neural Network for 3D Model Classification and Retrieval

Zeng, Hui;Wang, Qi;Li, Chen;Song, Wei
- Journal of Information Processing Systems
- /
- v.15 no.5
- /
- pp.1179-1191
- /
- 2019
We design an ingenious view-pooling method named learning-based multiple pooling fusion (LMPF), and apply it to multi-view convolutional neural network (MVCNN) for 3D model classification or retrieval. By this means, multi-view feature maps projected from a 3D model can be compiled as a simple and effective feature descriptor. The LMPF method fuses the max pooling method and the mean pooling method by learning a set of optimal weights. Compared with the hand-crafted approaches such as max pooling and mean pooling, the LMPF method can decrease the information loss effectively because of its "learning" ability. Experiments on ModelNet40 dataset and McGill dataset are presented and the results verify that LMPF can outperform those previous methods to a great extent.
https://doi.org/10.3745/JIPS.02.0120 인용 PDF KSCI

Visual Semantic Based 3D Video Retrieval System Using HDFS

Ranjith Kumar, C.;Suguna, S.
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.10 no.8
- /
- pp.3806-3825
- /
- 2016
This paper brings out a neoteric frame of reference for visual semantic based 3d video search and retrieval applications. Newfangled 3D retrieval application spotlight on shape analysis like object matching, classification and retrieval not only sticking up entirely with video retrieval. In this ambit, we delve into 3D-CBVR (Content Based Video Retrieval) concept for the first time. For this purpose we intent to hitch on BOVW and Mapreduce in 3D framework. Here, we tried to coalesce shape, color and texture for feature extraction. For this purpose, we have used combination of geometric & topological features for shape and 3D co-occurrence matrix for color and texture. After thriving extraction of local descriptors, TB-PCT (Threshold Based- Predictive Clustering Tree) algorithm is used to generate visual codebook. Further, matching is performed using soft weighting scheme with L₂ distance function. As a final step, retrieved results are ranked according to the Index value and produce results .In order to handle prodigious amount of data and Efficacious retrieval, we have incorporated HDFS in our Intellection. Using 3D video dataset, we fiture the performance of our proposed system which can pan out that the proposed work gives meticulous result and also reduce the time intricacy.
https://doi.org/10.3837/tiis.2016.08.021 인용 PDF KSCI KPUBS HTML

Enhancing the Performance of Blog Retrieval by User Tagging and Social Network Analysis (사용자 태그와 중심성 지수를 이용한 블로그 검색 성능 향상에 관한 연구)

Kim, Eun-Hee;Chung, Young-Mee
- Journal of the Korean Society for information Management
- /
- v.27 no.1
- /
- pp.61-77
- /
- 2010
Blogs are now one of the major information resources on the web. The purpose of this study is to enhance the performance of blog retrieval by means of user assigned tags and trackback information. To this end, retrieval experiments were performed with a dataset of 4,908 blog pages together with their associated trackback URLs. In the experiments, text terms, user tags, and network centrality values based on trackbacks were variously combined as retrieval features. The experimental results showed that employing user tags and network centrality values as retrieval features in addition to text words could improve the performance of blog retrieval.
https://doi.org/10.3743/KOSIM.2010.27.1.061 인용 PDF

Semi-supervised Cross-media Feature Learning via Efficient L_2,q Norm

Zong, Zhikai;Han, Aili;Gong, Qing
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.13 no.3
- /
- pp.1403-1417
- /
- 2019
With the rapid growth of multimedia data, research on cross-media feature learning has significance in many applications, such as multimedia search and recommendation. Existing methods are sensitive to noise and edge information in multimedia data. In this paper, we propose a semi-supervised method for cross-media feature learning by means of $L_{2,q}$ norm to improve the performance of cross-media retrieval, which is more robust and efficient than the previous ones. In our method, noise and edge information have less effect on the results of cross-media retrieval and the dynamic patch information of multimedia data is employed to increase the accuracy of cross-media retrieval. Our method can reduce the interference of noise and edge information and achieve fast convergence. Extensive experiments on the XMedia dataset illustrate that our method has better performance than the state-of-the-art methods.
https://doi.org/10.3837/tiis.2019.03.016 인용 PDF KSCI HTML

Designing Dataset Management and Service System for Digital Libraries Using DCAT (DCAT을 활용한 디지털도서관 데이터셋 관리와 서비스 설계)

Park, Jin Ho
- Journal of the Korean Society for Library and Information Science
- /
- v.53 no.2
- /
- pp.247-266
- /
- 2019
The purpose of this study is to propose a W3C standard, DCAT, to manage and service dataset that is becoming increasingly important as new knowledge information resources. To do this, we first analyzed the class and properties of the four core classes of DCAT. In addition, I modeled and presented a system that can manage and service various data sets based on DCAT in digital library. The system is divided into source data, data set management, linked data connection, and user service. Especially, the DCAT mapping function is suggested in dataset management. This feature can ensure interoperability of various datasets.
https://doi.org/10.4275/KSLIS.2019.53.2.247 인용 PDF KSCI HTML

Search Result 66, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)