• Title/Summary/Keyword: search similarity

Search Result 535, Processing Time 0.026 seconds

Analysis and Estimation for Market Share of Biologics based on Google Trends Big Data (구글 트렌드 빅데이터를 통한 바이오의약품의 시장 점유율 분석과 추정)

  • Bong, Ki Tae;Lee, Heesang
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.43 no.2
    • /
    • pp.14-24
    • /
    • 2020
  • Google Trends is a useful tool not only for setting search periods, but also for providing search volume to specific countries, regions, and cities. Extant research showed that the big data from Google Trends could be used for an on-line market analysis of opinion sensitive products instead of an on-site survey. This study investigated the market share of tumor necrosis factor-alpha (TNF-α) inhibitor, which is in a great demand pharmaceutical product, based on big data analysis provided by Google Trends. In this case study, the consumer interest data from Google Trends were compared to the actual product sales of Top 3 TNF-α inhibitors (Enbrel, Remicade, and Humira). A correlation analysis and relative gap were analyzed by statistical analysis between sales-based market share and interest-based market share. Besides, in the country-specific analysis, three major countries (USA, Germany, and France) were selected for market share analysis for Top 3 TNF-α inhibitors. As a result, significant correlation and similarity were identified by data analysis. In the case of Remicade's biosimilars, the consumer interest in two biosimilar products (Inflectra and Renflexis) increased after the FDA approval. The analytical data showed that Google Trends is a powerful tool for market share estimation for biosimilars. This study is the first investigation in market share analysis for pharmaceutical products using Google Trends big data, and it shows that global and regional market share analysis and estimation are applicable for the interest-sensitive products.

Fast Handwriting Recognition Using Model Graph (모델 그래프를 이용한 빠른 필기 인식 방법)

  • Oh, Se-Chang
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.5
    • /
    • pp.892-898
    • /
    • 2012
  • Rough classification methods are used to improving the recognition speed in many character recognition problems. In this case, some irreversible result can occur by an error in rough classification. Methods for duplicating each model in several classes are used in order to reduce this risk. But the errors by rough classfication can not be completely ruled out by these methods. In this paper, an recognition method is proposed to increase speed that matches models selectively without any increase in error. This method constructs a model graph using similarity between models. Then a search process begins from a particular point in the model graph. In this process, matching of unnecessary models are reduced that are not similar to the input pattern. In this paper, the proposed method is applied to the recognition problem of handwriting numbers and upper/lower cases of English alphabets. In the experiments, the proposed method was compared with the basic method that matches all models with input pattern. As a result, the same recognition rate, which has shown as the basic method, was obtained by controlling the out-degree of the model graph and the number of maintaining candidates during the search process thereby being increased the recognition speed to 2.45 times.

Extended Semantic Web Services Retrieval Model for the Intelligent Web Services (지능형 웹 서비스를 위한 확장된 시맨틱 웹서비스 검색 모델)

  • Choi, Ok-Kyung;Han, Sang-Yong;Lee, Zoon-Ky
    • The KIPS Transactions:PartD
    • /
    • v.13D no.5 s.108
    • /
    • pp.725-730
    • /
    • 2006
  • Recently Web services have become a key technology which is indispensable for e-business. Due to its ability to provide the desired information or service regardless of time and place, integrating current application systems within a single business or between multiple businesses with standardized technologies are realized using the open network and Internet. However, the current Web Services Retrieval Systems, based on text oriented search are incapable of providing reliable search results by perceiving the similarity or interrelation between the various terms. Currently there are no web services retrieval models containing such semantic web functions. This research work is purported for solving such problems by designing and implementing an extended Semantic Web Services Retrieval Model that is capable of searching for general web documents, UDDI and semantic web documents. Execution result is proposed in this paper and its efficiency and accuracy are verified through it.

The Kernel Trick for Content-Based Media Retrieval in Online Social Networks

  • Cha, Guang-Ho
    • Journal of Information Processing Systems
    • /
    • v.17 no.5
    • /
    • pp.1020-1033
    • /
    • 2021
  • Nowadays, online or mobile social network services (SNS) are very popular and widely spread in our society and daily lives to instantly share, disseminate, and search information. In particular, SNS such as YouTube, Flickr, Facebook, and Amazon allow users to upload billions of images or videos and also provide a number of multimedia information to users. Information retrieval in multimedia-rich SNS is very useful but challenging task. Content-based media retrieval (CBMR) is the process of obtaining the relevant image or video objects for a given query from a collection of information sources. However, CBMR suffers from the dimensionality curse due to inherent high dimensionality features of media data. This paper investigates the effectiveness of the kernel trick in CBMR, specifically, the kernel principal component analysis (KPCA) for dimensionality reduction. KPCA is a nonlinear extension of linear principal component analysis (LPCA) to discovering nonlinear embeddings using the kernel trick. The fundamental idea of KPCA is mapping the input data into a highdimensional feature space through a nonlinear kernel function and then computing the principal components on that mapped space. This paper investigates the potential of KPCA in CBMR for feature extraction or dimensionality reduction. Using the Gaussian kernel in our experiments, we compute the principal components of an image dataset in the transformed space and then we use them as new feature dimensions for the image dataset. Moreover, KPCA can be applied to other many domains including CBMR, where LPCA has been used to extract features and where the nonlinear extension would be effective. Our results from extensive experiments demonstrate that the potential of KPCA is very encouraging compared with LPCA in CBMR.

A Image Search Algorithm using Coefficients of The Cosine Transform (여현변환 계수를 이용한 이미지 탐색 알고리즘)

  • Lee, Seok-Han
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.12 no.1
    • /
    • pp.13-21
    • /
    • 2019
  • The content based on image retrieval makes use of features of information within image such as color, texture and share for Retrieval data. we present a novel approach for improving retrieval accuracy based on DCT Filter-Bank. First, we perform DCT on a given image, and generate a Filter-Bank using the DCT coefficients for each color channel. In this step, DC and the limited number of AC coefficients are used. Next, a feature vector is obtained from the histogram of the quantized DC coefficients. Then, AC coefficients in the Filter-Bank are separated into three main groups indicating horizontal, vertical, and diagonal edge directions, respectively, according to their spatial-frequency properties. Each directional group creates its histogram after employing Otsu binarization technique. Finally, we project each histogram on the horizontal and vertical axes, and generate a feature vector for each group. The computed DC and AC feature vectors bins are concatenated, and it is used in the similarity checking procedure. We experimented using 1,000 databases, and as a result, this approach outperformed the old retrieval method which used color information.

Target Image Exchange Model for Object Tracking Based on Siamese Network (샴 네트워크 기반 객체 추적을 위한 표적 이미지 교환 모델)

  • Park, Sung-Jun;Kim, Gyu-Min;Hwang, Seung-Jun;Baek, Joong-Hwan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.3
    • /
    • pp.389-395
    • /
    • 2021
  • In this paper, we propose a target image exchange model to improve performance of the object tracking algorithm based on a Siamese network. The object tracking algorithm based on the Siamese network tracks the object by finding the most similar part in the search image using only the target image specified in the first frame of the sequence. Since only the object of the first frame and the search image compare similarity, if tracking fails once, errors accumulate and drift in a part other than the tracked object occurs. Therefore, by designing a CNN(Convolutional Neural Network) based model, we check whether the tracking is progressing well, and the target image exchange timing is defined by using the score output from the Siamese network-based object tracking algorithm. The proposed model is evaluated the performance using the VOT-2018 dataset, and finally achieved an accuracy of 0.611 and a robustness of 22.816.

Process Standardization for the Construction of Job-Exposure Matrix Using the Work Environment Measurement Database (작업환경측정 결과 데이터베이스를 활용한 직무노출매트릭스 구축을 위한 공정 표준화)

  • Sangjun Choi;Ju-Hyun Park;Dong-Hee Koh;Donguk Park;Hwan-Cheol Kim;Dae Sung Lim;Yeji Sung;Kyoung Yoon Ko;Ji Seon Lim;Hoekyeong Seo
    • Journal of Korean Society of Occupational and Environmental Hygiene
    • /
    • v.33 no.1
    • /
    • pp.78-90
    • /
    • 2023
  • Objectives: The purpose of this study is to standardize the process code of the work environment measurement database (WEMD) for the construction of a job-exposure matrix (JEM). Methods: The standard process code (SPC) was reclassified based on process similarity and drawing upon the code used in the existing K2B. It was supplemented through review by industrial hygiene experts. In addition, an index word database related to SPC was created and used for SPC search. A pilot evaluation project was conducted by experts to evaluate the validity of the newly reclassified standard process code. Results: A total of 70 final SPCs were developed, including 31 processes related to the construction industry. Using the Shiny program, we developed a standard code finder that can be used on the web (https://kscf.shinyapps.io/scf_app/). As a result of the pilot evaluation, it was determined that it was easier to search for standard codes than previous codes, so it was highly utilized. Conclusions: It is expected that JEM construction using industry-process information drawing on WEMD data will be possible using the 70 newly standardized process codes.

Elicitation of Collective Intelligence by Fuzzy Relational Methodology (퍼지관계 이론에 의한 집단지성의 도출)

  • Joo, Young-Do
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.17-35
    • /
    • 2011
  • The collective intelligence is a common-based production by the collaboration and competition of many peer individuals. In other words, it is the aggregation of individual intelligence to lead the wisdom of crowd. Recently, the utilization of the collective intelligence has become one of the emerging research areas, since it has been adopted as an important principle of web 2.0 to aim openness, sharing and participation. This paper introduces an approach to seek the collective intelligence by cognition of the relation and interaction among individual participants. It describes a methodology well-suited to evaluate individual intelligence in information retrieval and classification as an application field. The research investigates how to derive and represent such cognitive intelligence from individuals through the application of fuzzy relational theory to personal construct theory and knowledge grid technique. Crucial to this research is to implement formally and process interpretatively the cognitive knowledge of participants who makes the mutual relation and social interaction. What is needed is a technique to analyze cognitive intelligence structure in the form of Hasse diagram, which is an instantiation of this perceptive intelligence of human beings. The search for the collective intelligence requires a theory of similarity to deal with underlying problems; clustering of social subgroups of individuals through identification of individual intelligence and commonality among intelligence and then elicitation of collective intelligence to aggregate the congruence or sharing of all the participants of the entire group. Unlike standard approaches to similarity based on statistical techniques, the method presented employs a theory of fuzzy relational products with the related computational procedures to cover issues of similarity and dissimilarity.

Identification and Biochemical Characterization of Xylanase-producing Streptomyces glaucescens subsp. WJ-1 Isolated from Soil in Jeju Island, Korea (제주도 토양에서 분리한 xylanase 생산균주 Streptomyces glaucescens subsp. WJ-1의 동정 및 효소의 생화학적 특성 연구)

  • Kim, Da Som;Jung, Sung Cheol;Bae, Chang Hwan;Chi, Won-Jae
    • Microbiology and Biotechnology Letters
    • /
    • v.45 no.1
    • /
    • pp.43-50
    • /
    • 2017
  • A xylan-degrading bacterium (strain WJ-1) was isolated from soil collected from Jeju Island, Republic of Korea. Strain WJ-1 was characterized as a gram-positive, aerobic, and spore-forming bacterium. The predominant fatty acid in this bacterium was anteiso-$C_{15:0}$ (42.99%). A similarity search based on 16S rRNA gene sequences suggested that the strain belonged to the genus Streptomyces. Further, strain WJ-1 shared the highest sequence similarity with the type strains Streptomyces spinoveruucosus NBRC 14228, S. minutiscleroticus NBRC 13000, and S. glaucescens NBRC 12774. Together, they formed a coherent cluster in a phylogenetic tree based on the neighbor-joining algorithm. The DNA G+C content of strain WJ-1 was 74.7 mol%. The level of DNA-DNA relatedness between strain WJ-1 and the closest related species S. glaucescens NBRC 12774 was 85.7%. DNA-DNA hybridization, 16S rRNA gene sequence similarity, and the phenotypic and chemotaxonomic characteristics suggest that strain WJ-1 constitutes a novel subspecies of S. glaucescens. Thus, the strain was designated as S. glaucescens subsp. WJ-1 (Korean Agricultural Culture Collection [KACC] accession number 92086). Additionally, strain WJ-1 secreted thermostable endo-type xylanases that converted xylan to xylooligosaccharides such as xylotriose and xylotetraose. The enzymes exhibited optimal activity at pH 7.0 and $55^{\circ}C$.

SOM-Based $R^{*}-Tree$ for Similarity Retrieval (자기 조직화 맵 기반 유사 검색 시스템)

  • O, Chang-Yun;Im, Dong-Ju;O, Gun-Seok;Bae, Sang-Hyeon
    • The KIPS Transactions:PartD
    • /
    • v.8D no.5
    • /
    • pp.507-512
    • /
    • 2001
  • Feature-based similarity has become an important research issue in multimedia database systems. The features of multimedia data are useful for discriminating between multimedia objects. the performance of conventional multidimensional data structures tends to deteriorate as the number of dimensions of feature vectors increase. The $R^{*}-Tree$ is the most successful variant of the R-Tree. In this paper, we propose a SOM-based $R^{*}-Tree$ as a new indexing method for high-dimensional feature vectors. The SOM-based $R^{*}-Tree$ combines SOM and $R^{*}-Tree$ to achieve search performance more scalable to high-dimensionalties. Self-Organizingf Maps (SOMs) provide mapping from high-dimensional feature vectors onto a two-dimensional space. The map is called a topological feature map, and preserves the mutual relationships (similarity) in the feature spaces of input data, clustering mutually similar feature vectors in neighboring nodes. Each node of the topological feature map holds a codebook vector. We experimentally compare the retrieval time cost of a SOM-based $R^{*}-Tree$ with of an SOM and $R^{*}-Tree$ using color feature vectors extracted from 40,000 images. The results show that the SOM-based $R^{*}-Tree$ outperform both the SOM and $R^{*}-Tree$ due to reduction of the number of nodes to build $R^{*}-Tree$ and retrieval time cost.

  • PDF