• Title/Summary/Keyword: New Category

Search Result 835, Processing Time 0.033 seconds

Classification of e-mail Using Dynamic Category Hierarchy and Automatic category generation (자동 카테고리 생성과 동적 분류 체계를 사용한 이메일 분류)

  • Ahn Chan Min;Park Sang Ho;Lee Ju-Hong;Choi Bum-Ghi;Park Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.10 no.2
    • /
    • pp.79-89
    • /
    • 2004
  • Since the amount of E-mail messages has increased , we need a new technique for efficient e-mail classification. E-mail classifications are grouped into two classes: binary classification, multi-classification. The current binary classification methods are mostly spm mail classification methods which are based on rule driven, bayesian, SVM, etc. The current multi- classification methods are based on clustering which groups e-mails by similarity. In this paper, we propose a novel method for e-mail classification. It combines the automatic category generation method based on the vector model and the dynamic category hierarchy construction method. This method can multi-classify e-mail automatically and manage a large amount of e-mail efficiently. In addition, this method increases the search accuracy by dynamic reclassification of e-mails.

  • PDF

Evaluating Carbon Dioxide Emission from Cadastral Category based on Tier 3 Approach (Tier 3 방식에 의거한 지목별 온실가스 배출 실태평가)

  • Kim, Dae-Ho;Um, Jung-Sup
    • Spatial Information Research
    • /
    • v.19 no.3
    • /
    • pp.11-22
    • /
    • 2011
  • It is usual for the carbon dioxide emission to be calculated by official energy consumption statistics produced from a number of specialized industrial process such as refinery, power plant etc. The aim of this research was to evaluate potential of cadastral system in monitoring carbon dioxide emitted from land use. An empirical study for a cadastral category was conducted to demonstrate how a on-site measurement can be used to assist in estimating the carbon dioxide emission in terms of land use specific settings. The cadastral category based analysis made it possible to identify area-wide patterns of carbon dioxide emission, which cannot be acquired by traditional Government statistics. It was possible to identify successively increasing trends in the human-related parcels such as housing land while decreasing trends of carbon dioxide in sink parcels(eg. forest). The results indicate that the cadastral parcel could be used not only as a tool to monitor carbon dioxide emission, but also as an evidence to restrict initiation of development activities causing negative influence to carbon dioxide emission such as road construction. As a result, the research findings have established the new concept of "carbon dioxide emission monitoring based on cadastral category", proposed as an initial aim of this paper.

Retailer's Store Brand Product Line Design and Product Assortment Decision in the Vertically Differentiated Product Category (수직적으로 차별화된 제품 카테고리 내에서 소매상의 스토어 브랜드 제품군 디자인 및 제품구색에 대한 의사결정)

  • Chung, Hwan
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.36 no.3
    • /
    • pp.107-120
    • /
    • 2011
  • The increased availability of store brand suppliers now provides retailers with opportunities to create their own lines of vertically differentiated multiple store brands within a product category. As the number of store brands increase, the retailer's shelf space becomes more crowded, which may force the retailer to consider dropping some national brands from its assortment. Despite these trends, the problem of product line design in a vertically differentiated product category has been analyzed mainly from a manufacturer's perspective in the marketing literature and it is not known to what extent the findings of the existing product line design literature provide applicable strategic guidelines for the new problem faced by retailers. In this study, we address this deficiency in the literature and conduct an in-depth study of the retailer's strategic design of a line of store brands and its assortment decision within the context of retail category management. We analyze the retailer's decision about not only how to design a line of store brands but also which national brand to drop from its assortment. The results of our analysis are as follows. First, if the retailer has to drop one of national brands from its assortment, it is the best for the retailer to drop the low-quality national brand rather than the high-quality national brand. Second, the retailer has to position the high-quality store brand relatively close to the high-quality national brand, remained on its shelf, in terms of quality so as to maximize the size of retail margin from the national brand. On the other hand, the retailer should set the quality of the low-quality store brand at a lower level than that of the low-quality national brand to increase the total category demand by attracting more price sensitive consumers. By doing so, the retailer can also minimize cannibalization between two store brands. Lastly, our analysis shows that the introduction of a line of store brands improves consumer welfare by increasing real values of all products on the shelf.

Design and Implementation of Web Directory Engine Using Dynamic Category Hierarchy (동적분류에 의한 주제별 웹 검색엔진의 설계 및 구현)

  • Choi Bum-Ghi;Park Sun;Park Tae-Su;Song Jae-Won;Lee Ju-Hong
    • Journal of Internet Computing and Services
    • /
    • v.7 no.2
    • /
    • pp.71-80
    • /
    • 2006
  • In web search engines, there are two main methods: directory searching and keyword searching. Keyword searching shows high recall rate but tends to come up with too many search results to find which users want to see the pages. Directory searching has also a difficulty to find the pages that users want in case of selecting improper category without knowing the exact category, that is, it shows high precision rates but low recall rates. We designed and implemented a new web search engine to resolve the problems of directory search method. It regards a category as a fuzzy set which contains keywords and calculate the degree of inclusion between categories. The merit of this method is to enhance the recall rate of directory searching by expanding subcategories on the basis of similarity.

  • PDF

Improving Classification Accuracy in Hierarchical Trees via Greedy Node Expansion

  • Byungjin Lim;Jong Wook Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.6
    • /
    • pp.113-120
    • /
    • 2024
  • With the advancement of information and communication technology, we can easily generate various forms of data in our daily lives. To efficiently manage such a large amount of data, systematic classification into categories is essential. For effective search and navigation, data is organized into a tree-like hierarchical structure known as a category tree, which is commonly seen in news websites and Wikipedia. As a result, various techniques have been proposed to classify large volumes of documents into the terminal nodes of category trees. However, document classification methods using category trees face a problem: as the height of the tree increases, the number of terminal nodes multiplies exponentially, which increases the probability of misclassification and ultimately leads to a reduction in classification accuracy. Therefore, in this paper, we propose a new node expansion-based classification algorithm that satisfies the classification accuracy required by the application, while enabling detailed categorization. The proposed method uses a greedy approach to prioritize the expansion of nodes with high classification accuracy, thereby maximizing the overall classification accuracy of the category tree. Experimental results on real data show that the proposed technique provides improved performance over naive methods.

A Methodology for Automatic Multi-Categorization of Single-Categorized Documents (단일 카테고리 문서의 다중 카테고리 자동확장 방법론)

  • Hong, Jin-Sung;Kim, Namgyu;Lee, Sangwon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.77-92
    • /
    • 2014
  • Recently, numerous documents including unstructured data and text have been created due to the rapid increase in the usage of social media and the Internet. Each document is usually provided with a specific category for the convenience of the users. In the past, the categorization was performed manually. However, in the case of manual categorization, not only can the accuracy of the categorization be not guaranteed but the categorization also requires a large amount of time and huge costs. Many studies have been conducted towards the automatic creation of categories to solve the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorizing complex documents with multiple topics because the methods work by assuming that one document can be categorized into one category only. In order to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, they are also limited in that their learning process involves training using a multi-categorized document set. These methods therefore cannot be applied to multi-categorization of most documents unless multi-categorized training sets are provided. To overcome the limitation of the requirement of a multi-categorized training set by traditional multi-categorization algorithms, we propose a new methodology that can extend a category of a single-categorized document to multiple categorizes by analyzing relationships among categories, topics, and documents. First, we attempt to find the relationship between documents and topics by using the result of topic analysis for single-categorized documents. Second, we construct a correspondence table between topics and categories by investigating the relationship between them. Finally, we calculate the matching scores for each document to multiple categories. The results imply that a document can be classified into a certain category if and only if the matching score is higher than the predefined threshold. For example, we can classify a certain document into three categories that have larger matching scores than the predefined threshold. The main contribution of our study is that our methodology can improve the applicability of traditional multi-category classifiers by generating multi-categorized documents from single-categorized documents. Additionally, we propose a module for verifying the accuracy of the proposed methodology. For performance evaluation, we performed intensive experiments with news articles. News articles are clearly categorized based on the theme, whereas the use of vulgar language and slang is smaller than other usual text document. We collected news articles from July 2012 to June 2013. The articles exhibit large variations in terms of the number of types of categories. This is because readers have different levels of interest in each category. Additionally, the result is also attributed to the differences in the frequency of the events in each category. In order to minimize the distortion of the result from the number of articles in different categories, we extracted 3,000 articles equally from each of the eight categories. Therefore, the total number of articles used in our experiments was 24,000. The eight categories were "IT Science," "Economy," "Society," "Life and Culture," "World," "Sports," "Entertainment," and "Politics." By using the news articles that we collected, we calculated the document/category correspondence scores by utilizing topic/category and document/topics correspondence scores. The document/category correspondence score can be said to indicate the degree of correspondence of each document to a certain category. As a result, we could present two additional categories for each of the 23,089 documents. Precision, recall, and F-score were revealed to be 0.605, 0.629, and 0.617 respectively when only the top 1 predicted category was evaluated, whereas they were revealed to be 0.838, 0.290, and 0.431 when the top 1 - 3 predicted categories were considered. It was very interesting to find a large variation between the scores of the eight categories on precision, recall, and F-score.

Selecting Ordering Policy and Items Classification Based on Canonical Correlation and Cluster Analysis

  • Nagasawa, Keisuke;Irohara, Takashi;Matoba, Yosuke;Liu, Shuling
    • Industrial Engineering and Management Systems
    • /
    • v.11 no.2
    • /
    • pp.134-141
    • /
    • 2012
  • It is difficult to find an appropriate ordering policy for a many types of items. One of the reasons for this difficulty is that each item has a different demand trend. We will classify items by shipment trend and then decide the ordering policy for each item category. In this study, we indicate that categorizing items from their statistical characteristics leads to an ordering policy suitable for that category. We analyze the ordering policy and shipment trend and propose a new method for selecting the ordering policy which is based on finding the strongest relation between the classification of the items and the ordering policy. In our numerical experiment, from actual shipment data of about 5,000 items over the past year, we calculated many statistics that represent the trend of each item. Next, we applied the canonical correlation analysis between the evaluations of ordering policies and the various statistics. Furthermore, we applied the cluster analysis on the statistics concerning the performance of ordering policies. Finally, we separate items into several categories and show that the appropriate ordering policies are different for each category.

Collaborations in Fashion and Arts Across Industry Disciplines (패션, 예술, 산업의 협업사례 고찰)

  • Park, Kyung-Ae;Kim, Sook-Kyung
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.33 no.7
    • /
    • pp.1152-1163
    • /
    • 2009
  • Product development and marketing that appeal to consumer emotions are important as shown by a variety of product and service industries that integrate fashion and arts into product design and marketing through collaboration. This study attempted to analyze the patterns in the collaborations of fashion and arts across industry disciplines. A total of 278 collaboration cases reported in news articles were collected from internet databases. Cases were categorized into 5 disciplines of fashion-fashion, arts-arts, fashion-arts, fashion-other industries, and arts-other industries, with each category analyzed in frequency distribution and collaboration type along with related partner and industry characteristics. Collaborations with other industries were observed more than internal ones, and individuals (rather than firms) were more involved in collaborations. Though the collaboration characteristics were different by partner category and sub-category, by individual or firm, and by related industries, a variety of collaborations integrating fashion and arts into product design and development, a new brand launching, product line extension, and co-marketing were observed across product and industry disciplines. The study also described fashion and arts that were integrated into consumer life styles.

Regional Distribution of Government-sponsored Research Institutes in Science & Technology (이공계 정부출연연구기관의 지방이전방안)

  • Chung Sun-Yang
    • Journal of Korea Technology Innovation Society
    • /
    • v.8 no.spc1
    • /
    • pp.410-432
    • /
    • 2005
  • Korea's government-sponsored research institutes (GRIs) have contributed a lot to the economic development of Korea. They have become major components of the Korean national innovation system. However, in these days, they have been blamed for low productivity and inefficiency, as well as insufficient contribution to national development. This paper argues that the major problem of Korea's GRIs lies in their concentration in a few regions, e. g. Seoul, Gyeonggi, and Daedeok. It argues that they should be fairly distributed among regions in order to contribute to the development of Korea effectively. In this regard, this paper explores the relevant policy options to effectively distribute Korean GRIs among regions. It suggests two categories of distribution scenarios. The first category is based on the types of GRIs to be distributed. This category has three scenarios: existing GRIs, branch institutes of existing GRIs, and new GRIs. The second category is based on the jurisdiction of GRIs. It has also three scenarios: GRI system as an independent sector, GRI-university cooperation system, and integration of GRIs to regional universities. These scenarios have advantages and disadvantages, respectively. Therefore, we must find a rather satisfactory scenario based on the mixture of scenarios of both categories.

  • PDF

Comparison Shopping System Based on RSS with Ontology Matching (온톨로지 매칭을 이용한 RSS 기반의 비교쇼핑 시스템)

  • Park, Sang-Un
    • The Journal of Information Systems
    • /
    • v.20 no.3
    • /
    • pp.41-61
    • /
    • 2011
  • In order to buy products through the Internet, consumers dissipate much time and efforts in collecting and comparing product information from various online shopping malls. Consumers can save their efforts by using price comparison sites, but there are some shortcomings in comparison shopping. Firstly, comparison sites do not show the lowest price of some products that are selling in shopping malls. Secondly, the product information provided by comparison sites is sometimes wrong. Thirdly, there are too many results. In order to overcome the shortcomings, we suggested a comparison shopping system based on RSS by using ontology matching. We used the current RSS standard for syntactic interoperability instead of suggesting new standards. Moreover, we used ontology matching for semantic interoperability to compare product information with different ontologies. The suggested ontology matching consists of three steps. The first step is finding exact sense from WordNet for a given product category, and the second step is searching for matching product category candidates from the products of RSS feeds. The final step is calculating similarities of the candidates with the target product category. From the experiments, we could get better recall rates that are suitable for e-commerce environments and the results show that our system is effective in product comparison.