• Title/Summary/Keyword: 키워드 필터링

Search Result 89, Processing Time 0.021 seconds

A study on the Filtering of Spam E-mail using n-Gram indexing and Support Vector Machine (n-Gram 색인화와 Support Vector Machine을 사용한 스팸메일 필터링에 대한 연구)

  • 서정우;손태식;서정택;문종섭
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.14 no.2
    • /
    • pp.23-33
    • /
    • 2004
  • Because of a rapid growth of internet environment, it is also fast increasing to exchange message using e-mail. But, despite the convenience of e-mail, it is rising a currently bi9 issue to waste their time and cost due to the spam mail in an individual or enterprise. Many kinds of solutions have been studied to solve harmful effects of spam mail. Such typical methods are as follows; pattern matching using the keyword with representative method and method using the probability like Naive Bayesian. In this paper, we propose a classification method of spam mails from normal mails using Support Vector Machine, which has excellent performance in pattern classification problems, to compensate for the problems of existing research. Especially, the proposed method practices efficiently a teaming procedure with a word dictionary including a generated index by the n-Gram. In the conclusion, we verified the proposed method through the accuracy comparison of spm mail separation between an existing research and proposed scheme.

Keyword Extraction through Text Mining and Open Source Software Category Classification based on Machine Learning Algorithms (텍스트 마이닝을 통한 키워드 추출과 머신러닝 기반의 오픈소스 소프트웨어 주제 분류)

  • Lee, Ye-Seul;Back, Seung-Chan;Joe, Yong-Joon;Shin, Dong-Myung
    • Journal of Software Assessment and Valuation
    • /
    • v.14 no.2
    • /
    • pp.1-9
    • /
    • 2018
  • The proportion of users and companies using open source continues to grow. The size of open source software market is growing rapidly not only in foreign countries but also in Korea. However, compared to the continuous development of open source software, there is little research on open source software subject classification, and the classification system of software is not specified either. At present, the user uses a method of directly inputting or tagging the subject, and there is a misclassification and hassle as a result. Research on open source software classification can also be used as a basis for open source software evaluation, recommendation, and filtering. Therefore, in this study, we propose a method to classify open source software by using machine learning model and propose performance comparison by machine learning model.

A Design and Implementation of Dynamic Electronic Map Creation System for Mobile phone Map Service Using Raster Method (래스터 방식을 이용한 모바일 전화기용 지도 서비스를 위한 동적 전자 지도 생성 시스템 설계 및 구현)

  • Seo Ii-Soo;Nam In-Gil;Lee Jeong-Bae;Choi Jin-Oh;Kim Mi-Ram
    • The KIPS Transactions:PartD
    • /
    • v.12D no.1 s.97
    • /
    • pp.151-158
    • /
    • 2005
  • In order to use the existing map data base in the mobile phone, the dynamic creation technique of the radio map which will be able to be converted into the raster image and transmitted was proposed. We transferred the client module functions such as the coordinate conversion, data compression and decoding to server, and made driving of JAVA browser in the mobile phone which has the restricted resources possible for the dynamic creation of the radio map. We made the radio electronic map service possible without map data base for the mobile phone use only by performing the general work of the map at the sever. And we guaranteed the client waiting time less then the limit time by performing the filtering work of the map at the server also. After we input the keyword at the user interface for searching the region or facility, and verified the performance of the proposed technique by confirming that the raster electronic map usable at the mobile phone was created dynamically.

The Method for Real-time Complex Event Detection of Unstructured Big data (비정형 빅데이터의 실시간 복합 이벤트 탐지를 위한 기법)

  • Lee, Jun Heui;Baek, Sung Ha;Lee, Soon Jo;Bae, Hae Young
    • Spatial Information Research
    • /
    • v.20 no.5
    • /
    • pp.99-109
    • /
    • 2012
  • Recently, due to the growth of social media and spread of smart-phone, the amount of data has considerably increased by full use of SNS (Social Network Service). According to it, the Big Data concept is come up and many researchers are seeking solutions to make the best use of big data. To maximize the creative value of the big data held by many companies, it is required to combine them with existing data. The physical and theoretical storage structures of data sources are so different that a system which can integrate and manage them is needed. In order to process big data, MapReduce is developed as a system which has advantages over processing data fast by distributed processing. However, it is difficult to construct and store a system for all key words. Due to the process of storage and search, it is to some extent difficult to do real-time processing. And it makes extra expenses to process complex event without structure of processing different data. In order to solve this problem, the existing Complex Event Processing System is supposed to be used. When it comes to complex event processing system, it gets data from different sources and combines them with each other to make it possible to do complex event processing that is useful for real-time processing specially in stream data. Nevertheless, unstructured data based on text of SNS and internet articles is managed as text type and there is a need to compare strings every time the query processing should be done. And it results in poor performance. Therefore, we try to make it possible to manage unstructured data and do query process fast in complex event processing system. And we extend the data complex function for giving theoretical schema of string. It is completed by changing the string key word into integer type with filtering which uses keyword set. In addition, by using the Complex Event Processing System and processing stream data at real-time of in-memory, we try to reduce the time of reading the query processing after it is stored in the disk.

Trend on the Recycling Technologies for Waste Catalyst by the Patent and Paper Analysis (특허(特許)와 논문(論文)으로 본 폐촉매(廢觸媒) 재활용(再活用) 기술(技術) 동향(動向))

  • Lee, Jin-Young;Pak, Jong-Jin;Cho, Young-Ju;Cho, Bong-Gyoo
    • Resources Recycling
    • /
    • v.22 no.2
    • /
    • pp.53-61
    • /
    • 2013
  • Since the 2000s, to start inducement of SCR(Selective Catalytic Reduction) denitrification facility by large scale companies which are emitted large amount of nitrogen oxides such as power plants, combined heat and power plant, incinerators and chemical plants due to take effect the regulation of stationary sources of nitrogen oxide(NOx), and the total amount of discharged pollutants, such as regulatory gradually emissions regulations are being strengthened and the expanded coverage due to the use of SCR denitrification catalyst is a growing trend. Since 2010 due to the new catalysts to replace the already installed power plants and incinerators due to inactive, and catalytic denitrification SCR waste catalyst waste as a resource rather than the development of technologies for recycling situation is urgently needed. In this study, analyzed paper and patent for recycling technologies of waste catalyst. The range of search was limited in the open patents of USA (US), European Union (EP), Japan (JP), Korea (KR) and SCI journals from 1975 to 2012. Patents and journals were collected using key-words searching and filtered by filtering criteria. The trends of the patents and journals was analyzed by the years, countries, companies, and technologies.

Implementation of an Efficient Microbial Medical Image Retrieval System Applying Knowledge Databases (지식 데이타베이스를 적용한 효율적인 세균 의료영상 검색 시스템의 구현)

  • Shin Yong Won;Koo Bong Oh
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.1 s.33
    • /
    • pp.93-100
    • /
    • 2005
  • This study is to desist and implement an efficient microbial medical image retrieval system based on knowledge and content of them which can make use of more accurate decision on colony as doll as efficient education for new techicians. For this. re first address overall inference to set up flexible search path using rule-base in order U redure time required original microbial identification by searching the fastest path of microbial identification phase based on heuristics knowledge. Next, we propose a color ffature gfraction mtU, which is able to extract color feature vectors of visual contents from a inn microbial image based on especially bacteria image using HSV color model. In addition, for better retrieval performance based on large microbial databases, we present an integrated indexing technique that combines with B+-tree for indexing simple attributes, inverted file structure for text medical keywords list, and scan-based filtering method for high dimensional color feature vectors. Finally. the implemented system shows the possibility to manage and retrieve the complex microbial images using knowledge and visual contents itself effectively. We expect to decrease rapidly Loaming time for elementary technicians by tell organizing knowledge of clinical fields through proposed system.

  • PDF

Trend on the Recycling Technologies for Waste Magnesium by the Patent and Paper Analysis (특허(特許)와 논문(論文)으로 본 폐(廢)마그네슘 재활용(再活用) 기술(技術) 동향(動向))

  • Moon, Byoung-Gi;You, Bong-Sun;Cho, Young-Ju;Cho, Bong-Gyoo
    • Resources Recycling
    • /
    • v.22 no.3
    • /
    • pp.73-80
    • /
    • 2013
  • Metal prices are rapidly rising due to increasing demand of metals and limited available resources according to the industrial requirement. As a result, securing a stable supply of these metal resources has been recognized as a core element of national competitiveness and sustained economic growth. In the case of magnesium and its alloys which are entirely depending on import, low-grade magnesium scraps from end-of-life vehicles and 3C(Camera, Computer, Communication) parts and magnesium wastes such as sludge and dross generated during melting process are hardly recycled. Accordingly, the development and commercialization of recycling technology of low-grade magnesium scrap is desperately needed to improve efficiency of resource circulation and to establish the required proprietary of resource metal supply and demand. In this study, papers and patents on recycling technologies of waste magnesium were analyzed. The range of search was limited in the open patents of USA (US), European Union (EP), Japan (JP), Korea (KR) and SCI journals from 1974 to 2012. Patents and journals were collected using key-words searching and filtered by filtering criteria. The trends of the patents and journals was analyzed by the years, countries, companies, and technologies.

Technical Trend on the Recycling Technologies for Stripping Process Waste Solution by the Patent and Paper Analysis (특허(特許)와 논문(論文)으로 본 스트리핑 공정폐액(工程廢液) 재활용(再活用) 기술(技術) 동향(動向))

  • Lee, Ho-Kyung;Lee, In-Gyoo;Park, Myung-Jun;Koo, Kee-Kahb;Cho, Young-Ju;Cho, Bong-Gyoo
    • Resources Recycling
    • /
    • v.22 no.4
    • /
    • pp.81-90
    • /
    • 2013
  • Since the 1990s, the rapid development of information and communication industry, the demand for semiconductor and LCD continues to increase. Therefore in the formation of fine circuit patterns, which are the cores of sensitizer and the most expensive thinner and stripper liquor used to remove photoresist and its dilution, the amount in demand are dramatically increasing, emerging need for recycling of waste thinner and stripper liquor. Recently, recycling technologies of stripping process waste solution has been widely studied by economic aspects and environmental aspects, in terms of efficiency of the stripping process. In this study, analyzed paper and patent for recycling technologies of waste solution from stripping process. The range of search was limited in the open patents of USA (US), European Union (EP), Japan (JP), Korea (KR) and SCI journals from 1981 to 2010. Patents and journals were collected using key-words searching and filtered by filtering criteria. The trends of the patents and journals was analyzed by the years, countries, companies, and technologies.

Geographical Name Denoising by Machine Learning of Event Detection Based on Twitter (트위터 기반 이벤트 탐지에서의 기계학습을 통한 지명 노이즈제거)

  • Woo, Seungmin;Hwang, Byung-Yeon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.10
    • /
    • pp.447-454
    • /
    • 2015
  • This paper proposes geographical name denoising by machine learning of event detection based on twitter. Recently, the increasing number of smart phone users are leading the growing user of SNS. Especially, the functions of short message (less than 140 words) and follow service make twitter has the power of conveying and diffusing the information more quickly. These characteristics and mobile optimised feature make twitter has fast information conveying speed, which can play a role of conveying disasters or events. Related research used the individuals of twitter user as the sensor of event detection to detect events that occur in reality. This research employed geographical name as the keyword by using the characteristic that an event occurs in a specific place. However, it ignored the denoising of relationship between geographical name and homograph, it became an important factor to lower the accuracy of event detection. In this paper, we used removing and forecasting, these two method to applied denoising technique. First after processing the filtering step by using noise related database building, we have determined the existence of geographical name by using the Naive Bayesian classification. Finally by using the experimental data, we earned the probability value of machine learning. On the basis of forecast technique which is proposed in this paper, the reliability of the need for denoising technique has turned out to be 89.6%.