• Title/Summary/Keyword: POS Data

Search Result 135, Processing Time 0.025 seconds

Spam Filter by Using X2 Statistics and Support Vector Machines (카이제곱 통계량과 지지벡터기계를 이용한 스팸메일 필터)

  • Lee, Song-Wook
    • The KIPS Transactions:PartB
    • /
    • v.17B no.3
    • /
    • pp.249-254
    • /
    • 2010
  • We propose an automatic spam filter for e-mail data using Support Vector Machines(SVM). We use a lexical form of a word and its part of speech(POS) tags as features and select features by chi square statistics. We represent each feature by TF(text frequency), TF-IDF, and binary weight for experiments. After training SVM with the selected features, SVM classifies each e-mail as spam or not. In experiment, the selected features improve the performance of our system and we acquired overall 98.9% of accuracy with TREC05-p1 spam corpus.

How have retailers led the HMR industry in Japan and UK?

  • CHO, Young Sang
    • The Journal of Economics, Marketing and Management
    • /
    • v.9 no.6
    • /
    • pp.25-38
    • /
    • 2021
  • Purpose: This study is aiming at providing researchers and practitioners with new insights to analyse how retailers have made a contribution to the growth of HMR market in Japan and UK. Research design: The second section will look at the definition of HMR, and then, introduce each county's case, while analysing how retailers have developed the HMR market and illustrating some implications. Finally, the authors will draw a conclusion. Results: Retailers have established the retailer brand development department with the sophisticated retail information system which has made a considerable contribution to the growth of the HMR market. Also, it enables retailers to accumulate retail knowledge associated with ready-to-eat meals and train top-level experts, whilst helping them to build the trustworthy supply chain relationships by sharing the POS data with food manufacturers. Consequently, the cooperation with food manufacturers has enhanced in the HMR market in both Japan and UK, on the basis of sophisticated delivery system as well as the concept of innovation into the HMR sector. Conclusions: Retailers have to benchmark Japanese and British retailers' Knowledge to grow ready meal market in Korea and invest their marketing resources in developing various HMR foods, on the basis of innovative thinking.

VOC Summarization and Classification based on Sentence Understanding (구문 의미 이해 기반의 VOC 요약 및 분류)

  • Kim, Moonjong;Lee, Jaean;Han, Kyouyeol;Ahn, Youngmin
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.1
    • /
    • pp.50-55
    • /
    • 2016
  • To attain an understanding of customers' opinions or demands regarding a companies' products or service, it is important to consider VOC (Voice of Customer) data; however, it is difficult to understand contexts from VOC because segmented and duplicate sentences and a variety of dialog contexts. In this article, POS (part of speech) and morphemes were selected as language resources due to their semantic importance regarding documents, and based on these, we defined an LSP (Lexico-Semantic-Pattern) to understand the structure and semantics of the sentences and extracted summary by key sentences; furthermore the LSP was introduced to connect the segmented sentences and remove any contextual repetition. We also defined the LSP by categories and classified the documents based on those categories that comprise the main sentences matched by LSP. In the experiment, we classified the VOC-data documents for the creation of a summarization before comparing the result with the previous methodologies.

A Study of Pre-trained Language Models for Korean Language Generation (한국어 자연어생성에 적합한 사전훈련 언어모델 특성 연구)

  • Song, Minchae;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.4
    • /
    • pp.309-328
    • /
    • 2022
  • This study empirically analyzed a Korean pre-trained language models (PLMs) designed for natural language generation. The performance of two PLMs - BART and GPT - at the task of abstractive text summarization was compared. To investigate how performance depends on the characteristics of the inference data, ten different document types, containing six types of informational content and creation content, were considered. It was found that BART (which can both generate and understand natural language) performed better than GPT (which can only generate). Upon more detailed examination of the effect of inference data characteristics, the performance of GPT was found to be proportional to the length of the input text. However, even for the longest documents (with optimal GPT performance), BART still out-performed GPT, suggesting that the greatest influence on downstream performance is not the size of the training data or PLMs parameters but the structural suitability of the PLMs for the applied downstream task. The performance of different PLMs was also compared through analyzing parts of speech (POS) shares. BART's performance was inversely related to the proportion of prefixes, adjectives, adverbs and verbs but positively related to that of nouns. This result emphasizes the importance of taking the inference data's characteristics into account when fine-tuning a PLMs for its intended downstream task.

A Korean Homonym Disambiguation System Based on Statistical, Model Using weights

  • Kim, Jun-Su;Lee, Wang-Woo;Kim, Chang-Hwan;Ock, Cheol-young
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2002.02a
    • /
    • pp.166-176
    • /
    • 2002
  • A homonym could be disambiguated by another words in the context as nouns, predicates used with the homonym. This paper using semantic information (co-occurrence data) obtained from definitions of part of speech (POS) tagged UMRD-S$^1$), In this research, we have analyzed the result of an experiment on a homonym disambiguation system based on statistical model, to which Bayes'theorem is applied, and suggested a model established of the weight of sense rate and the weight of distance to the adjacent words to improve the accuracy. The result of applying the homonym disambiguation system using semantic information to disambiguating homonyms appearing on the dictionary definition sentences showed average accuracy of 98.32% with regard to the most frequent 200 homonyms. We selected 49 (31 substantives and 18 predicates) out of the 200 homonyms that were used in the experiment, and performed an experiment on 50,703 sentences extracted from Sejong Project tagged corpus (i.e. a corpus of morphologically analyzed words) of 3.5 million words that includes one of the 49 homonyms. The result of experimenting by assigning the weight of sense rate(prior probability) and the weight of distance concerning the 5 words at the front/behind the homonym to be disambiguated showed better accuracy than disambiguation systems based on existing statistical models by 2.93%,

  • PDF

A research on the media player transferring vibrotactile stimulation from digital sound (디지털 음원의 촉각 자극 전이를 위한 미디어 플레이어에 대한 연구)

  • Lim, Young-Hoon;Lee, Su-Jin;Jung, Jong-Hwan;Ha, Ji-Min;Whang, Min-Cheol;Park, Jun-Seok
    • 한국HCI학회:학술대회논문집
    • /
    • 2007.02a
    • /
    • pp.881-886
    • /
    • 2007
  • This study was to develope a vibrotactile display system using windows media player from digital audio signal. WMPlayer10SDK system which was plug-in tool by microsoft windows media player provided its video and audio signal information. The audio signal was tried to be change into vibrotactile display. Audio signal had 4 sections such as 8bit, 16bit, 24bit, and 32bit. Each section was computed its frequency and vibrato scale. And data was transferred to 38400bps network port(COM1) for vibration. Using this system was able to develop the music suit which presented tactile feeling of music beyond sound. Therefore, it may provide cross modal technology for fusion technology of human senses.

  • PDF

E-commerce data based Sentiment Analysis Model Implementation using Natural Language Processing Model (자연어처리 모델을 이용한 이커머스 데이터 기반 감성 분석 모델 구축)

  • Choi, Jun-Young;Lim, Heui-Seok
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.11
    • /
    • pp.33-39
    • /
    • 2020
  • In the field of Natural Language Processing, Various research such as Translation, POS Tagging, Q&A, and Sentiment Analysis are globally being carried out. Sentiment Analysis shows high classification performance for English single-domain datasets by pretrained sentence embedding models. In this thesis, the classification performance is compared by Korean E-commerce online dataset with various domain attributes and 6 Neural-Net models are built as BOW (Bag Of Word), LSTM[1], Attention, CNN[2], ELMo[3], and BERT(KoBERT)[4]. It has been confirmed that the performance of pretrained sentence embedding models are higher than word embedding models. In addition, practical Neural-Net model composition is proposed after comparing classification performance on dataset with 17 categories. Furthermore, the way of compressing sentence embedding model is mentioned as future work, considering inference time against model capacity on real-time service.

A Study on the Attitude of Seafarers Education & Training - A Case Study on S Company - (선원 교육훈련의 인식에 관한 연구 - S사 사례 연구 -)

  • Lee, Won-Geon;Lee, Gyeong-Gu;Lee, Myun-Soo;Nam, Ki-Chan
    • Journal of Navigation and Port Research
    • /
    • v.33 no.8
    • /
    • pp.531-537
    • /
    • 2009
  • As the Port State Control Inspection recognized that almost marine casualties are caused by human faults rather than vessel's defaults, it has put more emphasis on the qualification of ships' crew and the education required. Accordingly, for shipping companies it is an urgent task to operate effective crew education system that can meet the standards of international agreements and domestic laws. Therefore, this study aims at deriving the attitude of crews of 'S' shipping company on the education and some implication for the effective crew education systems. For this questionnaire survey has been carried out and the data analysed by the respondent groups.

A Study on the Effect of the IoT Technology on SCM (IoT 기술이 공급사슬관리에 미치는 영향에 관한 연구)

  • Lee, Kangbae;Baek, Daehan;Kim, Doohawn
    • Journal of Information Technology Services
    • /
    • v.15 no.1
    • /
    • pp.227-243
    • /
    • 2016
  • In order to maximize profitability by optimizing the entire supply chain process, enterprises have made efforts to apply IT technologies such as POS, MES, and TMS. In addition, academic societies have also made efforts to verify the effects of IT technology introduction through various researches. However, until now, there is almost no research that analyzes the relationship between the IoT, a new IT technology, and the SCM. To study the effect of IoT technology on SCM, this study conducted professional Delphi surveys for three times. Through this method, this study analyzed changes that will be caused by the IoT technology, the priority area in IoT introduction, and the expected difficulty in IoT introduction on SCM. As a result of the Delphi surveys and analyses, it was expected that when IoT technology is introduced, the level of SCM's IT use and partnership will increase. However, the effect of the increased performance of the supply chain, which includes inventory management and quality control, will become weaker. The reason is that the development of operation and management skills, as well as the improvement of IT technology, are also important elements for the performance improvement of the supply chain. As for the priority area in IoT introduction, it was expected that the effect will be greater when IoT is introduced in customer service, transportation, and delivery areas. As difficulties in IoT technology introduction, such as the shortage of IoT platform development personnel, standardization, integration with the existing system, securing professional manpower, expenses, data management, and operation, were derived, it has thus become necessary for us to exert greater efforts in order to come up with solutions.

Development of Dynamic Magnetic Field Emulator for Smart Multi-Card (스마트멀티카드를 위한 동적자장모사장치의 개발)

  • Bae, Jae-Ho
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.40 no.4
    • /
    • pp.183-190
    • /
    • 2017
  • This paper proposes a dynamic magnetic field emulator (DMFE), which can electrically emulate information for the magnetic stripes of most widely used credit cards. Payment transactions with most common credit cards are performed by reading the card's information, encoded in magnetic stripes, using the reader head of a point-of-sale (POS) system. A stripe-type permanent magnet is attached to the back side of the credit card, and information for payments or value-added service is reorganized by exposing it to strong magnetic field. The process of data recording and retrieving as stated above has been pointed out as a major cause of illegal credit card use, because the information on the magnetic stripe is always exposed, and is thus vulnerable to forgery or alteration. A dynamic magnetic field emulator displays card information only when necessary by using the principle of solenoidal magnets. The DMFE proposed in this paper can prevent fraudulent use if it is operated with a device, like a smart phone, or a separate user-authentication procedure. In addition, because it is possible to display various information as needed, it can be utilized for a smart multi-card application, in which information for multiple cards is stored in one card, and can be selected and used as needed. This paper introduces the necessity of the DMFE and its manufacturing principles. As a result, this study will be helpful for making various application cases in payment, which is a core area of the Fintech (a newly-coined word of finance and technology) industry.