• Title/Summary/Keyword: Sentence Importance

Search Result 59, Processing Time 0.026 seconds

The Processing of Thematic Role Information in Korean Verbs (한국어 동사의 의미역정보 처리과정)

  • Kim, Young-Jin;Woo, Jeung-Hee
    • Korean Journal of Cognitive Science
    • /
    • v.18 no.2
    • /
    • pp.91-112
    • /
    • 2007
  • Two experiments were conducted to examine psychological reality and incremental nature of thematic processing in Korean sentence comprehension. By using two different types of verbs (i.e., transitive and causative verbs), we manipulated necessity of the thematic reanalysis (i.e., consistent vs. inconsistent condition) in the coordinated sentence structures. In Experiment 1, there was no significant difference in the reading times of the verbs between the consistent and the inconsistent condition. However, there was significant differences in question answering times between the two conditions. In Experiment 2 in which we changed a noun phrase of the test sentences into inanimate one, we found significant thematic reanalysis effects in the reading times of the final verbs. Based on these results we discussed the theoretical importance and universality of the thematic processes.

  • PDF

VOC Summarization and Classification based on Sentence Understanding (구문 의미 이해 기반의 VOC 요약 및 분류)

  • Kim, Moonjong;Lee, Jaean;Han, Kyouyeol;Ahn, Youngmin
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.1
    • /
    • pp.50-55
    • /
    • 2016
  • To attain an understanding of customers' opinions or demands regarding a companies' products or service, it is important to consider VOC (Voice of Customer) data; however, it is difficult to understand contexts from VOC because segmented and duplicate sentences and a variety of dialog contexts. In this article, POS (part of speech) and morphemes were selected as language resources due to their semantic importance regarding documents, and based on these, we defined an LSP (Lexico-Semantic-Pattern) to understand the structure and semantics of the sentences and extracted summary by key sentences; furthermore the LSP was introduced to connect the segmented sentences and remove any contextual repetition. We also defined the LSP by categories and classified the documents based on those categories that comprise the main sentences matched by LSP. In the experiment, we classified the VOC-data documents for the creation of a summarization before comparing the result with the previous methodologies.

Development of Japanese to Korean Machine Translation System ATOM Using Personal Computer II - Syntactic/Semantic Analysis and Generation Process - (PC를 이용한 일$\cdot$한 번역 시스템 ATOM의 개발에 관한 연구 ( II ) - 구문해석과 생성과 정을 중심으로 -)

  • Kim, Young-Sum;Kim, Han-Woo;Choi, Byung-Uk
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.25 no.10
    • /
    • pp.1193-1201
    • /
    • 1988
  • In this paper, we describe the syntactic and semantic parsing methods which use the case frames. The case structures based on obligatory cases of verbs. And, we use a small set of partial-garammar rules based on simple sentence to represent such case structures. Also, we enhance the efficiency by constructing independent procedure for particle classification and ambiguity resolution of major particle considering the importance of Japanese particle process in the generation. And we construct the generation table considering the combination possibility between the verbs and auxiliary verbs for processing the termination phrase. Therefore we can generate more natural translated sentence according to unique decision with information of syntactic analysis and simplify the generating process.

  • PDF

Performance Improvement of Web Information Retrieval Using Sentence-Query Similarity (문장-질의 유사성을 이용한 웹 정보 검색의 성능 향상)

  • Park Eui-Kyu;Ra Dong-Yul;Jang Myung-Gil
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.5
    • /
    • pp.406-415
    • /
    • 2005
  • Prosperity of Internet led to the web containing huge number of documents. Thus increasing importance is given to the web information retrieval technology that can provide users with documents that contain the right information they want. This paper proposes several techniques that are effective for the improvement of web information retrieval. Similarity between a document and the query is a major source of information exploited by conventional systems. However, we suggest a technique to make use of similarity between a sentence and the query. We introduce a technique to compute the approximate score of the sentence-query similarity even without a mature technology of natural language processing. It was shown that the amount of computation for this task is linear to the number of documents in the total collection, which implies that practical systems can make use of this technique. The next important technique proposed in this paper is to use stratification of documents in re-ranking the documents to output. It was shown that it can lead to significant improvement in performance. We furthermore showed that using hyper links, anchor texts, and titles can result in enhancement of performance. To justify the proposed techniques we developed a large scale web information retrieval system and used it for experiments.

Selective Word Embedding for Sentence Classification by Considering Information Gain and Word Similarity (문장 분류를 위한 정보 이득 및 유사도에 따른 단어 제거와 선택적 단어 임베딩 방안)

  • Lee, Min Seok;Yang, Seok Woo;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.105-122
    • /
    • 2019
  • Dimensionality reduction is one of the methods to handle big data in text mining. For dimensionality reduction, we should consider the density of data, which has a significant influence on the performance of sentence classification. It requires lots of computations for data of higher dimensions. Eventually, it can cause lots of computational cost and overfitting in the model. Thus, the dimension reduction process is necessary to improve the performance of the model. Diverse methods have been proposed from only lessening the noise of data like misspelling or informal text to including semantic and syntactic information. On top of it, the expression and selection of the text features have impacts on the performance of the classifier for sentence classification, which is one of the fields of Natural Language Processing. The common goal of dimension reduction is to find latent space that is representative of raw data from observation space. Existing methods utilize various algorithms for dimensionality reduction, such as feature extraction and feature selection. In addition to these algorithms, word embeddings, learning low-dimensional vector space representations of words, that can capture semantic and syntactic information from data are also utilized. For improving performance, recent studies have suggested methods that the word dictionary is modified according to the positive and negative score of pre-defined words. The basic idea of this study is that similar words have similar vector representations. Once the feature selection algorithm selects the words that are not important, we thought the words that are similar to the selected words also have no impacts on sentence classification. This study proposes two ways to achieve more accurate classification that conduct selective word elimination under specific regulations and construct word embedding based on Word2Vec embedding. To select words having low importance from the text, we use information gain algorithm to measure the importance and cosine similarity to search for similar words. First, we eliminate words that have comparatively low information gain values from the raw text and form word embedding. Second, we select words additionally that are similar to the words that have a low level of information gain values and make word embedding. In the end, these filtered text and word embedding apply to the deep learning models; Convolutional Neural Network and Attention-Based Bidirectional LSTM. This study uses customer reviews on Kindle in Amazon.com, IMDB, and Yelp as datasets, and classify each data using the deep learning models. The reviews got more than five helpful votes, and the ratio of helpful votes was over 70% classified as helpful reviews. Also, Yelp only shows the number of helpful votes. We extracted 100,000 reviews which got more than five helpful votes using a random sampling method among 750,000 reviews. The minimal preprocessing was executed to each dataset, such as removing numbers and special characters from text data. To evaluate the proposed methods, we compared the performances of Word2Vec and GloVe word embeddings, which used all the words. We showed that one of the proposed methods is better than the embeddings with all the words. By removing unimportant words, we can get better performance. However, if we removed too many words, it showed that the performance was lowered. For future research, it is required to consider diverse ways of preprocessing and the in-depth analysis for the co-occurrence of words to measure similarity values among words. Also, we only applied the proposed method with Word2Vec. Other embedding methods such as GloVe, fastText, ELMo can be applied with the proposed methods, and it is possible to identify the possible combinations between word embedding methods and elimination methods.

Automatic Speech Style Recognition Through Sentence Sequencing for Speaker Recognition in Bilateral Dialogue Situations (양자 간 대화 상황에서의 화자인식을 위한 문장 시퀀싱 방법을 통한 자동 말투 인식)

  • Kang, Garam;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.2
    • /
    • pp.17-32
    • /
    • 2021
  • Speaker recognition is generally divided into speaker identification and speaker verification. Speaker recognition plays an important function in the automatic voice system, and the importance of speaker recognition technology is becoming more prominent as the recent development of portable devices, voice technology, and audio content fields continue to expand. Previous speaker recognition studies have been conducted with the goal of automatically determining who the speaker is based on voice files and improving accuracy. Speech is an important sociolinguistic subject, and it contains very useful information that reveals the speaker's attitude, conversation intention, and personality, and this can be an important clue to speaker recognition. The final ending used in the speaker's speech determines the type of sentence or has functions and information such as the speaker's intention, psychological attitude, or relationship to the listener. The use of the terminating ending has various probabilities depending on the characteristics of the speaker, so the type and distribution of the terminating ending of a specific unidentified speaker will be helpful in recognizing the speaker. However, there have been few studies that considered speech in the existing text-based speaker recognition, and if speech information is added to the speech signal-based speaker recognition technique, the accuracy of speaker recognition can be further improved. Hence, the purpose of this paper is to propose a novel method using speech style expressed as a sentence-final ending to improve the accuracy of Korean speaker recognition. To this end, a method called sentence sequencing that generates vector values by using the type and frequency of the sentence-final ending appearing in the utterance of a specific person is proposed. To evaluate the performance of the proposed method, learning and performance evaluation were conducted with a actual drama script. The method proposed in this study can be used as a means to improve the performance of Korean speech recognition service.

Design of Learning Achievement Evaluation Module of Intelligent Computer Assisted Instruction with Various Fuzzy Environment (다양한 퍼지 환경을 갖는 지능형 교수 시스템의 학습 성취도 평가 모듈 설계)

  • Won Sung-Hyun
    • Management & Information Systems Review
    • /
    • v.2
    • /
    • pp.311-334
    • /
    • 1998
  • By decreasing in CPU price and development of computer assembling technology, personal computer fake a good chance to accelerate its supply. Recently, as being introduced new computing technology so called multi media, teaming assist system which is based on single media such as studying book, cassette tape, video tape, or something else is rapidly being replaced by new assist education system based on multi media in which it is operated by the personal computer. In the computer assist education system, there is an evaluation module which appraise learner's study level into the next study strategy. At the view of this point, this part is very important. In this part, there are some factors like Importance, complexity, or difficulty which commonly include fuzzy factors in our surrounding. But until now, we are still out of the level to handle the evaluation module adequately among the some studies. In this study, we would like to suggest a new module that evaluate learning achievement of ICAI which have a variety of fuzzy environment. We combine Independent fuzzy environment like importance, complexity, difficulty into making total evaluation of learner's achievement. By the result, with expressing by linguistic form, this study can provide the theoretical basis in which we will be able to carry out sentence toward evaluation among elementary school.

  • PDF

Relationships Among Language Ability, Foreign Language Learning Experience, and Metalinguistic Ability in Korean Preschool Children (유아의 모국어 능력, 외국어 경험 정도와 상위언어 능력간의 관계)

  • Han, You Me;Cho, Bok Hee
    • Korean Journal of Child Studies
    • /
    • v.20 no.3
    • /
    • pp.199-216
    • /
    • 1999
  • The 121 five-year-old Korean subjects of this study were divided in 3 groups based on their experience in learning a foreign language (English). A battery of tests was administered to measure spoken and written language ability and the 3 metalinguistic domains of phonological, semantic, and syntactic awareness. Spoken language ability was positively correlated with semantic and syntactic awareness. The relative importance of each metalinguistic domain varied with level of written language development. Phonological awareness was the only predictor of decoding. Syntactic awareness and phonological awareness were significant variables in sentence comprehension. Metalinguistic ability was a better predictor of written language development than spoken language ability. Foreign language learning experience had an effect on syntactic awareness: low experience was superior to no experience, but high experience was not superior to low experience.

  • PDF

An Evaluation Framework for Defense Informatization Policy

  • Jung, Hosang;Lee, Sangho
    • Journal of Multimedia Information System
    • /
    • v.7 no.1
    • /
    • pp.73-86
    • /
    • 2020
  • The well-known sentence, "You can't manage what you don't measure" suggests the importance of measurement. The Ministry of National Defense (MND) in Korea is measuring the effort of informatization for various dimensions such as validity, adequacy, and effectiveness using the MND evaluation system to obtain positive and significant effects from informatization. MND views the defense informatization domain as divided into the defense informatization policy, the defense informatization project, and the defense informatization level, which can measure the informatization capability of the MND and the armed forces or organizations. Furthermore, it feels there is some limitation, such as those related to ambiguity and reliability, present in the system. To overcome the limitations in the existing system to evaluate the defense informatization policy, this study proposes a revised evaluation framework for the policy of defense informatization, its indicators, and measurement methods.

Development of a C-Language Learning Tool using Console Wrapper (Console Wrapper를 활용한 C언어 학습도구 개발)

  • Hwang, Giu-Duck;Choi, Sook-Young
    • Journal of Digital Convergence
    • /
    • v.7 no.3
    • /
    • pp.113-122
    • /
    • 2009
  • The majority of programming education in the learning place attaches importance more to grammar, memorization of the imperative sentence and explanation of the program language itself than the specific way to use the target language. In addition, it is mainly used to teach theoretical knowledge based on the text. Consequently, current programming education has not interested learners in the programming learning and has not improved their ability in programming problems of the real world. We therefore developed a learning tool of C-language, which is based on the Console Wrapper. The purpose of proposing the learning tool was to make the programming education break from the typical theoretical learning and to let learners be interested in the programming education. By using the dynamic screen instead of the static console screen, the learners could enjoy learning the program. As a result of this study, we could know that the programming education using our learning tool is more effective than the typical C language programming education.

  • PDF