• Title/Summary/Keyword: syntactic weight

Search Result 8, Processing Time 0.033 seconds

Ordering a Left-branching Language: Heaviness vs. Givenness

  • Choi, Hye-Won
    • Language and Information
    • /
    • v.13 no.1
    • /
    • pp.39-56
    • /
    • 2009
  • This paper investigates ordering alternation phenomena in Korean using the dative construction data from Sejong Corpus of Modern Korean (Kim, 2000). The paper first shows that syntactic weight and information structure are distinct and independent factors that influence word order in Korean. Moreover, it reveals that heaviness and givenness compete each other and exert diverging effects on word order, which contrasts the converging effects of these factors shown in word orders of right-branching languages like English. The typological variation of syntactic weight effect poses interesting theoretical and empirical questions, which are discussed in relation to processing efficiency in ordering.

  • PDF

Discriminator of Similar Documents Using Syntactic and Semantic Analysis (구문의미분석를 이용한 유사문서 판별기)

  • Kang, Won-Seog;Hwang, Do-Sam;Kim, Jung H.
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.3
    • /
    • pp.40-51
    • /
    • 2014
  • Owing to importance of document copyright the need to detect document duplication and plagiarism is increasing. Many studies have sought to meet such need, but there are difficulties in document duplication detection due to technological limitations with the processing of natural language. This thesis designs and implements a discriminator of similar documents with natural language processing technique. This system discriminates similar documents using morphological analysis, syntactic analysis, and weight on low frequency and idiom. To evaluate the system, we analyze the correlation between human discrimination and term-based discrimination, and between human discrimination and proposed discrimination. This analysis shows that the proposed discrimination needs improving. Future research should work to define the document type and improve the processing technique appropriate for each type.

An Implementation of Syntactic Constituent Recognizer Using Connectionism (Connectionism을 이용한 부분 구문 인식기의 구현)

  • Jung, Han-Min;Yuh, Sang-Hwa;Kim, Tae-Wan;Park, Dong-In
    • Annual Conference on Human and Language Technology
    • /
    • 1996.10a
    • /
    • pp.479-483
    • /
    • 1996
  • 본 논문은 구운 분석의 검색 영역 축소를 통한 구문 분석기의 성능 향상을 목적으로 connectionism을 이용한 부분 구문 인식기의 설계와 구현을 기술한다. 본 부분 구문 인식기는 형태소 분석된 문장으로부터 명사-주어부와 술어부를 인식함으로써 전체 검색 영역을 여러 부분으로 나누어 구문 분석문제를 축소시키는 것을 목적으로 하고 있다. Connectionist 모델은 입력층과 출력층으로 구성된 개선된 퍼셉트론 구조이며, 입/출력층 사이의 노드들을, 입력층 사이의 노드들을 연결하는 연결 강도(weight)가 존재한다. 명사-주어부 및 술어부 구문 태그를 connectionist 모델에 적용하며, 학습 알고리즘으로는 개선된 백프로퍼게이션 학습 알고리즘을 사용한다. 부분 구문 인식 실험은 112개 문장의 학습 코퍼스와 46개 문장의 실험 코퍼스에 대하여 85.7%와 80.4%의 정확한 명사-주어부 및 술어부 인식을, 94.6%와 95.7%의 명사-주어부와 술어부 사이의 올바른 경계 인식을 보여준다.

  • PDF

A Study on the Design and Structural Analysis of the Unmanned Underwater Vehicle (심해 무인 잠수정 프레임의 설계 및 구조해석에 관한 연구)

  • JOUNG TAE-HWAN;NHO IN-SIK;CHUN IL-YONG;LEE JONG-Moo
    • Proceedings of the Korea Committee for Ocean Resources and Engineering Conference
    • /
    • 2004.05a
    • /
    • pp.172-177
    • /
    • 2004
  • This paper presents the results of the structural analysis and optimal design of frames of the UUV(Unmanned Underwater vehicle) to be operated at 6000m depth in the ocean. The structure of the UUV system can be classified into two structure, Launcher ana ROV. Frame of the launcher will be made by Galvanized Steel which has high strength and corrosion-resistant but this material has high specific gravity for tile object to be weight in the water Similarly, ROV will be made by AI6061-T6, and frame of the ROV will be fix many instruments and syntactic buoyancy materials. Before fabrication of tile frame, we performed sensitivity analysis - change in weight due to $\pm1\%$ change in design variables, for easy choice by change of dimension of the frame.

  • PDF

Prosodic Break Index Estimation using LDA and Tri-tone Model (LDA와 tri-tone 모델을 이용한 운율경계강도 예측)

  • 강평수;엄기완;김진영
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.7
    • /
    • pp.17-22
    • /
    • 1999
  • In this paper we propose a new mixed method of LDA and tri-tone model to predict Korean prosodic break indices(PBI) for a given utterance. PBI can be used as an important cue of syntactic discontinuity in continuous speech recognition(CSR). The model consists of three steps. At the first step, PBI was predicted with the information of syllable and pause duration through the linear discriminant analysis (LDA) method. At the second step, syllable tone information was used to estimate PBI. In this step we used vector quantization (VQ) for coding the syllable tones and PBI is estimated by tri-tone model. In the last step, two PBI predictors were integrated by a weight factor. The proposed method was tested on 200 literal style spoken sentences. The experimental results showed 72% accuracy.

  • PDF

A Research for Web Documents Genre Classification using STW (STW를 이용한 웹 문서 장르 분류에 관한 연구)

  • Ko, Byeong-Kyu;Oh, Kun-Seok;Kim, Pan-Koo
    • Journal of Information Technology and Architecture
    • /
    • v.9 no.4
    • /
    • pp.413-422
    • /
    • 2012
  • Many researchers have been studied to reveal human natural language to let machine understand its meaning by text based, page rank based or more. Particularly, it has been considered that URL and HTML Tag information in web documents are attracting people' attention again to analyze huge amount of web document automatically. In this paper, we propose a STW (Semantic Term Weight) approach based on syntactic and linguistic structure of web documents in order to classify what genres are. For the evaluation, we analyzed more than 1,000 documents from 20-Genre-collection corpus for training the documents based on SVM algorithm. Afterwards, we tested KI-04 corpus to evaluate performance of our proposed method. This paper measured their accuracy by classifying them into an experiment using STW and one without u sing STW. As the results, the proposed STW based approach showed approximately 10.2% which Is higher than one without use of STW.

A Study of Efficiency Information Filtering System using One-Hot Long Short-Term Memory

  • Kim, Hee sook;Lee, Min Hi
    • International Journal of Advanced Culture Technology
    • /
    • v.5 no.1
    • /
    • pp.83-89
    • /
    • 2017
  • In this paper, we propose an extended method of one-hot Long Short-Term Memory (LSTM) and evaluate the performance on spam filtering task. Most of traditional methods proposed for spam filtering task use word occurrences to represent spam or non-spam messages and all syntactic and semantic information are ignored. Major issue appears when both spam and non-spam messages share many common words and noise words. Therefore, it becomes challenging to the system to filter correct labels between spam and non-spam. Unlike previous studies on information filtering task, instead of using only word occurrence and word context as in probabilistic models, we apply a neural network-based approach to train the system filter for a better performance. In addition to one-hot representation, using term weight with attention mechanism allows classifier to focus on potential words which most likely appear in spam and non-spam collection. As a result, we obtained some improvement over the performances of the previous methods. We find out using region embedding and pooling features on the top of LSTM along with attention mechanism allows system to explore a better document representation for filtering task in general.

Linguistic Description and Theory

  • Nakajima, Heizo
    • Korean Journal of English Language and Linguistics
    • /
    • v.1 no.3
    • /
    • pp.349-368
    • /
    • 2001
  • We have brought up several distinct types of English clausal constructions, and have been lead to the descriptive generalization in (14),repeated here as (33): (33) Reduced clauses cannot occur in non-complement positions. The generalization in (33) refers to two theory-internal notions, reduced clauses and non-complement positions. Both notions are concerned with the composition of syntactic structures to be defined by X-bar theory. Without these theoretical notions, it would be difficult to describe in a general form the fact that certain types of complement clauses-namely, null-that clauses, if-clauses, Acc-ing gerund, ECM complement clauses, and Raising complement clauses-cannot occur in particular syntactic positions. Instead, one would have to describe this fact for each clause type, in such a way that null-that clauses cannot occur in such and such positions, and if-clauses cannot occur in such and such positions, and Acc-ing gerund cannot occur in such and such positions, and so on, although the positions in which they cannot occur are totally the same. Given the terminology of X-bar theory, however, it has turned out that these types of complement clauses are all reduced clauses, and the positions where they cannot occur are all non-complement positions. Then, the generalization has obtained that reduced clauses cannot occur in non-complement positions. It is a theoretical issue, and differs depending upon theories, how to explain why such a descriptive generalization holds at all. Hopefully, the demonstration here provides a piece of evidence showing that a theory or a particular theoretical nation plays an important role in the description of linguistic facts. Moreover, I have made a crucial prediction on the basis of the well-accepted theoretical assumption the ECM complement clauses and Raising complement clauses are reduced clauses; namely, the prediction that these types of clauses cannot occur in non-complement position. The prediction based upon the theoretical assumption is actually borne out, as illustrated earlier. The illustration of the prediction, I hope, shows that a theory or a particular theoretical assumption, coupled with another theoretical assumption, allows us to make some interesting predictions. Predictions serve to widen a range of linguistic facts to be described. A theory plays a crucial part in finding out interesting facts as well as in describing them in some general forms. Finally, let me state a few words as to the recent generative theory in connection with linguistic description. The recent generative theory is getting more and more abstract. I think it is moving toward a good direction as cognitive science. It will contribute, among others, to the inquiry into what is knowledge that is very specific to language faculty, and into how it interacts with other cognitive faculties. However, I am suspicious about how much the abstract generative theory will contribute to the description of linguistic facts in a particular language. While generative theory is claimed to aim both for descriptive adequacy and for explanatory adequacy, the recent generative theory is likely to put much more weight on explanatory adequacy. In my view, a less abstract theory is enough, or even more useful, for the purpose of linguistic description. Of course, how abstract theory one should adopt as a framework differs depending upon what aspect of language one attempts to describe. What I would like to emphasize here is that linguistic theory does not conflicts with linguistic description, and a linguistic theory with an appropriate degree of abstractness serves as a tool for finding out new interesting facts, as well as for describing them in some general, elegant forms.

  • PDF