• 제목/요약/키워드: representative words

검색결과 251건 처리시간 0.023초

비격식 문서 분류 성능 개선을 위한 LDA 단어 분포 기반의 자질 확장 (Feature Expansion based on LDA Word Distribution for Performance Improvement of Informal Document Classification)

  • 이호경;양선;고영중
    • 정보과학회 논문지
    • /
    • 제43권9호
    • /
    • pp.1008-1014
    • /
    • 2016
  • 트위터, 페이스북, 온라인 고객 리뷰 등은 신문기사처럼 정제된 글이 아닌 자유롭게 기술되는 비격식(informal) 텍스트 문서에 속한다. 이러한 비격식 문서에서 일관된 규칙이나 패턴을 찾는 일은 격식(formal) 문서 경우에 비해 용이하지 않기 때문에, 비격식 문서 분석을 위해서는 성능 개선을 위한 추가적인 접근 방법 필요다고 판단된다. 본 연구에서는 대표적 비격식 문서인 트위터 데이터를 열 가지 카테고리로 분류함에 있어 LDA(Latent Dirichlet allocation) 단어 분포를 사용하여 자질(feature)을 교정하고 확장한다. 토픽별로 상위에 랭크된 단어 자질들을 기반으로 다른 단어 자질들을 분해 및 병합하는 방식으로 유용한 자질 집합을 반복적으로 확장시킨다. 이렇게 생성된 자질로 문서 분류를 수행한 결과 자질 확장 이전에 비해 마이크로 평균 F1-score 7.11%p의 성능 개선 효과를 확인할 수 있었다.

토픽 모델링을 활용한 한의원 리뷰 분석과 마케팅 제언 (Reviews Analysis of Korean Clinics Using LDA Topic Modeling)

  • 김초명;조아람;김양균
    • 대한한의학회지
    • /
    • 제43권1호
    • /
    • pp.73-86
    • /
    • 2022
  • Objectives: In the health care industry, the influence of online reviews is growing. As medical services are provided mainly by providers, those services have been managed by hospitals and clinics. However, direct promotions of medical services by providers are legally forbidden. Due to this reason, consumers, like patients and clients, search a lot of reviews on the Internet to get any information about hospitals, treatments, prices, etc. It can be determined that online reviews indicate the quality of hospitals, and that analysis should be done for sustainable hospital marketing. Method: Using a Python-based crawler, we collected reviews, written by real patients, who had experienced Korean medicine, about more than 14,000 reviews. To extract the most representative words, reviews were divided by positive and negative; after that reviews were pre-processed to get only nouns and adjectives to get TF(Term Frequency), DF(Document Frequency), and TF-IDF(Term Frequency - Inverse Document Frequency). Finally, to get some topics about reviews, aggregations of extracted words were analyzed by using LDA(Latent Dirichlet Allocation) methods. To avoid overlap, the number of topics is set by Davis visualization. Results and Conclusions: 6 and 3 topics extracted in each positive/negative review, analyzed by LDA Topic Model. The main factors, consisting of topics were 1) Response to patients and customers. 2) Customized treatment (consultation) and management. 3) Hospital/Clinic's environments.

1910년대 전후 전남 영광지역의 종교지형과 민족사회·경제운동 (Topography of Religion and National, Social & Economic Movements in Chonnam Yeonggwang before and after the 1910's)

  • 김민영
    • 한일민족문제연구
    • /
    • 제34호
    • /
    • pp.5-40
    • /
    • 2018
  • This paper is to take note of national, social and economic movement, social & economic publicness of religion surrounding Yeonggwang, Joennam around 1910s. At first I would like to look at this period because regional society was in the middle of change of large transition before and after Japan's forced occupation of Korea in 1910s and March 1st Independence Movement in 1919. In particular we focus on spatially Yeonggwang in Joennam because this area is not only called as advent area of Buddhism earlier but also is unique regional culture and ideological topology where Donghak, Protestantism, Catholic, Institute of Won Buddhism and etc. Through casting light upon the above, it is expected to offer one clue for the question of internalizing value to be sought for in the national and social and economic movement by Korean religion around 1910 and public goods in the strategy and tactics to be selected and further publicness and practice lying in their awareness and behavior. In particular it is thought to have advanced the accumulation of case study of Yeonggwang in Joennam with representative 'place-ness' related to this. Along with this it is considered that our challenge is to restore and casting light again on common foundation of existence shape and publicness of various religions in the middle of national and social movement and economic movement in Yeonggwang of Joennam area. In other words, we expect that religions will continue individual efforts and common practices to urge social justice for historic and public value based on common good encompassing historic value, in other words, individual responsibility and social justice among social and economic conditions originated from Japanese colonial era.

여모의 구성적 특징과 유래 (A Study on the Origin and Clothing Composition of the Yemou)

  • 장인우
    • 복식
    • /
    • 제63권7호
    • /
    • pp.164-175
    • /
    • 2013
  • This study examined the Yemou(a hat for a dead woman) from the ladies' clothes excavated from the Lady Lee's tomb in order to trace the significance of the clothing composition and its social origin in the Chosun dynasty. The compositional characteristic of Yemou covers the body of the hat which is not connected with the cover, Wonsal which has a round shape that covers the face of the dead body, and two Gae(a ribbon on the backside of a hat). Seongho Lee-ik(one of representative Confucian scholars in the Chosun dynasty) stated in his book entitled "Seongho Notes", that the structural elements of Yemou originated in Yum(wrapping cloth for the head of a dead body). According to Seongho, Yemou's body part came from the scarf used to cover the head. Wonsal(the cloth of round shape for covering the face) and Gae were derived from Yum made of two ends of long cloth for covering and binding the head of a dead body. Yongjae Kim-kunhang(one of Confucian scholars in the late-Chosun dynasty) demonstrated in his "Yongjae Collection" the social background of the emergence of Yemou. Yemou was the hat produced from the process of nationalizing the Chinese courtesy of clothing. In other words, Bokgun(a man's hat) in the Chosun dynasty replaced the Chinese Yum. Unlike the Chinese custom, man and woman in the Chosun dynasty wore different clothes respectively. According to the clothing custom of the Chosun dynasty a woman wore a female hat, Yemou instead of men's Bokgun.

한복의 형태적 특성 분석에 따른 현대 패션디자인 개발 (Modern Fashion Design Development using Morphological Characteristics of Hanbok)

  • 박명희;심상보
    • 복식
    • /
    • 제66권2호
    • /
    • pp.134-147
    • /
    • 2016
  • The mainstay of modern fashion design has always been Western costumes. Though Asian costumes do get featured in collections at times, most instances are just instances of the western culture showing curiosity toward non-mainstream costumes. Until recently, Japan, which has been the most active in cultural exchanges, has been the main recipient of these curiosities, and has been used as the representative style and culture of East Asia. What needs to be let known is that Korea has its own costume style and culture, which have been developed according to its tradition and beliefs. Hanbok, which is the representative traditional costume in Korea, has existed since the beginning of the Kochosun dynasties. I started this study to figure out the design source of Hanbok's shape and develop it into a modern costume. In the fashion industry, "Mandarin Collar" and "Kimono Sleeve" are common terms, And I hope that words like 'Korean Collar' and 'Hanbok Sleeve' will one day become a household term. Hanbok contains Korea image. And its shape is formed depending on how Koreans have been treating all sorts of objects or things for many years. If my study can identify and express the unique Korean way of pattern and considering clothes, which is clearly different from those of China and Japan, I will be able to establish a concept of 'Korean style', that people of the world could come to recognize.

문맥가중치가 반영된 문장 유사 척도 (Context-Weighted Metrics for Example Matching)

  • 김동주;김한우
    • 전자공학회논문지CI
    • /
    • 제43권6호
    • /
    • pp.43-51
    • /
    • 2006
  • 본 논문은 영한 기계번역을 위한 예제기반 기계번역에서 예제 문장의 비교를 위한 척도에 관한 것으로 주어진 질의 문장과 가장 유사한 예제 문장을 찾아내는데 사용되는 유사성 척도를 제안한다. 제안하는 척도는 편집거리 알고리즘에 기반을 둔 것으로 표면어가 일치하지 않는 단어에 대해 기본적으로 단어의 표제어 정보와 품사 정보를 이용하여 유사도를 계산한다. 편집거리 척도는 비교 단위의 순서에 의존적이기는 하지만 순서만 일치하면 동일한 유사성 기여도를 갖는 것으로 판단하기 때문에 완전 문맥을 반영하지는 못한다. 따라서 본 논문에서는 완전 문맥 반영을 위해 추가적으로 이들 정보 외에 일치하는 단위 정보를 갖는 연속된 단어들에 대해 연속 정보를 반영한 문맥 가중치를 제안한다. 또한 비유사성 정도를 의미하는 척도인 편집거리 척도를 유사성 척도로 변경하고, 문맥 가중치가 적용된 척도를 문장 비교에 적용하기 위하여 정규화를 수행하며, 이를 통하여 유사도에 따른 순위를 결정한다. 또한 언어적 정보를 이용한 기존 방법류들에 대한 일반화를 시도하였으며, 문맥 가중치가 적용된 척도의 우수성을 증명하기 위해 일반화된 기존 방법류들과의 비교 실험을 수행하였다.

한 중 일 궁궐 건축의 이미지 특성 비교 연구 (A Comparative Study on the Image characteristics in Traditional Palaces of Korea, China and Japan)

  • 조은숙;박영순
    • 디자인학연구
    • /
    • 제18권1호
    • /
    • pp.27-38
    • /
    • 2005
  • 본 연구의 목적은 한 중 일 궁궐 건축의 이미지 표현 어휘를 활용하여 한 중 일 삼국의 이미지를 비교분석 함으로써 한국 고유의 이미지 특성을 규명하는 데 있다. 이와 같은 연구를 진행하기 위한 방법으로는 조사도구의 선정과 이미지 표현 어휘 수집 및 추출을 위한 문헌 조사방법, 자유 연상 측정법, 그리고 설문조사 방법이 사용되었다. 조사도구로 사용된 한 중 일 궁궐 건축을 대표하는 사진으로는 한국의 창덕궁, 중국의 자금성, 일본의 니조성의 외부 5점, 내부 2점씩의 사진을 이용하였으며, 설문지는 47개의 어휘를 선정하여 5점 척도로 구성하였다. 이상과 같은 방법으로 조사 분석한 결과는 다음과 같다. 한 중 일 궁궐 건축의 대표적인 이미지 표현 어휘는 장식성, 안정감, 개방성, 선적특성, 비친근성, 여성성등 크게 6가지 요인 구조로 나타났다. 이러한 연구결과를 바탕으로 한 중 일 궁귈 건축에 나타나는 이미지 특성을 종합하여 공통성과 차별성을 파악한 결과, 삼국의 공통적인 이미지 특성은 선적 특성으로 나타났으며, 찬국의 이미지 특성으로는 안정감, 곡선적 특성, 여성성, 중국의 이미지 특성으로는 장식성과 직선적 특성, 일본의 이미지 특성으로는 단순성, 비친근성, 개방성의 특성을 보이는 것으로 나타났다. 또한 한 중 일 삼국 궁궐 건축 이미지의 공통성과 차별성을 바탕으로 한국 고유의 이미지 특성을 규명함에 있어 궁궐의 외부와 내부에서 모두 나타난 안정감과 곡선적 특성을 한국의 주요 이미지 특성으로 파악하였다. 이러한 연구 과정을 통해 동아시아 삼국의 이미지 특성의 공통점과 차이점을 파악해 볼 수 있었으며, 오랜 시간동안 지리적 인 영향과 문화적인 면으로 인해 중국과 일본의 문화권에서 중간적, 매개적 문화의 입장으로 평가되었던 한국의 이미지에 대한 고유한 특성을 파악할 수 있었다.

  • PDF

The classified method for overlapping data

  • Kruatrachue, Boontee;Warunsin, Kulwarun;Siriboon, Kritawan
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2004년도 ICCAS
    • /
    • pp.2037-2040
    • /
    • 2004
  • In this paper we introduce a new prototype based classifiers for overlapping data, where training pattern can be overlap on the feature space. The proposed classifier is based on the prototype from neural network classifier (NNC)[1] for overlap data. The method automatically chooses the initial center and two radiuses for each class. The center is used as a mean representative of training data for each class. The unclassified pattern is classified by measure distance from the class center. If the distance is in the lower (shorter radius) the unknown pattern has the high percentage of being in this class. If the distance is between the lower and upper (further radius), the pattern has the probability of being in this class or others. But if the distance is outside the upper, the pattern is not in this class. We borrow the words upper and lower from the rough set to represent the region of certainty [3]. The training algorithm to find number of cluster and their parameters (center, lower, upper) is presented. The clustering result is tested using patterns from Thai handwritten letter and the clustering result is very similar to human eyes clustering.

  • PDF

R&D Perspective Social Issue Packaging using Text Analysis

  • Wong, William Xiu Shun;Kim, Namgyu
    • 한국IT서비스학회지
    • /
    • 제15권3호
    • /
    • pp.71-95
    • /
    • 2016
  • In recent years, text mining has been used to extract meaningful insights from the large volume of unstructured text data sets of various domains. As one of the most representative text mining applications, topic modeling has been widely used to extract main topics in the form of a set of keywords extracted from a large collection of documents. In general, topic modeling is performed according to the weighted frequency of words in a document corpus. However, general topic modeling cannot discover the relation between documents if the documents share only a few terms, although the documents are in fact strongly related from a particular perspective. For instance, a document about "sexual offense" and another document about "silver industry for aged persons" might not be classified into the same topic because they may not share many key terms. However, these two documents can be strongly related from the R&D perspective because some technologies, such as "RF Tag," "CCTV," and "Heart Rate Sensor," are core components of both "sexual offense" and "silver industry." Thus, in this study, we attempted to discover the differences between the results of general topic modeling and R&D perspective topic modeling. Furthermore, we package social issues from the R&D perspective and present a prototype system, which provides a package of news articles for each R&D issue. Finally, we analyze the quality of R&D perspective topic modeling and provide the results of inter- and intra-topic analysis.

거주자 증언을 통한 운조루의 생활공간에 관한 연구 (A Study on the life space of UNJORU through the testimony of residents)

  • 김병진
    • 한국주거학회논문집
    • /
    • 제27권1호
    • /
    • pp.21-30
    • /
    • 2016
  • This study examines ways of housing usage and aspects of resident's life based on the representative traditional house "UNJORU" as time passed. In other words, it explains how the traditional life has changed. these days compared to late Joseon dynasty. It also explains how the meaning of the place changed by life style change and the aspect have changed in women's perspective. This is for restoring the time period that the life dairy was recorded later time period. We can trust Mrs. Lee who is the eldest resident of them at the present in UNJORU. The method of study proceeded by interview format. It is classified a meal place and a folk-beliefs the daily life the funeral rites non-daily life, such as in this process, was conducted to understand the consciousness and life form at the time of residents. As a result, Ryu's family life style has preferred a more modern life style than traditional life style by time as well as society changes. Through this research, It was possible to analyze how the external formality of traditional house has kept but internal formality has changed over time.