• Title/Summary/Keyword: 텍스트 연구

Search Result 3,471, Processing Time 0.032 seconds

Conservation of Excavated Lacquer-wares for using artificially water-soaked Lacquer-wares (인공수침 칠기를 이용한 고대칠기 보존연구)

  • Kim, Soo-Chul
    • Journal of Conservation Science
    • /
    • v.21
    • /
    • pp.49-58
    • /
    • 2007
  • Among the treatment results of test samples of the antique lacquer-ware, the treatment with PEG#3,350 40% solution displayed excellent effect with low shrinkage ratio; in weight gain the treatment with Sucrose 19%+Glycerin 1%(t-butanol 5% in water) solution showed consistent increase. However during the impregnation process of Sucrose, the weight of the testing samples decreased by dehydration because the inner part of the test samples and the treatment solution showed concentration gradient. Therefore, we concluded longer impregnation period should be necessary to prevent dehydration. Since both higher and lower molecular weight treatment chemicals could penetrate into the wood of the lacquer-ware, air drying and conditioning after impregnation treatment with high concentration chemicals would be possible, as well as vacuum freeze-drying.

  • PDF

A Study on the Deduction of Social Issues Applying Word Embedding: With an Empasis on News Articles related to the Disables (단어 임베딩(Word Embedding) 기법을 적용한 키워드 중심의 사회적 이슈 도출 연구: 장애인 관련 뉴스 기사를 중심으로)

  • Choi, Garam;Choi, Sung-Pil
    • Journal of the Korean Society for information Management
    • /
    • v.35 no.1
    • /
    • pp.231-250
    • /
    • 2018
  • In this paper, we propose a new methodology for extracting and formalizing subjective topics at a specific time using a set of keywords extracted automatically from online news articles. To do this, we first extracted a set of keywords by applying TF-IDF methods selected by a series of comparative experiments on various statistical weighting schemes that can measure the importance of individual words in a large set of texts. In order to effectively calculate the semantic relation between extracted keywords, a set of word embedding vectors was constructed by using about 1,000,000 news articles collected separately. Individual keywords extracted were quantified in the form of numerical vectors and clustered by K-means algorithm. As a result of qualitative in-depth analysis of each keyword cluster finally obtained, we witnessed that most of the clusters were evaluated as appropriate topics with sufficient semantic concentration for us to easily assign labels to them.

A New Optical Media API for Real-Time Recording (실시간 기록을 위한 광매체 API)

  • Lee, Min-Suk;Song, Jin-Seok;Yun, Chan-Hee
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.13 no.2
    • /
    • pp.75-85
    • /
    • 2007
  • There are many embedded systems which store and play multimedia streams on optical media such as recordable cd and dvd. Some of those are PVRs, DVRs, and camcorders. In this paper we describe the design and implementation of a new, well structured, fully documented, operating system independent and open source optical media API which can be used in various applications and embedded systems. We also design an ISO-9660 compliant optical media layout, an API set and the scenario for real-time recording. To prove the usability, we develop a text application to replace well-known CD-burning software, cdrecord, and a graphic burning application. All the implementations are firstly done on Linux PC environment, and then ported to a commercial embedded system which uses pSOS as an operating system.

A Study on Implementation of Emotional Speech Synthesis System using Variable Prosody Model (가변 운율 모델링을 이용한 고음질 감정 음성합성기 구현에 관한 연구)

  • Min, So-Yeon;Na, Deok-Su
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.8
    • /
    • pp.3992-3998
    • /
    • 2013
  • This paper is related to the method of adding a emotional speech corpus to a high-quality large corpus based speech synthesizer, and generating various synthesized speech. We made the emotional speech corpus as a form which can be used in waveform concatenated speech synthesizer, and have implemented the speech synthesizer that can be generated various synthesized speech through the same synthetic unit selection process of normal speech synthesizer. We used a markup language for emotional input text. Emotional speech is generated when the input text is matched as much as the length of intonation phrase in emotional speech corpus, but in the other case normal speech is generated. The BIs(Break Index) of emotional speech is more irregular than normal speech. Therefore, it becomes difficult to use the BIs generated in a synthesizer as it is. In order to solve this problem we applied the Variable Break[3] modeling. We used the Japanese speech synthesizer for experiment. As a result we obtained the natural emotional synthesized speech using the break prediction module for normal speech synthesize.

Design and Implementation of a Retrieval Server for Virtual Documents in the MIRAGE-III Digital Library (MIRAGE-III 디지털도서관에서 가상문서 검색 서버의 설계 및 구현)

  • Lee, Yong-Bae;Maeng, Sung-Hyon
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.8 no.2
    • /
    • pp.219-230
    • /
    • 2002
  • One of the most important functions digital libraries need to offer is to help users find necessary information in a distributed environment in the most efficient and effective manner. In order to meet the goal, it is desirable to link scattered pieces of information and present them as a logically coherent whole when the user wants it, so that he or she doesn't need to know their physical location. The virtual document is an integrated document that the total or part of the physical documents stored in a specific repository are linked dynamically. Our MIRAGE-III digital library system provides a content-based retrieval of physical documents and the virtual documents in XML. This system provides a retrieval of partial documents, attributes and hierarchical structures and linked-documents based in structured documents like XML or SGML. In this paper we describe a methodology of design and implementation of the query processor and retrieval server in the MIRAGE-III digital library system.

A Study on Contents-based Retrieval using Wavelet (Wavelet을 이용한 내용기반 검색에 관한 연구)

  • 강진석;박재필;나인호;최연성;김장형
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.4 no.5
    • /
    • pp.1051-1066
    • /
    • 2000
  • According to the recent advances of digital encoding technologies and computing power, large amounts of multimedia informations such as image, graphic, audio and video are fully used in multimedia systems through Internet. By this, diverse retrieval mechanisms are required for users to search dedicated informations stored in multimedia systems, and especially it is preferred to use contents-based retrieval method rather than text-type keyword retrieval method. In this paper, we propose a new contents-based indexing and searching algorithm which aims to get both high efficiency and high retrieval performance. To achieve these objectives, firstly the proposed algorithm classifies images by a pre-processing process of edge extraction, range division, and multiple filtering, and secondly it searches the target images using spatial and textural characteristics of colors, which are extracted from the previous process, in a image. In addition, we describe the simulation results of search requests and retrieval outputs for several images of company's trade-mark using the proposed contents-based retrieval algorithm based on wavelet.

  • PDF

Representation of Women in Early 1970's Korean Films : focusing on the relationship with social contexts (1970년대 초 한국영화의 여성 재현 : 사회적 콘텍스트와의 연관성을 중심으로)

An Analytical Study on Performance Factors of Automatic Classification based on Machine Learning (기계학습에 기초한 자동분류의 성능 요소에 관한 연구)

  • Kim, Pan Jun
    • Journal of the Korean Society for information Management
    • /
    • v.33 no.2
    • /
    • pp.33-59
    • /
    • 2016
  • This study examined the factors affecting the performance of automatic classification for the domestic conference papers based on machine learning techniques. In particular, In view of the classification performance that assigning automatically the class labels to the papers in Proceedings of the Conference of Korean Society for Information Management using Rocchio algorithm, I investigated the characteristics of the key factors (classifier formation methods, training set size, weighting schemes, label assigning methods) through the diversified experiments. Consequently, It is more effective that apply proper parameters (${\beta}$, ${\lambda}$) and training set size (more than 5 years) according to the classification environments and properties of the document set. and If the performance is equivalent, I discovered that the use of the more simple methods (single weighting schemes) is very efficient. Also, because the classification of domestic papers is corresponding with multi-label classification which assigning more than one label to an article, it is necessary to develop the optimum classification model based on the characteristics of the key factors in consideration of this environment.

Extracting Technical Vocabulary List for Early Childhood Education Using EAP Specialized Corpus (EAP 전문 코퍼스를 활용한 유아교육 전문 어휘 추출)

  • Lee, Je-Young;Ahn, Jongki;Lee, Jee Eun
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.1
    • /
    • pp.475-484
    • /
    • 2017
  • The aim of this research is the development and evaluation of a technical vocabulary list for early childhood education. The list was compiled from a corpus of 500,000 running words of written academic texts from 7 books about early childhood education. The distribution of GSL[1] and AWL[2] was 81.86% and 9.78% respectively, which meant that academic texts related to early childhood education is very similar with ones on other disciplines. The technical vocabulary list for early childhood education (TV4ECE), extracted in terms of frequency and range, contains 224 types. This word list can be used to teach early childhood education in English, especially for the preparation of reading the English texts in the field of early childhood education.

A Study on the Semiautomatic Construction of Domain-Specific Relation Extraction Datasets from Biomedical Abstracts - Mainly Focusing on a Genic Interaction Dataset in Alzheimer's Disease Domain - (바이오 분야 학술 문헌에서의 분야별 관계 추출 데이터셋 반자동 구축에 관한 연구 - 알츠하이머병 유관 유전자 간 상호 작용 중심으로 -)

  • Choi, Sung-Pil;Yoo, Suk-Jong;Cho, Hyun-Yang
    • Journal of Korean Library and Information Science Society
    • /
    • v.47 no.4
    • /
    • pp.289-307
    • /
    • 2016
  • This paper introduces a software system and process model for constructing domain-specific relation extraction datasets semi-automatically. The system uses a set of terms such as genes, proteins diseases and so forth as inputs and then by exploiting massive biological interaction database, generates a set of term pairs which are utilized as queries for retrieving sentences containing the pairs from scientific databases. To assess the usefulness of the proposed system, this paper applies it into constructing a genic interaction dataset related to Alzheimer's disease domain, which extracts 3,510 interaction-related sentences by using 140 gene names in the area. In conclusion, the resulting outputs of the case study performed in this paper indicate the fact that the system and process could highly boost the efficiency of the dataset construction in various subfields of biomedical research.