• Title/Summary/Keyword: 문헌비교

Search Result 2,667, Processing Time 0.027 seconds

Korean Patent ELECTRA : a pre-trained Korean Patent language representation model for the study of Korean Patent natural language processing(KorPatELECTRA) (Korean Patent ELECTRA : 한국 특허문헌 자연어처리 연구를 위한 사전 학습된 언어모델(KorPatELECTRA))

  • Min, Jae-Ok;Jang, Ji-Mo;Jo, Yu-Jeong;Noh, Han-Sung
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.07a
    • /
    • pp.69-71
    • /
    • 2021
  • 특허분야에서 자연어처리 태스크는 특허문헌의 언어적 특이성으로 문제 해결의 난이도가 높은 과제임에 따라 한국 특허문헌에 최적화된 언어모델의 연구가 시급한 실정이다. 본 논문에서는 대량의 한국 특허문헌 데이터를 최적으로 사전 학습(pre-trained)한 Korean Patent ELECTRA 모델과 tokenize 방식을 제안하며 기존 범용 목적의 사전학습 모델과 비교 실험을 통해 한국 특허문헌 자연어처리에 대한 발전 가능성을 확인하였다.

  • PDF

A bibliographical analysis on the studies of history of Korean women (우리나라 여성사연구의 서지적 고찰)

  • 유소영
    • Journal of Korean Library and Information Science Society
    • /
    • v.35 no.2
    • /
    • pp.115-133
    • /
    • 2004
  • The literature on Km women's history from ancient times to the end of the Japanese occupation is collected and analyzed The total number of documents treated in this paper is 369. The documents, all written by 2003 include books, theses for master's and Ph. D. degrees, journal articles, and digital papers on the Internet home pages. The results shows that the favored themes of the researchers were women's status, education, and arts. Other subjects include women's activities, daily life, biographies, and religion. The most preferred period was the Chosun Dynasty and the least preferred period was the Koryo Dynasty. The usual method of these documents is narrative. There are few attempts to compare women's history of Korea with that of other countries.

  • PDF

Literature survey, Modeling by Measured unit load Qulaty Assurance (문헌조사, 단일지목 모델링을 통한 실측 원단위 신뢰성검토)

  • Park, Jae-Beom;Kang, Du-Kee;Kal, Byung-Seok;Yoon, Young-Sam;Lee, Je-Guan
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2011.05a
    • /
    • pp.284-288
    • /
    • 2011
  • 비점오염원 원단위는 적용이 간편하기 때문에 많은 국가에서 비점오염원 유출량을 산정하기 위하여 사용하고 있다. 그러나 비점오염원의 유출은 지역, 지형, 기후, 토지이용 등과 같은 여러 가지 조건들에 따라 배출양상이 다양하게 나타나므로 신뢰성 있는 원단위를 산정하기 위해서는 장기간에 걸친 실측자료가 뒷받침되어야 한다. 우리나라에서는 지난 1980년대 초부터 비점오염 원단위에 대한 연구가 시작되었으며 수많은 문헌과 연구에서 비점오염원 유출특성을 제시하고 있다. 따라서 본 연구에서는 지난 3년간 낙동강 수계에서 모니터링된 비점오염원 자료를 통하여 유출모형인 SWMM(Storm Water Management Model) 구축하고 또한 구축된 모형을 이용하여 실측한 자료를 통한 EMC값과 기존 문헌에서 제시하고 있는 EMC값을 서로 비교함으로서 본 연구에서 제시하고 있는 EMC값의 적정성 및 선행 자료에서 제시하고 있는 값들의 범위를 분석하고자 하였다. 따라서, 본 연구는 모델링을 통하여 비점오염원 유출특성을 파악함과 동시에 실측값 및 기존 문헌에서 제시 하고 있는 EMC를 비교 분석함으로서 적정한 비점오염원의 EMC 값 범위를 파악하여 지목별 비점오염원 원단위 산정에 큰 도움이 되리라 판단된다.

  • PDF

The Relationship between Primary and Secondary Literature (학술정보에 있어서 1차 자료와 2차 자료의 관계에 대한 연구)

  • Oh Dong-Woo
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.33 no.1
    • /
    • pp.145-160
    • /
    • 1999
  • The literature may be divided into two broad categories : primary and secondary literature. Primary literature comprises documents which contain the full text of the author's work. On the other hand secondary literature, such as indexes, abstracts, bibliographies reviews and surveys, provides signposts to the primary literature. This paper intended to examine and analyze the relationship between primary and secondary literature through the previous research and the theory, and the actual study which compared it with the lists of journals in three secondary literatures: Library and Information Science Abstracts(LISA). 1998; Library Literature (LL), 1998; Information Science Abstracts (ISA), 1998. Some of the problem involved in the study of journal overlap and coverage patterns of secondary services are discussed. Conclusions are drawn about the impact on library and information science selection policies and organization of library collections.

  • PDF

Comparison of Performance Factors for Automatic Classification of Records Utilizing Metadata (메타데이터를 활용한 기록물 자동분류 성능 요소 비교)

  • Young Bum Gim;Woo Kwon Chang
    • Journal of the Korean Society for information Management
    • /
    • v.40 no.3
    • /
    • pp.99-118
    • /
    • 2023
  • The objective of this study is to identify performance factors in the automatic classification of records by utilizing metadata that contains the contextual information of records. For this study, we collected 97,064 records of original textual information from Korean central administrative agencies in 2022. Various classification algorithms, data selection methods, and feature extraction techniques are applied and compared with the intent to discern the optimal performance-inducing technique. The study results demonstrated that among classification algorithms, Random Forest displayed higher performance, and among feature extraction techniques, the TF method proved to be the most effective. The minimum data quantity of unit tasks had a minimal influence on performance, and the addition of features positively affected performance, while their removal had a discernible negative impact.

Comparative analysis of Korean universities' research performance evaluation standards: based on journal publications (국내대학의 연구업적평가기준 비교 분석 - 학술논문업적을 중심으로 -)

  • Lee, Hye-Kyung;Yang, Ki-Duk
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2015.08a
    • /
    • pp.17-22
    • /
    • 2015
  • 본 연구는 국내대학의 교원들의 학술논문업적에 관한 합리적이고 객관적인 평가방법 개발에 비계가 되고자, 현재 국내 대학에서 시행중인 교원업적평가 중 학술논문평가기준을 비교 분석하였다. 우선 문헌정보학과가 포함된 국내 종합대학 27개교의 학술논문평가기준을 수집하여 비교한 다음, 2001년에서 2014년까지의 전국문헌정보학과 교원들의 학술논문 서지데이터를 이용하여 학술지의 영향력지수 및 피인용횟수를 적용한 순위와 27개대학 평가기준을 적용한 순위를 학교별, 저자별로 도출하여 비교 분석하였다. 본 연구를 통하여 교원들의 공정한 평가방안 마련에 도모하고자 하며, 교원의 연구 및 학문활동의 상승 제고 뿐 아니라 대학 내 연구환경 개선까지 다방면으로 교원의 사기진작에 도움이 되고자 한다.

  • PDF

A Study on Tools to Develop Electronic Documents (전자문헌 개발도구에 관한 고찰 - SGML, HTML과 PDF를 중심으로 -)

  • Kim, Yong;NamKoong, Hwang
    • Journal of Information Management
    • /
    • v.29 no.1
    • /
    • pp.1-19
    • /
    • 1998
  • With development in computing and networking technologies, national supports and attention for building digital library, which is to overcome the limits of time and location in using information resources, is increasing. To accomplish the main goal of digital library that is to freely share and transfer information on network, the importance of standardization in developing electronic document is increasing. Now several tools to develop electronic document, which will be used in digital library, are developed for electronic document used on WWW. But none of them has absolute advantages to other formats. Those tools, that is, have comparative advantages and disadvantages for making electronic documents. Through reviewing features and analyzing comparative advantage and disadvantage of SGML, HTML, and PDF, which will be used to develop electronic documents in digital libraries, this study focuses on their comparative advantages and disadvantages. With doing it, this study propose relevant type of electronic document formats to the types of information resources.

  • PDF

Performance Evaluation of Commercial Document Delivery Suppliers Accessible through the Internet (인터넷을 통하여 접속가능한 상업적 문헌전달서비스의 성능 평가)

  • 장혜란
    • Journal of the Korean Society for information Management
    • /
    • v.17 no.1
    • /
    • pp.89-101
    • /
    • 2000
  • In recent years the service of commercial document delivery has mutured in the library and information field. The aim of this paper is to evaluate the performance of the major document delivery suppliers accessible through the internet. Actual article requests in science and technology were sampled systematically and forwarded to the suppliers selected. Processing data was collected and analyzed in terms of fill rate, delivery time, and cost. Method of delivery was also a criteria of analysis. Results shows that the performance of the commercial suppliers varies according to the criteria. And the knowledge about vendor services and citation verification are important factors affecting the document delivery capability. Finally recommendations based on the results are briefly examined.

  • PDF

A Comparative Study on Metadata Formats of Digital Contents (디지털콘텐츠 메타데이터 포맷의 비교 연구)

  • Cho, Yoon-Hee
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.37 no.2
    • /
    • pp.135-152
    • /
    • 2003
  • With the rapid growth of the Internet, digital contents have increased in a geometric progression and the types also became much varied. In order to make it easier to identify and search digital contents on the Internet, which is basically a distributed network environment, it is essential to organize and manage metadata. In this study, we have comparatively analyzed the data elements of the meta data formats currently approached from different aspects in diverse fields, so as to provide basic materials for securing interoperability of the meta data formats. We selected Dublin Core, Semantic Header, MARC, IAFA Templates, and TEI Header as the general metadata formats of digital contents used widely in all areas, and we carried out comparisons and analyses based on the literature.

Optimization of Number of Training Documents in Text Categorization (문헌범주화에서 학습문헌수 최적화에 관한 연구)

  • Shim, Kyung
    • Journal of the Korean Society for information Management
    • /
    • v.23 no.4 s.62
    • /
    • pp.277-294
    • /
    • 2006
  • This paper examines a level of categorization performance in a real-life collection of abstract articles in the fields of science and technology, and tests the optimal size of documents per category in a training set using a kNN classifier. The corpus is built by choosing categories that hold more than 2,556 documents first, and then 2,556 documents per category are randomly selected. It is further divided into eight subsets of different size of training documents : each set is randomly selected to build training documents ranging from 20 documents (Tr-20) to 2,000 documents (Tr-2000) per category. The categorization performances of the 8 subsets are compared. The average performance of the eight subsets is 30% in $F_1$ measure which is relatively poor compared to the findings of previous studies. The experimental results suggest that among the eight subsets the Tr-100 appears to be the most optimal size for training a km classifier In addition, the correctness of subject categories assigned to the training sets is probed by manually reclassifying the training sets in order to support the above conclusion by establishing a relation between and the correctness and categorization performance.