• Title/Summary/Keyword: 과학기술 데이터

Search Result 2,575, Processing Time 0.034 seconds

A Study on Metadata Elements for Digital Course Resources in Universities (디지털 강의자원 관리를 위한 메타데이터 요소에 관한 연구)

  • Choi, Yoon-Kyung;Chung, Yeon-Kyoung
    • Journal of Information Management
    • /
    • v.39 no.3
    • /
    • pp.23-48
    • /
    • 2008
  • The purpose of this study is to propose the mandatory and extensible elements of Korea Education Metadata elements for digital course resources universities in Korea. For the study, literature research, case study and interview were performed. The mandatory elements of KEM consisted of sixty nine elements, and were proposed by two phases. Also, based on interview process twelve items were newly recommended to supplement KEM elements.

The Current Status of the Electronic Journal Usage Statistics at the Academic Libraries (대학도서관에서의 전자저널 이용 통계 제공 및 활용 현황)

  • Hwang, Ok-Gyung
    • Journal of Information Management
    • /
    • v.38 no.4
    • /
    • pp.68-87
    • /
    • 2007
  • The purpose of the study is to understand the present state of practical use of electronic journal usage statistics at the academic libraries. For this purpose the study performed an online questionnaire survey to the 63 academic libraries located in Seoul and Gyeonggi Province. Based on the 48 responses, the study found out that the rate of satisfaction with the present usage data was low. Especially the rate of unsatisfaction with the absence of comparable data and the average usage rate of all the subscribing libraries appeared high. The study also examined 5 types of statistics for the evaluation of electronic journal.

An Automated Production System Design for Natural Language Processing Models Using Korean Pre-trained Model (한국어 사전학습 모델을 활용한 자연어 처리 모델 자동 산출 시스템 설계)

  • Jihyoung Jang;Hoyoon Choi;Gun-woo Lee;Myung-seok Choi;Charmgil Hong
    • Annual Conference on Human and Language Technology
    • /
    • 2022.10a
    • /
    • pp.613-618
    • /
    • 2022
  • 효과적인 자연어 처리를 위해 제안된 Transformer 구조의 등장 이후, 이를 활용한 대규모 언어 모델이자 사전학습 모델인 BERT, GPT, OPT 등이 공개되었고, 이들을 한국어에 보다 특화한 KoBERT, KoGPT 등의 사전학습 모델이 공개되었다. 자연어 처리 모델의 확보를 위한 학습 자원이 늘어나고 있지만, 사전학습 모델을 각종 응용작업에 적용하기 위해서는 데이터 준비, 코드 작성, 파인 튜닝 및 저장과 같은 복잡한 절차를 수행해야 하며, 이는 다수의 응용 사용자에게 여전히 도전적인 과정으로, 올바른 결과를 도출하는 것은 쉽지 않다. 이러한 어려움을 완화시키고, 다양한 기계 학습 모델을 사용자 데이터에 보다 쉽게 적용할 수 있도록 AutoML으로 통칭되는 자동 하이퍼파라미터 탐색, 모델 구조 탐색 등의 기법이 고안되고 있다. 본 연구에서는 한국어 사전학습 모델과 한국어 텍스트 데이터를 사용한 자연어 처리 모델 산출 과정을 정형화 및 절차화하여, 궁극적으로 목표로 하는 예측 모델을 자동으로 산출하는 시스템의 설계를 소개한다.

  • PDF

Research on Text Classification of Research Reports using Korea National Science and Technology Standards Classification Codes (국가 과학기술 표준분류 체계 기반 연구보고서 문서의 자동 분류 연구)

  • Choi, Jong-Yun;Hahn, Hyuk;Jung, Yuchul
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.1
    • /
    • pp.169-177
    • /
    • 2020
  • In South Korea, the results of R&D in science and technology are submitted to the National Science and Technology Information Service (NTIS) in reports that have Korea national science and technology standard classification codes (K-NSCC). However, considering there are more than 2000 sub-categories, it is non-trivial to choose correct classification codes without a clear understanding of the K-NSCC. In addition, there are few cases of automatic document classification research based on the K-NSCC, and there are no training data in the public domain. To the best of our knowledge, this study is the first attempt to build a highly performing K-NSCC classification system based on NTIS report meta-information from the last five years (2013-2017). To this end, about 210 mid-level categories were selected, and we conducted preprocessing considering the characteristics of research report metadata. More specifically, we propose a convolutional neural network (CNN) technique using only task names and keywords, which are the most influential fields. The proposed model is compared with several machine learning methods (e.g., the linear support vector classifier, CNN, gated recurrent unit, etc.) that show good performance in text classification, and that have a performance advantage of 1% to 7% based on a top-three F1 score.

A Study on the Appraisal of Research Records in Science and Technology : Focusing on Foreign Cases (과학기술분야 연구기록의 평가에 관한 연구)

  • Lee, Mi-Young
    • The Korean Journal of Archival Studies
    • /
    • no.41
    • /
    • pp.75-111
    • /
    • 2014
  • With the quantitative growth of research data, the issue of enormous preservation cost and sharing expansion, the organizations should prioritize the collections then select the data that are worthy of save. Therefore, today, it is important for the organizations to appraise the continuing value of produced records. Considering the universities and the public institutions such as governmentfunded research institutes as the heavy producer of the data, it becomes a rising problem for the records management that it does not go beyond the framework of "administrative records" and "public records". In this study, I looked into the background of the contention that the research records must be managed in a different perspective and checked the characteristics of research records in therms of the producers, research activities and records. Based on this analysis, I suggested the main issues and considerations about the subjects, criterias and methods in research records appraisal.

Automatic Classification of Department Types and Analysis of Co-Authorship Network: Focusing on Korean Journals in the Computer Field

  • Byungkyu Kim;Beom-Jong You;Min-Woo Park
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.4
    • /
    • pp.53-63
    • /
    • 2023
  • The utilization of department information in bibliometric analysis using scientific and technological literature is highly advantageous. In this paper, the department information dataset was built through the screening, data refinement, and classification processing of authors' department type belonging to university institutions appearing in academic journals in the field of science and technology published in Korea, and the automatic classification model based on deep learning was developed using the department information dataset as learning data and verification data. In addition, we analyzed the co-authorship structure and network in the field of computer science using the department information dataset and affiliation information of authors from domestic academic journals. The research resulted in a 98.6% accuracy rate for the automatic classification model using Korean department information. Moreover, the co-authorship patterns of Korean researchers in the computer science and engineering field, along with the characteristics and centralities of the co-author network based on institution type, region, institution, and department type, were identified in detail and visually presented on a map.

Derivation of Inherent Optical Properties Based on Deep Neural Network (심층신경망 기반의 해수 고유광특성 도출)

  • Hyeong-Tak Lee;Hey-Min Choi;Min-Kyu Kim;Suk Yoon;Kwang-Seok Kim;Jeong-Eon Moon;Hee-Jeong Han;Young-Je Park
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_1
    • /
    • pp.695-713
    • /
    • 2023
  • In coastal waters, phytoplankton,suspended particulate matter, and dissolved organic matter intricately and nonlinearly alter the reflectivity of seawater. Neural network technology, which has been rapidly advancing recently, offers the advantage of effectively representing complex nonlinear relationships. In previous studies, a three-stage neural network was constructed to extract the inherent optical properties of each component. However, this study proposes an algorithm that directly employs a deep neural network. The dataset used in this study consists of synthetic data provided by the International Ocean Color Coordination Group, with the input data comprising above-surface remote-sensing reflectance at nine different wavelengths. We derived inherent optical properties using this dataset based on a deep neural network. To evaluate performance, we compared it with a quasi-analytical algorithm and analyzed the impact of log transformation on the performance of the deep neural network algorithm in relation to data distribution. As a result, we found that the deep neural network algorithm accurately estimated the inherent optical properties except for the absorption coefficient of suspended particulate matter (R2 greater than or equal to 0.9) and successfully separated the sum of the absorption coefficient of suspended particulate matter and dissolved organic matter into the absorption coefficient of suspended particulate matter and dissolved organic matter, respectively. We also observed that the algorithm, when directly applied without log transformation of the data, showed little difference in performance. To effectively apply the findings of this study to ocean color data processing, further research is needed to perform learning using field data and additional datasets from various marine regions, compare and analyze empirical and semi-analytical methods, and appropriately assess the strengths and weaknesses of each algorithm.

슈퍼컴퓨터를 이용한 첨단과학정보 DB구축

  • Hwang, Il-Seon
    • Journal of Scientific & Technological Knowledge Infrastructure
    • /
    • s.1
    • /
    • pp.96-101
    • /
    • 2000
  • 이번에 구축되는 첨단과학분야의 DB들은 슈퍼컴퓨터를 사용해야만 하는 방대한 데이터 양을 처리해야 되거나 혹은 계산알고리즘을 포함하고 있다. 또한 구축되는 DB들을 일반 상요자와 연결하는 정보서비스 시스템을 포함하고 있다. 따라서 연구개발 정보센터는 일반사용자가 획득하고자 하는 정보를 One-stop으로 제공받을 수 있ㄷ록 인터넷, 관련 공공망 그리고 슈퍼컴퓨터 센터의 전국망인 HPCNat을 통한 정보서비스 계획을 고려하고 있다.

  • PDF

A Data Migration Model and Case Study for Building Management System of Science and Technology Contents (과학기술정보콘텐츠 통합관리시스템 구축을 위한 데이터 마이그레이션 모델 수립 및 적용 사례)

  • Shin, Sung-Ho;Lee, Min-Ho;Lee, Won-Goo;Yoon, Hwa-Mook;Sung, Won-Kyung;Kim, Kwang-Young
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.11
    • /
    • pp.123-135
    • /
    • 2011
  • The domestic market of database in Korea is estimated to be over 3.663 trillion won. The data migration is getting to be more important along with the continuous growth of the database industry. g-CRM and private recommending function are examples of the service that can be given through coupling among customer database, product database, geographic information database, and others. The core infrastructure is also the database which is integrated, perfect, and reliable. There are not enough researches on efficient way of data migration and integrating process and investigation of migrated data though trends of database in IT environment as above. In connection with this issue, we have made a model for data migration on scientific and technological contents and suggest the result of data migration process adapting that model. In addition, we verified migration's exhaustiveness, migration's consistency, and migration's coherence for investigation of migrated data and database. From the result, we conclude data migration based on proper model has a significant influence on the database consistency and the data values correctness and is essential to maintain high qualified database.

Study about Research Data Citation Based on DCI (Data Citation Index) (Data Citation Index를 기반으로 한 연구데이터 인용에 관한 연구)

  • Cho, Jane
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.50 no.1
    • /
    • pp.189-207
    • /
    • 2016
  • Sharing and reutilizing of research data could not only enhance efficiency and transparency of research process, but also create new science through data integrating and reinterpretationing. Diverse policies about research data sharing and reutilizing have been developing, along with extending of research evaluating spectrum that across research data citation rate to social impact of research output. This study analyzed the scale and citation number of research data which has not been analyzed before in korea through data citation index using Kruskal-Wallis H analysis. As result, genetics and biotechnology are identified as subject areas which have most huge number of research data, however the subject areas that have been highly cited are identified as economics and social study such as, demographic and employment. And Uk Data Archive, Inter-university Consortium for Political and Social Research are analyzed as data repositories which have most highly cited research data. And the data study which describes methodology of data survey, type and so on shows high citation rate than other data type. In the result of altmetrics of research data, data study of social science shows relatively high impact than other areas.