• Title/Summary/Keyword: 과학기술 데이터

Search Result 2,591, Processing Time 0.029 seconds

Managing Data Set in Administrative Information Systems as Records (행정정보 데이터세트의 기록관리 방안)

  • Oh, Seh-La;Rieh, Hae-young
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.19 no.2
    • /
    • pp.51-76
    • /
    • 2019
  • Records management professionals and scholars have emphasized the necessity of managing data set in administrative information systems as records, but it has not been practiced in the actual field. Applying paper-based records management standards and guidelines to data set management proved to be a difficult task because of technology-dependent characteristics, vast scale, and various operating environments. Therefore, the data set requires a management system that can accommodate the inherent characteristics of records and can be practically applied. This study developed and presented data set management methods and procedures based on the analysis of data set in public administrative information systems operating in public institutions.

DART: Data Augmentation using Retrieval Technique (DART: 검색 모델 기술을 사용한 데이터 증강 방법론 연구)

  • Seungjun Lee;Jaehyung Seo;Jungseob Lee;Myunghoon Kang;Hyeonseok Moon;Chanjun Park;Dahyun Jung;Jaewook Lee;Kinam Park;Heuiseok Lim
    • Annual Conference on Human and Language Technology
    • /
    • 2022.10a
    • /
    • pp.313-319
    • /
    • 2022
  • 최근 BERT와 같은 트랜스포머 (Transformer) 기반의 모델이 natural language understanding (NLU)와 같은 여러 자연어 처리 태스크에서 좋은 성능을 보인다. 이러한 모델은 여전히 대용량의 학습을 요구한다. 일반적으로, 데이터 증강 기법은 low-resource 환경을 개선하는 데 도움을 준다. 최근 생성 모델을 활용해 합성 데이터를 생성해 데이터를 증강하는 시도가 이루어졌다. 이러한 방법은 원본 문장과 의미론적 유사성을 훼손하지 않으면서 어휘와 구조적 다양성을 높이는 것을 목표로 한다. 본 논문은 task-oriented 한 어휘와 구조를 고려한 데이터 증강 방법을 제안한다. 이를 위해 검색 모델과 사전 학습된 생성 모델을 활용한다. 검색 모델을 사용해 학습 데이터셋의 입력 문장과 유사한 문장 쌍을 검색 (retrieval) 한다. 검색된 유사한 문장 쌍을 사용하여 생성 모델을 학습해 합성 데이터를 생성한다. 본 논문의 방법론은 low-resource 환경에서 베이스라인 성능을 최대 4% 이상 향상할 수 있었으며, 기존의 데이터 증강 방법론보다 높은 성능 향상을 보인다.

  • PDF

A Study on Development of Digital Curation Maturity Models and Indicators: Focusing on KISTI (디지털 큐레이션 성숙도 모델 및 지표 개발에 관한 연구: 한국과학기술정보연구원 디지털큐레이션센터를 중심으로)

  • Seonghun, Kim;Suelki, Do;Sangeun, Han;Jayhoon, Kim;Seokjong, Lim;Jinho, Park
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.4
    • /
    • pp.269-306
    • /
    • 2022
  • This study aimed to develop indicators that can measure the digital transformation performance of science and technology information construction and sharing systems by utilizing the Digital Curation Maturity Models. For digital transformation, it is necessary to consider not only simple service improvement but also organizational and business changes. In this study, we aimed to develop a model for measuring the digital transformation of KISTI, Korea's representative science and technology information service organization. KISTI has already carried out BPR work for digital transformation and borrowed the concept of a maturity model. However, in BPR, there is no method to measure the result. Therefore, in this paper, we developed an index to measure digital transformation based on the maturity model. Indicator development was carried out in two ways: model development and evaluation. Cases for model construction were made through a comprehensive review of existing KISTI and various domestic and foreign cases. The models before verification were technology (37), data (45), strategy (18), organization (36), and (social)influence (14) based on the major categories. After verification using confirmatory factor analysis, the model is classified as technology (20 / 17 indicators dropped), data (36 / 9 indicators dropped), strategy (18 / maintenance), organization(30 / 6 indicators dropped), and (social) influence (13 indicators / 1 indicator dropped).

The Fourth Industrial Revolution and the Deregulation of Data Protection (4차 산업혁명과 개인정보 규제완화론)

  • Chang, Yeo-Kyung
    • Journal of Science and Technology Studies
    • /
    • v.17 no.2
    • /
    • pp.41-79
    • /
    • 2017
  • The fourth industrial revolution, which is all the rage in recent years in South Korea, comes from Klaus Schwab's book. Schwab claims that recent rapid technological innovation has inevitably determined the future of our society, and regulations on related policies need to be relaxed. The debate on the Fourth Industrial Revolution in the Korean society is also centered on deregulation policies. In particular, it is strongly argued that personal data protection regulation should be relaxed in a big data environments. The Science and technology studies has long criticized technological determinism. The future of technology can be changed by the will of regulatory authorities and the intervention of civil society. In this article, the author examines various discussions at home and abroad around the deregulation of data protection, including de-identification of personal data. Through this, the author criticizes the way of accepting the fourth industrial revolution theory, and draw its implications for the Korean society.

A Study on Data Cleansing Techniques for Word Cloud Analysis of Text Data (텍스트 데이터 워드클라우드 분석을 위한 데이터 정제기법에 관한 연구)

  • Lee, Won-Jo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.4
    • /
    • pp.745-750
    • /
    • 2021
  • In Big data visualization analysis of unstructured text data, raw data is mostly large-capacity, and analysis techniques cannot be applied without cleansing it unstructured. Therefore, from the collected raw data, unnecessary data is removed through the first heuristic cleansing process and Stopwords are removed through the second machine cleansing process. Then, the frequency of the vocabulary is calculated, visualized using the word cloud technique, and key issues are extracted and informationalized, and the results are analyzed. In this study, we propose a new Stopword cleansing technique using an external Stopword set (DB) in Python word cloud, and derive the problems and effectiveness of this technique through practical case analysis. And, through this verification result, the utility of the practical application of word cloud analysis applying the proposed cleansing technique is presented.

Implementation of marine static data collection and DB storage algorithms (해양 정적 데이터 수집 및 DB 저장 알고리즘 구현)

  • Seung-Hwan Choi;Gi-Jo Park;Ki-Sook Chung;Woo-Sug Jung;Kyung-Seok Kim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.2
    • /
    • pp.95-101
    • /
    • 2023
  • Globally, the importance of utilization and management of marine spatial information is being maximized, and analyzing such data is emerging as a major driving force for R&D. In Korea, it is expected that collecting marine data from the past to the present and extracting its value will play an important role in the development of science in Korea in the future. In particular, marine static data constitutes a huge big database, and it is necessary to store and store the collected data without loss as high data collection costs and high-level observation techniques are required. In addition, the Disaster Safety Intelligence Convergence Center's "Marine Digital Twin Establishment and Utilization-Based Technology Research" task requires collection and analysis of marine data, so this paper conducts a current status survey of static marine data. And we present a series of algorithms that collect and store them in a database.

Effect on self-enhancement of deep-learning inference by repeated training of false detection cases in tunnel accident image detection (터널 내 돌발상황 오탐지 영상의 반복 학습을 통한 딥러닝 추론 성능의 자가 성장 효과)

  • Lee, Kyu Beom;Shin, Hyu Soung
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.21 no.3
    • /
    • pp.419-432
    • /
    • 2019
  • Most of deep learning model training was proceeded by supervised learning, which is to train labeling data composed by inputs and corresponding outputs. Labeling data was directly generated manually, so labeling accuracy of data is relatively high. However, it requires heavy efforts in securing data because of cost and time. Additionally, the main goal of supervised learning is to improve detection performance for 'True Positive' data but not to reduce occurrence of 'False Positive' data. In this paper, the occurrence of unpredictable 'False Positive' appears by trained modes with labeling data and 'True Positive' data in monitoring of deep learning-based CCTV accident detection system, which is under operation at a tunnel monitoring center. Those types of 'False Positive' to 'fire' or 'person' objects were frequently taking place for lights of working vehicle, reflecting sunlight at tunnel entrance, long black feature which occurs to the part of lane or car, etc. To solve this problem, a deep learning model was developed by simultaneously training the 'False Positive' data generated in the field and the labeling data. As a result, in comparison with the model that was trained only by the existing labeling data, the re-inference performance with respect to the labeling data was improved. In addition, re-inference of the 'False Positive' data shows that the number of 'False Positive' for the persons were more reduced in case of training model including many 'False Positive' data. By training of the 'False Positive' data, the capability of field application of the deep learning model was improved automatically.

A Study on the Trend Analysis Based on Personal Information Threats Using Text Mining (텍스트 마이닝을 활용한 개인정보 위협기반의 트렌드 분석 연구)

  • Kim, Young-Hee;Lee, Taek-Hyun;Kim, Jong-Myoung;Park, Won-Hyung;Koo, Kwang-Ho
    • Convergence Security Journal
    • /
    • v.19 no.2
    • /
    • pp.29-38
    • /
    • 2019
  • For that reason, trend research has been actively conducted to identify and analyze the key topics in large amounts of data and information. Also personal information protection field is increasing activities in order to identify prospects and trends in advance for preemptive response. However, only research based on technology such as trends in information security field and personal information protection solution is broadly taking place. In this study, threat-based trends in personal information protection field is analyzed through text mining method. This will be the key to deduct undiscovered issues and provide visibility of current and future trends. Policy formulation is possible for companies handling personal information and for that reason, it is expected to be used for searching direction of strategy establishment for effective response.

A Transdisciplinary and Humanistic Approach on the Impacts by Artificial Intelligence Technology (인공지능과 디지털 기술 발달에 따른 트랜스/포스트휴머니즘에 관한 학제적 연구)

  • Kim, Dong-Yoon;Bae, Sang-Joon
    • Journal of Broadcast Engineering
    • /
    • v.24 no.3
    • /
    • pp.411-419
    • /
    • 2019
  • Nowadays we are not able to consider and imagine anything without taking into account what is called Artificial Intelligence. Even broadcasting media technologies could not be thought of outside this newly emerging technology of A.I.. Since the last part of 20th century, this technology seemingly is accelerating it's development thanks to an unbelievably enormous computational capacity of data information treatments. In conjunction with the firmly established worldwide platform companies like GAFA(Google, Amazon, Facebook, Apple), the key cutting edge technologies dubbed NBIC(Nanotech, Biotech, Information Technology, Cognitive science) converge to change the map of the current civilization by affecting the human relationship with the world and hence modifying what is essential in humans. Under the sign of the converging technologies, the relatively recently coined concepts such as 'trans(post)humanism' are emerging in the academic sphere in the North American and Major European regions. Even though the so-called trans(post)human movements are prevailing in the major technological spots, we have to say that these terms do not yet reach an unanimous acceptation among many experts coming from diverse fields. Indeed trans(post)humanism as a sort of obscure term has been a largely controversial trend. Because there have been many different opinions depending on scientific, philosophical, medical, engineering scholars like Peter Sloterdijk, K. N. Hayles, Neil Badington, Raymond Kurzweil, Hans Moravec, Laurent Alexandre, Gilbert Hottois just to name a few. However, considering the highly dazzling development of artificial intelligence technology basically functioning in conjunction with the cybernetic communication system firstly conceived by Nobert Wiener, MIT mathematician, we can not avoid questioning what A. I. signifies and how it will affect the current media communication environment.

A Visualization Method for the Ocean Forecast Data using WMS System (WMS 시스템을 이용한 해양예측모델 데이터의 가시화 기법)

  • Kwon, Taejung;Lee, Jaeryoung;Park, Jaepyo
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.18 no.6
    • /
    • pp.11-19
    • /
    • 2018
  • Recently, many companies offer various web-based map that is based on GIS(Geographic Information System) information. Google Map, Open street, Bing Map, Naver Map, Daum Map, Vwolrd Map, etc are the few examples of such system. In this paper, we propose a method to visualize ocean forecasting model data considering the flow diagram of tidal current, streamline expression algorithm, and user convenience by using vector field data information that is currently being served. It is confirmed that the proposed method of the flow diagram of tidal current, and stream line expression algorithm is faster than that of conventional ocean prediction model data by more than 2 times.