• Title/Summary/Keyword: Public dataset

Search Result 254, Processing Time 0.025 seconds

A Study on the Establishment of Standard Elements of Infrastructure Master Data: Focused on Infrastructure Standard Dataset (기반시설 마스터데이터 표준요소 구축에 관한 연구 - 기반시설 표준데이터를 중심으로 -)

  • Sohn, Hyein;Nam, Young Joon
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.28 no.4
    • /
    • pp.35-55
    • /
    • 2017
  • The Master Data is constructed for the wide use within the institution, and it is mainly used in the enterprise. In this research, we have conducted research for the purpose of building master data on infrastructure that can be used by public institutions in the country. To do this, we analyzed individual attributes of the standard data set provided by the public data portal. Among these, we extracted standard elements that match the characteristics of the Master Data. Finally, the standardized elements are verified through the standardization system that is utilized in the country.

A Study on Managing Dataset Records in Government Information Systems (행정정보 데이터세트 기록의 관리방안)

  • Wang, Ho-sung;Seol, Moon-won
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.17 no.3
    • /
    • pp.23-47
    • /
    • 2017
  • According to a recent survey, over 18,000 government information systems have numerous different functions and characteristics. Although every dataset that is created and maintained in government information systems is declared as a collection of records according to the Public Records Management Act, current electronic records management policies cannot cover dataset records management. This study suggests the policy directions for dataset records management at the national level. It emphasizes the necessity to preserve the appearance and behavior (function) of database systems to ensure the authenticity of dataset records. In addition, this study investigates "emulation" as a representation and long-term preservation methodology for dataset-type records. It also suggests a dataset records model.

Modern Face Recognition using New Masked Face Dataset Generated by Deep Learning (딥러닝 기반의 새로운 마스크 얼굴 데이터 세트를 사용한 최신 얼굴 인식)

  • Pann, Vandet;Lee, Hyo Jong
    • Annual Conference of KIPS
    • /
    • 2021.11a
    • /
    • pp.647-650
    • /
    • 2021
  • The most powerful and modern face recognition techniques are using deep learning methods that have provided impressive performance. The outbreak of COVID-19 pneumonia has spread worldwide, and people have begun to wear a face mask to prevent the spread of the virus, which has led existing face recognition methods to fail to identify people. Mainly, it pushes masked face recognition has become one of the most challenging problems in the face recognition domain. However, deep learning methods require numerous data samples, and it is challenging to find benchmarks of masked face datasets available to the public. In this work, we develop a new simulated masked face dataset that we can use for masked face recognition tasks. To evaluate the usability of the proposed dataset, we also retrained the dataset with ArcFace based system, which is one the most popular state-of-the-art face recognition methods.

A Study on Educational Data Mining for Public Data Portal through Topic Modeling Method with Latent Dirichlet Allocation (LDA기반 토픽모델링을 활용한 공공데이터 기반의 교육용 데이터마이닝 연구)

  • Seungki Shin
    • Journal of The Korean Association of Information Education
    • /
    • v.26 no.5
    • /
    • pp.439-448
    • /
    • 2022
  • This study aims to search for education-related datasets provided by public data portals and examine what data types are constructed through classification using topic modeling methods. Regarding the data of the public data portal, 3,072 cases of file data in the education field were collected based on the classification system. Text mining analysis was performed using the LDA-based topic modeling method with stopword processing and data pre-processing for each dataset. Program information and student-supporting notifications were usually provided in the pre-classified dataset for education from the data portal. On the other hand, the characteristics of educational programs and supporting information for the disabled, parents, the elderly, and children through the perspective of lifelong education were generally indicated in the dataset collected by searching for education. The results of data analysis through this study show that providing sufficient educational information through the public data portal would be better to help the students' data science-based decision-making and problem-solving skills.

Study on Public Institution Dataset Identification and Evaluation Process : Focusing on the Case of KR Electronic Procurement System (공공기관 데이터세트 식별과 평가 절차 연구 국가철도공단 전자조달시스템 사례를 중심으로)

  • Hwang, jin hyun;Baek, young mi;Yim, jin hee
    • The Korean Journal of Archival Studies
    • /
    • no.70
    • /
    • pp.41-83
    • /
    • 2021
  • After the revision of the Enforcement Decree of the Public Records Act, the archives created a management standard table for data set records management and performed management and control. Therefore, in this study, the data set record identification procedure and evaluation index were developed for systematic data set record management of archives. By applying this, a management standard table was prepared after identifying the records of 8 datasets in kr's electronic procurement system, and the evaluation was carried out according to the evaluation index, and the retention period, transfer, and collection were determined. It is hoped that this case study will be of practical use to the archives at a time when concrete examples of procedures for the management of dataset records are lacking.

A Study on Improvement of Evaluation Indicators for Archival Appraisal of Administrative Information Dataset (행정정보 데이터세트 평가선별을 위한 평가지표 개선방안 연구)

  • HanYeok Jeon;Byongu Kang;ChaeEun Song;Dongmin Yang
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.23 no.2
    • /
    • pp.27-48
    • /
    • 2023
  • In domestic public institutions, administrative information datasets are recognized as electronic records that require systematic management. In this regard, concrete measures for the execution of records management have been discussed recently in the National Archives of Korea and the academic field. This study seeks to derive a plan to improve evaluation indicators that can effectively grasp the value of administrative information datasets and the matters to be considered when evaluating and selecting datasets in the records management of public institutions. This paper analyzes the theoretical background and current status of dataset evaluation and selection, derives considerations necessary for this process, and proposes improvement measures for evaluation indicators presented in previous studies. The results of this study are expected to lead to the revitalization of discussions on maintaining the public institutions' dataset management system and supplementing the management process in the future.

Classification of Network Traffic using Machine Learning for Software Defined Networks

  • Muhammad Shahzad Haroon;Husnain Mansoor
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.12
    • /
    • pp.91-100
    • /
    • 2023
  • As SDN devices and systems hit the market, security in SDN must be raised on the agenda. SDN has become an interesting area in both academics and industry. SDN promises many benefits which attract many IT managers and Leading IT companies which motivates them to switch to SDN. Over the last three decades, network attacks becoming more sophisticated and complex to detect. The goal is to study how traffic information can be extracted from an SDN controller and open virtual switches (OVS) using SDN mechanisms. The testbed environment is created using the RYU controller and Mininet. The extracted information is further used to detect these attacks efficiently using a machine learning approach. To use the Machine learning approach, a dataset is required. Currently, a public SDN based dataset is not available. In this paper, SDN based dataset is created which include legitimate and non-legitimate traffic. Classification is divided into two categories: binary and multiclass classification. Traffic has been classified with or without dimension reduction techniques like PCA and LDA. Our approach provides 98.58% of accuracy using a random forest algorithm.

Human Activity Classification Using Deep Transfer Learning (딥 전이 학습을 이용한 인간 행동 분류)

  • Nindam, Somsawut;Manmai, Thong-oon;Sung, Thaileang;Wu, Jiahua;Lee, Hyo Jong
    • Annual Conference of KIPS
    • /
    • 2022.11a
    • /
    • pp.478-480
    • /
    • 2022
  • This paper studies human activity image classification using deep transfer learning techniques focused on the inception convolutional neural networks (InceptionV3) model. For this, we used UFC-101 public datasets containing a group of students' behaviors in mathematics classrooms at a school in Thailand. The video dataset contains Play Sitar, Tai Chi, Walking with Dog, and Student Study (our dataset) classes. The experiment was conducted in three phases. First, it extracts an image frame from the video, and a tag is labeled on the frame. Second, it loads the dataset into the inception V3 with transfer learning for image classification of four classes. Lastly, we evaluate the model's accuracy using precision, recall, F1-Score, and confusion matrix. The outcomes of the classifications for the public and our dataset are 1) Play Sitar (precision = 1.0, recall = 1.0, F1 = 1.0), 2), Tai Chi (precision = 1.0, recall = 1.0, F1 = 1.0), 3) Walking with Dog (precision = 1.0, recall = 1.0, F1 = 1.0), and 4) Student Study (precision = 1.0, recall = 1.0, F1 = 1.0), respectively. The results show that the overall accuracy of the classification rate is 100% which states the model is more powerful for learning UCF-101 and our dataset with higher accuracy.

A Novel Transfer Learning-Based Algorithm for Detecting Violence Images

  • Meng, Yuyan;Yuan, Deyu;Su, Shaofan;Ming, Yang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.6
    • /
    • pp.1818-1832
    • /
    • 2022
  • Violence in the Internet era poses a new challenge to the current counter-riot work, and according to research and analysis, most of the violent incidents occurring are related to the dissemination of violence images. The use of the popular deep learning neural network to automatically analyze the massive amount of images on the Internet has become one of the important tools in the current counter-violence work. This paper focuses on the use of transfer learning techniques and the introduction of an attention mechanism to the residual network (ResNet) model for the classification and identification of violence images. Firstly, the feature elements of the violence images are identified and a targeted dataset is constructed; secondly, due to the small number of positive samples of violence images, pre-training and attention mechanisms are introduced to suggest improvements to the traditional residual network; finally, the improved model is trained and tested on the constructed dedicated dataset. The research results show that the improved network model can quickly and accurately identify violence images with an average accuracy rate of 92.20%, thus effectively reducing the cost of manual identification and providing decision support for combating rebel organization activities.

Construction of LiDAR Dataset for Autonomous Driving Considering Domestic Environments and Design of Effective 3D Object Detection Model (국내 주행환경을 고려한 자율주행 라이다 데이터 셋 구축 및 효과적인 3D 객체 검출 모델 설계)

  • Jin-Hee Lee;Jae-Keun Lee;Joohyun Lee;Je-Seok Kim;Soon Kwon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.18 no.5
    • /
    • pp.203-208
    • /
    • 2023
  • Recently, with the growing interest in the field of autonomous driving, many researchers have been focusing on developing autonomous driving software platforms. In particular, we have concentrated on developing 3D object detection models that can improve real-time performance. In this paper, we introduce a self-constructed 3D LiDAR dataset specific to domestic environments and propose a VariFocal-based CenterPoint for the 3D object detection model, with improved performance over the previous models. Furthermore, we present experimental results comparing the performance of the 3D object detection modules using our self-built and public dataset. As the results show, our model, which was trained on a large amount of self-constructed dataset, successfully solves the issue of failing to detect large vehicles and small objects such as motorcycles and pedestrians, which the previous models had difficulty detecting. Consequently, the proposed model shows a performance improvement of about 1.0 mAP over the previous model.