• Title/Summary/Keyword: BIG4

Search Result 3,612, Processing Time 0.032 seconds

Trends of South Korea's Informatization and Libraries' Role Based on Newspaper Big Data (신문 빅데이터를 바탕으로 본 국내 정보화의 경향과 도서관의 역할)

  • Na, Kyoungsik;Lee, Jisu
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.9
    • /
    • pp.14-33
    • /
    • 2018
  • The purpose of this study to analyze the informatization trends in Korea through objective newspaper data for the period from 1998 to 2017 for informatization and library in four newspapers including KyoungHyang Newspaper, Kookmin Ilbo, Hankyoreh and Hankookilbo. Based on the analysis results of metadata and related words using BIGKinds, a news big data system, this study presented analysis of simple frequency, classification and classification of the keywords 'information', 'informatization' and 'library'. Based on the results, we compared and analyzed the tendency of informatization in the media through comparison with the 'Information White Paper' which is the publication of government agencies and with research about the research topic of 4 academic journals in the Library and Information Science field. This study tried to interpret the trends of informatization based on the media and it is meaningful that we analyzed the big data of newspaper article which is the long term and time series data. Based on the results of the study, implications of the growth and development of libraries with domestic informatization were suggested. It is expected that we will be able to create a basic framework for developing library informatization policy through the further studies.

A Quality Evaluation Model for Distributed Processing Systems of Big Data (빅데이터 분산처리시스템의 품질평가모델)

  • Choi, Seung-Jun;Park, Jea-Won;Kim, Jong-Bae;Choi, Jae-Hyun
    • Journal of Digital Contents Society
    • /
    • v.15 no.4
    • /
    • pp.533-545
    • /
    • 2014
  • According to the evolving of IT technologies, the amount of data we are facing increasing exponentially. Thus, the technique for managing and analyzing these vast data that has emerged is a distributed processing system of big data. A quality evaluation for the existing distributed processing systems has been proceeded by the structured data environment. Thus, if we apply this to the evaluation of distributed processing systems of big data which has to focus on the analysis of the unstructured data, a precise quality assessment cannot be made. Therefore, a study of the quality evaluation model for the distributed processing systems is needed, which considers the environment of the analysis of big data. In this paper, we propose a new quality evaluation model by deriving the quality evaluation elements based on the ISO/IEC9126 which is the international standard on software quality, and defining metrics for validating the elements.

Study on Big Data Utilization Plans in Mathematics Education (수학교육에서 빅데이터 활용 방안에 대한 소고)

  • Ko, Ho Kyoung;Choi, Youngwoo;Park, Seonjeong
    • Communications of Mathematical Education
    • /
    • v.28 no.4
    • /
    • pp.573-588
    • /
    • 2014
  • How will the field of education react to the big data craze that has recently seeped into every aspect of society? To search for ways to use big data in mathematics education, this study first examined the concept of big data and examples of its application, and then pursued directions for future research in two ways. First, changes in the representation and acceptance of data are required because of changes in technology and the environment. In other words, the learning content and methodology of data treatment need to be changed by describing a myriad amount of data visually or by 'analyzing and inferring' data to provide data efficiently and clearly. Additionally, the mathematics education field needs to foster changes in curricula to facilitate the improvement of students' learning capacity in the 21st century. Second, it is necessary to more actively collect data on general education and not merely on teaching or learning to identify new information, pursue positive changes in the teaching and learning of mathematics, and stimulate interest and research in the field so that it can be used to make policy decisions regarding mathematics education.

For Improving Security Log Big Data Analysis Efficiency, A Firewall Log Data Standard Format Proposed (보안로그 빅데이터 분석 효율성 향상을 위한 방화벽 로그 데이터 표준 포맷 제안)

  • Bae, Chun-sock;Goh, Sung-cheol
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.1
    • /
    • pp.157-167
    • /
    • 2020
  • The big data and artificial intelligence technology, which has provided the foundation for the recent 4th industrial revolution, has become a major driving force in business innovation across industries. In the field of information security, we are trying to develop and improve an intelligent security system by applying these techniques to large-scale log data, which has been difficult to find effective utilization methods before. The quality of security log big data, which is the basis of information security AI learning, is an important input factor that determines the performance of intelligent security system. However, the difference and complexity of log data by various product has a problem that requires excessive time and effort in preprocessing big data with poor data quality. In this study, we research and analyze the cases related to log data collection of various firewall. By proposing firewall log data collection format standard, we hope to contribute to the development of intelligent security systems based on security log big data.

Big Data Meets Telcos: A Proactive Caching Perspective

  • Bastug, Ejder;Bennis, Mehdi;Zeydan, Engin;Kader, Manhal Abdel;Karatepe, Ilyas Alper;Er, Ahmet Salih;Debbah, Merouane
    • Journal of Communications and Networks
    • /
    • v.17 no.6
    • /
    • pp.549-557
    • /
    • 2015
  • Mobile cellular networks are becoming increasingly complex to manage while classical deployment/optimization techniques and current solutions (i.e., cell densification, acquiring more spectrum, etc.) are cost-ineffective and thus seen as stopgaps. This calls for development of novel approaches that leverage recent advances in storage/memory, context-awareness, edge/cloud computing, and falls into framework of big data. However, the big data by itself is yet another complex phenomena to handle and comes with its notorious 4V: Velocity, voracity, volume, and variety. In this work, we address these issues in optimization of 5G wireless networks via the notion of proactive caching at the base stations. In particular, we investigate the gains of proactive caching in terms of backhaul offloadings and request satisfactions, while tackling the large-amount of available data for content popularity estimation. In order to estimate the content popularity, we first collect users' mobile traffic data from a Turkish telecom operator from several base stations in hours of time interval. Then, an analysis is carried out locally on a big data platformand the gains of proactive caching at the base stations are investigated via numerical simulations. It turns out that several gains are possible depending on the level of available information and storage size. For instance, with 10% of content ratings and 15.4Gbyte of storage size (87%of total catalog size), proactive caching achieves 100% of request satisfaction and offloads 98% of the backhaul when considering 16 base stations.

The Status and Suggestions for Big Data Adaptation in the Government and the Public Agency (정부 및 공공기관에서의 빅데이터 활용에 대한 현황 및 실행방안 제안)

  • Byeon, Hyeon-Su
    • Journal of Digital Convergence
    • /
    • v.15 no.4
    • /
    • pp.13-25
    • /
    • 2017
  • Volume in data storage is growing more than ever before. This phenomenon is caused by the participation of governments and firms as well as general users. As for big data, governments and public agencies are likely to play important roles in applications since they can access and operate personal data for public purposes. In this study, the author examined the status and countermeasure of big data from different countries and drew some common grounds. The suggestions are as follows. First of all, securing manpower and technology have to take precedence. In addition, share and development between the government and the private sector are required. And organizations should come up with long-term strategies along with the development of data loading and analysis. In conclusion, the author propose the recognition of the importance of data management, privacy protection and the expansion of field application possibilities for political usage of big data.

Validation of Administrative Big Database for Colorectal Cancer Searched by International Classification of Disease 10th Codes in Korean: A Retrospective Big-cohort Study

  • Hwang, Young-Jae;Kim, Nayoung;Yun, Chang Yong;Yoon, Hyuk;Shin, Cheol Min;Park, Young Soo;Son, Il Tae;Oh, Heung-Kwon;Kim, Duck-Woo;Kang, Sung-Bum;Lee, Hye Seung;Park, Seon Mee;Lee, Dong Ho
    • Journal of Cancer Prevention
    • /
    • v.23 no.4
    • /
    • pp.183-190
    • /
    • 2018
  • Background: As the number of big-cohort studies increases, validation becomes increasingly more important. We aimed to validate administrative database categorized as colorectal cancer (CRC) by the International Classification of Disease (ICD) 10th code. Methods: Big-cohort was collected from Clinical Data Warehouse using ICD 10th codes from May 1, 2003 to November 30, 2016 at Seoul National University Bundang Hospital. The patients in the study group had been diagnosed with cancer and were recorded in the ICD 10th code of CRC by the National Health Insurance Service. Subjects with codes of inflammatory bowel disease or tuberculosis colitis were selected for the control group. For the accuracy of registered CRC codes (C18-21), the chart, imaging results, and pathologic findings were examined by two reviewers. Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) for CRC were calculated. Results: A total of 6,780 subjects with CRC and 1,899 control subjects were enrolled. Of these patients, 22 subjects did not have evidence of CRC by colonoscopy, computed tomography, magnetic resonance imaging, or positron emission tomography. The sensitivity and specificity of hospitalization data for identifying CRC were 100.00% and 98.86%, respectively. PPV and NPV were 99.68% and 100.00%, respectively. Conclusions: The big-cohort database using the ICD 10th code for CRC appears to be accurate.

A Study on Big-5 based Personality Analysis through Analysis and Comparison of Machine Learning Algorithm (머신러닝 알고리즘 분석 및 비교를 통한 Big-5 기반 성격 분석 연구)

  • Kim, Yong-Jun
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.4
    • /
    • pp.169-174
    • /
    • 2019
  • In this study, I use surveillance data collection and data mining, clustered by clustering method, and use supervised learning to judge similarity. I aim to use feature extraction algorithms and supervised learning to analyze the suitability of the correlations of personality. After conducting the questionnaire survey, the researchers refine the collected data based on the questionnaire, classify the data sets through the clustering techniques of WEKA, an open source data mining tool, and judge similarity using supervised learning. I then use feature extraction algorithms and supervised learning to determine the suitability of the results for personality. As a result, it was found that the highest degree of similarity classification was obtained by EM classification and supervised learning by Naïve Bayes. The results of feature classification and supervised learning were found to be useful for judging fitness. I found that the accuracy of each Big-5 personality was changed according to the addition and deletion of the items, and analyzed the differences for each personality.

A Study on Personal Information Protection System for Big Data Utilization in Industrial Sectors (산업 영역에서 빅데이터 개인정보 보호체계에 관한 연구)

  • Kim, Jin Soo;Choi, Bang Ho;Cho, Gi Hwan
    • Smart Media Journal
    • /
    • v.8 no.1
    • /
    • pp.9-18
    • /
    • 2019
  • In the era of the 4th industrial revolution, the big data industry is gathering attention for new business models in the public and private sectors by utilizing various information collected through the internet and mobile. However, although the big data integration and analysis are performed with de-identification techniques, there is still a risk that personal privacy can be exposed. Recently, there are many studies to invent effective methods to maintain the value of data without disclosing personal information. In this paper, a personal information protection system is investigated to boost big data utilization in industrial sectors, such as healthcare and agriculture. The criteria for evaluating the de-identification adequacy of personal information and the protection scope of personal information should be differently applied for each industry. In the field of personal sensitive information-oriented healthcare sector, the minimum value of k-anonymity should be set to 5 or more, which is the average value of other industrial sectors. In agricultural sector, it suggests the inclusion of companion dogs or farmland information as sensitive information. Also, it is desirable to apply the demonstration steps to each region-specific industry.

An Analysis of the Social Phenomena and Perceptions of the Special Case of Military Service System in Korean Sports Field Using Big Data (빅데이터분석을 통한 체육계 병역특례제도의 사회적 현상 및 인식분석)

  • Lee, Hyun-Jeong;Han, Hae-Won
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.4
    • /
    • pp.229-236
    • /
    • 2019
  • The purpose of this paper is to analyze social phenomena and perceptions by collecting and analyzing data on public opinion, views and trends related to special case of military service in the sports community through Big KINDS operated by the Korea Press Promotion Foundation. To this end, the related keywords were derived and visualized by implementing a LDA(latent dirichlet allocation) technique to derive problems found in social phenomena based on big data analysis. The topics derived include "re-lighting special case on military service," " military service corruption controversy," "special case of military service for athletes," "alternative military service system for artists " and "parliamentary inspection of the administration" This could be used as a basic data for identifying accurate information on social controversies related to special case of military service in the sports community and drawing up practical measures that are considered in line with the principle of just and equal burden.