• Title/Summary/Keyword: Big data collection

Search Result 348, Processing Time 0.032 seconds

Forecasting the Future Korean Society: A Big Data Analysis on 'Future Society'-related Keywords in News Articles and Academic Papers (빅데이터를 통해 본 한국사회의 미래: 언론사 뉴스기사와 사회과학 학술논문의 '미래사회' 관련 키워드 분석)

  • Kim, Mun-Cho;Lee, Wang-Won;Lee, Hye-Soo;Suh, Byung-Jo
    • Informatization Policy
    • /
    • v.25 no.4
    • /
    • pp.37-64
    • /
    • 2018
  • This study aims to forecast the future of the Korean society via a big data analysis. Based upon two sets of database - a collection of 46,000,000 news on 127 media in Naver Portal operated by Naver Corporation and a collection of 70,000 academic papers of social sciences registered in KCI (Korea Citation Index of National Research Foundation) between 2005-2017, 40 most frequently occurring keywords were selected. Next, their temporal variations were traced and compared in terms of number and pattern of frequencies. In addition, core issues of the future were identified through keyword network analysis. In the case of the media news database, such issues as economy, polity or technology turned out to be the top ranked ones. As to the academic paper database, however, top ranking issues are those of feeling, working or living. Referring to the system and life-world conceptual framework suggested by $J{\ddot{u}}rgen$ Habermas, public interest of the future inclines to the matter of 'system' while professional interest of the future leans to that of 'life-world.' Given the disparity of future interest, a 'mismatch paradigm' is proposed as an alternative to social forecasting, which can substitute the existing paradigms based on the ideas of deficiency or deprivation.

Collection and Analysis of Electricity Consumption Data in POSTECH Campus (포스텍 캠퍼스의 전력 사용 데이터 수집 및 분석)

  • Ryu, Do-Hyeon;Kim, Kwang-Jae;Ko, YoungMyoung;Kim, Young-Jin;Song, Minseok
    • Journal of Korean Society for Quality Management
    • /
    • v.50 no.3
    • /
    • pp.617-634
    • /
    • 2022
  • Purpose: This paper introduces Pohang University of Science Technology (POSTECH) advanced metering infrastructure (AMI) and Open Innovation Big Data Center (OIBC) platform and analysis results of electricity consumption data collected via the AMI in POSTECH campus. Methods: We installed 248 sensors in seven buildings at POSTECH for the AMI and collected electricity consumption data from the buildings. To identify the amounts and trends of electricity consumption of the seven buildings, electricity consumption data collected from March to June 2019 were analyzed. In addition, this study compared the differences between the amounts and trends of electricity consumption of the seven buildings before and after the COVID-19 outbreak by using electricity consumption data collected from March to June 2019 and 2020. Results: Users can monitor, visualize, and download electricity consumption data collected via the AMI on the OIBC platform. The analysis results show that the seven buildings consume different amounts of electricity and have different consumption trends. In addition, the amounts of most buildings were significantly reduced after the COVID-19 outbreak. Conclusion: POSTECH AMI and OIBC platform can be a good reference for other universities that prepare their own microgrid. The analysis results provides a proof that POSTECH needs to establish customized strategies on reducing electricity for each building. Such results would be useful for energy-efficient operation and preparation of unusual energy consumptions due to unexpected situations like the COVID-19 pandemic.

Artificial Intelligence-based Security Control Construction and Countermeasures (인공지능기반 보안관제 구축 및 대응 방안)

  • Hong, Jun-Hyeok;Lee, Byoung Yup
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.1
    • /
    • pp.531-540
    • /
    • 2021
  • As cyber attacks and crimes increase exponentially and hacking attacks become more intelligent and advanced, hacking attack methods and routes are evolving unpredictably and in real time. In order to reinforce the enemy's responsiveness, this study aims to propose a method for developing an artificial intelligence-based security control platform by building a next-generation security system using artificial intelligence to respond by self-learning, monitoring abnormal signs and blocking attacks.The artificial intelligence-based security control platform should be developed as the basis for data collection, data analysis, next-generation security system operation, and security system management. Big data base and control system, data collection step through external threat information, data analysis step of pre-processing and formalizing the collected data to perform positive/false detection and abnormal behavior analysis through deep learning-based algorithm, and analyzed data Through the operation of a security system of prevention, control, response, analysis, and organic circulation structure, the next generation security system to increase the scope and speed of handling new threats and to reinforce the identification of normal and abnormal behaviors, and management of the security threat response system, Harmful IP management, detection policy management, security business legal system management. Through this, we are trying to find a way to comprehensively analyze vast amounts of data and to respond preemptively in a short time.

Refresh Cycle Optimization for Web Crawlers (웹크롤러의 수집주기 최적화)

  • Cho, Wan-Sup;Lee, Jeong-Eun;Choi, Chi-Hwan
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.6
    • /
    • pp.30-39
    • /
    • 2013
  • Web crawler should maintain fresh data with minimum server overhead for large amount of data in the web sites. The overhead in the server increases rapidly as the amount of data is exploding as in the big data era. The amount of web information is increasing rapidly with advanced wireless networks and emergence of diverse smart devices. Furthermore, the information is continuously being produced and updated in anywhere and anytime by means of easy web platforms, and smart devices. Now, it is becoming a hot issue how frequently updated web data has to be refreshed in data collection and integration. In this paper, we propose dynamic web-data crawling methods, which include sensitive checking of web site changes, and dynamic retrieving of web pages from target web sites based on historical update patterns. Furthermore, we implemented a Java-based web crawling application and compared efficiency between conventional static approaches and our dynamic one. Our experiment results showed 46.2% overhead benefits with more fresh data compared to the static crawling methods.

Text Data Analysis Model Based on Web Application (웹 애플리케이션 기반의 텍스트 데이터 분석 모델)

  • Jin, Go-Whan
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.11
    • /
    • pp.785-792
    • /
    • 2021
  • Since the Fourth Industrial Revolution, various changes have occurred in society as a whole due to advance in technologies such as artificial intelligence and big data. The amount of data that can be collect in the process of applying important technologies tends to increase rapidly. Especially in academia, existing generated literature data is analyzed in order to grasp research trends, and analysis of these literature organizes the research flow and organizes some research methodologies and themes, or by grasping the subjects that are currently being talked about in academia, we are making a lot of contributions to setting the direction of future research. However, it is difficult to access whether data collection is necessary for the analysis of document data without the expertise of ordinary programs. In this paper, propose a text mining-based topic modeling Web application model. Even if you lack specialized knowledge about data analysis methods through the proposed model, you can perform various tasks such as collecting, storing, and text-analyzing research papers, and researchers can analyze previous research and research trends. It is expect that the time and effort required for data analysis can be reduce order to understand.

A Study on Condition Analysis of Revised Project Level of Gravity Port facility using Big Data (빅데이터 분석을 통한 중력식 항만시설 수정프로젝트 레벨의 상태변화 특성 분석)

  • Na, Yong Hyoun;Park, Mi Yeon;Jang, Shinwoo
    • Journal of the Society of Disaster Information
    • /
    • v.17 no.2
    • /
    • pp.254-265
    • /
    • 2021
  • Purpose: Inspection and diagnosis on the performance and safety through domestic port facilities have been conducted for over 20 years. However, the long-term development strategies and directions for facility renewal and performance improvement using the diagnosis history and results are not working in realistically. In particular, in the case of port structures with a long service life, there are many problems in terms of safety and functionality due to increasing of the large-sized ships, of port use frequency, and the effects of natural disasters due to climate change. Method: In this study, the maintenance history data of the gravity type quay in element level were collected, defined as big data, and a predictive approximation model was derived to estimate the pattern of deterioration and aging of the facility of project level based on the data. In particular, we compared and proposed models suitable for the use of big data by examining the validity of the state-based deterioration pattern and deterioration approximation model generated through machine learning algorithms of GP and SGP techniques. Result: As a result of reviewing the suitability of the proposed technique, it was considered that the RMSE and R2 in GP technique were 0.9854 and 0.0721, and the SGP technique was 0.7246 and 0.2518. Conclusion: This research through machine learning techniques is expected to play an important role in decision-making on investment in port facilities in the future if port facility data collection is continuously performed in the future.

Issue Analysis on Gas Safety Based on a Distributed Web Crawler Using Amazon Web Services (AWS를 활용한 분산 웹 크롤러 기반 가스 안전 이슈 분석)

  • Kim, Yong-Young;Kim, Yong-Ki;Kim, Dae-Sik;Kim, Mi-Hye
    • Journal of Digital Convergence
    • /
    • v.16 no.12
    • /
    • pp.317-325
    • /
    • 2018
  • With the aim of creating new economic values and strengthening national competitiveness, governments and major private companies around the world are continuing their interest in big data and making bold investments. In order to collect objective data, such as news, securing data integrity and quality should be a prerequisite. For researchers or practitioners who wish to make decisions or trend analyses based on objective and massive data, such as portal news, the problem of using the existing Crawler method is that data collection itself is blocked. In this study, we implemented a method of collecting web data by addressing existing crawler-style problems using the cloud service platform provided by Amazon Web Services (AWS). In addition, we collected 'gas safety' articles and analyzed issues related to gas safety. In order to ensure gas safety, the research confirmed that strategies for gas safety should be established and systematically operated based on five categories: accident/occurrence, prevention, maintenance/management, government/policy and target.

IoT Data Processing Model of Smart Farm Based on Machine Learning (머신러닝 기반 스마트팜의 IoT 데이터 처리 모델)

  • Yoon-Su, Jeong
    • Advanced Industrial SCIence
    • /
    • v.1 no.2
    • /
    • pp.24-29
    • /
    • 2022
  • Recently, smart farm research that applies IoT technology to various farms is being actively conducted to improve agricultural cooling power and minimize cost reduction. In particular, methods for automatically and remotely controlling environmental information data around smart farms through IoT devices are being studied. This paper proposes a processing model that can maintain an optimal growth environment by monitoring environmental information data collected from smart farms in real time based on machine learning. Since the proposed model uses machine learning technology, environmental information is grouped into multiple blockchains to enable continuous data collection through rich big data securing measures. In addition, the proposed model selectively (or binding) the collected environmental information data according to priority using weights and correlation indices. Finally, the proposed model allows us to extend the cost of processing environmental information to n-layer to a minimum so that we can process environmental information in real time.

A Study on the Prediction Method of Voice Phishing Damage Using Big Data and FDS (빅데이터와 FDS를 활용한 보이스피싱 피해 예측 방법 연구)

  • Lee, Seoungyong;Lee, Julak
    • Korean Security Journal
    • /
    • no.62
    • /
    • pp.185-203
    • /
    • 2020
  • While overall crime has been on the decline since 2009, voice phishing has rather been on the rise. The government and academia have presented various measures and conducted research to eradicate it, but it is not enough to catch up with evolving voice phishing. In the study, researchers focused on catching criminals and preventing damage from voice phishing, which is difficult to recover from. In particular, a voice phishing prediction method using the Fraud Detection System (FDS), which is being used to detect financial fraud, was studied based on the fact that the victim engaged in financial transaction activities (such as account transfers). As a result, it was conceptually derived to combine big data such as call details, messenger details, abnormal accounts, voice phishing type and 112 report related to voice phishing in machine learning-based Fraud Detection System(FDS). In this study, the research focused mainly on government measures and literature research on the use of big data. However, limitations in data collection and security concerns in FDS have not provided a specific model. However, it is meaningful that the concept of voice phishing responses that converge FDS with the types of data needed for machine learning was presented for the first time in the absence of prior research. Based on this research, it is hoped that 'Voice Phishing Damage Prediction System' will be developed to prevent damage from voice phishing.

Estimation of Mass Rapid Transit Passenger's Train Choice Using a Mixture Distribution Analysis (통행시간 기반 혼합분포모형 분석을 통한 도시철도 승객의 급행 탑승 여부 추정 연구)

  • Jang, Jinwon;Yoon, Hosang;Park, Dongjoo
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.20 no.5
    • /
    • pp.1-17
    • /
    • 2021
  • Identifying the exact train and the type of train boarded by passengers is practically cumbersome. Previous studies identified the trains boarded by each passenger by matching the Automated Fare Collection (AFC) data and the train schedule diagram. However, this approach has been shown to be inefficient as the exact train boarded by a considerable number of passengers cannot be accurately determined. In this study, we demonstrate that the AFC data - diagram matching technique could not estimate 28% of the train type selected by passengers using the Seoul Metro line no.9. To obtain more accurate results, this paper developed a two-step method for estimating the train type boarded by passengers by applying the AFC data - diagram matching method followed by a mixture distribution analysis. As a result of the analysis, we derived reasonable express train use/non-use passenger classification points based on 298 origin-destination pairs that satisfied the verification criteria of this study.