• Title/Summary/Keyword: big data tasks

Search Result 98, Processing Time 0.021 seconds

Relationship between Big Data and Analysis Prediction (빅데이터와 분석예측의 관계)

  • Kang, Sun-Kyoung;Lee, Hyun-Chang;Shin, Seong-Yoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.05a
    • /
    • pp.167-168
    • /
    • 2017
  • In this paper, we discuss the importance of what to analyze and what to predict using Big Data. The issue of how and where to apply a large amount of data that is accumulated in my daily life and which I am not aware of is a very important factor. There are many kinds of tasks that specify what to predict and how to use these data. Finding the most appropriate one is the way to increase the prediction probability. In addition, the data that are analyzed and predicted should be useful in real life to make meaningful data.

  • PDF

Changes and Strategies of the Government Service Paradigm through Using Big Data -Focused on Disaster Safety Management in Seoul City- (빅데이터활용을 통한 정부서비스 패러다임의 변화와 전략 -서울시 재난안전관리를 중심으로-)

  • Kim, Young-mi
    • Journal of Digital Convergence
    • /
    • v.15 no.2
    • /
    • pp.59-65
    • /
    • 2017
  • The basic goal of urban safety is to support citizens' quality of life and city competitiveness, and its importance is increasing. Since the risk of disasters is growing, there is a growing demand from society for minimizing the damage by preventing and responding to them in advance. In case of urban governments, securing safety emerges as one of the most important policy tasks due to natural disasters such as heavy rain and heavy snow and human disasters such as various accidents. Recently, it is emphasized the necessity to increase the prevention effect through disaster analysis using Big Data. This study examined paradigm change of disaster safety management using big data centering on Seoul city. In particular, the study tried case analysis from the viewpoint of maximizing effective government services for disaster safety management, and sought the strategic meaning in connection with the ordinance.

Survey of Temporal Information Extraction

  • Lim, Chae-Gyun;Jeong, Young-Seob;Choi, Ho-Jin
    • Journal of Information Processing Systems
    • /
    • v.15 no.4
    • /
    • pp.931-956
    • /
    • 2019
  • Documents contain information that can be used for various applications, such as question answering (QA) system, information retrieval (IR) system, and recommendation system. To use the information, it is necessary to develop a method of extracting such information from the documents written in a form of natural language. There are several kinds of the information (e.g., temporal information, spatial information, semantic role information), where different kinds of information will be extracted with different methods. In this paper, the existing studies about the methods of extracting the temporal information are reported and several related issues are discussed. The issues are about the task boundary of the temporal information extraction, the history of the annotation languages and shared tasks, the research issues, the applications using the temporal information, and evaluation metrics. Although the history of the tasks of temporal information extraction is not long, there have been many studies that tried various methods. This paper gives which approach is known to be the better way of extracting a particular part of the temporal information, and also provides a future research direction.

A Study on the Procedure of Using Big Data to Solve Smart City Problems Based on Citizens' Needs and Participation (시민 니즈와 참여 기반의 스마트시티 문제해결을 위한 빅 데이터 활용 절차에 관한 연구)

  • Chang, Hye-Jung
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.13 no.2
    • /
    • pp.102-112
    • /
    • 2020
  • Smart City's goal is to solve urban problems through smart city's component technology, thereby developing eco-friendly and sustainable economies and improving citizens' quality of life. Until now, smart cities have evolved into component technologies, but it is time to focus attention on the needs and participation of citizens in smart cities. In this paper, we present a big data procedure for solving smart city problems based on citizens' needs and participation. To this end, we examine the smart city project market by region and major industry. We also examine the development stages of the smart city market area by sector. Additionally it understands the definition and necessity of each sector for citizen participation, and proposes a method to solve the problem through big data in the seven-step big data problem solving process. The seven-step big data process for solving problems is a method of deriving tasks after analyzing structured and unstructured data in each sector of smart cities and deriving policy programs accordingly. To attract citizen participation in these procedures, the empathy stage of the design thinking methodology is used in the unstructured data collection process. Also, as a method of identifying citizens' needs to solve urban problems in smart cities, the problem definition stage of the design sinking methodology was incorporated into the unstructured data analysis process.

An Exploration on Personal Information Regulation Factors and Data Combination Factors Affecting Big Data Utilization (빅데이터 활용에 영향을 미치는 개인정보 규제요인과 데이터 결합요인의 탐색)

  • Kim, Sang-Gwang;Kim, Sun-Kyung
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.2
    • /
    • pp.287-304
    • /
    • 2020
  • There have been a number of legal & policy studies on the affecting factors of big data utilization, but empirical research on the composition factors of personal information regulation or data combination, which acts as a constraint, has been hardly done due to the lack of relevant statistics. Therefore, this study empirically explores the priority of personal information regulation factors and data combination factors that influence big data utilization through Delphi Analysis. As a result of Delphi analysis, personal information regulation factors include in order of the introduction of pseudonymous information, evidence clarity of personal information de-identification, clarity of data combination regulation, clarity of personal information definition, ease of personal information consent, integration of personal information supervisory authority, consistency among personal information protection acts, adequacy punishment intensity in case of violation of law, and proper penalty level when comparing EU GDPR. Next, data combination factors were examined in order of de-identification of data combination, standardization of combined data, responsibility of data combination, type of data combination institute, data combination experience, and technical value of data combination. These findings provide implications for which policy tasks should be prioritized when designing personal information regulations and data combination policies to utilize big data.

A Study on Factors Affecting BigData Acceptance Intention of Agricultural Enterprises (농업 관련 기업의 빅데이터 수용 의도에 미치는 영향요인 연구)

  • Ryu, GaHyun;Heo, Chul-Moo
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.17 no.1
    • /
    • pp.157-175
    • /
    • 2022
  • At this moment, a paradigm shift is taking place across all sectors of society for the transition movements to the digital economy. Various movements are taking place in the global agricultural industry to achieve innovative growth using big data which is a key resource of the 4th industrial revolution. Although the government is making various attempts to promote the use of big data, the movement of the agricultural industry as a key player in the use of big data, is still insufficient. Therefore, in this study, effects of performance expectations, effort expectations, social impact, facilitation conditions, based on the Unified Theory of Acceptance and Use of Technology(UTAUT), and innovation tendencies on the acceptance intention of big data were analyzed using the economic and practical benefits that can be obtained from the use of big data for agricultural-related companies as moderating variables. 333 questionnaires collected from agricultural-related companies were used for empirical analysis. The analysis results using SPSS v22.0 and Process macro v3.4 were found to have a significant positive (+) effect on the intention to accept big data by effort expectations, social impact, facilitation conditions, and innovation tendencies. However, it was found that the effect of performance expectations on acceptance intention was insignificant, with social impact having the greatest influence on acceptance intention and innovation tendency the least. Moderating effects of economic benefit and practical benefit between effort expectation and acceptance intention, moderating effect of practical benefit between social impact and acceptance intention, and moderating effect of economic benefit and practical benefit between facilitation condition and acceptance intention were found to be significant. On the other hand, it was found that economic benefits and practical benefits did not moderate the magnitude of the influence of performance expectations and innovation tendency on acceptance intention. These results suggest the following implications. First, in order to promote the use of big data by companies, the government needs to establish a policy to support the use of big data tailored to companies. Significant results can only be achieved when corporate members form a correct understanding and consensus on the use of big data. Second, it is necessary to establish and implement a platform specialized for agricultural data which can support standardized linkage between diverse agricultural big data, and support for a unified path for data access. Building such a platform will be able to advance the industry by forming an independent cooperative relationship between companies. Finally, the limitations of this study and follow-up tasks are presented.

Parallelism point selection in nested parallelism situations with focus on the bandwidth selection problem (평활량 선택문제 측면에서 본 중첩병렬화 상황에서 병렬처리 포인트선택)

  • Cho, Gayoung;Noh, Hohsuk
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.3
    • /
    • pp.383-396
    • /
    • 2018
  • Various parallel processing R packages are used for fast processing and the analysis of big data. Parallel processing is used when the work can be decomposed into tasks that are non-interdependent. In some cases, each task decomposed for parallel processing can also be decomposed into non-interdependent subtasks. We have to choose whether to parallelize the decomposed tasks in the first step or to parallelize the subtasks in the second step when facing nested parallelism situations. This choice has a significant impact on the speed of computation; consequently, it is important to understand the nature of the work and decide where to do the parallel processing. In this paper, we provide an idea of how to apply parallel computing effectively to problems by illustrating how to select a parallelism point for the bandwidth selection of nonparametric regression.

Design of Spark SQL Based Framework for Advanced Analytics (Spark SQL 기반 고도 분석 지원 프레임워크 설계)

  • Chung, Jaehwa
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.10
    • /
    • pp.477-482
    • /
    • 2016
  • As being the advanced analytics indispensable on big data for agile decision-making and tactical planning in enterprises, distributed processing platforms, such as Hadoop and Spark which distribute and handle the large volume of data on multiple nodes, receive great attention in the field. In Spark platform stack, Spark SQL unveiled recently to make Spark able to support distributed processing framework based on SQL. However, Spark SQL cannot effectively handle advanced analytics that involves machine learning and graph processing in terms of iterative tasks and task allocations. Motivated by these issues, this paper proposes the design of SQL-based big data optimal processing engine and processing framework to support advanced analytics in Spark environments. Big data optimal processing engines copes with complex SQL queries that involves multiple parameters and join, aggregation and sorting operations in distributed/parallel manner and the proposing framework optimizes machine learning process in terms of relational operations.

Design of a ParamHub for Machine Learning in a Distributed Cloud Environment

  • Su-Yeon Kim;Seok-Jae Moon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.16 no.2
    • /
    • pp.161-168
    • /
    • 2024
  • As the size of big data models grows, distributed training is emerging as an essential element for large-scale machine learning tasks. In this paper, we propose ParamHub for distributed data training. During the training process, this agent utilizes the provided data to adjust various conditions of the model's parameters, such as the model structure, learning algorithm, hyperparameters, and bias, aiming to minimize the error between the model's predictions and the actual values. Furthermore, it operates autonomously, collecting and updating data in a distributed environment, thereby reducing the burden of load balancing that occurs in a centralized system. And Through communication between agents, resource management and learning processes can be coordinated, enabling efficient management of distributed data and resources. This approach enhances the scalability and stability of distributed machine learning systems while providing flexibility to be applied in various learning environments.

LSTM-based Anomaly Detection on Big Data for Smart Factory Monitoring (스마트 팩토리 모니터링을 위한 빅 데이터의 LSTM 기반 이상 탐지)

  • Nguyen, Van Quan;Van Ma, Linh;Kim, Jinsul
    • Journal of Digital Contents Society
    • /
    • v.19 no.4
    • /
    • pp.789-799
    • /
    • 2018
  • This article presents machine learning based approach on Big data to analyzing time series data for anomaly detection in such industrial complex system. Long Short-Term Memory (LSTM) network have been demonstrated to be improved version of RNN and have become a useful aid for many tasks. This LSTM based model learn the higher level temporal features as well as temporal pattern, then such predictor is used to prediction stage to estimate future data. The prediction error is the difference between predicted output made by predictor and actual in-coming values. An error-distribution estimation model is built using a Gaussian distribution to calculate the anomaly in the score of the observation. In this manner, we move from the concept of a single anomaly to the idea of the collective anomaly. This work can assist the monitoring and management of Smart Factory in minimizing failure and improving manufacturing quality.