• Title/Summary/Keyword: Big Data Pattern Analysis

Search Result 172, Processing Time 0.024 seconds

Emotion Prediction of Paragraph using Big Data Analysis (빅데이터 분석을 이용한 문단 내의 감정 예측)

  • Kim, Jin-su
    • Journal of Digital Convergence
    • /
    • v.14 no.11
    • /
    • pp.267-273
    • /
    • 2016
  • Creation and Sharing of information which is structured data as well as various unstructured data. makes progress actively through the spread of mobile. Recently, Big Data extracts the semantic information from SNS and data mining is one of the big data technique. Especially, the general emotion analysis that expresses the collective intelligence of the masses is utilized using large and a variety of materials. In this paper, we propose the emotion prediction system architecture which extracts the significant keywords from social network paragraphs using n-gram and Korean morphological analyzer, and predicts the emotion using SVM and these extracted emotion features. The proposed system showed 82.25% more improved recall rate in average than previous systems and it will help extract the semantic keyword using morphological analysis.

Clustering of Smart Meter Big Data Based on KNIME Analytic Platform (KNIME 분석 플랫폼 기반 스마트 미터 빅 데이터 클러스터링)

  • Kim, Yong-Gil;Moon, Kyung-Il
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.2
    • /
    • pp.13-20
    • /
    • 2020
  • One of the major issues surrounding big data is the availability of massive time-based or telemetry data. Now, the appearance of low cost capture and storage devices has become possible to get very detailed time data to be used for further analysis. Thus, we can use these time data to get more knowledge about the underlying system or to predict future events with higher accuracy. In particular, it is very important to define custom tailored contract offers for many households and businesses having smart meter records and predict the future electricity usage to protect the electricity companies from power shortage or power surplus. It is required to identify a few groups with common electricity behavior to make it worth the creation of customized contract offers. This study suggests big data transformation as a side effect and clustering technique to understand the electricity usage pattern by using the open data related to smart meter and KNIME which is an open source platform for data analytics, providing a user-friendly graphical workbench for the entire analysis process. While the big data components are not open source, they are also available for a trial if required. After importing, cleaning and transforming the smart meter big data, it is possible to interpret each meter data in terms of electricity usage behavior through a dynamic time warping method.

Automatic Algorithm for Cleaning Asset Data of Overhead Transmission Line (가공송전 전선 자산데이터의 정제 자동화 알고리즘 개발 연구)

  • Mun, Sung-Duk;Kim, Tae-Joon;Kim, Kang-Sik;Hwang, Jae-Sang
    • KEPCO Journal on Electric Power and Energy
    • /
    • v.7 no.1
    • /
    • pp.73-77
    • /
    • 2021
  • As the big data analysis technologies has been developed worldwide, the importance of asset management for electric power facilities based data analysis is increasing. It is essential to secure quality of data that will determine the performance of the RISK evaluation algorithm for asset management. To improve reliability of asset management, asset data must be preprocessed. In particular, the process of cleaning dirty data is required, and it is also urgent to develop an algorithm to reduce time and improve accuracy for data treatment. In this paper, the result of the development of an automatic cleaning algorithm specialized in overhead transmission asset data is presented. A data cleaning algorithm was developed to enable data clean by analyzing quality and overall pattern of raw data.

A Study on Big Data Analysis of Public Library in Busan: Based on the Library Collection/Circulation Data (부산지역 공공도서관의 빅데이터 분석 연구 - 도서관 정보나루 장서/대출데이터를 중심으로 -)

  • Lee, Soon-Young;Lee, Soo-Sang
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.55 no.4
    • /
    • pp.89-114
    • /
    • 2021
  • This study analyzed the previous studies and utilization cases on library big data, and based on this, analyzed the collection/circulation data of the library big data platform and tried to derive meaningful analysis results. And five analysis indicators were selected: the increase rate of collections by annual, the composition of collections by subject, the composition of unborrowed collections by subject, the rate of borrowed collections, and use factor by subject. The analysis data is 6,722,603 cases of collection/circulation data from 33 public libraries in Busan. The main analysis results are as follows. First, it was found that the gap in the number of circulation was larger than the number of collection in the 33 public libraries. Second, the annual increase rate of collections also showed a clear decline. Third, each library showed a similar pattern in the composition of both the collections and the unborrowed collections by subject. Fourth, it was found that users' circulation were very different by subject and library. Fifth, in most libraries, the rate of circulation of collections and use factor in the natural science field were the highest.

Development of integrated management solution through log analysis based on Big Data (빅데이터기반의 로그분석을 통한 통합 관리 솔루션 개발)

  • Kang, Sun-Kyoung;Lee, Hyun-Chang;Shin, Seong-Yoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.10a
    • /
    • pp.541-542
    • /
    • 2017
  • In this paper, we intend to develop an integrated management solution that can be easily operated by integrating complex and various cloud environments. This has the advantage that users and administrators can conveniently solve problems by collecting and analyzing fixed log data and unstructured log data based on big data and realizing integrated monitoring in real time. Hypervisor log pattern analysis technology will be able to manage existing complex and various cloud environment more efficiently.

  • PDF

Developing electric railway load pattern inspection program (전철 변전소 전력부하패턴 점검 프로그램 개발)

  • Jeon, Yong-Joo;Lee, Gi-Chun
    • Proceedings of the KSR Conference
    • /
    • 2007.05a
    • /
    • pp.893-898
    • /
    • 2007
  • At present, one of the big characteristics in electric power market in korea is unique seller but in the near future competitions are expected in the market. so additional service for the electric power are expected. Also with development of IT technology, remote inspection for power usage are possible so as consumption pattern analysis. KORAIL is one of the biggest consumer in electric power market so it is necessary to investigate power consumption pattern. This paper presents electric load consumption pattern for representative substation like urban subway, high-speed train, industrial line and simulation program for electric power rate definition program based on billing system database. Base on the substation annual power usage DB data, the characteristic of the substation power consumption are investigated and effective electrical billing system are compared each other. Through this program it is verified that we can save more then several hundred million won for a year.

  • PDF

Development of Short-Term Load Forecasting Method by Analysis of Load Characteristics during Chuseok Holiday (추석 연휴 전력수요 특성 분석을 통한 단기전력 수요예측 기법 개발)

  • Kwon, Oh-Sung;Song, Kyung-Bin
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.60 no.12
    • /
    • pp.2215-2220
    • /
    • 2011
  • The accurate short-term load forecasting is essential for the efficient power system operation and the system marginal price decision of the electricity market. So far, errors of load forecasting for Chuseok Holiday are very big compared with forecasting errors for the other special days. In order to improve the accuracy of load forecasting for Chuseok Holiday, selection of input data, the daily normalized load patterns and load forecasting model are investigated. The efficient data selection and daily normalized load pattern based on fuzzy linear regression model is proposed. The proposed load forecasting method for Chuseok Holiday is tested in recent 5 years from 2006 to 2010, and improved the accuracy of the load forecasting compared with the former research.

SNA Pattern Analysis on the Public Software Industry based on Open API Big Data from Korea Public Procurement Service (조달청 OPEN API 빅데이터를 활용한 공공 소프트웨어 산업의 SNA 패턴 분석)

  • KIM, Sojung lucia;Shim, Seon-Young;Seo, Yong-Won
    • Informatization Policy
    • /
    • v.24 no.3
    • /
    • pp.42-66
    • /
    • 2017
  • This study investigated the ecological change of public software industry, comparing the pre and post structure of industry network based on the application of the regulation restricting large company participation in public software market. For this purpose, we used big data of the software market from Korea Public Procurement Service and used the SNA(Social Network Analysis) methodology which is being actively used in the area of social science recently. Finally, we highlighted the contribution of open public data. By analyzing order and contract data of the public software industry for 3 years - from 2013 to 2015 - we found out two main things. First, we observed that Power Law distribution had been going on in the public software industry, regardless of the external impact of regulation. Second, despite the existence of such Power Law distribution, we also observed the ecological change of industry structure from year to year. We presented the implication of such findings and discussed the advantage of open public data as the original motivator of this study.

A Trip Mobility Analysis using Big Data (빅데이터 기반의 모빌리티 분석)

  • Cho, Bumchul;Kim, Juyoung;Kim, Dong-ho
    • The Journal of Bigdata
    • /
    • v.5 no.2
    • /
    • pp.85-95
    • /
    • 2020
  • In this study, a mobility analysis method is suggested to estimate an O/D trip demand estimation using Mobile Phone Signaling Data. Using mobile data based on mobile base station location information, a trip chain database was established for each person and daily traffic patterns were analyzed. In addition, a new algorithm was developed to determine the traffic characteristics of their mobilities. To correct the ping pong handover problem of communication data itself, the methodology was developed and the criteria for stay time was set to distinguish pass by between stay within the influence area. The big-data based method is applied to analyze the mobility pattern in inter-regional trip and intra-regional trip in both of an urban area and a rural city. When comparing it with the results with traditional methods, it seems that the new methodology has a possibility to be applied to the national survey projects in the future.

Clustering Algorithm using the DFP-Tree based on the MapReduce (맵리듀스 기반 DFP-Tree를 이용한 클러스터링 알고리즘)

  • Seo, Young-Won;Kim, Chang-soo
    • Journal of Internet Computing and Services
    • /
    • v.16 no.6
    • /
    • pp.23-30
    • /
    • 2015
  • As BigData is issued, many applications that operate based on the results of data analysis have been developed, typically applications are products recommend service of e-commerce application service system, search service on the search engine service and friend list recommend system of social network service. In this paper, we suggests a decision frequent pattern tree that is combined the origin frequent pattern tree that is mining similar pattern to appear in the data set of the existing data mining techniques and decision tree based on the theory of computer science. The decision frequent pattern tree algorithm improves about problem of frequent pattern tree that have to make some a lot's pattern so it is to hard to analyze about data. We also proposes to model for a Mapredue framework that is a programming model to help to operate in distributed environment.