• Title/Summary/Keyword: Data 분석

Search Result 63,378, Processing Time 0.074 seconds

Big Data Analysis Using Principal Component Analysis (주성분 분석을 이용한 빅데이터 분석)

  • Lee, Seung-Joo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.6
    • /
    • pp.592-599
    • /
    • 2015
  • In big data environment, we need new approach for big data analysis, because the characteristics of big data, such as volume, variety, and velocity, can analyze entire data for inferring population. But traditional methods of statistics were focused on small data called random sample extracted from population. So, the classical analyses based on statistics are not suitable to big data analysis. To solve this problem, we propose an approach to efficient big data analysis. In this paper, we consider a big data analysis using principal component analysis, which is popular method in multivariate statistics. To verify the performance of our research, we carry out diverse simulation studies.

Research on the Analysis System based on the Big Data for Matlab (Matlab을 활용한 빅데이터 기반 분석 시스템 연구)

  • Joo, Moon-il;Kim, Hee-cheol
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2016.10a
    • /
    • pp.96-98
    • /
    • 2016
  • Recently, big data technology develop due to the rapid data generation. Thus big data analysis tools for analyzing big data has been developed. Typical big data tools are the R program, Hive, Tajo and more. But data analysis based on Matlab is still common used. And it is still used in big data analysis. In this paper, it research into big data analysis system based on the Matlab for analyzing vital signals.

  • PDF

Big Data Patent Analysis Using Social Network Analysis (키워드 네트워크 분석을 이용한 빅데이터 특허 분석)

  • Choi, Ju-Choel
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.2
    • /
    • pp.251-257
    • /
    • 2018
  • As the use of big data is necessary for increasing business value, the size of the big data market is getting bigger. Accordingly, it is important to apply competitive patents in order to gain the big data market. In this study, we conducted the patent analysis based keyword network to analyze the trend of big data patents. The analysis procedure consists of big data collection and preprocessing, network construction, and network analysis. The results of the study are as follows. Most of big data patents are related to data processing and analysis, and the keywords with high degree centrality and between centrality are "analysis", "process", "information", "data", "prediction", "server", "service", and "construction". we expect that the results of this study will offer useful information in applying big data patent.

Problems of Big Data Analysis Education and Their Solutions (빅데이터 분석 교육의 문제점과 개선 방안 -학생 과제 보고서를 중심으로)

  • Choi, Do-Sik
    • Journal of the Korea Convergence Society
    • /
    • v.8 no.12
    • /
    • pp.265-274
    • /
    • 2017
  • This paper examines the problems of big data analysis education and suggests ways to solve them. Big data is a trend that the characteristic of big data is evolving from V3 to V5. For this reason, big data analysis education must take V5 into account. Because increased uncertainty can increase the risk of data analysis, internal and external structured/semi-structured data as well as disturbance factors should be analyzed to improve the reliability of the data. And when using opinion mining, error that is easy to perceive is variability and veracity. The veracity of the data can be increased when data analysis is performed against uncertain situations created by various variables and options. It is the node analysis of the textom(텍스톰) and NodeXL that students and researchers mainly use in the analysis of the association network. Social network analysis should be able to get meaningful results and predict future by analyzing the current situation based on dark data gained.

Web-Based Data Analysis Service for Smart Farms (스마트팜을 위한 웹 기반 데이터 분석 서비스)

  • Jung, Jimin;Lee, Jihyun;Noh, Hyemin
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.9
    • /
    • pp.355-362
    • /
    • 2022
  • Smart Farm, which combines information and communication technologies with agriculture is moving from simple monitoring of the growth environment toward discovering the optimal environment for crop growth and in the form of self-regulating agriculture. To this end, it is important to collect related data, but it is more important for farmers with cultivation know-how to analyze the collected data from various perspectives and derive useful information for regulating the crop growth environment. In this study, we developed a web service that allows farmers who want to obtain necessary information with data related to crop growth to easily analyze data. Web-based data analysis serivice developed uses R language for data analysis and Express web application framework for Node.js. As a result of applying the developed data analysis service together with the growth environment monitoring system in operation, we could perform data analysis what we want just by uploading a CSV file or by entering raw data directly. We confirmed that a service provider could provid various data analysis services easily and could add a new data analysis service by newly adding R script.

A Big Data Preprocessing using Statistical Text Mining (통계적 텍스트 마이닝을 이용한 빅 데이터 전처리)

  • Jun, Sunghae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.5
    • /
    • pp.470-476
    • /
    • 2015
  • Big data has been used in diverse areas. For example, in computer science and sociology, there is a difference in their issues to approach big data, but they have same usage to analyze big data and imply the analysis result. So the meaningful analysis and implication of big data are needed in most areas. Statistics and machine learning provide various methods for big data analysis. In this paper, we study a process for big data analysis, and propose an efficient methodology of entire process from collecting big data to implying the result of big data analysis. In addition, patent documents have the characteristics of big data, we propose an approach to apply big data analysis to patent data, and imply the result of patent big data to build R&D strategy. To illustrate how to use our proposed methodology for real problem, we perform a case study using applied and registered patent documents retrieved from the patent databases in the world.

Data Literacy, Organizational Culture, and Data Analytics Maturity: Moderating Effect of Organizational Culture (데이터 리터러시와 데이터 분석 성숙도의 관계에서 조직문화의 조절효과)

  • Park, Chong-Nam;Cho, Yee-Un
    • Informatization Policy
    • /
    • v.28 no.1
    • /
    • pp.43-63
    • /
    • 2021
  • The purpose of this research is to examine the relationships among data literacy, organizational culture, and data analytics maturity and the moderating effects of organizational culture. Analysis of the relationship between data literacy and data analytics maturity shows that the higher the data literacy competency of employees, the higher the organization's data analytics maturity. In examining the relationship between organizational culture and data analytics maturity, it is found that relationship culture and innovation culture are positively related to data analytics maturity. In addition, relationship culture and hierarchy culture show significant moderating effects. Relationship culture shows a synergistic effect, whereas hierarchy culture has a buffer effect between data literacy and data analytics maturity.

Clustering and classification to characterize daily electricity demand (시간단위 전력사용량 시계열 패턴의 군집 및 분류분석)

  • Park, Dain;Yoon, Sanghoo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.2
    • /
    • pp.395-406
    • /
    • 2017
  • The purpose of this study is to identify the pattern of daily electricity demand through clustering and classification. The hourly data was collected by KPS (Korea Power Exchange) between 2008 and 2012. The time trend was eliminated for conducting the pattern of daily electricity demand because electricity demand data is times series data. We have considered k-means clustering, Gaussian mixture model clustering, and functional clustering in order to find the optimal clustering method. The classification analysis was conducted to understand the relationship between external factors, day of the week, holiday, and weather. Data was divided into training data and test data. Training data consisted of external factors and clustered number between 2008 and 2011. Test data was daily data of external factors in 2012. Decision tree, random forest, Support vector machine, and Naive Bayes were used. As a result, Gaussian model based clustering and random forest showed the best prediction performance when the number of cluster was 8.

Text Mining and Visualization of Unstructured Data Using Big Data Analytical Tool R (빅데이터 분석 도구 R을 이용한 비정형 데이터 텍스트 마이닝과 시각화)

  • Nam, Soo-Tai;Shin, Seong-Yoon;Jin, Chan-Yong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.9
    • /
    • pp.1199-1205
    • /
    • 2021
  • In the era of big data, not only structured data well organized in databases, but also the Internet, social network services, it is very important to effectively analyze unstructured big data such as web documents, e-mails, and social data generated in real time in mobile environment. Big data analysis is the process of creating new value by discovering meaningful new correlations, patterns, and trends in big data stored in data storage. We intend to summarize and visualize the analysis results through frequency analysis of unstructured article data using R language, a big data analysis tool. The data used in this study was analyzed for total 104 papers in the Mon-May 2021 among the journals of the Korea Institute of Information and Communication Engineering. In the final analysis results, the most frequently mentioned keyword was "Data", which ranked first 1,538 times. Therefore, based on the results of the analysis, the limitations of the study and theoretical implications are suggested.

Automatic Generation of Issue Analysis Report Based on Social Big Data Mining (소셜 빅데이터 마이닝 기반 이슈 분석보고서 자동 생성)

  • Heo, Jeong;Lee, Chung Hee;Oh, Hyo Jung;Yoon, Yeo Chan;Kim, Hyun Ki;Jo, Yo Han;Ock, Cheol Young
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.12
    • /
    • pp.553-564
    • /
    • 2014
  • In this paper, we propose the system for automatic generation of issue analysis report based on social big data mining, with the purpose of resolving three problems of the previous technologies in a social media analysis and analytic report generation. Three problems are the isolation of analysis, the subjectivity of experts and the closure of information attributable to a high price. The system is comprised of the natural language query analysis, the issue analysis, the social big data analysis, the social big data correlation analysis and the automatic report generation. For the evaluation of report usefulness, we used a Likert scale and made two experts of big data analysis evaluate. The result shows that the quality of report is comparatively useful and reliable. Because of a low price of the report generation, the correlation analysis of social big data and the objectivity of social big data analysis, the proposed system will lead us to the popularization of social big data analysis.