• Title/Summary/Keyword: Big Data Analytics Process

Search Result 52, Processing Time 0.021 seconds

Research Trends of Health Recommender Systems (HRS): Applying Citation Network Analysis and GraphSAGE (건강추천시스템(HRS) 연구 동향: 인용네트워크 분석과 GraphSAGE를 활용하여)

  • Haryeom Jang;Jeesoo You;Sung-Byung Yang
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.2
    • /
    • pp.57-84
    • /
    • 2023
  • With the development of information and communications technology (ICT) and big data technology, anyone can easily obtain and utilize vast amounts of data through the Internet. Therefore, the capability of selecting high-quality data from a large amount of information is becoming more important than the capability of just collecting them. This trend continues in academia; literature reviews, such as systematic and non-systematic reviews, have been conducted in various research fields to construct a healthy knowledge structure by selecting high-quality research from accumulated research materials. Meanwhile, after the COVID-19 pandemic, remote healthcare services, which have not been agreed upon, are allowed to a limited extent, and new healthcare services such as health recommender systems (HRS) equipped with artificial intelligence (AI) and big data technologies are in the spotlight. Although, in practice, HRS are considered one of the most important technologies to lead the future healthcare industry, literature review on HRS is relatively rare compared to other fields. In addition, although HRS are fields of convergence with a strong interdisciplinary nature, prior literature review studies have mainly applied either systematic or non-systematic review methods; hence, there are limitations in analyzing interactions or dynamic relationships with other research fields. Therefore, in this study, the overall network structure of HRS and surrounding research fields were identified using citation network analysis (CNA). Additionally, in this process, in order to address the problem that the latest papers are underestimated in their citation relationships, the GraphSAGE algorithm was applied. As a result, this study identified 'recommender system', 'wireless & IoT', 'computer vision', and 'text mining' as increasingly important research fields related to HRS research, and confirmed that 'personalization' and 'privacy' are emerging issues in HRS research. The study findings would provide both academic and practical insights into identifying the structure of the HRS research community, examining related research trends, and designing future HRS research directions.

Clustering of Smart Meter Big Data Based on KNIME Analytic Platform (KNIME 분석 플랫폼 기반 스마트 미터 빅 데이터 클러스터링)

  • Kim, Yong-Gil;Moon, Kyung-Il
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.2
    • /
    • pp.13-20
    • /
    • 2020
  • One of the major issues surrounding big data is the availability of massive time-based or telemetry data. Now, the appearance of low cost capture and storage devices has become possible to get very detailed time data to be used for further analysis. Thus, we can use these time data to get more knowledge about the underlying system or to predict future events with higher accuracy. In particular, it is very important to define custom tailored contract offers for many households and businesses having smart meter records and predict the future electricity usage to protect the electricity companies from power shortage or power surplus. It is required to identify a few groups with common electricity behavior to make it worth the creation of customized contract offers. This study suggests big data transformation as a side effect and clustering technique to understand the electricity usage pattern by using the open data related to smart meter and KNIME which is an open source platform for data analytics, providing a user-friendly graphical workbench for the entire analysis process. While the big data components are not open source, they are also available for a trial if required. After importing, cleaning and transforming the smart meter big data, it is possible to interpret each meter data in terms of electricity usage behavior through a dynamic time warping method.

A Study on Policy Priorities for Implementing Big Data Analytics in the Social Security Sector : Adopting AHP Methodology (AHP분석을 활용한 사회보장부문 빅 데이터 활용가능 영역 탐색 연구)

  • Ham, Young-Jin;Ahn, Chang-Won;Kim, Ki-Ho;Park, Gyu-Beom;Kim, Kyoung-June;Lee, Dae-Young;Park, Sun-Mi
    • Journal of Digital Convergence
    • /
    • v.12 no.8
    • /
    • pp.49-60
    • /
    • 2014
  • The primary purpose of this paper is to find out what issues are important in the Social Security sector, and then, through AHP methodology, this study analyzes what kind of big data methodologies and projects can be implemented to solves these issues. To the aim, this paper first confirmed 8 big data projects from reviewing all issues in the Social Security sector such as administrative works and social policies. After the result of pairwise comparison, policy validity is most important factors rather then effectiveness and practicability. With regard to the priorities among sub-big data projects, the project about preventing improper recipients has come out the most important project in terms of validity, effectiveness and practicability. And the results showed that the project about outreaching and reducing a blind spot on the welfare sector is weighed as a significant project. The results of this paper, in particular 8 sub-big data projects, will be useful to anyone who is interested in using big data and its methodologies for the social welfare sector.

Optimization Model for the Mixing Ratio of Coatings Based on the Design of Experiments Using Big Data Analysis (빅데이터 분석을 활용한 실험계획법 기반의 코팅제 배합비율 최적화 모형)

  • Noh, Seong Yeo;Kim, Young-Jin
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.3 no.10
    • /
    • pp.383-392
    • /
    • 2014
  • The research for coatings is one of the most popular and active research in the polymer industry. For the coatings, electronics industry, medical and optical fields are growing more important. In particular, the trend is the increasing of the technical requirements for the performance and accuracy of the coatings by the development of automotive and electronic parts. In addition, the industry has a need of more intelligent and automated system in the industry is increasing by introduction of the IoT and big data analysis based on the environmental information and the context information. In this paper, we propose an optimization model for the design of experiments based coating formulation data objects using the Internet technologies and big data analytics. In this paper, the coating formulation was calculated based on the best data analysis is based on the experimental design, modify the operator with respect to the error caused based on the coating formulation used in the actual production site data and the corrected result data. Further optimization model to correct the reference value by leveraging big data analysis and Internet of things technology only existing coating formulation is applied as the reference data using a manufacturing environment and context information retrieval in color and quality, the most important factor in maintaining and was derived. Based on data obtained from an experiment and analysis is improving the accuracy of the combination data and making it possible to give a LOT shorter working hours per data. Also the data shortens the production time due to the reduction in the delivery time per treatment and It can contribute to cost reduction or the like defect rate reduced. Further, it is possible to obtain a standard data in the manufacturing process for the various models.

Auto Configuration Module for Logstash in Elasticsearch Ecosystem

  • Ahmed, Hammad;Park, Yoosang;Choi, Jongsun;Choi, Jaeyoung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.10a
    • /
    • pp.39-42
    • /
    • 2018
  • Log analysis and monitoring have a significant importance in most of the systems. Log management has core importance in applications like distributed applications, cloud based applications, and applications designed for big data. These applications produce a large number of log files which contain essential information. This information can be used for log analytics to understand the relevant patterns from varying log data. However, they need some tools for the purpose of parsing, storing, and visualizing log informations. "Elasticsearch, Logstash, and Kibana"(ELK Stack) is one of the most popular analyzing tools for log management. For the ingestion of log files configuration files have a key importance, as they cover all the services needed to input, process, and output the log files. However, creating configuration files is sometimes very complicated and time consuming in many applications as it requires domain expertise and manual creation. In this paper, an auto configuration module for Logstash is proposed which aims to auto generate the configuration files for Logstash. The primary purpose of this paper is to provide a mechanism, which can be used to auto generate the configuration files for corresponding log files in less time. The proposed module aims to provide an overall efficiency in the log management system.

Investigating Factors Contributing to Inadequate Facility Safety Inspections and Diagnosis Services: A Machine Learning Approach (머신러닝 기반 시설물 안전 점검·진단용역 부실 판정 요인에 대한 연구)

  • Junyong Park;Chie Hoon Song
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.27 no.4_2
    • /
    • pp.897-908
    • /
    • 2024
  • Evaluating the adequacy of facility safety inspection and diagnosis services performed by private enterprises is a time-consuming and administratively complex process. This study aims to analyze the determinants that could influence the rating of these safety inspection and diagnosis services using data analytics approach. Through a comparative analysis of several machine learning algorithms suitable for multi-class classification, we selected the model with the best performance (Random Forest) and identified the main determinants using the permutation importance technique. Among the variables examined, "contract value," "days of service performed" and "adherence to fair market value" were found to be strongly correlated with the rating assessments. Furthermore, we discovered that the skills and expertise of service performing personnel significantly impacted the rating. The results of this study can contribute to the enhancement of the current post-evaluation administrative processes and offer valuable insights into rating assessments by incorporating previously unexplored variables pertaining to both service providers and the services itself.

The Correlation between Social Media and the Behaviors of the Supreme Court in Korea (소셜미디어와 대법원 판결의 상관 관계에 대한 분석)

  • Heo, Junhong;Seo, Yeeun;Lee, Seoyeong;Lee, Sang-Yong Tom
    • Knowledge Management Research
    • /
    • v.22 no.3
    • /
    • pp.31-53
    • /
    • 2021
  • As a communication channel for individuals, social media is affecting various areas such as business, economy, politics, and society. One of the less-studied areas is the law. Therefore, this study collected various information from social media and analyzed its impacts on the legal decisions, especially the Supreme Court decisions in Korea. This study was conducted by compiling information from Internet news articles and public responses. We found that when the negative reactions from the public got higher, the trial duration until the supreme court making the final decisions became shorter. However, we were not able to find the significant relationship between social media reactions and dismissal of appeal nor annulment. Our study would contribute to the information systems and knowledge management research in a sense that the social analytics is applied to the area of legal decisions, instead of using conventional qualitative study methodology. Our study is also meaningful to the practitioners because that big data analytical business can be applied to the field of law by creating a new database for the emerging legal technology. Finally, law makers can think of a better way to standardize the legal decision process to minimize the reverse effects from social media.

A Study on Establishing a Market Entry Strategy for the Satellite Industry Using Future Signal Detection Techniques (미래신호 탐지 기법을 활용한 위성산업 시장의 진입 전략 수립 연구)

  • Sehyoung Kim;Jaehyeong Park;Hansol Lee;Juyoung Kang
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.249-265
    • /
    • 2023
  • Recently, the satellite industry has been paying attention to the private-led 'New Space' paradigm, which is a departure from the traditional government-led industry. The space industry, which is considered to be the next food industry, is still receiving relatively little attention in Korea compared to the global market. Therefore, the purpose of this study is to explore future signals that can help determine the market entry strategies of private companies in the domestic satellite industry. To this end, this study utilizes the theoretical background of future signal theory and the Keyword Portfolio Map method to analyze keyword potential in patent document data based on keyword growth rate and keyword occurrence frequency. In addition, news data was collected to categorize future signals into first symptom and early information, respectively. This is utilized as an interpretive indicator of how the keywords reveal their actual potential outside of patent documents. This study describes the process of data collection and analysis to explore future signals and traces the evolution of each keyword in the collected documents from a weak signal to a strong signal by specifically visualizing how it can be used through the visualization of keyword maps. The process of this research can contribute to the methodological contribution and expansion of the scope of existing research on future signals, and the results can contribute to the establishment of new industry planning and research directions in the satellite industry.

Genetic Programming based Manufacutring Big Data Analytics (유전 프로그래밍을 활용한 제조 빅데이터 분석 방법 연구)

  • Oh, Sanghoun;Ahn, Chang Wook
    • Smart Media Journal
    • /
    • v.9 no.3
    • /
    • pp.31-40
    • /
    • 2020
  • Currently, black-box-based machine learning algorithms are used to analyze big data in manufacturing. This algorithm has the advantage of having high analytical consistency, but has the disadvantage that it is difficult to interpret the analysis results. However, in the manufacturing industry, it is important to verify the basis of the results and the validity of deriving the analysis algorithms through analysis based on the manufacturing process principle. To overcome the limitation of explanatory power as a result of this machine learning algorithm, we propose a manufacturing big data analysis method using genetic programming. This algorithm is one of well-known evolutionary algorithms, which repeats evolutionary operators such as selection, crossover, mutation that mimic biological evolution to find the optimal solution. Then, the solution is expressed as a relationship between variables using mathematical symbols, and the solution with the highest explanatory power is finally selected. Through this, input and output variable relations are derived to formulate the results, so it is possible to interpret the intuitive manufacturing mechanism, and it is also possible to derive manufacturing principles that cannot be interpreted based on the relationship between variables represented by formulas. The proposed technique showed equal or superior performance as a result of comparing and analyzing performance with a typical machine learning algorithm. In the future, the possibility of using various manufacturing fields was verified through the technique.

Predicting the Number of People for Meals of an Institutional Foodservice by Applying Machine Learning Methods: S City Hall Case (기계학습방법을 활용한 대형 집단급식소의 식수 예측: S시청 구내직원식당의 실데이터를 기반으로)

  • Jeon, Jongshik;Park, Eunju;Kwon, Ohbyung
    • Journal of the Korean Dietetic Association
    • /
    • v.25 no.1
    • /
    • pp.44-58
    • /
    • 2019
  • Predicting the number of meals in a foodservice organization is an important decision-making process that is essential for successful food production, such as reducing the amount of residue, preventing menu quality deterioration, and preventing rising costs. Compared to other demand forecasts, the menu of dietary personnel includes diverse menus, and various dietary supplements include a range of side dishes. In addition to the menus, diverse subjects for prediction are very difficult problems. Therefore, the purpose of this study was to establish a method for predicting the number of meals including predictive modeling and considering various factors in addition to menus which are actually used in the field. For this purpose, 63 variables in eight categories such as the daily available number of people for the meals, the number of people in the time series, daily menu details, weekdays or seasons, days before or after holidays, weather and temperature, holidays or year-end, and events were identified as decision variables. An ensemble model using six prediction models was then constructed to predict the number of meals. As a result, the prediction error rate was reduced from 10%~11% to approximately 6~7%, which was expected to reduce the residual amount by approximately 40%.