• Title/Summary/Keyword: Analytics Results

Search Result 275, Processing Time 0.023 seconds

Techniques to Guarantee Real-Time Fault Recovery in Spark Streaming Based Cloud System (Spark Streaming 기반 클라우드 시스템에서 실시간 고장 복구를 지원하기 위한 기법들)

  • Kim, Jungho;Park, Daedong;Kim, Sangwook;Moon, Yongshik;Hong, Seongsoo
    • Journal of KIISE
    • /
    • v.44 no.5
    • /
    • pp.460-468
    • /
    • 2017
  • In a real-time cloud environment, the data analysis framework plays a pivotal role. Spark Streaming meets most real-time requirements among existing frameworks. However, the framework does not meet the second scale real-time fault recovery requirement. Spark Streaming fault recovery time increases in proportion to the transformation history length called lineage. This is because it recovers the last state data based on the cumulative lineage recorded during normal operation. Therefore, fault recovery time is not bounded within a limited time. In addition, it is impossible to achieve a second-scale fault recovery time because it costs tens of seconds to read initial state data from fault-tolerant storage. In this paper, we propose two techniques to solve the problems mentioned above. We apply the proposed techniques to Spark Streaming 1.6.2. Experimental results show that the fault recovery time is bounded and the average fault recovery time is reduced by up to 41.57%.

Problems of Big Data Analysis Education and Their Solutions (빅데이터 분석 교육의 문제점과 개선 방안 -학생 과제 보고서를 중심으로)

  • Choi, Do-Sik
    • Journal of the Korea Convergence Society
    • /
    • v.8 no.12
    • /
    • pp.265-274
    • /
    • 2017
  • This paper examines the problems of big data analysis education and suggests ways to solve them. Big data is a trend that the characteristic of big data is evolving from V3 to V5. For this reason, big data analysis education must take V5 into account. Because increased uncertainty can increase the risk of data analysis, internal and external structured/semi-structured data as well as disturbance factors should be analyzed to improve the reliability of the data. And when using opinion mining, error that is easy to perceive is variability and veracity. The veracity of the data can be increased when data analysis is performed against uncertain situations created by various variables and options. It is the node analysis of the textom(텍스톰) and NodeXL that students and researchers mainly use in the analysis of the association network. Social network analysis should be able to get meaningful results and predict future by analyzing the current situation based on dark data gained.

Analysis of Adverse Drug Reaction Reports using Text Mining (텍스트마이닝을 이용한 약물유해반응 보고자료 분석)

  • Kim, Hyon Hee;Rhew, Kiyon
    • Korean Journal of Clinical Pharmacy
    • /
    • v.27 no.4
    • /
    • pp.221-227
    • /
    • 2017
  • Background: As personalized healthcare industry has attracted much attention, big data analysis of healthcare data is essential. Lots of healthcare data such as product labeling, biomedical literature and social media data are unstructured, extracting meaningful information from the unstructured text data are becoming important. In particular, text mining for adverse drug reactions (ADRs) reports is able to provide signal information to predict and detect adverse drug reactions. There has been no study on text analysis of expert opinion on Korea Adverse Event Reporting System (KAERS) databases in Korea. Methods: Expert opinion text of KAERS database provided by Korea Institute of Drug Safety & Risk Management (KIDS-KD) are analyzed. To understand the whole text, word frequency analysis are performed, and to look for important keywords from the text TF-IDF weight analysis are performed. Also, related keywords with the important keywords are presented by calculating correlation coefficient. Results: Among total 90,522 reports, 120 insulin ADR report and 858 tramadol ADR report were analyzed. The ADRs such as dizziness, headache, vomiting, dyspepsia, and shock were ranked in order in the insulin data, while the ADR symptoms such as vomiting, 어지러움, dizziness, dyspepsia and constipation were ranked in order in the tramadol data as the most frequently used keywords. Conclusion: Using text mining of the expert opinion in KIDS-KD, frequently mentioned ADRs and medications are easily recovered. Text mining in ADRs research is able to play an important role in detecting signal information and prediction of ADRs.

Applying the AHP approach to evaluate Mobile Commerce Environment (AHP 기법을 이용한 모바일 상거래 환경의 평가)

  • Oh, Gi-Oug
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.12
    • /
    • pp.2308-2314
    • /
    • 2006
  • Internet Technology allowed general commerce to have proceeded actively, and now it is sharply evolving into Mobile Commerce. Mobile Commerce came into an inheritance of the Electronic Commerce, thought the feature to be different form is various as well. Many researches that analyzed the feature of Electronic Commerce were carried out but those were processed according to its location and field only. This study extracted and analyzed the factor that considered the several positions for Mobile Commerce so that it can reflect diverse the demands from those related to Mobile Commerce. It sought the new characteristic which is different from the Electronic Commerce, and took account of the successful factor for Mobile Commerce which includes the position in a user, a developer and an operator. In addition AHP (Analytics Hierarchy Process) was used in order to evaluate the subject characteristic applied to each related to through more objective methods. The analysis results identified in this study such as the quality trust and understandability for an information duality might be the one of the chief elements of success in Mobile Commerce which applies at the present.

Fire Accident Analysis of Hazardous Materials Using Data Analytics (Data Analytics를 활용한 위험물 화재사고 분석)

  • Shin, Eun-Ji;Koh, Moon-Soo;Shin, Dongil
    • Journal of the Korean Institute of Gas
    • /
    • v.24 no.5
    • /
    • pp.47-55
    • /
    • 2020
  • Hazardous materials accidents are not limited to the leakage of the material, but if the early response is not appropriate, it can lead to a fire or an explosion, which increases the scale of the damage. However, as the 4th industrial revolution and the rise of the big data era are being discussed, systematic analysis of hazardous materials accidents based on new techniques has not been attempted, but simple statistics are being collected. In this study, we perform the systematic analysis, using machine learning, on the fire accident data for the past 11 years (2008 ~ 2018), accumulated by the National Fire Service. The analysis results are visualized and presented through text mining analysis, and the possibility of developing a damage-scale prediction model is explored by applying the regression analysis method, using the main factors present in the hazardous materials fire accident data.

Improved Long-term Survival with Contralateral Prophylactic Mastectomy among Young Women

  • Zeichner, Simon Blechman;Ruiz, Ana Lourdes;Markward, Nathan Joseph;Rodriguez, Estelamari
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.3
    • /
    • pp.1155-1162
    • /
    • 2014
  • Background: Despite mixed survival data, the utilization of contralateral prophylactic mastectomy (CPM) for the prevention of a contralateral breast cancer (CBC) has increased significantly over the last 15 years, especially among women less than 40. We set out to look at our own experience with CPM, focusing on outcomes in women less than 40, the sub-population with the highest cumulative lifetime risk of developing CBC. With an extended follow-up, we hoped to demonstrate differences in the long-term disease free survival (DFS) and overall survival (OS) among groups who underwent the procedure (CPM) versus those that did not (NCPM). Materials and Methods: We performed a retrospective review of all breast cancer patients less than age 40 diagnosed at Mount Sinai Medical Center between January 1, 1980 and December 31, 2010 (n=481). Among these patients, 42 were identified as having undergone CPM, while 195 were confirmed as being CPM-free during the observation period. A univariate and multivariate analyses were performed. Results: The CPM group had a significantly higher percentage of patients who were diagnosed between 2000 and 2010 (95.2% vs 40%, p=0.0001). The CPM group had significantly smaller tumors (0-2cm.: 41.7% vs 24.8%, p=0.04). Among the entire group of patients, the overall five- and 10-year DFS were 81.3% and 73.3%, respectively. CPM was significantly associated [HR 2.35 (1.02, 5.41); p=0.046] with 10-year OS, although a similar effect was not observed for five-year OS. Conclusions: We found that CPM has increased dramatically over the last 15 years, especially among white women with locally advanced disease. In patients less than 40, who are thought to be at greatest cumulative risk of secondary breast cancer, CPM provided an OS advantage, regardless of genetics, tumor or patient characteristics, and which was only seen after 10 years of follow-up.

The National "Smoking Cessation Clinics" Program in the Republic of Korea: Socioeconomic Status and Age Matter

  • Kim, Hyoshin;Oh, Jin-Kyoung;Lim, Min Kyung;Jeong, Bo Yoon;Yun, E Hwa;Park, Eun Young
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.14 no.11
    • /
    • pp.6919-6924
    • /
    • 2013
  • Background: Between 1998-2009 South Korea experienced significant progress in reducing the male smoking rate from 66.3% to 46.9%. As part of a significant government effort in the area of smoking cessation intervention, the Korean government implemented the national "Smoking Cessation Clinics (SCC)" program in 2004. Materials and Methods: Data covered 804,334 adult male smokers participating in SCC program at 253 public health centers between 2006-2009. We examined participant cessation rates with the SCC program, their characteristics and program intervention components using health insurance status as a socioeconomic status (SES) indicator. Multivariate logistic regression analyses were performed correcting for intra-class correlations within public health centers. Results: The overall 6-month quit rate was high (46.8%). Higher odds of smoking cessation were positively associated with higher levels of behavioral counseling sessions, but not nicotine replacement therapy (NRT). Cessation rates were lower for Medicaid participants than for regular health insurance participants. Disadvantaged younger smokers were less likely to participate in the program. Older smokers were more likely to quit regardless of SES. Stress was cited as major reason for failure. Conclusions: SES inequalities across different age groups exist in smoking cessation among Korean adult male smokers. There is a need for intervention programs specifically targeting sub-populations of SES by different age groups.

A Study on the Performance Evaluation of Machine Learning for Predicting the Number of Movie Audiences (영화 관객 수 예측을 위한 기계학습 기법의 성능 평가 연구)

  • Jeong, Chan-Mi;Min, Daiki
    • The Journal of Society for e-Business Studies
    • /
    • v.25 no.2
    • /
    • pp.49-63
    • /
    • 2020
  • The accurate prediction of box office in the early stage is crucial for film industry to make better managerial decision. With aims to improve the prediction performance, the purpose of this paper is to evaluate the use of machine learning methods. We tested both classification and regression based methods including k-NN, SVM and Random Forest. We first evaluate input variables, which show that reputation-related information generated during the first two-week period after release is significant. Prediction test results show that regression based methods provides lower prediction error, and Random Forest particularly outperforms other machine learning methods. Regression based method has better prediction power when films have small box office earnings. On the other hand, classification based method works better for predicting large box office earnings.

A Case Study on Energy focused Smart City, London of the UK: Based on the Framework of 'Business Model Innovation'

  • Song, Minzheong
    • International journal of advanced smart convergence
    • /
    • v.9 no.2
    • /
    • pp.8-19
    • /
    • 2020
  • We see an energy fucused smart city evolution of the UK along with the project of "Smart London Plan (SLP)." A theoretical logic of business model innovation has been discussed and a research framework of evolving energy focused smart city is formulated. The starting point is the silo system. In the second stage, the private investment in smart meters establishes a basement for next stages. As results, the UK's smart energy sector has evolved from smart meter installation through smart grid to new business models such as water-energy nexus and microgrid. Before smart meter installation of the government, the electricity system was centralized. However, after consumer engagement plan has been set to make them understand benefits that they can secure through smart meters, the customer behavior has been changed. The data analytics firm enables greater understanding of consumer behavior and it helps energy industry to be smart via controlling, securing and using that data to improve the energy system. In the third stage, distribution network operators (DNOs)' access to smart meter data has been allowed and the segmentation starts. In the fourth stage, with collaboration of Ofwat and Ofgem, it is possible to eliminate unnecessary duplication of works and reduce interest conflict between water and electricity. In the fifth stage, smart meter and grid has been integrated as an "adaptive" system and a transition from DNO to DSO is accomplished for the integrated operation. Microgrid is a prototype for an "adaptive" smart grid. Previous steps enable London to accomplish a platform leadership to support the increasing electrification of the heating and transport sector and smart home.

A Future Prospect for Change in each Step of Six Sigma DMAIC under the 4th Industrial Revolution (4차 산업혁명 하에서의 6 시그마 DMAIC 단계별 변화에 대한 전망)

  • Kwon, Hyuck Moo;Hong, Sung Hoon;Lee, Min Koo
    • Journal of Korean Society for Quality Management
    • /
    • v.46 no.1
    • /
    • pp.1-10
    • /
    • 2018
  • Purpose: This paper provides an idea on the future prospect for change in steps of the six sigma DMAIC project under the environment of the 4th industrial revolution. Methods: First, the purpose and activities required in each step of DMAIC are reviewed. Next, activities are reviewed together with tools and techniques, considering the purpose and the environmental changes of the 4th Industrial Revolution. Finally, the best approaches for achieving the purpose are prospected to get an idea on future change. Results: The purpose of each phase of DMAIC is expected to remain unchanged. But activities, techniques, or methods will be replaced with more effective and efficient ones. Also, many activities may possibly be executed by a system instead of people like BB, GB or team members. Moreover, DMAIC may not be a project any more but a routine job of the system in the future. Conclusion: Under the environment of the 4th industrial revolution, many activities including analyzing various types of data and extracting valuable information, will be executed by a system with proper algorithms instead of people. And six sigma improvement projects may be intrinsic parts of the system and may not exist as separate projects any more.