• 제목/요약/키워드: Classification and Prediction

검색결과 1,110건 처리시간 0.027초

불균형 데이터 환경에서 변수가중치를 적용한 사례기반추론 기반의 고객반응 예측 (Response Modeling for the Marketing Promotion with Weighted Case Based Reasoning Under Imbalanced Data Distribution)

  • 김은미;홍태호
    • 지능정보연구
    • /
    • 제21권1호
    • /
    • pp.29-45
    • /
    • 2015
  • 고객반응 예측모형은 마케팅 프로모션을 제공할 목표고객을 효과적으로 선정할 수 있도록 하여 프로모션의 효과를 극대화 할 수 있도록 해준다. 오늘날과 같은 빅데이터 환경에서는 데이터 마이닝 기법을 적용하여 고객반응 예측모형을 구축하고 있으며 본 연구에서는 사례기반추론 기반의 고객반응 예측모형을 제시하였다. 일반적으로 사례기반추론 기반의 예측모형은 타 인공지능기법에 비해 성과가 낮다고 알려져 있으나 입력변수의 중요도에 따라 가중치를 상이하게 적용함으로써 예측성과를 향상시킬 수 있다. 본 연구에서는 프로모션에 대한 고객의 반응여부에 영향을 미치는 중요도에 따라 입력변수의 가중치를 산출하여 적용하였으며 동일한 가중치를 적용한 예측모형과의 성과를 비교하였다. 목욕세제 판매데이터를 사용하여 고객반응 예측모형을 개발하고 로짓모형의 계수를 적용하여 입력변수의 중요도에 따라 가중치를 산출하였다. 실증분석 결과 각 변수의 중요도에 기반하여 가중치를 적용한 예측모형이 동일한 가중치를 적용한 예측모형보다 높은 예측성과를 보여주었다. 또한 고객 반응예측 모형과 같이 실생활의 분류문제에서는 두 범주에 속하는 데이터의 수가 현격한 차이를 보이는 불균형 데이터가 대부분이다. 이러한 데이터의 불균형 문제는 기계학습 알고리즘의 성능을 저하시키는 요인으로 작용하며 본 연구에서 제안한 Weighted CBR이 불균형 환경에서도 안정적으로 적용할 수 있는지 검증하였다. 전체데이터에서 100개의 데이터를 무작위로 추출한 불균형 환경에서 100번 반복하여 예측성과를 비교해 본 결과 본 연구에서 제안한 Weighted CBR은 불균형 환경에서도 일관된 우수한 성과를 보여주었다.

건설업 산업안전보건관리비 예측 모델 개발 - 일반건설공사(갑)의 공사비 50억미만 공사를 대상으로 - (Development of a Safety and Health Expense Prediction Model in the Construction Industry)

  • 염동준;이미영;오세욱;한승우;김영석
    • 한국건설관리학회논문집
    • /
    • 제16권6호
    • /
    • pp.63-72
    • /
    • 2015
  • 최근 건설프로젝트가 고층화, 대형화, 복잡화됨에 따라 적정 수준의 건설업 산업안전보건관리비 확보 및 사용에 대한 중요성이 증가하고 있다. 그러나 현행 건설업 산업안전보건관리비 계상요율은 공사의 종류 및 규모로만 분류하여 일괄적으로 제시되어 있어 각각의 건설프로젝트가 지니고 있는 공사의 환경 및 특성을 반영하지 못한다는 한계점을 지니고 있다. 따라서 본 연구의 목적은 건설업 산업안전보건관리비 산정 시 건설프로젝트별 공사 환경 및 특성을 고려할 수 있도록 하는 건설업 산업안전보건관리비 예측 모델을 개발하는 것이다. 이를 위해 본 연구에서는 일반건설공사(갑) 50억 미만 공사현장에 대해 현장의 여건 및 건설업 산업안전보건관리비의 사용 실태를 조사하고, 이를 대상으로 통계적 기법인 다중회귀분석에 적용하였다. 분석 결과, 예측 모델은 검증군에 대해 기존 요율로써 산정할 때의 오차율(18.48%)보다 낮은 오차율(4.38%)을 보여, 기존의 방식보다 높은 예측정확도를 보이는 것으로 분석되었다. 개발된 예측 모델을 활용할 경우 각 건설프로젝트가 지닌 공사 환경과 특성이 반영된 보다 현실적인 건설업 산업안전보건관리비를 확보할 수 있을 것으로 예상되며, 이러한 적정 수준의 건설업 산업안전보건관리비 확보는 건설프로젝트에 양질의 안전관리를 제공함은 물론, 나아가 건설 안전사고 최소화 및 건설 재해율 감소에 기여할 것으로 기대된다. 본 연구는 건설업 산업안전보건관리비 예측 모델 개발을 위한 초기단계의 연구로 50억 미만의 일반건설공사(갑)으로 범위를 한정하였으나, 향후 추가적인 데이터 수집을 통해 건설업 전반에서 활용 가능한 건설업 산업안전보건관리비 예측 모델이 개발될 필요가 있다.

1시간 호우피해 규모 예측을 위한 AI 기반의 1ST-모형 개발 (Development of 1ST-Model for 1 hour-heavy rain damage scale prediction based on AI models)

  • 이준학;이하늘;강나래;황석환;김형수;김수전
    • 한국수자원학회논문집
    • /
    • 제56권5호
    • /
    • pp.311-323
    • /
    • 2023
  • 집중호우, 홍수 및 도시침수와 같은 재해를 저감시키기 위하여 자연 재난으로 인한 재해의 발생 여부를 사전에 파악하는 것은 중요하다. 현재 국내는 기상청에서 운영하고 있는 호우주의보 및 호우경보를 발령하고 있지만, 이는 전국에 일괄적인 기준으로 적용하고 있어 사전에 호우로 인한 피해를 명확하게 인지하지 못하고 있는 실정이다. 따라서, 일괄된 기준을 지역적 특성을 반영한 호우특보 기준으로 재설정하고 1시간 후에 강우로 발생할 수 있는 피해의 규모를 예측하고자 하였다. 연구 대상 지역으로는 호우피해가 가장 빈번하게 발생하였던 경기도 지역으로 선정하였고, 강우량 및 호우 피해액 자료를 활용하여 지역적 특성을 고려한 시간단위 재해 유발 강우를 설정하였다. 강우에 의한 호우피해 발생 여부를 예측하는 모형을 개발하기 위해 재해 유발 강우 및 강우 자료를 활용하였으며, 머신러닝 기법인 의사 결정 나무 모형과 랜덤 포레스트 모형을 활용하여 분석 및 비교하였다. 또한 1시간 후의 강우를 예측하기 위한 모형으로는 장단기 메모리, 심층 신경망 모형을 활용하여 분석 및 비교하였다. 최종적으로 예측 모형을 통해 예측된 강우를 훈련된 분류 모형에 적용하여 1시간 후 호우에 의한 규모별 피해 발생 여부를 예측하였고, 이를 1ST-모형이라고 정의하였다. 본 연구를 통해 개발된 1ST-모형을 활용하여 예방 및 대비 차원의 재난관리를 실시한다면 호우로 인한 피해를 저감하는데 기여 할 수 있을 것으로 판단된다.

Using ICT in the HEIs in the Study of the Philological Sciences

  • Iryna, Kominiarska;Roman, Dubrovskyi;Inna, Volianiuk;Natalya, Yanus;Oleksandr, Hryshchenko
    • International Journal of Computer Science & Network Security
    • /
    • 제22권5호
    • /
    • pp.31-38
    • /
    • 2022
  • The article highlights the educational potential of information and communication technologies in the study of philological disciplines in higher education institutions. The study aims to analyze the didactic potential of ICT in the study of philological disciplines, as well as to check the scientific hypothesis that the use of ICT in HEIs in the study of philological disciplines will intensify and enhance the effectiveness of the learning process. To confirm the validity of the hypothesis, experimental testing was carried out and the results are illustrated in the article. The above-mentioned goal of the study determined the use of theoretical and empirical methods: analysis, synthesis, generalization, and systematization of pedagogical and scientific-methodological literature to clarify the state of research problem development and to identify pedagogical foundations on which the process of ICT use is based, comparison and prediction; questioning and testing of educational process participants to understand the effectiveness of ICT use in their training in HEIs. The research results showed positive changes in all analyzed criteria in the experimental group, which is due to the introduction of additional ICT tools into the educational process of the mentioned group. The scientific novelty of the study consists in highlighting the main characteristics and didactic functions of ICT in the learning process of philological students; in covering the classification of ICT, ICT tools, and typology of training sessions using ICT in the study of philological disciplines. In the conclusion it is summarized that the introduction of modern ICT in the educational process allows intensifying the learning process, implementation of a variety of ideas, increases the pace of classes and material assimilation, influencing the motivation for learning, increases the amount of independent work of students.

Application of Terahertz Spectroscopy and Imaging in the Diagnosis of Prostate Cancer

  • Zhang, Ping;Zhong, Shuncong;Zhang, Junxi;Ding, Jian;Liu, Zhenxiang;Huang, Yi;Zhou, Ning;Nsengiyumva, Walter;Zhang, Tianfu
    • Current Optics and Photonics
    • /
    • 제4권1호
    • /
    • pp.31-43
    • /
    • 2020
  • The feasibility of the application of terahertz electromagnetic waves in the diagnosis of prostate cancer was examined. Four samples of incomplete cancerous prostatic paraffin-embedded tissues were examined using terahertz spectral imaging (TPI) system and the results obtained by comparing the absorption coefficient and refractive index of prostate tumor, normal prostate tissue and smooth muscle from one of the paraffin tissue masses examined were reported. Three hundred and sixty cases of absorption coefficients from one of the paraffin tissues examined were used as raw data to classify these three tissues using the Principal Component Analysis (PCA) and Least Squares Support Vector Machine (LS-SVM). An excellent classification with an accuracy of 92.22% in the prediction set was achieved. Using the distribution information of THz reflection signal intensity from sample surface and absorption coefficient of the sample, an attempt was made to use the TPI system to identify the boundaries of the different tissues involved (prostate tumors, normal and smooth muscles). The location of three identified regions in the terahertz images (frequency domain slice absorption coefficient imaging, 1.2 THz) were compared with those obtained from the histopathologic examination. The tissue tumor region had a distinctively visible color and could well be distinguished from other tissue regions in terahertz images. Results indicate that a THz spectroscopy imaging system can be efficiently used in conjunction with the proposed advanced computer-based mathematical analysis method to identify tumor regions in the paraffin tissue mass of prostate cancer.

Modern Paradigm of Organization of the Management Mechanism by Innovative Development in Higher Education Institutions

  • Kubitsky, Serhii;Domina, Viktoriia;Mykhalchenko, Nataliia;Terenko, Olena;Mironets, Liudmyla;Kanishevska, Lyubov;Marszałek, Lidia
    • International Journal of Computer Science & Network Security
    • /
    • 제22권11호
    • /
    • pp.141-148
    • /
    • 2022
  • The development of the education system and the labor market today requires new conditions for unification and functioning, the introduction of an innovative culture in the field of Education. The construction of modern management of innovative development of a higher education institution requires consideration of the existing theoretical, methodological and practical planes on which its formation is based. The purpose of the article is to substantiate the modern paradigm of organizing the mechanism of managing the innovative development of higher education institutions. Innovation in education is represented not only by the final product of applying novelty in educational and managerial processes in order to qualitatively improve the subject and objects of management and obtain economic, social, scientific, technical, environmental and other effects, but also by the procedure for their constant updating. The classification of innovations in education is presented. Despite the positive developments in the development of Education, numerous problems remain in this area, which is discussed in the article. The concept of innovative development of higher education institutions is described, which defines the prerequisites, goals, principles, tasks and mechanisms of university development for a long-term period and should be based on the following principles: scientific, flexible, efficient and comprehensive. The role of the motivational component of the mechanism of innovative development of higher education institutions is clarified, which allows at the strategic level to create an innovative culture and motivation of innovative activity of each individual, to make a choice of rational directions for solving problems, at the tactical level - to form motives for innovative activity in the most effective directions, at the operational level - to monitor the formation of a system of motives and incentives, to adjust the directions of motivation. The necessity of the functional component of the mechanism, which consists in determining a set of steps and management decisions aimed at achieving certain goals of innovative development of higher education institutions, is proved. The monitoring component of the mechanism is aimed at developing a special system for collecting, processing, storing and distributing information about the stages of development of higher education institutions, prediction based on the objective data on the dynamics and main trends of its development, and elaboration of recommendations.

BEEF MEAT TRACEABILITY. CAN NIRS COULD HELP\ulcorner

  • Cozzolino, D.
    • 한국근적외분광분석학회:학술대회논문집
    • /
    • 한국근적외분광분석학회 2001년도 NIR-2001
    • /
    • pp.1246-1246
    • /
    • 2001
  • The quality of meat is highly variable in many properties. This variability originates from both animal production and meat processing. At the pre-slaughter stage, animal factors such as breed, sex, age contribute to this variability. Environmental factors include feeding, rearing, transport and conditions just before slaughter (Hildrum et al., 1995). Meat can be presented in a variety of forms, each offering different opportunities for adulteration and contamination. This has imposed great pressure on the food manufacturing industry to guarantee the safety of meat. Tissue and muscle speciation of flesh foods, as well as speciation of animal derived by-products fed to all classes of domestic animals, are now perhaps the most important uncertainty which the food industry must resolve to allay consumer concern. Recently, there is a demand for rapid and low cost methods of direct quality measurements in both food and food ingredients (including high performance liquid chromatography (HPLC), thin layer chromatography (TLC), enzymatic and inmunological tests (e.g. ELISA test) and physical tests) to establish their authenticity and hence guarantee the quality of products manufactured for consumers (Holland et al., 1998). The use of Near Infrared Reflectance Spectroscopy (NIRS) for the rapid, precise and non-destructive analysis of a wide range of organic materials has been comprehensively documented (Osborne et at., 1993). Most of the established methods have involved the development of NIRS calibrations for the quantitative prediction of composition in meat (Ben-Gera and Norris, 1968; Lanza, 1983; Clark and Short, 1994). This was a rational strategy to pursue during the initial stages of its application, given the type of equipment available, the state of development of the emerging discipline of chemometrics and the overwhelming commercial interest in solving such problems (Downey, 1994). One of the advantages of NIRS technology is not only to assess chemical structures through the analysis of the molecular bonds in the near infrared spectrum, but also to build an optical model characteristic of the sample which behaves like the “finger print” of the sample. This opens the possibility of using spectra to determine complex attributes of organic structures, which are related to molecular chromophores, organoleptic scores and sensory characteristics (Hildrum et al., 1994, 1995; Park et al., 1998). In addition, the application of statistical packages like principal component or discriminant analysis provides the possibility to understand the optical properties of the sample and make a classification without the chemical information. The objectives of this present work were: (1) to examine two methods of sample presentation to the instrument (intact and minced) and (2) to explore the use of principal component analysis (PCA) and Soft Independent Modelling of class Analogy (SIMCA) to classify muscles by quality attributes. Seventy-eight (n: 78) beef muscles (m. longissimus dorsi) from Hereford breed of cattle were used. The samples were scanned in a NIRS monochromator instrument (NIR Systems 6500, Silver Spring, MD, USA) in reflectance mode (log 1/R). Both intact and minced presentation to the instrument were explored. Qualitative analysis of optical information through PCA and SIMCA analysis showed differences in muscles resulting from two different feeding systems.

  • PDF

Chatbot Design Method Using Hybrid Word Vector Expression Model Based on Real Telemarketing Data

  • Zhang, Jie;Zhang, Jianing;Ma, Shuhao;Yang, Jie;Gui, Guan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권4호
    • /
    • pp.1400-1418
    • /
    • 2020
  • In the development of commercial promotion, chatbot is known as one of significant skill by application of natural language processing (NLP). Conventional design methods are using bag-of-words model (BOW) alone based on Google database and other online corpus. For one thing, in the bag-of-words model, the vectors are Irrelevant to one another. Even though this method is friendly to discrete features, it is not conducive to the machine to understand continuous statements due to the loss of the connection between words in the encoded word vector. For other thing, existing methods are used to test in state-of-the-art online corpus but it is hard to apply in real applications such as telemarketing data. In this paper, we propose an improved chatbot design way using hybrid bag-of-words model and skip-gram model based on the real telemarketing data. Specifically, we first collect the real data in the telemarketing field and perform data cleaning and data classification on the constructed corpus. Second, the word representation is adopted hybrid bag-of-words model and skip-gram model. The skip-gram model maps synonyms in the vicinity of vector space. The correlation between words is expressed, so the amount of information contained in the word vector is increased, making up for the shortcomings caused by using bag-of-words model alone. Third, we use the term frequency-inverse document frequency (TF-IDF) weighting method to improve the weight of key words, then output the final word expression. At last, the answer is produced using hybrid retrieval model and generate model. The retrieval model can accurately answer questions in the field. The generate model can supplement the question of answering the open domain, in which the answer to the final reply is completed by long-short term memory (LSTM) training and prediction. Experimental results show which the hybrid word vector expression model can improve the accuracy of the response and the whole system can communicate with humans.

질병예측자료로서 사과(四科) . 사류형상(四類形象)의 의의와 미병진단적 가치 연구 (Study on the Meaning of Four Subjects and Four Species as a Disease-Prediction Data and Diagnostic Value on Ante-Disease)

  • 김종원;전수형;이인선;김규곤;이용태;김경철;엄현섭;지규용
    • 동의생리병리학회지
    • /
    • 제23권2호
    • /
    • pp.325-330
    • /
    • 2009
  • In Korea, medical diagnostic equipments and biochemical examination can not be used in order for diagnosing sub-healthy state or ante-disease state in oriental medicine clinic. So morphic analogical method used in oriental medicine can be a good tool as a disease-predictable signs in order to enable preventive diagnosis and therapy. Therefore the four geometrical subjects; Essence, Pneuma, Spirit, Blood(四科;精氣紳血) and the four taxonomical species; Pisces, Quadruped, Aves, Carapaces(四類;魚走鳥甲) are chosen as morphic models in this paper. The differences of two classifying methods with four subjects and four species were as follows. The diagnostic category was meta-medical and synthetic against medical specific. The diagnostic object was body in contrast with face. They were able to be applicant in psychology and classification of characteristics against diagnostics and therapeutics directly in oriental medicine. The theoretical basis was basic diagrams of four unit-fluids of body and morphological analogy with four animal species respectively. And the therapeutic aims were systemic pathogenesis following five phase theory against congestion and deficiency of Essence, Pneuma, Spirit, Blood. The four subjects and four species are mixed each other practically in clinic. But it should be used limitedly because of the above reasons described and must divide the principal and secondary factors and follow the pathology of principal shape factor. In order to improve the diagnostic value of ante-disease state, the discriminable standards, measurement methods, limit of interrelating interpretation and the criteria of abnormal disproportion were needed to be defined more clearly in advance.

Development of algorithm for work intensity evaluation using excess overwork index of construction workers with real-time heart rate measurement device

  • Jae-young Park;Jung Hwan Lee;Mo-Yeol Kang;Tae-Won Jang;Hyoung-Ryoul Kim;Se-Yeong Kim;Jongin Lee
    • Annals of Occupational and Environmental Medicine
    • /
    • 제35권
    • /
    • pp.24.1-24.15
    • /
    • 2023
  • Background: The construction workers are vulnerable to fatigue due to high physical workload. This study aimed to investigate the relationship between overwork and heart rate in construction workers and propose a scheme to prevent overwork in advance. Methods: We measured the heart rates of construction workers at a construction site of a residential and commercial complex in Seoul from August to October 2021 and develop an index that monitors overwork in real-time. A total of 66 Korean workers participated in the study, wearing real-time heart rate monitoring equipment. The relative heart rate (RHR) was calculated using the minimum and maximum heart rates, and the maximum acceptable working time (MAWT) was estimated using RHR to calculate the workload. The overwork index (OI) was defined as the cumulative workload evaluated with the MAWT. An appropriate scenario line (PSL) was set as an index that can be compared to the OI to evaluate the degree of overwork in real-time. The excess overwork index (EOI) was evaluated in real-time during work performance using the difference between the OI and the PSL. The EOI value was used to perform receiver operating characteristic (ROC) curve analysis to find the optimal cut-off value for classification of overwork state. Results: Of the 60 participants analyzed, 28 (46.7%) were classified as the overwork group based on their RHR. ROC curve analysis showed that the EOI was a good predictor of overwork, with an area under the curve of 0.824. The optimal cut-off values ranged from 21.8% to 24.0% depending on the method used to determine the cut-off point. Conclusion: The EOI showed promising results as a predictive tool to assess overwork in real-time using heart rate monitoring and calculation through MAWT. Further research is needed to assess physical workload accurately and determine cut-off values across industries.