• Title/Summary/Keyword: Statistics technique

Search Result 880, Processing Time 0.028 seconds

The development of symmetrically and attributably pure confidence in association rule mining (연관성 규칙에서 활용 가능한 대칭적 기여 순수 신뢰도의 개발)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.3
    • /
    • pp.601-609
    • /
    • 2014
  • The most widely used data mining technique for big data analysis is to generate meaningful association rules. This method has been used to find the relationship between set of items based on the association criteria such as support, confidence, lift, etc. Among them, confidence is the most frequently used, but it has the drawback that we can not know the direction of association by it. The attributably pure confidence was developed to compensate for this drawback, but the value was changed by the position of two item sets. In this paper, we propose four symmetrically and attributably pure confidence measures to compensate the shortcomings of confidence and the attributably pure confidence. And then we prove three conditions of interestingness measure by Piatetsky-Shapiro, and comparative studies with confidence, attributably pure confidence, and four symmetrically and attributably pure confidence measures are shown by numerical examples. The results show that the symmetrically and attributably pure confidence measures are better than confidence and the attributably pure confidence. Also the measure NSAPis found to be the best among these four symmetrically and attributably pure confidence measures.

Study on Local Wireless Network Data Structure for Sludge Multimeter (슬러지 멀티미터를 위한 근거리무선네트워크 데이터구조 설계 연구)

  • Jung, Soonho;Kim, Younggi;Lee, Sijin;Lee, Sunghwa;Park, Taejun;Byun, Doogyoon;Cha, Jaesang
    • Journal of Satellite, Information and Communications
    • /
    • v.9 no.2
    • /
    • pp.96-100
    • /
    • 2014
  • Recently, the management system of wastewater treatment facility has magnified due to the stringent regulations for the protection of the environment, and a sewage treatment plant efficiency and research of the car development are activated in large facilities or industrial park. however, the existing sewerage disposal system and specific water quality monitoring network reliability for real-time transmission of this building is insufficient. In this paper, we proposed a local wireless network design for sludge multi meter data collection and control for measuring the concentration of the sludge efficiently. Also, the collected data over the local wireless network to transmitted to the central monitoring system and accumulate the data in real time to calculate statistics is possible to monitor the status of the sewage treatment facilities. The proposed system uses a short-range wireless networks of IEEE 802.15.4 and configures an IEEE 802.11 network which can monitor real-time status in central system. Also, we install a sludge multimeter and communication network in sewage treatment facilities and confirm the usefulness of the proposed technique by demonstrating its effectiveness.

Estimating Forest Site Productivity and Productive Areas of Quercus acutissima and Quercus mongolica Using Environmental Variables (환경요인에 의한 상수리나무와 신갈나무의 임지생산력 및 적지 추정)

  • Shin, Man-Yong;Sung, Joo-Han;Chun, Jung-Hwa
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.14 no.2
    • /
    • pp.89-97
    • /
    • 2012
  • This study was conducted to estimate forest site productivity and productive areas of Quercus acutissima and Quercus mongolica using environmental factors including climatic variables. Using the data set from digital forest site map and forest climatic map, a total of 42 environmental variables were regressed on site index for developing the best site index equations for Quercus acutissima and Quercus mongolica. Five to six environmental factors by species were selected as independent variables in the best site index equations. For the site index equations, three evaluation statistics (i.e., mean difference, standard deviation of difference, and standard error of difference) were applied to the test data set for the validation of the results, The site index equations fitted well to the test data set with relatively low bias and variation. As a result, it was concluded that the site index equations by species were well capable of estimating site quality. Finally, based on the site index equations, the productive areas by species were estimated by applying GIS technique to the digital forest maps. In addition, the distribution of productive areas by species was illustrated.

Android Malware Analysis Technology Research Based on Naive Bayes (Naive Bayes 기반 안드로이드 악성코드 분석 기술 연구)

  • Hwang, Jun-ho;Lee, Tae-jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.27 no.5
    • /
    • pp.1087-1097
    • /
    • 2017
  • As the penetration rate of smartphones increases, the number of malicious codes targeting smartphones is increasing. I 360 Security 's smartphone malware statistics show that malicious code increased 437 percent in the first quarter of 2016 compared to the fourth quarter of 2015. In particular, malicious applications, which are the main means of distributing malicious code on smartphones, are aimed at leakage of user information, data destruction, and money withdrawal. Often, it is operated by an API, which is an interface that allows you to control the functions provided by the operating system or programming language. In this paper, we propose a mechanism to detect malicious application based on the similarity of API pattern in normal application and malicious application by learning pattern of API in application derived from static analysis. In addition, we show a technique for improving the detection rate and detection rate for each label derived by using the corresponding mechanism for the sample data. In particular, in the case of the proposed mechanism, it is possible to detect when the API pattern of the new malicious application is similar to the previously learned patterns at a certain level. Future researches of various features of the application and applying them to this mechanism are expected to be able to detect new malicious applications of anti-malware system.

GIS-based Spatial Integration and Statistical Analysis using Multiple Geoscience Data Sets : A Case Study for Mineral Potential Mapping (다중 지구과학자료를 이용한 GIS 기반 공간통합과 통계량 분석 : 광물 부존 예상도 작성을 위한 사례 연구)

  • 이기원;박노욱;권병두;지광훈
    • Korean Journal of Remote Sensing
    • /
    • v.15 no.2
    • /
    • pp.91-105
    • /
    • 1999
  • Spatial data integration using multiple geo-based data sets has been regarded as one of the primary GIS application issues. As for this issue, several integration schemes have been developed as the perspectives of mathematical geology or geo-mathematics. However, research-based approaches for statistical/quantitative assessments between integrated layer and input layers are not fully considered yet. Related to this niche point, in this study, spatial data integration using multiple geoscientific data sets by known integration algorithms was primarily performed. For spatial integration by using raster-based GIS functionality, geological, geochemical, geophysical data sets, DEM-driven data sets and remotely sensed imagery data sets from the Ogdong area were utilized for geological thematic mapping related by mineral potential mapping. In addition, statistical/quantitative information extraction with respective to relationships among used data sets and/or between each data set and integrated layer was carried out, with the scope of multiple data fusion and schematic statistical assessment methodology. As for the spatial integration scheme, certainty factor (CF) estimation and principal component analysis (PCA) were applied. However, this study was not aimed at direct comparison of both methodologies; whereas, for the statistical/quantitative assessment between integrated layer and input layers, some statistical methodologies based on contingency table were focused. Especially, for the bias reduction, jackknife technique was also applied in PCA-based spatial integration. Through the statistic analyses with respect to the integration information in this case study, new information for relationships of integrated layer and input layers was extracted. In addition, influence effects of input data sets with respect to integrated layer were assessed. This kind of approach provides a decision-making information in the viewpoint of GIS and is also exploratory data analysis in conjunction with GIS and geoscientific application, especially handing spatial integration or data fusion with complex variable data sets.

Estimation of Duck House Litter Evaporation Rate Using Machine Learning (기계학습을 활용한 오리사 바닥재 수분 발생량 분석)

  • Kim, Dain;Lee, In-bok;Yeo, Uk-hyeon;Lee, Sang-yeon;Park, Sejun;Decano, Cristina;Kim, Jun-gyu;Choi, Young-bae;Cho, Jeong-hwa;Jeong, Hyo-hyeog;Kang, Solmoe
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.63 no.6
    • /
    • pp.77-88
    • /
    • 2021
  • Duck industry had a rapid growth in recent years. Nevertheless, researches to improve duck house environment are still not sufficient enough. Moisture generation of duck house litter is an important factor because it may cause severe illness and low productivity. However, the measuring process is difficult because it could be disturbed with animal excrements and other factors. Therefore, it has to be calculated according to the environmental data around the duck house litter. To cut through all these procedures, we built several machine learning regression model forecasting moisture generation of litter by measured environment data (air temperature, relative humidity, wind velocity and water contents). 5 models (Multi Linear Regression, k-Nearest Neighbors, Support Vector Regression, Random Forest and Deep Neural Network). have been selected for regression. By using R-Square, RMSE and MAE as evaluation metrics, the best accurate model was estimated according to the variables for each machine learning model. In addition, to address the small amount of data acquired through lab experiments, bootstrapping method, a technique utilized in statistics, was used. As a result, the most accurate model selected was Random Forest, with parameters of n-estimator 200 by bootstrapping the original data nine times.

Comparison of Fornix and Stria Terminalis Connectivity among First-Episode Schizophrenia, Chronic Schizophrenia and Healthy Controls (초발 조현병, 만성 조현병과 건강 대조군의 뇌활과 분계섬유줄 연결성 비교)

  • Lee, Arira;Yun, Mirim;Yook, Ki Hwan;Choi, Tai Kiu;Lee, Kang Soo;Bang, Minji;Lee, Sang-Hyuk
    • Korean Journal of Biological Psychiatry
    • /
    • v.26 no.1
    • /
    • pp.8-13
    • /
    • 2019
  • Objectives Disrupted integrities of the fornix and stria terminalis have been suggested in schizophrenia. However, very few studies have focused on the fornix and stria terminalis comparing first-episode schizophrenia (FESZ), chronic schizophrenia (CS), and healthy controls (HCs) with the application of diffusion-tensor imaging (DTI) technique. The objective of this study is to compare the connectivity of the fornix and stria terminalis among FESZ, CS, and HCs. Methods We included the 44 FESZ patients, 39 CS patients and 20 HCs in this study. Voxel-wise statistical analysis of the fractional anisotropy (FA) data was performed using Tract-Based Spatial Statistics to analyze the connectivity of fornix and stria terminalis. In addition, the Scale for the Assessment of Positive Symptoms (SAPS) and the Scale for the Assessment of Negative Symptoms (SANS) were used to evaluate clinical symptom severities. Results There were no significant differences between the FESZ, CS, and HCs in age, sex, education years. The SAPS and SANS scores of the schizophrenia groups showed no significant differences. FA values of the right fornix cres/stria terminalis in the CS group were significantly lower than those in FESZ and HCs. There were no significant differences of FA values of the right fornix cres/stria terminalis between the FESZ and the HCs. Pearson correlation analyses revealed that significant correlation between FA values of the right fornix cres/stria terminalies of the the FESZ group and positive, negative symptom scales, and FA values of the right fornix cres/stria terminalis of the CS group and negative symptom scales. Conclusions This study shows that FA values of the fornix and stria terminalis in the CS were lower than in the FESZ and the HCs. These results suggest that the fornix and stria terminalis can play a role in pathophysiology of schizophrenia. Thus current study can broaden our understanding of the pathophysiology of schizophrenia.

Development of Design Space Exploration for Warship using the Concept of Negative Design (네거티브 설계 개념을 이용한 함정 설계영역탐색법 개발)

  • Park, Jin-Won
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.9
    • /
    • pp.412-419
    • /
    • 2019
  • Negative space in the discipline of art defines the space around and between the subject of an image. The use of negative space is an element of artistic composition, since it is occasionally used to artistic effect as the "real" subject of an image. In painting, it is a technique that negatively touches the background of an object to be expressed, so that it gives a feeling of unique texture and silhouette by touching unnecessary parts while leaving necessary parts. As in art, negative space in a design can also be useful to identify an image of infeasible design ranges with a straightforward view. Similarity between two disciplines leads to the introduction of the negative space concept for design space exploration. A rough design space exploration using statistics and visual analytics may support more efficient decision-making, and can provide meaningful insights into the direction of early-phase system design. For this, the approach guarantees dynamic interactions between visualized information and human cognitive systems. Visual analytics is useful to summarize complex and large-scale data. It is useful for identifying feasible design spaces, as well as for avoiding infeasible spaces or highly risky spaces. This paper investigates the possible use of the negative space concept by using an application example.

The Factors Influence upon Employment Volition in Alcohol Use Disorder (알코올사용장애 환자의 취업의지에 미치는 영향요인)

  • Rho, In-Suk;Cho, Kyong-Ah
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.7
    • /
    • pp.272-280
    • /
    • 2019
  • This descriptive research study investigated the relationships between different degrees of family support, ego-resilience and employment volition of patients suffering from alcohol use disorder, and we identified the factors that had an influence on employment volition. This study used the survey research technique and structured questionnaires. The data was collected from 128 males (aged 20 years or older) and who had been diagnosed with alcohol use disorder. They had also undergone either inpatient hospital care or outpatient treatment. The data was analyzed using descriptive statistics, one-way ANOVA, Pearson's correlation coefficient and multiple regression analysis. The results of the study showed that family support had a value of 4.30, an ego resilience had a value of 2.37 and the employment volition had a value of 4.06. The results of the multiple regression analysis showed that there was statistically significant positive correlation between employment volition and ego resilience (${\beta}=-.314$, p<.01) and age (${\beta}=-.253$, p<.01), and the total explanatory power of these 2 factors was 16.3%. According to the results of this study, age-based approaches are needed to improve the employment volition of patients with alcohol use disorder. Additionally, the results of this study suggest that an ego resilience enhancement program be developed and implemented to help these patients.

A Study on Research Trends in the Smart Farm Field using Topic Modeling and Semantic Network Analysis (토픽모델링과 언어네트워크분석을 활용한 스마트팜 연구 동향 분석)

  • Oh, Juyeon;Lee, Joonmyeong;Hong, Euiki
    • Journal of Digital Convergence
    • /
    • v.20 no.2
    • /
    • pp.203-215
    • /
    • 2022
  • The study is to investigate research trends and knowledge structures in the Smart Farm field. To achieve the research purpose, keywords and the relationship among keywords were analyzed targeting 104 Korean academic journals related to the Smart Farm in KCI(Korea Citation Index), and topics were analyzed using the LDA Topic Modeling technique. As a result of the analysis, the main keywords in the Korean Smart Farm-related research field were 'environment', 'system', 'use', 'technology', 'cultivation', etc. The results of Degree, Betweenness, and Eigenvector Centrality were presented. There were 7 topics, such as 'Introduction analysis of Smart Farm', 'Eco-friendly Smart Farm and economic efficiency of Smart Farm', 'Smart Farm platform design', 'Smart Farm production optimization', 'Smart Farm ecosystem', 'Smart Farm system implementation', and 'Government policy for Smart Farm' in the results of Topic Modeling. This study will be expected to serve as basic data for policy development necessary to advance Korean Smart Farm research in the future by examining research trends related to Korean Smart Farm.