• Title/Summary/Keyword: Outlier Analysis

Search Result 234, Processing Time 0.022 seconds

Pupil Data Measurement and Social Emotion Inference Technology by using Smart Glasses (스마트 글래스를 활용한 동공 데이터 수집과 사회 감성 추정 기술)

  • Lee, Dong Won;Mun, Sungchul;Park, Sangin;Kim, Hwan-jin;Whang, Mincheol
    • Journal of Broadcast Engineering
    • /
    • v.25 no.6
    • /
    • pp.973-979
    • /
    • 2020
  • This study aims to objectively and quantitatively determine the social emotion of empathy by collecting pupillary response. 52 subjects (26 men and 26 women) voluntarily participated in the experiment. After the measurement of the reference of 30 seconds, the experiment was divided into the task of imitation and spontaneously self-expression. The two subjects were interacted through facial expressions, and the pupil images were recorded. The pupil data was processed through binarization and circular edge detection algorithm, and outlier detection and removal technique was used to reject eye-blinking. The pupil size according to the empathy was confirmed for statistical significance with test of normality and independent sample t-test. Statistical analysis results, the pupil size was significantly different between empathy (M ± SD = 0.050 ± 1.817)) and non-empathy (M ± SD = 1.659 ± 1.514) condition (t(92) = -4.629, p = 0.000). The rule of empathy according to the pupil size was defined through discriminant analysis, and the rule was verified (Estimation accuracy: 75%) new 12 subjects (6 men and 6 women, mean age ± SD = 22.84 ± 1.57 years). The method proposed in this study is non-contact camera technology and is expected to be utilized in various virtual reality with smart glasses.

Comparative Analysis of Anomaly Detection Models using AE and Suggestion of Criteria for Determining Outliers

  • Kang, Gun-Ha;Sohn, Jung-Mo;Sim, Gun-Wu
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.8
    • /
    • pp.23-30
    • /
    • 2021
  • In this study, we present a comparative analysis of major autoencoder(AE)-based anomaly detection methods for quality determination in the manufacturing process and a new anomaly discrimination criterion. Due to the characteristics of manufacturing site, anomalous instances are few and their types greatly vary. These properties degrade the performance of an AI-based anomaly detection model using the dataset for both normal and anomalous cases, and incur a lot of time and costs in obtaining additional data for performance improvement. To solve this problem, the studies on AE-based models such as AE and VAE are underway, which perform anomaly detection using only normal data. In this work, based on Convolutional AE, VAE, and Dilated VAE models, statistics on residual images, MSE, and information entropy were selected as outlier discriminant criteria to compare and analyze the performance of each model. In particular, the range value applied to the Convolutional AE model showed the best performance with AUC PRC 0.9570, F1 Score 0.8812 and AUC ROC 0.9548, accuracy 87.60%. This shows a performance improvement of an accuracy about 20%P(Percentage Point) compared to MSE, which was frequently used as a standard for determining outliers, and confirmed that model performance can be improved according to the criteria for determining outliers.

A Review of Statistical Methods in the Korean Journal of Orthodontics and the American Journal of Orthodontics and Dentofacial Orthopedics (대한치과교정학회지(KJO)와 미국교정학회지(AJODO)에서 사용된 통계기법의 비교분석 및 고찰(1999-2003))

  • Lim, Hoi-Jeong
    • The korean journal of orthodontics
    • /
    • v.34 no.5 s.106
    • /
    • pp.371-379
    • /
    • 2004
  • The purpose of this study was to investigate the changes and types of statistical methods used in the Korean Journal of Orthodontics (KJO) and the American Journal of Orthodontics and Dentofacial Orthopedics (AJODO) from )999 to 2003. The frequency of use, transitions, assumption check of statistical methods and types of advanced statistical methods were examined from each journal. The study consisted of 247 articles published in the KJO and randomly chosen 50 articles per year which were original articles and used statistical methods T-test, analysis of variance(ANOVA), correlation analysis, nonparametric analysis. regression analysis chi-square test. factor analysis, were the order of statistical methods most frequently used in the KJO, while t-test. ANOVA, nonparametric analysis, correlation analysis, regression analysis, chi-square test. factor analysis. were the order of statistical methods used in the AJODO The changes of statistical methods observed in the KJO were not significant $(X^2=17.4\;p=0.5881)$ but the changes observed in the AJODO was seen to be significant $(x^2=42.4,\;p=0.0397)$ Some of the studies examined had overlooked the assumptions of the statistical methods employed. Data investigation such as outlier should be performed before analysis and alternative statistical approaches are applied for a small sample size. Types of advanced statistical methods were factor analysis and discriminant analysis in the KJO and Intention-To-Treat (ITT) analysis in clinical trials through multi-center, survival analysis and Generalized Estimating Equations (GEE) in the AJODO. Appropriate analysis approaches and interpretations should be applied for the correlated and repeated measurements of the orthodontic data set.

Precision Improvement Methodology of Geotechnical Information through Outlier Analysis (이상치 분석을 통한 3차원 지반정보 정밀도 향상 방안)

  • Lee, Boyoung;Hwang, Bumsik;Kim, Hansaem;Cho, Wanjei
    • Journal of the Korean GEO-environmental Society
    • /
    • v.19 no.2
    • /
    • pp.23-35
    • /
    • 2018
  • Recently, ground disasters such as road collapses and cavities have been frequently occurred in Seoul and downtown areas. As a result, studies on the integrated underground space map is underway as a government's solution. On the other hand, the geotechnical information underlying the integrated underground space map has been being built with more than 220 thousands borehole DB informations through the Integrated DB Center of National Geotechnical Information. To build a three-dimensional integrated underground space map based on the geotechnical information, the reliability of the geotechnical information should be verified by analyzing and evaluating the precision of the geotechnical information. Thereby, studies were conducted on the precision verification and evaluation of the constructed geotechnical information. Thereafter, it has been reviewed how to utilize geotechnical information in addition to analyzing the precision of the geotechnical information in order to visualize three dimensions in geotechnical information. As a further step to the practical DB application, a module is suggested in this study to improve the precision of geotechnical information for establishing reliable three dimensional integrated underground space maps based on the previous research results.

Analysis of Riding Quality Acceptability and Characteristics of Expressway Users and Evaluation of MRI Thresholds using Receiver Operating Characteristic curves (고속도로 이용자의 승차감 평가특성 및 만족도 분석과 ROC 곡선을 이용한 평탄성 관리기준 적정성 검토)

  • Lee, Jaehoon;Sohn, Ducksu;Ryu, SungWoo;Kim, Youngwon;Park, Junyoung
    • International Journal of Highway Engineering
    • /
    • v.20 no.2
    • /
    • pp.35-44
    • /
    • 2018
  • PURPOSES : The purpose of this research is to analyze the characteristics of panels that affect the evaluating results of riding quality and to evaluate the appropriateness of roughness management criteria based on ride comfort satisfaction. METHODS : In order to analyze the influence of panel characteristics of riding quality, 33 panels, consisting of civilians and experts, were selected. Also, considering the roughness distribution of the expressway, 35 sections with MRI ranging from 1.17 m/km to 4.65 m/km were selected. Each panel boarded a passenger car and evaluated the riding quality with grades from 0 to 10, and assessed whether it was satisfied or not. After removing outlier results using a box plot technique, 964 results were analyzed. An ANOVA was conducted to evaluate the effects of panel expertise, age, driving experience, vehicle ownership, and gender on the evaluation results. In addition, by using the receiver operating characteristics (ROC) curve, the MRI value, which can most accurately evaluate the satisfaction with riding quality, was derived. Then, the compatibility of MRI was evaluated using AUC as a criterion to assess whether the riding quality was satisfactory. RESULTS : Only the age of the panel participants were found to have an effect on the riding quality satisfaction. It was found that satisfaction with riding quality and MRI are strongly correlated. The satisfaction rate of roughness management criteria on new (MRI 1.6 m/km) and maintenance (MRI 3.0 m/km) expressways were 95% and 53%, respectively. As a result of evaluating the roughness management criteria by using the ROC curve, it was found that the accuracy of satisfaction was the highest at MRI 3.1-3.2 m/km. In addition, the AUC of the MRI was about 0.8, indicating that the MRI was an appropriate index for evaluating the riding quality satisfaction. CONCLUSIONS : Based on the results, the distribution of the panels' age should be considered when panel rating is conducted. From the results of the ROC curve, MRI of 3.0 m/km, which is a criterion of roughness management on maintenance expressways, is considered as appropriate.

Performance Enhancement of Algorithms based on Error Distributions under Impulsive Noise (충격성 잡음하에서 오차 분포에 기반한 알고리듬의 성능향상)

  • Kim, Namyong;Lee, Gyoo-yeong
    • Journal of Internet Computing and Services
    • /
    • v.19 no.3
    • /
    • pp.49-56
    • /
    • 2018
  • Euclidean distance (ED) between error distribution and Dirac delta function has been used as an efficient performance criterion in impulsive noise environmentsdue to the outlier-cutting effect of Gaussian kernel for error signal. The gradient of ED for its minimization has two components; $A_k$ for kernel function of error pairs and the other $B_k$ for kernel function of errors. In this paper, it is analyzed that the first component is to govern gathering close together error samples, and the other one $B_k$ is to conduct error-sample concentration on zero. Based upon this analysis, it is proposed to normalize $A_k$ and $B_k$ with power of inputs which are modified by kernelled error pairs or errors for the purpose of reinforcing their roles of narrowing error-gap and drawing error samples to zero. Through comparison of fluctuation of steady state MSE and value of minimum MSE in the results of simulation of multipath equalization under impulsive noise, their roles and efficiency of the proposed normalization method are verified.

Genetic signature of strong recent positive selection at interleukin-32 gene in goat

  • Asif, Akhtar Rasool;Qadri, Sumayyah;Ijaz, Nabeel;Javed, Ruheena;Ansari, Abdur Rahman;Awais, Muhammd;Younus, Muhammad;Riaz, Hasan;Du, Xiaoyong
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.30 no.7
    • /
    • pp.912-919
    • /
    • 2017
  • Objective: Identification of the candidate genes that play key roles in phenotypic variations can provide new information about evolution and positive selection. Interleukin (IL)-32 is involved in many biological processes, however, its role for the immune response against various diseases in mammals is poorly understood. Therefore, the current investigation was performed for the better understanding of the molecular evolution and the positive selection of single nucleotide polymorphisms in IL-32 gene. Methods: By using fixation index ($F_{ST}$) based method, IL-32 (9375) gene was found to be outlier and under significant positive selection with the provisional combined allocation of mean heterozygosity and $F_{ST}$. Using nucleotide sequences of 11 mammalian species from National Center for Biotechnology Information database, the evolutionary selection of IL-32 gene was determined using Maximum likelihood model method, through four models (M1a, M2a, M7, and M8) in Codeml program of phylogenetic analysis by maximum liklihood. Results: IL-32 is detected under positive selection using the $F_{ST}$ simulations method. The phylogenetic tree revealed that goat IL-32 was in close resemblance with sheep IL-32. The coding nucleotide sequences were compared among 11 species and it was found that the goat IL-32 gene shared identity with sheep (96.54%), bison (91.97%), camel (58.39%), cat (56.59%), buffalo (56.50%), human (56.13%), dog (50.97%), horse (54.04%), and rabbit (53.41%) respectively. Conclusion: This study provides evidence for IL-32 gene as under significant positive selection in goat.

Impacts and Tasks of Teacher Education Programs Revealed by Preservice Teachers: Students' Intact Beliefs (예비교사들을 통해 알아본 교사양성 프로그램의 효과 및 과제: 학생들의 변하지 않는 신념들)

  • Kwak, Young-Sun
    • Journal of the Korean earth science society
    • /
    • v.23 no.4
    • /
    • pp.309-323
    • /
    • 2002
  • This qualitative study investigated preservice teachers' understandings of the ontology and epistemology underlying constructivist notions of learning through four in-depth interviews. Of the sixteen participants in a larger study, five significantly changed ontological and epistemological beliefs and eleven did not. This study focused on these eleven teachers who have hardly changed their philosophical beliefs throughout the teacher education program. Ten teachers who consistently maintained the scientific realist beliefs were presented as a composite case (Young's case). Among the eleven teachers, there was one outlier who had consistently maintained an idealist and relativist epistemological position from the beginning of the study and was subjected to another case analysis (Ben's case). These cases corroborated the assertion that each individual's deeply entrenched ontological and epistemological beliefs are very hard to change. For researchers, this study offers insights into the reasons that preservice teachers give for non-changes in their thinking about learning to teach. The study also examines preservice teachers' perceived constraints in implementing their ideal pedagogies and the influence of the teacher education program on their pedagogical beliefs changes. The benefits and influences of the M.Ed. program's theoretical coursework and the field experiences on these teachers' learning-to-teach experiences are addressed with rich data. The implications for teacher educators as well as for the instructional practices of preservice teacher education programs are discussed. This research emphasize necessity of the field-based teacher education program and the need of empowering experienced school teachers as teacher educators in teacher preparation and professional development.

Analysis of the Optimal Window Size of Hampel Filter for Calibration of Real-time Water Level in Agricultural Reservoirs (농업용저수지의 실시간 수위 보정을 위한 Hampel Filter의 최적 Window Size 분석)

  • Joo, Dong-Hyuk;Na, Ra;Kim, Ha-Young;Choi, Gyu-Hoon;Kwon, Jae-Hwan;Yoo, Seung-Hwan
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.64 no.3
    • /
    • pp.9-24
    • /
    • 2022
  • Currently, a vast amount of hydrologic data is accumulated in real-time through automatic water level measuring instruments in agricultural reservoirs. At the same time, false and missing data points are also increasing. The applicability and reliability of quality control of hydrological data must be secured for efficient agricultural water management through calculation of water supply and disaster management. Considering the characteristics of irregularities in hydrological data caused by irrigation water usage and rainfall pattern, the Korea Rural Community Corporation is currently applying the Hampel filter as a water level data quality management method. This method uses window size as a key parameter, and if window size is large, distortion of data may occur and if window size is small, many outliers are not removed which reduces the reliability of the corrected data. Thus, selection of the optimal window size for individual reservoir is required. To ensure reliability, we compared and analyzed the RMSE (Root Mean Square Error) and NSE (Nash-Sutcliffe model efficiency coefficient) of the corrected data and the daily water level of the RIMS (Rural Infrastructure Management System) data, and the automatic outlier detection standards used by the Ministry of Environment. To select the optimal window size, we used the classification performance evaluation index of the error matrix and the rainfall data of the irrigation period, showing the optimal values at 3 h. The efficient reservoir automatic calibration technique can reduce manpower and time required for manual calibration, and is expected to improve the reliability of water level data and the value of water resources.

Estimation of Moisture Content in Cucumber and Watermelon Seedlings Using Hyperspectral Imagery (초분광영상 이용 오이 및 수박 묘의 수분함량 추정)

  • Kim, Seong-Heon;Kang, Jeong-Gyun;Ryu, Chan-Seok;Kang, Ye-Seong;Sarkar, Tapash Kumar;Kang, Dong Hyeon;Ku, Yang-Gyu;Kim, Dong-Eok
    • Journal of Bio-Environment Control
    • /
    • v.27 no.1
    • /
    • pp.34-39
    • /
    • 2018
  • This research was conducted to estimate moisture content in cucurbitaceae seedlings, such as cucumber and watermelon, using hyperspectral imagery. Using a hyperspectral image acquisition system, the reflectance of leaf area of cucumber and watermelon seedlings was calculated after providing water stress. Then, moisture content in each seedling was measured by using a dry oven. Finally, using reflectance and moisture content, the moisture content estimation models were developed by PLSR analysis. After developing the estimation models, performance of the cucumber showed 0.73 of $R^2$, 1.45% of RMSE, and 1.58% of RE. Performance of the watermelon showed 0.66 of $R^2$, 1.06% of RMSE, and 1.14% of RE. The model performed slightly better after removing one sample from cucumber seedlings as outlier and unnecessary. Hence, the performance of new model for cucumber seedlings showed 0.79 of $R^2$, 1.10% of RMSE, and 1.20% of RE. The model performance combined with all samples showed 0.67 of $R^2$, 1.26% of RMSE, and 1.36% of RE. The model of cucumber showed better performance than the model of watermelon. This is because variables of cucumber are consisted of widely distributed variation, and it affected the performance. Further, accuracy and precision of the cucumber model were increased when an insignificant sample was eliminated from the dataset. Finally, it is considered that both models can be significantly used to estimate moisture content, as gradients of trend line are almost same and intersected. It is considered that the accuracy and precision of the estimating models possibly can be improved, if the models are constructed by using variables with widely distributed variation. The improved models will be utilized as the basis for developing low-priced sensors.