• Title/Summary/Keyword: suspicious data

Search Result 105, Processing Time 0.024 seconds

Validation of Quality Control Algorithms for Temperature Data of the Republic of Korea (한국의 기온자료 품질관리 알고리즘의 검증)

  • Park, Changyong;Choi, Youngeun
    • Atmosphere
    • /
    • v.22 no.3
    • /
    • pp.299-307
    • /
    • 2012
  • This study is aimed to validate errors for detected suspicious temperature data using various quality control procedures for 61 weather stations in the Republic of Korea. The quality control algorithms for temperature data consist of four main procedures (high-low extreme check, internal consistency check, temporal outlier check, and spatial outlier check). Errors of detected suspicious temperature data are judged by examining temperature data of nearby stations, surface weather charts, hourly temperature data, daily precipitation, and daily maximum wind direction. The number of detected errors in internal consistency check and spatial outlier check showed 4 days (3 stations) and 7 days (5 stations), respectively. Effective and objective methods for validation errors through this study will help to reduce manpower and time for conduct of quality management for temperature data.

Bayesian Outlier Detection in Regression Model

  • Younshik Chung;Kim, Hyungsoon
    • Journal of the Korean Statistical Society
    • /
    • v.28 no.3
    • /
    • pp.311-324
    • /
    • 1999
  • The problem of 'outliers', observations which look suspicious in some way, has long been one of the most concern in the statistical structure to experimenters and data analysts. We propose a model for an outlier problem and also analyze it in linear regression model using a Bayesian approach. Then we use the mean-shift model and SSVS(George and McCulloch, 1993)'s idea which is based on the data augmentation method. The advantage of proposed method is to find a subset of data which is most suspicious in the given model by the posterior probability. The MCMC method(Gibbs sampler) can be used to overcome the complicated Bayesian computation. Finally, a proposed method is applied to a simulated data and a real data.

  • PDF

Clinical Value of Dividing False Positive Urine Cytology Findings into Three Categories: Atypical, Indeterminate, and Suspicious of Malignancy

  • Matsumoto, Kazumasa;Ikeda, Masaomi;Hirayama, Takahiro;Nishi, Morihiro;Fujita, Tetsuo;Hattori, Manabu;Sato, Yuichi;Ohbu, Makoto;Iwam, Masatsugu
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.5
    • /
    • pp.2251-2255
    • /
    • 2014
  • Background: The aim of this study was to evaluate 10 years of false positive urine cytology records, along with follow-up histologic and cytologic data, to determine the significance of suspicious urine cytology findings. Materials and Methods: We retrospectively reviewed records of urine samples harvested between January 2002 and December 2012 from voided and catheterized urine from the bladder. Among the 21,283 urine samples obtained during this period, we located 1,090 eligible false positive findings for patients being evaluated for the purpose of confirming urothelial carcinoma (UC). These findings were divided into three categories: atypical, indeterminate, and suspicious of malignancy. Results: Of the 1,090 samples classified as false positive, 444 (40.7%) were categorized as atypical, 367 (33.7%) as indeterminate, and 279 (25.6%) as suspicious of malignancy. Patients with concomitant UC accounted for 105 (23.6%) of the atypical samples, 147 (40.1%) of the indeterminate samples, and 139 (49.8%) of the suspicious of malignancy samples (p<0.0001). The rate of subsequent diagnosis of UC during a 1-year follow-up period after harvesting of a sample with false positive urine cytology initially diagnosed as benign was significantly higher in the suspicious of malignancy category than in the other categories (p<0.001). The total numbers of UCs were 150 (33.8%) for atypical samples, 213 (58.0%) for indeterminate samples, and 199 (71.3%) for samples categorized as suspicious of malignancy. Conclusions: Urine cytology remains the most specific adjunctive method for the surveillance of UC. We demonstrated the clinical value of dividing false positive urine cytology findings into three categories, and our results may help clinicians better manage patients with suspicious findings.

A Bayesian Approach to Detecting Outliers Using Variance-Inflation Model

  • Lee, Sangjeen;Chung, Younshik
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.3
    • /
    • pp.805-814
    • /
    • 2001
  • The problem of 'outliers', observations which look suspicious in some way, has long been one of the most concern in the statistical structure to experimenters and data analysts. We propose a model for outliers problem and also analyze it in linear regression model using a Bayesian approach with the variance-inflation model. We will use Geweke's(1996) ideas which is based on the data augmentation method for detecting outliers in linear regression model. The advantage of the proposed method is to find a subset of data which is most suspicious in the given model by the posterior probability The sampling based approach can be used to allow the complicated Bayesian computation. Finally, our proposed methodology is applied to a simulated and a real data.

  • PDF

Malignancy Risk Stratification of Thyroid Nodules with Macrocalcification and Rim Calcification Based on Ultrasound Patterns

  • Hwa Seon Shin;Dong Gyu Na;Wooyul Paik;So Jin Yoon;Hye Yun Gwon;Byeong-Joo Noh;Won Jun Kim
    • Korean Journal of Radiology
    • /
    • v.22 no.4
    • /
    • pp.663-671
    • /
    • 2021
  • Objective: To determine the association of macrocalcification and rim calcification with malignancy and to stratify the malignancy risk of thyroid nodules with macrocalcification and rim calcification based on ultrasound (US) patterns. Materials and Methods: The study included a total of 3603 consecutive nodules (≥ 1 cm) with final diagnoses. The associations of macrocalcification and rim calcification with malignancy and malignancy risk of the nodules were assessed overall and in subgroups based on the US patterns of the nodules. The malignancy risk of the thyroid nodules was categorized as high (> 50%), intermediate (upper-intermediate: > 30%, ≤ 50%; lower-intermediate: > 10%, ≤ 30%), and low (≤ 10%). Results: Macrocalcification was independently associated with malignancy in all nodules and solid hypoechoic (SH) nodules (p < 0.001). Rim calcification was not associated with malignancy in all nodules (p = 0.802); however, it was independently associated with malignancy in partially cystic or isoechoic and hyperechoic (PCIH) nodules (p = 0.010). The malignancy risks of nodules with macrocalcification were classified as upper-intermediate and high in SH nodules, and as low and lower-intermediate in PCIH nodules based on suspicious US features. The malignancy risks of nodules with rim calcification were stratified as low and lower-intermediate based on suspicious US features. Conclusion: Macrocalcification increased the malignancy risk in all and SH nodules with or without suspicious US features, with low to high malignancy risks depending on the US patterns. Rim calcification increased the malignancy risk in PCIH nodules, with low and lower-intermediate malignancy risks based on suspicious US features. However, the role of rim calcification in risk stratification of thyroid nodules remains uncertain.

A Realtime Malware Detection Technique Using Multiple Filter (다중 필터를 이용한 실시간 악성코드 탐지 기법)

  • Park, Jae-Kyung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.7
    • /
    • pp.77-85
    • /
    • 2014
  • Recently, several environment damage caused by malicious or suspicious code is increasing. We study comprehensive response system actively for malware detection. Suspicious code is installed on your PC without your consent, users are unaware of the damage. Also, there are need to technology for realtime processing of Big Data. We must develope advanced technology for malware detection. We must analyze the static, dynamic of executable file for fundamentally malware detection in recently and verified by a reputation for verification. It is need to judgment of similarity for realtime response with big data. In this paper, we proposed realtime detection and verification technology using multiple filter. Our malware study suggests a new direction of realtime malware detection.

A Study for Quality Improvement of Three-dimensional Body Measurement Data (3차원 인체치수 조사 자료의 품질 개선을 위한 연구)

  • Park, Sun-Mi;Nam, Yun-Ja;Park, Jin-Woo
    • Journal of the Ergonomics Society of Korea
    • /
    • v.28 no.4
    • /
    • pp.117-124
    • /
    • 2009
  • To inspect the quality of data collected from a large-scale body measurement and investigation project, it is necessary to establish a proper data editing process. The three-dimensional body measurement may have measuring errors caused from measurer's proficiency or changes in the subject's posture. And it may also have errors caused in the process of algorithm expressing the information obtained from the three-dimensional scanner into numerical values, and in the course of data-processing dealing with numerous data for individuals. When those errors are found, the quality of the measured data is deteriorated, and they consequently reduce the quality of statistics which was conducted on the basis of it. Therefore this study intends to suggest a new way to improve the quality of the data collected from the three-dimensional body measurement by proposing a working procedure identifying data errors and correcting them from the whole data processing procedure-collecting, processing, and analyzing- of the 2004 Size Korea Three-dimensional Body Measurement Project. This study was carried out into three stages: Firstly, we detected erroneous data by examining of logical relations among variables under each edit rule. Secondly, we detected suspicious data through independent examination of individual variable value by sex and age. Finally, we examined scatter-plot matrix of many variables to consider the relationships among them. This simple graphical tool helps us to find out whether some suspicious data exist in the data set or not. As a result of this study, we detected some erroneous data included in the raw data. We figured out that the main errors are not because of the system errors that the three-dimensional body measurement system has but because of the subject's original three-dimensional shape data. Therefore by correcting some erroneous data, we have enhanced data quality.

Simulated Dynamic C&C Server Based Activated Evidence Aggregation of Evasive Server-Side Polymorphic Mobile Malware on Android

  • Lee, Han Seong;Lee, Hyung-Woo
    • International journal of advanced smart convergence
    • /
    • v.6 no.1
    • /
    • pp.1-8
    • /
    • 2017
  • Diverse types of malicious code such as evasive Server-side Polymorphic are developed and distributed in third party open markets. The suspicious new type of polymorphic malware has the ability to actively change and morph its internal data dynamically. As a result, it is very hard to detect this type of suspicious transaction as an evidence of Server-side polymorphic mobile malware because its C&C server was shut downed or an IP address of remote controlling C&C server was changed irregularly. Therefore, we implemented Simulated C&C Server to aggregate activated events perfectly from various Server-side polymorphic mobile malware. Using proposed Simulated C&C Server, we can proof completely and classify veiled server-side polymorphic malicious code more clearly.

Fuzzy Darwinian Detection of Credit Card Fraud (퍼지-다윈의 불량 신용 탐지 시스템)

  • Bentley, Peter J.;Kim, Jung-Won;Jung, Gil-Ho;Choi, Jong-Uk
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2000.10a
    • /
    • pp.277-280
    • /
    • 2000
  • Credit evaluation is one of the most important and difficult tasks fur credit card companies, mortgage companies, banks and other financial institutes. Incorrect credit judgement causes huge financial losses. This work describes the use of an evolutionary-fuzzy system capable of classifying suspicious and non-suspicious credit card transactions. The paper starts with the details of the system used in this work. A series of experiments are described, showing that the complete system is capable of attaining good accuracy and intelligibility levels for real data.

  • PDF

A Study on the Factors Influencing Depression among Elderly People with, and without, Dementia (치매노인, 치매의심노인 및 일반노인의 우울에 영향을 미치는 요인)

  • Lee Keum-Jae;Lee Shin-Young
    • Journal of Korean Academy of Fundamentals of Nursing
    • /
    • v.11 no.2
    • /
    • pp.166-176
    • /
    • 2004
  • Purpose: The purpose of this study was to identify the factors that affect depression among elderly people with, and without, dementia. Method: The participants were 903 people who were 65 or older and resided in Sungnam City. Data were collected from April to July 2002 using a questionnaire. The collected data were analyzed using descriptive statistics and hierarchical multiple regression aided by SPSS/PC. Result: The variables at the final step of the regression equation accounted for 28.2% of variance in the dementia group, 21.4% in the group with suspicious dementia, and 18.9% in the normal group. The multiple regression analysis revealed that ADL and instrumental support were related significantly to depression in the dementia group. Self-rated health, IADL, social activity support, and instrumental support were significantly related to depression in the group with suspicious dementia. In the normal group, education, self-rated health, and living arrangement with family were significantly related to depression. Conclusion: Social support and health condition are important to decrease depression in elderly people with dementia.

  • PDF