• Title/Summary/Keyword: Spatial Scan Statistic

Search Result 8, Processing Time 0.024 seconds

Optimizing the maximum reported cluster size for normal-based spatial scan statistics

  • Yoo, Haerin;Jung, Inkyung
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.4
    • /
    • pp.373-383
    • /
    • 2018
  • The spatial scan statistic is a widely used method to detect spatial clusters. The method imposes a large number of scanning windows with pre-defined shapes and varying sizes on the entire study region. The likelihood ratio test statistic comparing inside versus outside each window is then calculated and the window with the maximum value of test statistic becomes the most likely cluster. The results of cluster detection respond sensitively to the shape and the maximum size of scanning windows. The shape of scanning window has been extensively studied; however, there has been relatively little attention on the maximum scanning window size (MSWS) or maximum reported cluster size (MRCS). The Gini coefficient has recently been proposed by Han et al. (International Journal of Health Geographics, 15, 27, 2016) as a powerful tool to determine the optimal value of MRCS for the Poisson-based spatial scan statistic. In this paper, we apply the Gini coefficient to normal-based spatial scan statistics. Through a simulation study, we evaluate the performance of the proposed method. We illustrate the method using a real data example of female colorectal cancer incidence rates in South Korea for the year 2009.

A Study on Spatial Statistical Perspective for Analyzing Spatial Phenomena in the Framework of GIS: an Empirical Example using Spatial Scan Statistic for Detecting Spatial Clusters of Breast Cancer Incidents (공간현상 분석을 위한 GIS 기반의 공간통계적 접근방법에 관한 고찰: 공간 군집지역 탐색을 위한 공간검색통계량의 실증적 사례분석)

  • Lee, Gyoung-Ju;Kweon, Ihl
    • Spatial Information Research
    • /
    • v.20 no.1
    • /
    • pp.81-90
    • /
    • 2012
  • When analyzing geographical phenomena, two properties need to be considered. One is the spatial dependence structure and the other is a variation or an uncertainty inhibited in a geographic space. Two problems are encountered due to the properties. Firstly, spatial dependence structure, which is conceptualized as spatial autocorrelation, generates heterogeneous geographic landscape in a spatial process. Secondly, generic statistics, although suitable for dealing with stochastic uncertainty, tacitly ignores location information im plicit in spatial data. GIS is a versatile tool for manipulating locational information, while spatial statistics are suitable for investigating spatial uncertainty. Therefore, integrating spatial statistics to GIS is considered as a plausible strategy for appropriately understanding geographic phenomena of interest. Geographic hot-spot analysis is a key tool for identifying abnormal locations in many domains (e.g., criminology, epidemiology, etc.) and is one of the most prominent applications by utilizing the integration strategy. The article aims at reviewing spatial statistical perspective for analyzing spatial processes in the framework of GIS by carrying out empirical analysis. Illustrated is the analysis procedure of using spatial scan statistic for detecting clusters in the framework of GIS. The empirical analysis targets for identifying spatial clusters of breast cancer incidents in Erie and Niagara counties, New York.

Cancer cluster detection using scan statistic (스캔 통계량을 이용한 암 클러스터 탐색)

  • Han, Junhee;Lee, Minjung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.5
    • /
    • pp.1193-1201
    • /
    • 2016
  • In epidemiology or etiology, we are often interested in identifying areas of elevated risk, so called, hot spot or cluster. Many existing clustering methods only tend to a result if there exists any clustering pattern in study area. Recently, however, lots of newly introduced clustering methods can identify the location, size, and shape of clusters and test if the clusters are statistically significant as well. In this paper, one of most commonly used clustering methods, scan statistic, and its implementation SaTScan software, which is freely available, will be introduced. To exemplify the usage of SaTScan software, we used cancer data from the SEER program of National Cancer Institute of U.S.A.We aimed to help researchers and practitioners, who are interested in spatial cluster detection, using female lung cancer mortality data of the SEER program.

Cluster of Parasite Infections by the Spatial Scan Analysis in Korea

  • Bae, Kyoung-Eun;Chang, Yoon Kyung;Kim, Tong-Soo;Hong, Sung-Jong;Ahn, Hye-Jin;Nam, Ho-Woo;Kim, Dongjae
    • Parasites, Hosts and Diseases
    • /
    • v.58 no.6
    • /
    • pp.603-608
    • /
    • 2020
  • This study was performed to find out the clusters with high parasite infection risk to discuss the geographical pattern. Clusters were detected using SatScan software, which is a statistical spatial scan program using Kulldorff's scan statistic. Information on the parasitic infection cases in Korea 2011-2019 were collected from the Korea Centers for Disease Control and Prevention. Clusters of Ascaris lumbricoides infection were detected in Jeollabuk-do, and T. trichiura in Ulsan, Busan, and Gyeongsangnam-do. C. sinensis clusters were detected in Ulsan, Daegu, Busan, Gyeongsangnamdo, and Gyeongsangbuk-do. Clusters of intestinal trematodes were detected in Ulsan, Busan, and Gyeongsangnam-do. P. westermani cluster was found in Jeollabuk-do. E. vermicularis clusters were distributed in Gangwon-do, Jeju-do, Daegu, Daejeon, and Gwangju. This clustering information can be referred for surveillance and control on the parasitic infection outbreak in the infection-prone areas.

Spatial Cluster Analysis for Earthquake on the Korean Peninsula

  • Kang, Chang-Wan;Moon, Sung-Ho;Cho, Jang-Sik;Lee, Jeong-Hyeong;Choi, Seung-Bae;Beum, Soo-Gyun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.4
    • /
    • pp.1141-1150
    • /
    • 2006
  • In this study, we performed spatial cluster analysis which considered spatial information using earthquake data for Korean peninsula occurred on 1978 year to 2005 year. Also, we look into how to be clustered for regions using earthquake magnitude and frequency based on spatial scan statistic. And, on the basis of the results, we constructed earthquake map by earthquake outbreak risk and gave a possible explanation for the results of spatial cluster analysis.

  • PDF

Spatial analysis of $PM_{10}$ and cardiovascular mortality in the Seoul metropolitan area

  • Lim, Yu-Ra;Bae, Hyun-Joo;Lim, Youn-Hee;Yu, Seungdo;Kim, Geun-Bae;Cho, Yong-Sung
    • Environmental Analysis Health and Toxicology
    • /
    • v.29
    • /
    • pp.5.1-5.7
    • /
    • 2014
  • Objectives Numerous studies have revealed the adverse health effects of acute and chronic exposure to particulate matter less than $10{\mu}m$ in aerodynamic diameter ($PM_{10}$). The aim of the present study was to examine the spatial distribution of $PM_{10}$ concentrations and cardiovascular mortality and to investigate the spatial correlation between $PM_{10}$ and cardiovascular mortality using spatial scan statistic (SaTScan) and a regression model. Methods From 2008 to 2010, the spatial distribution of $PM_{10}$ in the Seoul metropolitan area was examined via kriging. In addition, a group of cardiovascular mortality cases was analyzed using SaTScan-based cluster exploration. Geographically weighted regression (GWR) was applied to investigate the correlation between $PM_{10}$ concentrations and cardiovascular mortality. Results An examination of the regional distribution of the cardiovascular mortality was higher in provincial districts (gu) belonging to Incheon and the northern part of Gyeonggi-do than in other regions. In a comparison of $PM_{10}$ concentrations and mortality cluster (MC) regions, all those belonging to MC 1 and MC 2 were found to belong to particulate matter (PM) 1 and PM 2 with high concentrations of air pollutants. In addition, the GWR showed that $PM_{10}$ has a statistically significant relation to cardiovascular mortality. Conclusions To investigate the relation between air pollution and health impact, spatial analyses can be utilized based on kriging, cluster exploration, and GWR for a more systematic and quantitative analysis. It has been proven that cardiovascular mortality is spatially related to the concentration of $PM_{10}$.

Cluster exploration of water pipe leak and complaints surveillance using a spatio-temporal statistical analysis (스캔통계량 분석을 통한 상수도 누수 및 수질 민원 발생 클러스터 탐색)

  • Juwon Lee;Eunju Kim;Sookhyun Nam;Tae-Mun Hwang
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.37 no.5
    • /
    • pp.261-269
    • /
    • 2023
  • In light of recent social concerns related to issues such as water supply pipe deterioration leading to problems like leaks and degraded water quality, the significance of maintenance efforts to enhance water source quality and ensure a stable water supply has grown substantially. In this study, scan statistic was applied to analyze water quality complaints and water leakage accidents from 2015 to 2021 to present a reasonable method to identify areas requiring improvement in water management. SaTScan, a spatio-temporal statistical analysis program, and ArcGIS were used for spatial information analysis, and clusters with high relative risk (RR) were determined using the maximum log-likelihood ratio, relative risk, and Monte Carlo hypothesis test for I city, the target area. Specifically, in the case of water quality complaints, the analysis results were compared by distinguishing cases occurring before and after the onset of "red water." The period between 2015 and 2019 revealed that preceding the occurrence of red water, the leak cluster at location L2 posed a significantly higher risk (RR: 2.45) than other regions. As for water quality complaints, cluster C2 exhibited a notably elevated RR (RR: 2.21) and appeared concentrated in areas D and S, respectively. On the other hand, post-red water incidents of water quality complaints were predominantly concentrated in area S. The analysis found that the locations of complaint clusters were similar to those of red water incidents. Of these, cluster C7 exhibited a substantial RR of 4.58, signifying more than a twofold increase compared to pre-incident levels. A kernel density map analysis was performed using GIS to identify priority areas for waterworks management based on the central location of clusters and complaint cluster RR data.

A Study on the Regional Characteristics of Broadband Internet Termination by Coupling Type using Spatial Information based Clustering (공간정보기반 클러스터링을 이용한 초고속인터넷 결합유형별 해지의 지역별 특성연구)

  • Park, Janghyuk;Park, Sangun;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.45-67
    • /
    • 2017
  • According to the Internet Usage Research performed in 2016, the number of internet users and the internet usage have been increasing. Smartphone, compared to the computer, is taking a more dominant role as an internet access device. As the number of smart devices have been increasing, some views that the demand on high-speed internet will decrease; however, Despite the increase in smart devices, the high-speed Internet market is expected to slightly increase for a while due to the speedup of Giga Internet and the growth of the IoT market. As the broadband Internet market saturates, telecom operators are over-competing to win new customers, but if they know the cause of customer exit, it is expected to reduce marketing costs by more effective marketing. In this study, we analyzed the relationship between the cancellation rates of telecommunication products and the factors affecting them by combining the data of 3 cities, Anyang, Gunpo, and Uiwang owned by a telecommunication company with the regional data from KOSIS(Korean Statistical Information Service). Especially, we focused on the assumption that the neighboring areas affect the distribution of the cancellation rates by coupling type, so we conducted spatial cluster analysis on the 3 types of cancellation rates of each region using the spatial analysis tool, SatScan, and analyzed the various relationships between the cancellation rates and the regional data. In the analysis phase, we first summarized the characteristics of the clusters derived by combining spatial information and the cancellation data. Next, based on the results of the cluster analysis, Variance analysis, Correlation analysis, and regression analysis were used to analyze the relationship between the cancellation rates data and regional data. Based on the results of analysis, we proposed appropriate marketing methods according to the region. Unlike previous studies on regional characteristics analysis, In this study has academic differentiation in that it performs clustering based on spatial information so that the regions with similar cancellation types on adjacent regions. In addition, there have been few studies considering the regional characteristics in the previous study on the determinants of subscription to high-speed Internet services, In this study, we tried to analyze the relationship between the clusters and the regional characteristics data, assuming that there are different factors depending on the region. In this study, we tried to get more efficient marketing method considering the characteristics of each region in the new subscription and customer management in high-speed internet. As a result of analysis of variance, it was confirmed that there were significant differences in regional characteristics among the clusters, Correlation analysis shows that there is a stronger correlation the clusters than all region. and Regression analysis was used to analyze the relationship between the cancellation rate and the regional characteristics. As a result, we found that there is a difference in the cancellation rate depending on the regional characteristics, and it is possible to target differentiated marketing each region. As the biggest limitation of this study and it was difficult to obtain enough data to carry out the analyze. In particular, it is difficult to find the variables that represent the regional characteristics in the Dong unit. In other words, most of the data was disclosed to the city rather than the Dong unit, so it was limited to analyze it in detail. The data such as income, card usage information and telecommunications company policies or characteristics that could affect its cause are not available at that time. The most urgent part for a more sophisticated analysis is to obtain the Dong unit data for the regional characteristics. Direction of the next studies be target marketing based on the results. It is also meaningful to analyze the effect of marketing by comparing and analyzing the difference of results before and after target marketing. It is also effective to use clusters based on new subscription data as well as cancellation data.