• 제목/요약/키워드: location-based clustering

Search Result 168, Processing Time 0.028 seconds

The Study of Land Surface Change Detection Using Long-Term SPOT/VEGETATION (장기간 SPOT/VEGETATION 정규화 식생지수를 이용한 지면 변화 탐지 개선에 관한 연구)

  • Yeom, Jong-Min;Han, Kyung-Soo;Kim, In-Hwan
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.13 no.4
    • /
    • pp.111-124
    • /
    • 2010
  • To monitor the environment of land surface change is considered as an important research field since those parameters are related with land use, climate change, meteorological study, agriculture modulation, surface energy balance, and surface environment system. For the change detection, many different methods have been presented for distributing more detailed information with various tools from ground based measurement to satellite multi-spectral sensor. Recently, using high resolution satellite data is considered the most efficient way to monitor extensive land environmental system especially for higher spatial and temporal resolution. In this study, we use two different spatial resolution satellites; the one is SPOT/VEGETATION with 1 km spatial resolution to detect coarse resolution of the area change and determine objective threshold. The other is Landsat satellite having high resolution to figure out detailed land environmental change. According to their spatial resolution, they show different observation characteristics such as repeat cycle, and the global coverage. By correlating two kinds of satellites, we can detect land surface change from mid resolution to high resolution. The K-mean clustering algorithm is applied to detect changed area with two different temporal images. When using solar spectral band, there are complicate surface reflectance scattering characteristics which make surface change detection difficult. That effect would be leading serious problems when interpreting surface characteristics. For example, in spite of constant their own surface reflectance value, it could be changed according to solar, and sensor relative observation location. To reduce those affects, in this study, long-term Normalized Difference Vegetation Index (NDVI) with solar spectral channels performed for atmospheric and bi-directional correction from SPOT/VEGETATION data are utilized to offer objective threshold value for detecting land surface change, since that NDVI has less sensitivity for solar geometry than solar channel. The surface change detection based on long-term NDVI shows improved results than when only using Landsat.

A study on the estimation of AADT by short-term traffic volume survey (단기조사 교통량을 이용한 AADT 추정연구)

  • 이승재;백남철;권희정
    • Journal of Korean Society of Transportation
    • /
    • v.20 no.6
    • /
    • pp.59-68
    • /
    • 2002
  • AADT(Annual Average Daily Traffic) can be obtained by using short-term counted traffic data rather than using traffic data collected for 365 days. The process is a very important in estimating AADT using short-term traffic count data. Therefore, There have been many studies about estimating AADT. In this Paper, we tried to improve the process of the AADT estimation based on the former AADT estimation researches. Firstly, we found the factor showing differences among groups. To do so, we examined hourly variables(divided to total hours, weekday hours. Saturday hours, Sunday hours, weekday and Sunday hours, and weekday and Saturday hours) every time changing the number of groups. After all, we selected the hourly variables of Sunday and weekday as the factor showing differences among groups. Secondly, we classified 200 locations into 10 groups through cluster analysis using only monthly variables. The nile of deciding the number of groups is maximizing deviation among hourly variables of each group. Thirdly, we classified 200 locations which had been used in the second step into the 10 groups by applying statistical techniques such as Discriminant analysis and Neural network. This step is for testing the rate of distinguish between the right group including each location and a wrong one. In conclusion, the result of this study's method was closer to real AADT value than that of the former method. and this study significantly contributes to improve the method of AADT estimation.

Analysis of public library book loan demand according to weather conditions using machine learning (머신러닝을 활용한 기상조건에 따른 공공도서관 도서대출 수요분석)

  • Oh, Min-Ki;Kim, Keun-Wook;Shin, Se-Young;Lee, Jin-Myeong;Jang, Won-Jun
    • Journal of Digital Convergence
    • /
    • v.20 no.3
    • /
    • pp.41-52
    • /
    • 2022
  • Although domestic public libraries achieved quantitative growth based on the 1st and 2nd comprehensive library development plans, there were some qualitative shortcomings, and various studies have been conducted to improve them. Most of the preceding studies have limitations in that they are limited to social and economic factors and statistical analysis. Therefore, in this study, by applying the spatiotemporal concept to quantitatively calculate the decrease in public library loan demand due to rainfall and heatwave, by clustering areas with high demand for book loan due to weather changes and areas where it is not, factors inside and outside public libraries and After the combination, changes in public library loan demand according to weather changes were analyzed. As a result of the analysis, there was a difference in the decrease due to the weather for each public library, and it was found that there were some differences depending on the characteristics and spatial location of the public library. Also, when the temperature was over 35℃, the decrease in book loan demand increased significantly. As internal factors, the number of seats, the number of books, and area were derived. As external factors, the public library access ramp, cafe, reading room, floating population in their teens, and floating population of women in their 30s/40s were analyzed as important variables. The results of this analysis are judged to contribute to the establishment of policies to promote the use of public libraries in consideration of the weather in a specific season, and also suggested limitations of the study.

Location Classification and Its Utilization for Illegal Parking Enforcement: Focusing on the Case of Gyeonggi (불법주정차 단속을 위한 지역(장소) 분류 및 활용 방안: 경기도를 중심으로)

  • Hyeon Han;So-yeon Choe;So-Hyun Lee
    • Information Systems Review
    • /
    • v.25 no.4
    • /
    • pp.113-130
    • /
    • 2023
  • Due to economic development and increasing gross national income, the number of automobiles continues to rise, leading to a serious issue of illegal parking due to limited road conditions and insufficient parking facilities. Illegal parking causes significant inconvenience and displeasure to people and can even result in accidents and loss of lives. The severity of accidents and their consequences, related to the growing number of vehicles and illegal parking, is escalating, particularly in the metropolitan areas. Consequently, efforts are being made to address this problem as a cause of social issues and come up with measures to reduce illegal parking. In particular, half of the public complaints in the metropolitan area are related to illegal parking, and the highest physical and human damage occurs in Gyeonggi. Thus, this study aims to use machine learning techniques based on data related to illegal parking in Suwon city, Gyeonggi, to categorize regional characteristics and propose effective measures to crack down on illegal parking. Additionally, practical, social, policy, and legal measures to decrease illegal parking in the metropolitan area are suggested. This study has academic significance in that it solved the problem of illegal parking, which is mentioned as one of the social problems that cause traffic congestion, by classifying regional characteristics using K-prototype, a machine learning algorithm. Furthermore, the results of this study contribute to practical and social aspects by providing measures to decrease illegal parking in the metropolitan area.

The Behavior Analysis of Exhibition Visitors using Data Mining Technique at the KIDS & EDU EXPO for Children (유아교육 박람회에서 데이터마이닝 기법을 이용한 전시 관람 행동 패턴 분석)

  • Jung, Min-Kyu;Kim, Hyea-Kyeong;Choi, Il-Young;Lee, Kyoung-Jun;Kim, Jae-Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.2
    • /
    • pp.77-96
    • /
    • 2011
  • An exhibition is defined as market events for specific duration to present exhibitors' main products to business or private visitors, and it plays a key role as effective marketing channels. As the importance of exhibition is getting more and more, domestic exhibition industry has achieved such a great quantitative growth. But, In contrast to the quantitative growth of domestic exhibition industry, the qualitative growth of Exhibition has not achieved competent growth. In order to improve the quality of exhibition, we need to understand the preference or behavior characteristics of visitors and to increase the level of visitors' attention and satisfaction through the understanding of visitors. So, in this paper, we used the observation survey method which is a kind of field research to understand visitors and collect the real data for the analysis of behavior pattern. And this research proposed the following methodology framework consisting of three steps. First step is to select a suitable exhibition to apply for our method. Second step is to implement the observation survey method. And we collect the real data for further analysis. In this paper, we conducted the observation survey method to obtain the real data of the KIDS & EDU EXPO for Children in SETEC. Our methodology was conducted on 160 visitors and 78 booths from November 4th to 6th in 2010. And, the last step is to analyze the record data through observation. In this step, we analyze the feature of exhibition using Demographic Characteristics collected by observation survey method at first. And then we analyze the individual booth features by the records of visited booth. Through the analysis of individual booth features, we can figure out what kind of events attract the attention of visitors and what kind of marketing activities affect the behavior pattern of visitors. But, since previous research considered only individual features influenced by exhibition, the research about the correlation among features is not performed much. So, in this research, additional analysis is carried out to supplement the existing research with data mining techniques. And we analyze the relation among booths using data mining techniques to know behavior patterns of visitors. Among data mining techniques, we make use of two data mining techniques, such as clustering analysis and ARM(Association Rule Mining) analysis. In clustering analysis, we use K-means algorithm to figure out the correlation among booths. Through data mining techniques, we figure out that there are two important features to affect visitors' behavior patterns in exhibition. One is the geographical features of booths. The other is the exhibit contents of booths. Those features are considered when the organizer of exhibition plans next exhibition. Therefore, the results of our analysis are expected to provide guideline to understanding visitors and some valuable insights for the exhibition from the earlier phases of exhibition planning. Also, this research would be a good way to increase the quality of visitor satisfaction. Visitors' movement paths, booth location, and distances between each booth are considered to plan next exhibition in advance. This research was conducted at the KIDS & EDU EXPO for Children in SETEC(Seoul Trade Exhibition & Convention), but it has some constraints to be applied directly to other exhibitions. Also, the results were derived from a limited number of data samples. In order to obtain more accurate and reliable results, it is necessary to conduct more experiments based on larger data samples and exhibitions on a variety of genres.

Derivation of Digital Music's Ranking Change Through Time Series Clustering (시계열 군집분석을 통한 디지털 음원의 순위 변화 패턴 분류)

  • Yoo, In-Jin;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.3
    • /
    • pp.171-191
    • /
    • 2020
  • This study focused on digital music, which is the most valuable cultural asset in the modern society and occupies a particularly important position in the flow of the Korean Wave. Digital music was collected based on the "Gaon Chart," a well-established music chart in Korea. Through this, the changes in the ranking of the music that entered the chart for 73 weeks were collected. Afterwards, patterns with similar characteristics were derived through time series cluster analysis. Then, a descriptive analysis was performed on the notable features of each pattern. The research process suggested by this study is as follows. First, in the data collection process, time series data was collected to check the ranking change of digital music. Subsequently, in the data processing stage, the collected data was matched with the rankings over time, and the music title and artist name were processed. Each analysis is then sequentially performed in two stages consisting of exploratory analysis and explanatory analysis. First, the data collection period was limited to the period before 'the music bulk buying phenomenon', a reliability issue related to music ranking in Korea. Specifically, it is 73 weeks starting from December 31, 2017 to January 06, 2018 as the first week, and from May 19, 2019 to May 25, 2019. And the analysis targets were limited to digital music released in Korea. In particular, digital music was collected based on the "Gaon Chart", a well-known music chart in Korea. Unlike private music charts that are being serviced in Korea, Gaon Charts are charts approved by government agencies and have basic reliability. Therefore, it can be considered that it has more public confidence than the ranking information provided by other services. The contents of the collected data are as follows. Data on the period and ranking, the name of the music, the name of the artist, the name of the album, the Gaon index, the production company, and the distribution company were collected for the music that entered the top 100 on the music chart within the collection period. Through data collection, 7,300 music, which were included in the top 100 on the music chart, were identified for a total of 73 weeks. On the other hand, in the case of digital music, since the cases included in the music chart for more than two weeks are frequent, the duplication of music is removed through the pre-processing process. For duplicate music, the number and location of the duplicated music were checked through the duplicate check function, and then deleted to form data for analysis. Through this, a list of 742 unique music for analysis among the 7,300-music data in advance was secured. A total of 742 songs were secured through previous data collection and pre-processing. In addition, a total of 16 patterns were derived through time series cluster analysis on the ranking change. Based on the patterns derived after that, two representative patterns were identified: 'Steady Seller' and 'One-Hit Wonder'. Furthermore, the two patterns were subdivided into five patterns in consideration of the survival period of the music and the music ranking. The important characteristics of each pattern are as follows. First, the artist's superstar effect and bandwagon effect were strong in the one-hit wonder-type pattern. Therefore, when consumers choose a digital music, they are strongly influenced by the superstar effect and the bandwagon effect. Second, through the Steady Seller pattern, we confirmed the music that have been chosen by consumers for a very long time. In addition, we checked the patterns of the most selected music through consumer needs. Contrary to popular belief, the steady seller: mid-term pattern, not the one-hit wonder pattern, received the most choices from consumers. Particularly noteworthy is that the 'Climbing the Chart' phenomenon, which is contrary to the existing pattern, was confirmed through the steady-seller pattern. This study focuses on the change in the ranking of music over time, a field that has been relatively alienated centering on digital music. In addition, a new approach to music research was attempted by subdividing the pattern of ranking change rather than predicting the success and ranking of music.

Comparison of Association Rule Learning and Subgroup Discovery for Mining Traffic Accident Data (교통사고 데이터의 마이닝을 위한 연관규칙 학습기법과 서브그룹 발견기법의 비교)

  • Kim, Jeongmin;Ryu, Kwang Ryel
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.1-16
    • /
    • 2015
  • Traffic accident is one of the major cause of death worldwide for the last several decades. According to the statistics of world health organization, approximately 1.24 million deaths occurred on the world's roads in 2010. In order to reduce future traffic accident, multipronged approaches have been adopted including traffic regulations, injury-reducing technologies, driving training program and so on. Records on traffic accidents are generated and maintained for this purpose. To make these records meaningful and effective, it is necessary to analyze relationship between traffic accident and related factors including vehicle design, road design, weather, driver behavior etc. Insight derived from these analysis can be used for accident prevention approaches. Traffic accident data mining is an activity to find useful knowledges about such relationship that is not well-known and user may interested in it. Many studies about mining accident data have been reported over the past two decades. Most of studies mainly focused on predict risk of accident using accident related factors. Supervised learning methods like decision tree, logistic regression, k-nearest neighbor, neural network are used for these prediction. However, derived prediction model from these algorithms are too complex to understand for human itself because the main purpose of these algorithms are prediction, not explanation of the data. Some of studies use unsupervised clustering algorithm to dividing the data into several groups, but derived group itself is still not easy to understand for human, so it is necessary to do some additional analytic works. Rule based learning methods are adequate when we want to derive comprehensive form of knowledge about the target domain. It derives a set of if-then rules that represent relationship between the target feature with other features. Rules are fairly easy for human to understand its meaning therefore it can help provide insight and comprehensible results for human. Association rule learning methods and subgroup discovery methods are representing rule based learning methods for descriptive task. These two algorithms have been used in a wide range of area from transaction analysis, accident data analysis, detection of statistically significant patient risk groups, discovering key person in social communities and so on. We use both the association rule learning method and the subgroup discovery method to discover useful patterns from a traffic accident dataset consisting of many features including profile of driver, location of accident, types of accident, information of vehicle, violation of regulation and so on. The association rule learning method, which is one of the unsupervised learning methods, searches for frequent item sets from the data and translates them into rules. In contrast, the subgroup discovery method is a kind of supervised learning method that discovers rules of user specified concepts satisfying certain degree of generality and unusualness. Depending on what aspect of the data we are focusing our attention to, we may combine different multiple relevant features of interest to make a synthetic target feature, and give it to the rule learning algorithms. After a set of rules is derived, some postprocessing steps are taken to make the ruleset more compact and easier to understand by removing some uninteresting or redundant rules. We conducted a set of experiments of mining our traffic accident data in both unsupervised mode and supervised mode for comparison of these rule based learning algorithms. Experiments with the traffic accident data reveals that the association rule learning, in its pure unsupervised mode, can discover some hidden relationship among the features. Under supervised learning setting with combinatorial target feature, however, the subgroup discovery method finds good rules much more easily than the association rule learning method that requires a lot of efforts to tune the parameters.

Determination of Tumor Boundaries on CT Images Using Unsupervised Clustering Algorithm (비교사적 군집화 알고리즘을 이용한 전산화 단층영상의 병소부위 결정에 관한 연구)

  • Lee, Kyung-Hoo;Ji, Young-Hoon;Lee, Dong-Han;Yoo, Seoung-Yul;Cho, Chul-Koo;Kim, Mi-Sook;Yoo, Hyung-Jun;Kwon, Soo-Il;Chun, Jun-Chul
    • Journal of Radiation Protection and Research
    • /
    • v.26 no.2
    • /
    • pp.59-66
    • /
    • 2001
  • It is a hot issue to determine the spatial location and shape of tumor boundary in fractionated stereotactic radiotherapy (FSRT). We could get consecutive transaxial plane images from the phantom (paraffin) and 4 patients with brain tumor using helical computed tomography(HCT). K-means classification algorithm was adjusted to change raw data pixel value in CT images into classified average pixel value. The classified images consists of 5 regions that ate tumor region (TR), normal region (NR), combination region (CR), uncommitted region (UR) and artifact region (AR). The major concern was how to separate the normal region from tumor region in the combination area. Relative average deviation analysis was adjusted to alter average pixel values of 5 regions into 2 regions of normal and tumor region to define maximum point among average deviation pixel values. And then we drawn gross tumor volume (GTV) boundary by connecting maximum points in images using semi-automatic contour method by IDL(Interactive Data Language) program. The error limit of the ROI boundary in homogeneous phantom is estimated within ${\pm}1%$. In case of 4 patients, we could confirm that the tumor lesions described by physician and the lesions described automatically by the K-mean classification algorithm and relative average deviation analyses were similar. These methods can make uncertain boundary between normal and tumor region into clear boundary. Therefore it will be useful in the CT images-based treatment planning especially to use above procedure apply prescribed method when CT images intermittently fail to visualize tumor volume comparing to MRI images.

  • PDF