• Title/Summary/Keyword: datasets

Search Result 2,091, Processing Time 0.023 seconds

An Enhanced Density and Grid based Spatial Clustering Algorithm for Large Spatial Database (대용량 공간데이터베이스를 위한 확장된 밀도-격자 기반의 공간 클러스터링 알고리즘)

  • Gao, Song;Kim, Ho-Seok;Xia, Ying;Kim, Gyoung-Bae;Bae, Hae-Young
    • The KIPS Transactions:PartD
    • /
    • v.13D no.5 s.108
    • /
    • pp.633-640
    • /
    • 2006
  • Spatial clustering, which groups similar objects based on their distance, connectivity, or their relative density in space, is an important component of spatial data mining. Density-based and grid-based clustering are two main clustering approaches. The former is famous for its capability of discovering clusters of various shapes and eliminating noises, while the latter is well known for its high speed. Clustering large data sets has always been a serious challenge for clustering algorithms, because huge data set would make the clustering process extremely costly. In this paper, we propose an enhanced Density-Grid based Clustering algorithm for Large spatial database by setting a default number of intervals and removing the outliers effectively with the help of a proper measurement to identify areas of high density in the input data space. We use a density threshold DT to recognize dense cells before neighbor dense cells are combined to form clusters. When proposed algorithm is performed on large dataset, a proper granularity of each dimension in data space and a density threshold for recognizing dense areas can improve the performance of this algorithm. We combine grid-based and density-based methods together to not only increase the efficiency but also find clusters with arbitrary shape. Synthetic datasets are used for experimental evaluation which shows that proposed method has high performance and accuracy in the experiments.

The Image of Ruralism in Korea through a Text Mining for Online News Media analysis (인터넷 뉴스 데이터 텍스트 분석을 통해 본 우리나라 농촌다움에 대한 이미지 연구)

  • Son, Yong-hoon;Kim, Young-jin
    • Journal of Korean Society of Rural Planning
    • /
    • v.25 no.4
    • /
    • pp.13-26
    • /
    • 2019
  • The rural areas in South Korea have changed rapidly in the process of national land development. Rural landscapes have become discoloured, and their attractiveness has decreased as cities have expanded. But the attractiveness or multifunctional values of rural areas has become more important in contemporary society around the world. According to this social demand, the efforts of conserving the rural landscape are of high priority and the recovery of ruralism in the area is required. This study has tried to understand how the public image of ruralism in South Korea has been influenced by the news media. The study retrieved news articles using the web searching portal site from the six keywords, commonly used to refer to ruralism, including 'rural landscape', 'rural community', 'rural tourism', 'rural life', 'rural amenity', and 'rural environment'. News data from the six keywords were also collected respectively from within the year-period of 2004-05, 2007-08, 2012-13, and 2016-17. In the text mining analysis, the nouns with high Degree Centrality were figured out, and the changes by year-period were identified. Then, LDA topic analysis was performed for text datasets of six keywords. As a result, the study found that the news articles gave an informed focus on only a handful of issues such as 'poor rural living condition', 'regional or village improvement projects', 'rural tourism promotion projects', and 'other government support projects'. On the other hand, nouns related to virtues and values in the rural landscape were less shown in news articles. These results have become more apparent in recent years. In the topic analysis, 35 topics were identified. 'village development projects', 'rural tourism', and 'urban-rural exchange projects' were appeared repeatedly in several keywords. Among the topics, there are also topics closely related to ruralism such as 'rural landscape conservation', 'eco-friendly rural areas', 'local amenity resources', 'public interest values of agriculture', and 'rural life and communities'. The study presented an image map showing ruralism in South Korea using a network map between all topics and keywords. At the end of the study, implications for Korean rural area policy and research directions were discussed.

Performance Analysis of Siding Window based Stream High Utility Pattern Mining Methods (슬라이딩 윈도우 기반의 스트림 하이 유틸리티 패턴 마이닝 기법 성능분석)

  • Ryang, Heungmo;Yun, Unil
    • Journal of Internet Computing and Services
    • /
    • v.17 no.6
    • /
    • pp.53-59
    • /
    • 2016
  • Recently, huge stream data have been generated in real time from various applications such as wireless sensor networks, Internet of Things services, and social network services. For this reason, to develop an efficient method have become one of significant issues in order to discover useful information from such data by processing and analyzing them and employing the information for better decision making. Since stream data are generated continuously and rapidly, there is a need to deal with them through the minimum access. In addition, an appropriate method is required to analyze stream data in resource limited environments where fast processing with low power consumption is necessary. To address this issue, the sliding window model has been proposed and researched. Meanwhile, one of data mining techniques for finding meaningful information from huge data, pattern mining extracts such information in pattern forms. Frequency-based traditional pattern mining can process only binary databases and treats items in the databases with the same importance. As a result, frequent pattern mining has a disadvantage that cannot reflect characteristics of real databases although it has played an essential role in the data mining field. From this aspect, high utility pattern mining has suggested for discovering more meaningful information from non-binary databases with the consideration of the characteristics and relative importance of items. General high utility pattern mining methods for static databases, however, are not suitable for handling stream data. To address this issue, sliding window based high utility pattern mining has been proposed for finding significant information from stream data in resource limited environments by considering their characteristics and processing them efficiently. In this paper, we conduct various experiments with datasets for performance evaluation of sliding window based high utility pattern mining algorithms and analyze experimental results, through which we study their characteristics and direction of improvement.

Comparative Analysis of Focal Length Bias for Three Different Line Scanners (초점거리 편의가 지상 정확도에 미치는 영향 비교 연구 - 세가지 라인 스캐너를 대상으로 -)

  • Kim, Changjae
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.32 no.4_1
    • /
    • pp.363-371
    • /
    • 2014
  • Most space-borne optical scanning systems adopt linear arrayconfigurations. The well-knownthree different types of space-borne sensors arealong-track line scanner, across-track linescanner, and three line scanner. To acquire accurate location information of an object on the ground withthose sensors, the exterior and interior orientation parameters are critical factors for both of space-borne and airborne missions. Since the imaging geometry of sensors mightchange time to time due to thermal influence, vibration, and wind, it is very important to analyze the Interior Orientation Parameters (IOP) effects on the ground. The experiments based on synthetic datasets arecarried out while the focal length biases are changing. Also, both high and low altitudes of the imagingsensor were applied. In case with the along-track line scanner, the focal length bias caused errors along the scanline direction. In the other case with the across-track one, the focal length bias caused errors alongthe scan line and vertical directions. Lastly, vertical errors were observed in the case ofthree-line scanner. Those results from this study will be able to provide the guideline for developing new linearsensors, so as for improving the accuracy of laboratory or in-flight sensor calibrations.

Design of an Arm Gesture Recognition System Using Feature Transformation and Hidden Markov Models (특징 변환과 은닉 마코프 모델을 이용한 팔 제스처 인식 시스템의 설계)

  • Heo, Se-Kyeong;Shin, Ye-Seul;Kim, Hye-Suk;Kim, In-Cheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.10
    • /
    • pp.723-730
    • /
    • 2013
  • This paper presents the design of an arm gesture recognition system using Kinect sensor. A variety of methods have been proposed for gesture recognition, ranging from the use of Dynamic Time Warping(DTW) to Hidden Markov Models(HMM). Our system learns a unique HMM corresponding to each arm gesture from a set of sequential skeleton data. Whenever the same gesture is performed, the trajectory of each joint captured by Kinect sensor may much differ from the previous, depending on the length and/or the orientation of the subject's arm. In order to obtain the robust performance independent of these conditions, the proposed system executes the feature transformation, in which the feature vectors of joint positions are transformed into those of angles between joints. To improve the computational efficiency for learning and using HMMs, our system also performs the k-means clustering to get one-dimensional integer sequences as inputs for discrete HMMs from high-dimensional real-number observation vectors. The dimension reduction and discretization can help our system use HMMs efficiently to recognize gestures in real-time environments. Finally, we demonstrate the recognition performance of our system through some experiments using two different datasets.

Understanding of Structural Changes of Keyword Networks in the Computer Engineering Field (컴퓨터공학 분야 키워드네트워크의 구조적 변화 이해)

  • Kwon, Yung-Keun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.3
    • /
    • pp.187-194
    • /
    • 2013
  • Recently, there have been many trials to analyze characteristics of research trends through a structural analysis of keyword networks in various fields. However, most previous studies have mainly focused on structural analysis harbored in some static networks and there is a lack of research on changes of such networks structure with time. In this paper, we constructed annual keyword networks by using a database of papers published in the international computer engineering-field journals from 2002 through 2011, and examined the changes of them. As a result, it was shown that most keywords in a network are preserved in the network of the next year, and their degree of connectivity and the average weight of the connections were higher and smaller, respectively, than those of the keywords which are not preserved. In addition, when a keyword network shifted to one of the next year, the connections between keywords were more likely to be removed than preserved, and the average weight of the removal connections was higher than that of the preserved ones. These results imply that the keywords are not changed over time but their connections are very likely to be changed; and there is apparent differences between the preserved and removal groups of keywords/connections with respect to degree and weights of connections. All these results are consistently observed over the ten-year datasets and they can be important principles in understanding the structural changes of the keyword networks.

Basin modelling with a MATLAB-based program, BasinVis 2.0: A case study on the southern Vienna Basin, Austria (MATLAB 기반의 프로그램 BasinVis 2.0을 이용한 분지 모델링: 오스트리아 비엔나 분지의 남부 지역에 대한 사례 연구)

  • Lee, Eun Young;Wagreich, Michael
    • Journal of the Geological Society of Korea
    • /
    • v.54 no.6
    • /
    • pp.615-630
    • /
    • 2018
  • Basin analysis is a research field to understand the formation and evolution of sedimentary basins. This task requires various geoscientific datasets as well as numerical and graphical modelling techniques to synthesize results dimensionally in time and space. For basin analysis and modelling in a comprehensive workflow, BasinVis 1.0 was released as a MATLAB-based program in 2016, and recently the software has been extended to BasinVis 2.0, with new functions and revised user-interface. As a case study, this work analyses the southern Vienna Basin and visualizes the sedimentation setting and subsidence evolution to introduce the basin modelling functions of BasinVis 2.0. This is a preliminary study for a basin-scale modelling of the Vienna Basin, together with our previous studies using BasinVis 1.0. In the study area, during the late Early Miocene, sedimentation and subsidence are significant along strike-slip and en-echelon listric normal faults. From the Middle Miocene onwards, however, subsidence decreases abruptly over the area and this situation continues until the Late Miocene. This is related to the development of the pull-apart system and corresponds to the episodic tectonic subsidence in strike-slip basins. The subsidence of the Middle Miocene is confined mainly to areas along the strike-slip faults, while, from the late Middle Miocene, the depocenter shifts to a depression along the N-S trending listric normal faults. This corresponds to the regional paleostress regime transitioning from NE-SW trending transtension to E-W trending extension. This study applies various functions and techniques to this case study, and the modelled results demonstrate that BasinVis 2.0 is effective and applicable to the basin modelling.

Study on Correlation Between Timber Age, Image Bands and Vegetation Indices for Timber Age Estimation Using Landsat TM Image (Landsat TM 영상을 이용한 교목연령 추정에 영창을 주는 영상 밴드 및 식생지수에 관한 연구)

  • Lee, Jung-Bin;Heo, Joon;Sohn, Hong-Gyoo
    • Korean Journal of Remote Sensing
    • /
    • v.24 no.6
    • /
    • pp.583-590
    • /
    • 2008
  • This study presents a correlation between timber Age, image bands and vegetation indices for timber age estimation. Basically, this study used Landsat TM images of three difference years (1994, 1994, 1998) and difference between Shuttle Radar Topography Mission (SRTM) and National Elevation Dataset (NED). Bands of 4, 5 and 7, Normalized Difference Vegetation Index (NDVI), Infrared Index (II), Vegetation Condition Index (VCI) and Soil Adjusted Vegetation Index (SA VI) were obtained from Landsat TM images. Tasseled cap - greenness and wetness images were also made by Tasseled cap transformation. Finally, analysis of correlation between timber age, difference between Shuttle Radar Topography Mission (SRTM) and National Elevation Dataset (NED), individual TM bands (4, 5, 7), Normalized Difference Vegetation Index (NDVI), Tasseled cap-Greenness, Wetness, Infrared Index (II), Vegetation Condition Index (VCI) and Soil Adjusted Vegetation Index (SAVI) using regression model. In this study about 1,992 datasets were analyzed. The Tasseled cap - Wetness, Infrared Index (II) and Vegetation Condition Index (VCI) showed close correlation for timber age estimation.

Development $K_d({\lambda})$ and Visibility Algorithm for Ocean Color Sensor Around the Central Coasts of the Yellow Sea (황해 중부 연안 해역에서의 해색센서용 하향 확산 감쇠계수 및 수중시계 추정 알고리즘 개발)

  • Min, Jee-Eun;Ahn, Yu-Hwan;Lee, Kyu-Sung;Ryu, Joo-Hyung
    • Korean Journal of Remote Sensing
    • /
    • v.23 no.4
    • /
    • pp.311-321
    • /
    • 2007
  • The diffuse attenuation coefficient for down-welling irradiance $K_d({\lambda})$, which is the propagation of down-welling irradiance at wavelength ${\lambda}$ from surface to a depth (z) in the ocean, and underwater visibility are important optical parameters for ocean studies. There have been several studies on $K_d({\lambda})$ and underwater visibility around the world, but only a few studies have focused on these properties in the Korean sea. Therefore, in the present study, we studied $K_d({\lambda})$ and underwater visibility around the coastal area of the Yellow Sea, and developed $K_d({\lambda})$ and underwater visibility algorithms for ocean color satellite sensor. For this research we conducted a field campaign around the Yellow Sea from $19{\sim}22$ September, 2006 and there we obtained a set of ocean optical and environmental data. From these datasets the $K_d({\lambda})$ and underwater visibility algorithms were empirically derived and compared with the existing NASA SeaWiFS $K_d({\lambda})$ algorithm and NRL (Naval Research Laboratory) underwater visibility algorithm. Such comparisons over a turbid area showed small difference in the $K_d({\lambda})$ algorithm and constants of our result for underwater visibility algorithm showed slightly higher values.

Improvement of Small Baseline Subset (SBAS) Algorithm for Measuring Time-series Surface Deformations from Differential SAR Interferograms (차분 간섭도로부터 지표변위의 시계열 관측을 위한 개선된 Small Baseline Subset (SBAS) 알고리즘)

  • Jung, Hyung-Sup;Lee, Chang-Wook;Park, Jung-Won;Kim, Ki-Dong;Won, Joong-Sun
    • Korean Journal of Remote Sensing
    • /
    • v.24 no.2
    • /
    • pp.165-177
    • /
    • 2008
  • Small baseline subset (SBAS) algorithm has been recently developed using an appropriate combination of differential interferograms, which are characterized by a small baseline in order to minimize the spatial decorrelation. This algorithm uses the singular value decomposition (SVD) to measure the time-series surface deformation from the differential interferograms which are not temporally connected. And it mitigates the atmospheric effect in the time-series surface deformation by using spatially low-pass and temporally high-pass filter. Nevertheless, it is not easy to correct the phase unwrapping error of each interferogram and to mitigate the time-varying noise component of the surface deformation from this algorithm due to the assumption of the linear surface deformation in the beginning of the observation. In this paper, we present an improved SBAS technique to complement these problems. Our improved SBAS algorithm uses an iterative approach to minimize the phase unwrapping error of each differential interferogram. This algorithm also uses finite difference method to suppress the time-varying noise component of the surface deformation. We tested our improved SBAS algorithm and evaluated its performance using 26 images of ERS-1/2 data and 21 images of RADARSAT-1 fine beam (F5) data at each different locations. Maximum deformation amount of 40cm in the radar line of sight (LOS) was estimated from ERS-l/2 datasets during about 13 years, whereas 3 cm deformation was estimated from RADARSAT-1 ones during about two years.