• Title/Summary/Keyword: time series clustering

Search Result 185, Processing Time 0.026 seconds

Identification of Research Areas and Evolution of 2D Materials by the Keyword Mapping Methodology (키워드 매핑 기반 2차원 물질 연구 영역 탐지와 발전 과정 분석)

  • Ahn, Sejung;Lee, June Young
    • Journal of the Korean Institute of Electrical and Electronic Material Engineers
    • /
    • v.31 no.1
    • /
    • pp.11-18
    • /
    • 2018
  • Two-dimensional (2D) materials such as transition metal dichalcogenides have attracted tremendous scientific interests owing to their potential of solving the zero band-gap issue of graphene. In this work, the research areas and technology evolutionary dynamics of the 2D materials were identified using the scientometric method focusing on keyword mapping and clustering. The time-series analysis showed that the technological progress of 2D material is in the early growth period. The overlay mapping analysis were carried out to investigate the technology evolution of 2D materials with time. The strategic diagram of co-word analysis classifying the topological positions of keyword was derived to support the analysis results. It is conjectured that extensive research will be conducted widely on the application of 2D materials not only in electronic and optoelectronic devices, but also in various other fields such as biomedical applications, and that their development will be more rapid based on accumulated results of extant graphene research.

Types and Characteristics Analysis of Human Dynamics in Seoul Using Location-Based Big Data (위치기반 빅데이터를 활용한 서울시 활동인구 유형 및 유형별 지역 특성 분석)

  • Jung, Jae-Hoon;Nam, Jin
    • Journal of Korea Planning Association
    • /
    • v.54 no.3
    • /
    • pp.75-90
    • /
    • 2019
  • As the 24-hour society arrives, human activities in daytime and nighttime urban spaces are changing drastically, and the need for new urban management policies is steadily increasing. This study analyzes the types and characteristics of Seoul's human dynamics using location-based big data and the results are summarized as follows. First, the pattern of human dynamics in Seoul repeats itself every 7 days. Second, the types of human dynamics in Seoul can be classified into five types, and each of type has its own unique time-series and local characteristics. Third, the degree of match between human dynamics and zoning system in urban planning legislation was highest in 'Type 1' residence pattern and low in other types. The following implications can be drawn from these results. First, This paper examined the methodology of analyzing the regional characteristics of Seoul through the human dynamics and obtained meaningful results. Second, This paper can derive reliable and objective pattern analysis results using Big data that reflect the overall population characteristics. Third, the scale of night-time activity in the urban space of Seoul was understood, and its distribution, patterns and characteristics identified.

Multi-FNN Identification by Means of HCM Clustering and ITs Optimization Using Genetic Algorithms (HCM 클러스터링에 의한 다중 퍼지-뉴럴 네트워크 동정과 유전자 알고리즘을 이용한 이의 최적화)

  • 오성권;박호성
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.10 no.5
    • /
    • pp.487-496
    • /
    • 2000
  • In this paper, the Multi-FNN(Fuzzy-Neural Networks) model is identified and optimized using HCM(Hard C-Means) clustering method and genetic algorithms. The proposed Multi-FNN is based on Yamakawa's FNN and uses simplified inference as fuzzy inference method and error back propagation algorithm as learning rules. We use a HCM clustering and Genetic Algorithms(GAs) to identify both the structure and the parameters of a Multi-FNN model. Here, HCM clustering method, which is carried out for the process data preprocessing of system modeling, is utilized to determine the structure of Multi-FNN according to the divisions of input-output space using I/O process data. Also, the parameters of Multi-FNN model such as apexes of membership function, learning rates and momentum coefficients are adjusted using genetic algorithms. A aggregate performance index with a weighting factor is used to achieve a sound balance between approximation and generalization abilities of the model. The aggregate performance index stands for an aggregate objective function with a weighting factor to consider a mutual balance and dependency between approximation and predictive abilities. According to the selection and adjustment of a weighting factor of this aggregate abjective function which depends on the number of data and a certain degree of nonlinearity, we show that it is available and effective to design an optimal Multi-FNN model. To evaluate the performance of the proposed model, we use the time series data for gas furnace and the numerical data of nonlinear function.

  • PDF

Design of Data-centroid Radial Basis Function Neural Network with Extended Polynomial Type and Its Optimization (데이터 중심 다항식 확장형 RBF 신경회로망의 설계 및 최적화)

  • Oh, Sung-Kwun;Kim, Young-Hoon;Park, Ho-Sung;Kim, Jeong-Tae
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.60 no.3
    • /
    • pp.639-647
    • /
    • 2011
  • In this paper, we introduce a design methodology of data-centroid Radial Basis Function neural networks with extended polynomial function. The two underlying design mechanisms of such networks involve K-means clustering method and Particle Swarm Optimization(PSO). The proposed algorithm is based on K-means clustering method for efficient processing of data and the optimization of model was carried out using PSO. In this paper, as the connection weight of RBF neural networks, we are able to use four types of polynomials such as simplified, linear, quadratic, and modified quadratic. Using K-means clustering, the center values of Gaussian function as activation function are selected. And the PSO-based RBF neural networks results in a structurally optimized structure and comes with a higher level of flexibility than the one encountered in the conventional RBF neural networks. The PSO-based design procedure being applied at each node of RBF neural networks leads to the selection of preferred parameters with specific local characteristics (such as the number of input variables, a specific set of input variables, and the distribution constant value in activation function) available within the RBF neural networks. To evaluate the performance of the proposed data-centroid RBF neural network with extended polynomial function, the model is experimented with using the nonlinear process data(2-Dimensional synthetic data and Mackey-Glass time series process data) and the Machine Learning dataset(NOx emission process data in gas turbine plant, Automobile Miles per Gallon(MPG) data, and Boston housing data). For the characteristic analysis of the given entire dataset with non-linearity as well as the efficient construction and evaluation of the dynamic network model, the partition of the given entire dataset distinguishes between two cases of Division I(training dataset and testing dataset) and Division II(training dataset, validation dataset, and testing dataset). A comparative analysis shows that the proposed RBF neural networks produces model with higher accuracy as well as more superb predictive capability than other intelligent models presented previously.

Clustering of Web Objects with Similar Popularity Trends (유사한 인기도 추세를 갖는 웹 객체들의 클러스터링)

  • Loh, Woong-Kee
    • The KIPS Transactions:PartD
    • /
    • v.15D no.4
    • /
    • pp.485-494
    • /
    • 2008
  • Huge amounts of various web items such as keywords, images, and web pages are being made widely available on the Web. The popularities of such web items continuously change over time, and mining temporal patterns in popularities of web items is an important problem that is useful for several web applications. For example, the temporal patterns in popularities of search keywords help web search enterprises predict future popular keywords, enabling them to make price decisions when marketing search keywords to advertisers. However, presence of millions of web items makes it difficult to scale up previous techniques for this problem. This paper proposes an efficient method for mining temporal patterns in popularities of web items. We treat the popularities of web items as time-series, and propose gapmeasure to quantify the similarity between the popularities of two web items. To reduce the computation overhead for this measure, an efficient method using the Fast Fourier Transform (FFT) is presented. We assume that the popularities of web items are not necessarily following any probabilistic distribution or periodic. For finding clusters of web items with similar popularity trends, we propose to use a density-based clustering algorithm based on the gap measure. Our experiments using the popularity trends of search keywords obtained from the Google Trends web site illustrate the scalability and usefulness of the proposed approach in real-world applications.

Analysis of the differences in living population changes and regional responses by COVID-19 outbreak in Seoul (코로나-19에 따른 서울시 생활인구 변화와 동별 반응 차이 분석)

  • Jin, Juhae;Seong, Byeongchan
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.6
    • /
    • pp.697-712
    • /
    • 2020
  • New infectious diseases have broken out repeatedly across the world over the last 20 years; COVID-19 is causing drastic changes and damage to daily lives. Furthermore, as there is no denying that new epidemics will appear in the future, there is a continuous need to develop measures aimed towards responding to economic damage. Against this backdrop, the living population is an important indicator that shows changes in citizens' life patterns. This study analyzes time-based and socio-environmental characteristics by detecting and classifying changes in everyday life caused by COVID-19 from the perspective of the floating population. k-shape Clustering is used to classify living population data of each of the 424 dong's in Seoul measured by the hour; then by applying intervention analysis and One-way ANOVA, each cluster's characteristics and aspects of change in the living population occurring in the aftermath of COVID-19 are scrutinized. In conclusion, this study confirms each cluster's obvious characteristics in changes of population flows before and after the confirmation of coronavirus patients and distinguishes groups that reacted sensitively to the intervention times on the basis of COVID-related incidents from those that did not.

Analysis of Relative Settlement Behavior of Retaining Wall Backside Ground Using Clustering (군집분류를 이용한 흙막이 벽체 배면 지반의 상대적 침하거동 분석)

  • Young-Jun Kwack;Heui-Soo Han
    • The Journal of Engineering Geology
    • /
    • v.33 no.1
    • /
    • pp.189-200
    • /
    • 2023
  • As urbanization and industrialization increase development in downtown areas, damage due to ground settlement continues to occur. Building collapse in urban has a high risk of leading to large-scale damage to life and property. However, there has rarely been studied on measurement data analysis methods when uneven loads are applied to the excavated ground and no prior knowledge of the ground. Accordingly, it was attempted to analyze the relative settlement behavior and correlation by processing the time-series surface settlement of construction sites in the urban. In this paper, the average index of difference in settlement and average of relative difference in settlement are defined and calculated, then plotted in the coordinate system to analyze the relative settlement behavior over time. In addition, since there was no prior knowledge of the ground, a standard to classify the clusters was needed, and the observation points were classified into using k-means clustering and Dunn Index. As a result of the analysis, it was confirmed that all the clusters moved to the stable region as the settlement amount converges. The clusters were segmented. Based on the analysis results, it was possible to distinguish between the independent displacement area and same behavior area by analyzing the correlation between measurement points. If possible to analyze the relative settlement behavior between the stations and classify the behavior areas, it can be helpful in settlement and stability management, such as uplift of the surrounding area, prediction of ground failure area, and prevention of activity failure.

Derivation of Digital Music's Ranking Change Through Time Series Clustering (시계열 군집분석을 통한 디지털 음원의 순위 변화 패턴 분류)

  • Yoo, In-Jin;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.3
    • /
    • pp.171-191
    • /
    • 2020
  • This study focused on digital music, which is the most valuable cultural asset in the modern society and occupies a particularly important position in the flow of the Korean Wave. Digital music was collected based on the "Gaon Chart," a well-established music chart in Korea. Through this, the changes in the ranking of the music that entered the chart for 73 weeks were collected. Afterwards, patterns with similar characteristics were derived through time series cluster analysis. Then, a descriptive analysis was performed on the notable features of each pattern. The research process suggested by this study is as follows. First, in the data collection process, time series data was collected to check the ranking change of digital music. Subsequently, in the data processing stage, the collected data was matched with the rankings over time, and the music title and artist name were processed. Each analysis is then sequentially performed in two stages consisting of exploratory analysis and explanatory analysis. First, the data collection period was limited to the period before 'the music bulk buying phenomenon', a reliability issue related to music ranking in Korea. Specifically, it is 73 weeks starting from December 31, 2017 to January 06, 2018 as the first week, and from May 19, 2019 to May 25, 2019. And the analysis targets were limited to digital music released in Korea. In particular, digital music was collected based on the "Gaon Chart", a well-known music chart in Korea. Unlike private music charts that are being serviced in Korea, Gaon Charts are charts approved by government agencies and have basic reliability. Therefore, it can be considered that it has more public confidence than the ranking information provided by other services. The contents of the collected data are as follows. Data on the period and ranking, the name of the music, the name of the artist, the name of the album, the Gaon index, the production company, and the distribution company were collected for the music that entered the top 100 on the music chart within the collection period. Through data collection, 7,300 music, which were included in the top 100 on the music chart, were identified for a total of 73 weeks. On the other hand, in the case of digital music, since the cases included in the music chart for more than two weeks are frequent, the duplication of music is removed through the pre-processing process. For duplicate music, the number and location of the duplicated music were checked through the duplicate check function, and then deleted to form data for analysis. Through this, a list of 742 unique music for analysis among the 7,300-music data in advance was secured. A total of 742 songs were secured through previous data collection and pre-processing. In addition, a total of 16 patterns were derived through time series cluster analysis on the ranking change. Based on the patterns derived after that, two representative patterns were identified: 'Steady Seller' and 'One-Hit Wonder'. Furthermore, the two patterns were subdivided into five patterns in consideration of the survival period of the music and the music ranking. The important characteristics of each pattern are as follows. First, the artist's superstar effect and bandwagon effect were strong in the one-hit wonder-type pattern. Therefore, when consumers choose a digital music, they are strongly influenced by the superstar effect and the bandwagon effect. Second, through the Steady Seller pattern, we confirmed the music that have been chosen by consumers for a very long time. In addition, we checked the patterns of the most selected music through consumer needs. Contrary to popular belief, the steady seller: mid-term pattern, not the one-hit wonder pattern, received the most choices from consumers. Particularly noteworthy is that the 'Climbing the Chart' phenomenon, which is contrary to the existing pattern, was confirmed through the steady-seller pattern. This study focuses on the change in the ranking of music over time, a field that has been relatively alienated centering on digital music. In addition, a new approach to music research was attempted by subdividing the pattern of ranking change rather than predicting the success and ranking of music.

Optimal Design of Fuzzy-Neural Networkd Structure Using HCM and Hybrid Identification Algorithm (HCM과 하이브리드 동정 알고리즘을 이용한 퍼지-뉴럴 네트워크 구조의 최적 설계)

  • Oh, Sung-Kwun;Park, Ho-Sung;Kim, Hyun-Ki
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.50 no.7
    • /
    • pp.339-349
    • /
    • 2001
  • This paper suggests an optimal identification method for complex and nonlinear system modeling that is based on Fuzzy-Neural Networks(FNN). The proposed Hybrid Identification Algorithm is based on Yamakawa's FNN and uses the simplified inference as fuzzy inference method and Error Back Propagation Algorithm as learning rule. In this paper, the FNN modeling implements parameter identification using HCM algorithm and hybrid structure combined with two types of optimization theories for nonlinear systems. We use a HCM(Hard C-Means) clustering algorithm to find initial apexes of membership function. The parameters such as apexes of membership functions, learning rates, and momentum coefficients are adjusted using hybrid algorithm. The proposed hybrid identification algorithm is carried out using both a genetic algorithm and the improved complex method. Also, an aggregated objective function(performance index) with weighting factor is introduced to achieve a sound balance between approximation and generalization abilities of the model. According to the selection and adjustment of a weighting factor of an aggregate objective function which depends on the number of data and a certain degree of nonlinearity(distribution of I/O data), we show that it is available and effective to design an optimal FNN model structure with mutual balance and dependency between approximation and generalization abilities. To evaluate the performance of the proposed model, we use the time series data for gas furnace, the data of sewage treatment process and traffic route choice process.

  • PDF

A New Approach of Self-Organizing Fuzzy Polynomial Neural Networks Based on Information Granulation and Genetic Algorithms (정보 입자화와 유전자 알고리즘에 기반한 자기구성 퍼지 다항식 뉴럴네트워크의 새로운 접근)

  • Park Ho-Sung;Oh Sung-Kwun;Kim Hvun-Ki
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.55 no.2
    • /
    • pp.45-51
    • /
    • 2006
  • In this paper, we propose a new architecture of Information Granulation based genetically optimized Self-Organizing Fuzzy Polynomial Neural Networks (IG_gSOFPNN) that is based on a genetically optimized multilayer perceptron with fuzzy polynomial neurons (FPNs) and discuss its comprehensive design methodology involving mechanisms of genetic optimization, especially information granulation and genetic algorithms. The proposed IG_gSOFPNN gives rise to a structurally optimized structure and comes with a substantial level of flexibility in comparison to the one we encounter in conventional SOFPNNs. The design procedure applied in the construction of each layer of a SOFPNN deals with its structural optimization involving the selection of preferred nodes (or FPNs) with specific local characteristics (such as the number of input variables, the order of the polynomial of the consequent part of fuzzy rules, and a collection of the specific subset of input variables) and addresses specific aspects of parametric optimization. In addition, the fuzzy rules used in the networks exploit the notion of information granules defined over system's variables and formed through the process of information granulation. That is, we determine the initial location (apexes) of membership functions and initial values of polynomial function being used in the premised and consequence part of the fuzzy rules respectively. This granulation is realized with the aid of the hard c-menas clustering method (HCM). To evaluate the performance of the IG_gSOFPNN, the model is experimented with using two time series data(gas furnace process and NOx process data).