• Title/Summary/Keyword: K means clustering

Search Result 1,118, Processing Time 0.063 seconds

A Statistical Approach for Improving the Embedding Capacity of Block Matching based Image Steganography (블록 매칭 기반 영상 스테가노그래피의 삽입 용량 개선을 위한 통계적 접근 방법)

  • Kim, Jaeyoung;Park, Hanhoon;Park, Jong-Il
    • Journal of Broadcast Engineering
    • /
    • v.22 no.5
    • /
    • pp.643-651
    • /
    • 2017
  • Steganography is one of information hiding technologies and discriminated from cryptography in that it focuses on avoiding the existence the hidden information from being detected by third parties, rather than protecting it from being decoded. In this paper, as an image steganography method which uses images as media, we propose a new block matching method that embeds information into the discrete wavelet transform (DWT) domain. The proposed method, based on a statistical analysis, reduces loss of embedding capacity due to inequable use of candidate blocks. It works in such a way that computes the variance of each candidate block, preserves candidate blocks with high frequency components while reducing candidate blocks with low frequency components by compressing them exploiting the k-means clustering algorithm. Compared with the previous block matching method, the proposed method can reconstruct secret images with similar PSNRs while embedding higher-capacity information.

Design and Assessment of an Ozone Potential Forecasting Model using Multi-regression Equations in Ulsan Metropolitan Area (중회귀 모형을 이용한 울산지역 오존 포텐셜 모형의 설계 및 평가)

  • Kim, Yoo-Keun;Lee, So-Young;Lim, Yun-Kyu;Song, Sang-Keun
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.23 no.1
    • /
    • pp.14-28
    • /
    • 2007
  • This study presented the selection of ozone ($O_3$) potential factors and designed and assessed its potential prediction model using multiple-linear regression equations in Ulsan area during the springtime from April to June, $2000{\sim}2004$. $O_3$ potential factors were selected by analyzing the relationship between meterological parameters and surface $O_3$ concentrations. In addition, cluster analysis (e.g., average linkage and K-means clustering techniques) was performed to identify three major synoptic patterns (e.g., $P1{\sim}P3$) for an $O_3$ potential prediction model. P1 is characterized by a presence of a low-pressure system over northeastern Korea, the Ulsan was influenced by the northwesterly synoptic flow leading to a retarded sea breeze development. P2 is characterized by a weakening high-pressure system over Korea, and P3 is clearly associated with a migratory anticyclone. The stepwise linear regression was performed to develop models for prediction of the highest 1-h $O_3$ occurring in the Ulsan. The results of the models were rather satisfactory, and the high $O_3$ simulation accuracy for $P1{\sim}P3$ synoptic patterns was found to be 79, 85, and 95%, respectively ($2000{\sim}2004$). The $O_3$ potential prediction model for $P1{\sim}P3$ using the predicted meteorological data in 2005 showed good high $O_3$ prediction performance with 78, 75, and 70%, respectively. Therefore the regression models can be a useful tool for forecasting of local $O_3$ concentration.

A change of the public's emotion depending on Temperature & Humidity index (온습도에 따른 대중의 감성(감정+감각) 활동 변화)

  • Yang, Junggi;Kim, Geunyoung;Lee, Youngho;Kang, Un-Gu
    • Journal of Digital Convergence
    • /
    • v.12 no.10
    • /
    • pp.243-252
    • /
    • 2014
  • Many researches about the effect on politics, economics and Sociocultural phenomenon using the social media are in progress. Authors utilized NAVER Trend most famous web browsing service in korea, NAVER Blog social media, NAVER Cafe service and Open Data(API) and also used temperature, humidity index data of Korea Meteorological Administration. This study analyzed a change of the public's emotion in korea using Cluster analysis of vocabulary of taste among its of feelings and senses. K-means clustering was followed by decision of the number of groups which was used Chi-square goodness of fit test and ward analysis. Eight groups was made and it represented sensitive vocabulary. By Discriminant analysis, eight groups decided by Cluster analysis has 98.9% accuracy. The change of the public's emotion has capability to predict people's activity so they can share sensibility and a bond of sympathy developed between them.

The Habitat Classification of mammals in Korea based on the National Ecosystem Survey (전국자연환경조사를 활용한 포유류 서식지 유형의 분류)

  • Lee, Hwajin;Ha, Jeongwook;Cha, Jinyeol;Lee, Junghyo;Yoon, Heenam;Chung, Chulun;Oh, Hongshik;Bae, Soyeon
    • Journal of Environmental Impact Assessment
    • /
    • v.26 no.2
    • /
    • pp.160-170
    • /
    • 2017
  • The purpose of this study is to perform clustering of the habitat types and to identify the characteristics of species in the habitat types using mammal data (70,562) of the 3rd National Ecosystem Survey conducted from 2006 to 2012. The 15 habitat types recorded in the field-paper of the 3rd National ecosystem survey were reclassified, which was followed by the statistical analysis of mammal habitat types. In the habitat types cluster analysis, non-hierarchical cluster analysis (k-means cluster analysis), hierarchical cluster analysis, and non-metric multidimensional scaling method were applied to 14 habitat types recorded more than 30 times. A total of 7 Orders, 16 Families, and 39 Species of mammals were identified in the 3rd National Ecosystem Survey collected nationwide. When 11 clusters were classified by habitat types, the simple structure index was the highest (ssi = 0.07). As a result of the similarities and hierarchies between habitat types suggested by the hierarchical clustering analysis, the residential areas were the most different habitat types for mammals; the next following type was a cluster together with rivers and coasts. The results of the non-metric multidimensional scaling analysis demonstrated that both Mus musculus and Rattus norvegicus restrictively appeared in a residential area, which is the most discriminating habitat type. Lutra lutra restrictively appeared in coastal and river areas. In summary, according to our results, the mammalian habitat can be divided into the following four types: (1) the forest type (using forest as the main habitat and migration route); (2) the river type (using water as the main habitat); (3) the residence habitat (living near residential area); and (4) the lowland type (consuming grain or seeds as the main feeding resource).

Analysis method of patent document to Forecast Patent Registration (특허 등록 예측을 위한 특허 문서 분석 방법)

  • Koo, Jung-Min;Park, Sang-Sung;Shin, Young-Geun;Jung, Won-Kyo;Jang, Dong-Sik
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.4
    • /
    • pp.1458-1467
    • /
    • 2010
  • Recently, imitation and infringement rights of an intellectual property are being recognized as impediments to nation's industrial growth. To prevent the huge loss which comes from theses impediments, many researchers are studying protection and efficient management of an intellectual property in various ways. Especially, the prediction of patent registration is very important part to protect and assert intellectual property rights. In this study, we propose the patent document analysis method by using text mining to predict whether the patent is registered or rejected. In the first instance, the proposed method builds the database by using the word frequencies of the rejected patent documents. And comparing the builded database with another patent documents draws the similarity value between each patent document and the database. In this study, we used k-means which is partitioning clustering algorithm to select criteria value of patent rejection. In result, we found conclusion that some patent which similar to rejected patent have strong possibility of rejection. We used U.S.A patent documents about bluetooth technology, solar battery technology and display technology for experiment data.

Analyzing K-POP idol popularity factors using music charts and new media data using machine learning (머신러닝을 활용한 음원 차트와 뉴미디어 데이터를 활용한 K-POP 아이돌 인기 요인 분석)

  • Jiwon Choi;Dayeon Jung;Kangkyu Choi;Taein Lim;Daehoon Kim;Jongkyn Jung;Seunmin Rho
    • Journal of Platform Technology
    • /
    • v.12 no.1
    • /
    • pp.55-66
    • /
    • 2024
  • The K-POP market has become influential not only in culture but also in society as a whole, including diplomacy and environmental movements. As a result, various papers have been conducted based on machine learning to identify the success factors of idols by utilizing traditional data such as music and recordings. However, there is a limitation that previous studies have not reflected the influence of new media platforms such as Instagram releases, YouTube shorts, TikTok, Twitter, etc. on the popularity of idols. Therefore, it is difficult to clarify the causal relationship of recent idol success factors because the existing studies do not consider the daily changing media trends. To solve these problems, this paper proposes a data collection system and analysis methodology for idol-related data. By developing a container-based real-time data collection automation system that reflects the specificity of idol data, we secure the stability and scalability of idol data collection and compare and analyze the clusters of successful idols through a K-Means clustering-based outlier detection model. As a result, we were able to identify commonalities among successful idols such as gender, time of success after album release, and association with new media. Through this, it is expected that we can finally plan optimal comeback promotions for each idol, album type, and comeback period to improve the chances of idol success.

  • PDF

Transcriptome Analyses for the Anti-Adipogenic Mechanism of an Herbal Composition (생약복합물의 지방세포형성억제 기전규명을 위한 전사체 분석)

  • Lee, Hae-Yong;Kang, Ryun-Hwa;Bae, Sung-Min;Chae, Soo-Ahn;Lee, Jung-Ju;Oh, Dong-Jin;Park, Suk-Won;Cho, Soo-Hyun;Shim, Yae-Jie;Yoon, Yoo-Sik
    • Journal of Life Science
    • /
    • v.20 no.7
    • /
    • pp.1054-1065
    • /
    • 2010
  • SH21B is a natural composition composed of seven herbs: Scutellaria baicalensis Georgi, Prunus armeniaca Maxim, Ephedra sinica Stapf, Acorus gramineus Soland, Typha orientalis Presl, Polygala tenuifolia Willd and Nelumbo nucifera Gaertner (Ratio 3:3:3:3:3:2:2). In our previous study, we reported that SH21B inhibited adipogenesis and fat accumulation in 3T3-L1 cells through modulation of various regulators in the adipogenesis pathway. The aim of this study was to analyze the transcriptome profiles for the anti-adipogenic effects of SH21B in 3T3-L1 cells. Total RNAs from SH21B-treated 3T3-L1 cells were reverse-transcribed into cDNAs and hybridized to Affymetrix Mouse Gene 1.0 ST array. From microarray analyses, we identified 2,568 genes of which expressions were changed more than two-fold by SH21B, and the clustering analyses of these genes resulted in 9 clusters. Three clusters among the 9 showed down-regulation by SH21B (cluster 4, cluster 6 and cluster 9), and two clusters showed up-regulation by SH21B (cluster 7 and cluster 8) during the adipogenesis of 3T3-L1 cells. It was found that many genes related to cell proliferation and adipogenesis were included in these clusters. Clusters 4, 6 and 9 included genes which were related with adipogenesis induction and cell cycle arrest. Clusters 7 and 8 included genes related to cell proliferation as well as adipogenesis inhibition. These results suggest that the mechanisms of the anti-adipogenic effects of SH21B may be the modulation of genes involved in cell proliferation and adipogenesis.

Distribution Analysis of Optimal Equipment Assignment Using a Genetic Algorithm (유전알고리즘을 이용하여 최적화된 방제 자원 배치안의 분포도 분석)

  • Kim, Hye-Jin;Kim, Yong-Hyuk
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.4
    • /
    • pp.11-16
    • /
    • 2020
  • As a plan for oil spill accidents, research to collect and analyze optimal equipment assignments is essential. However, studies that have diversified and analyzed the optimal equipment assignments for responding to oil spill accidents have not been preceded. In response to the need for analyzing optimal equipment assignments study, we devised a genetic algorithm for optimal equipment assignments. The designed genetic algorithm yielded 10,000 optimal equipment assignments. We clustered using the k-means algorithm. As a result, the two clusters of Yeosu, Daesan, and Ulsan, which are expected to be the largest spills, were clearly identified. We also projected 16-dimensional data in two dimensions via Sammon's mapping. The projected data were analyzed for distribution. We confirmed that results of the simulation were better than those of optimal equipment assignments included in the cluster.In the future, it will be possible to implement an approximate model with excellent performance based on this study.

Design of Robust Face Recognition System Realized with the Aid of Automatic Pose Estimation-based Classification and Preprocessing Networks Structure

  • Kim, Eun-Hu;Kim, Bong-Youn;Oh, Sung-Kwun;Kim, Jin-Yul
    • Journal of Electrical Engineering and Technology
    • /
    • v.12 no.6
    • /
    • pp.2388-2398
    • /
    • 2017
  • In this study, we propose a robust face recognition system to pose variations based on automatic pose estimation. Radial basis function neural network is applied as one of the functional components of the overall face recognition system. The proposed system consists of preprocessing and recognition modules to provide a solution to pose variation and high-dimensional pattern recognition problems. In the preprocessing part, principal component analysis (PCA) and 2-dimensional 2-directional PCA ($(2D)^2$ PCA) are applied. These functional modules are useful in reducing dimensionality of the feature space. The proposed RBFNNs architecture consists of three functional modules such as condition, conclusion and inference phase realized in terms of fuzzy "if-then" rules. In the condition phase of fuzzy rules, the input space is partitioned with the use of fuzzy clustering realized by the Fuzzy C-Means (FCM) algorithm. In conclusion phase of rules, the connections (weights) are realized through four types of polynomials such as constant, linear, quadratic and modified quadratic. The coefficients of the RBFNNs model are obtained by fuzzy inference method constituting the inference phase of fuzzy rules. The essential design parameters (such as the number of nodes, and fuzzification coefficient) of the networks are optimized with the aid of Particle Swarm Optimization (PSO). Experimental results completed on standard face database -Honda/UCSD, Cambridge Head pose, and IC&CI databases demonstrate the effectiveness and efficiency of face recognition system compared with other studies.

A Study on Economic Analysis Algorithm for Energy Storage System Considering Peak Reduction and a Special Tariff (피크저감과 특례요금제를 고려한 ESS 경제성 분석 알고리즘에 관한 연구)

  • Son, Joon-Ho
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.67 no.10
    • /
    • pp.1278-1285
    • /
    • 2018
  • For saving electricity bill, energy storage system(ESS) is being installed in factories, public building and commercial building with a Time-of-Use(TOU) tariff which consists of demand charge(KRW/kW) and energy charge(KRW/kWh). However, both of peak reduction and ESS special tariff are not considered in an analysis of initial cost payback period(ICPP) on ESS. Since it is difficult to reflect base rate by an amount of uncertain peak demand reduction during mid-peak and on-peak periods in the future days. Therefore, the ICPP on ESS can be increased. Based on this background, this paper presents the advanced analysis method for the ICPP on ESS. In the proposed algorithm, the representative days of monthly electricity consumption pattern for the amount of peak reduction can be found by the k­means clustering algorithm. Moreover, the total expected energy costs of representative days are minimized by optimal daily ESS operation considering both peak reduction and the special tariff through a mixed-integer linear programming(MILP). And then, the amount of peak reduction becomes a value that the sum of the expected energy costs for 12 months is maximum. The annual benefit cost is decided by the amount of annual peak reduction. Two simulation cases are considered in this study, which one only considers the special tariff and another considers both of the special tariff and amount of peak reduction. The ICPP in the proposed method is shortened by 18 months compared to the conventional method.