• Title/Summary/Keyword: partitioning approach

Search Result 160, Processing Time 0.022 seconds

Hybrid Movie Recommendation System Using Clustering Technique (클러스터링 기법을 이용한 하이브리드 영화 추천 시스템)

  • Sophort Siet;Sony Peng;Yixuan Yang;Sadriddinov Ilkhomjon;DaeYoung Kim;Doo-Soon Park
    • Annual Conference of KIPS
    • /
    • 2023.05a
    • /
    • pp.357-359
    • /
    • 2023
  • This paper proposes a hybrid recommendation system (RS) model that overcomes the limitations of traditional approaches such as data sparsity, cold start, and scalability by combining collaborative filtering and context-aware techniques. The objective of this model is to enhance the accuracy of recommendations and provide personalized suggestions by leveraging the strengths of collaborative filtering and incorporating user context features to capture their preferences and behavior more effectively. The approach utilizes a novel method that combines contextual attributes with the original user-item rating matrix of CF-based algorithms. Furthermore, we integrate k-mean++ clustering to group users with similar preferences and finally recommend items that have highly rated by other users in the same cluster. The process of partitioning is the use of the rating matrix into clusters based on contextual information offers several advantages. First, it bypasses of the computations over the entire data, reducing runtime and improving scalability. Second, the partitioned clusters hold similar ratings, which can produce greater impacts on each other, leading to more accurate recommendations and providing flexibility in the clustering process. keywords: Context-aware Recommendation, Collaborative Filtering, Kmean++ Clustering.

Multi-Threaded Parallel H.264/AVC Decoder for Multi-Core Systems (멀티코어 시스템을 위한 멀티스레드 H.264/AVC 병렬 디코더)

  • Kim, Won-Jin;Cho, Keol;Chung, Ki-Seok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.47 no.11
    • /
    • pp.43-53
    • /
    • 2010
  • Wide deployment of high resolution video services leads to active studies on high speed video processing. Especially, prevalent employment of multi-core systems accelerates researches on high resolution video processing based on parallelization of multimedia software. In this paper, we propose a novel parallel H.264/AVC decoding scheme on a multi-core platform. Parallel H.264/AVC decoding is challenging not only because parallelization may incur significant synchronization overhead but also because software may have complicated dependencies. To overcome such issues, we propose a novel approach called Multi-Threaded Parallelization(MTP). In MTP, to reduce synchronization overhead, a separate thread is allocated to each stage in the pipeline. In addition, an efficient memory reuse technique is used to reduce the memory requirement. To verify the effectiveness of the proposed approach, we parallelized FFmpeg H.264/AVC decoder with the proposed technique using OpenMP, and carried out experiments on an Intel Quad-Core platform. The proposed design performs better than FFmpeg H.264/AVC decoder before the parallelization by 53%. We also reduced the amount of memory usage by 65% and 81% for a high-definition(HD) and a full high-definition(FHD) video, respectively compared with that of popular existing method called 2Dwave.

Application of Spatial Data Integration Based on the Likelihood Ratio Function nad Bayesian Rule for Landslide Hazard Mapping (우도비 함수와 베이지안 결합을 이용한 공간통합의 산사태 취약성 분석에의 적용)

  • Chi, Kwang-Hoon;Chung, Chang-Jo F.;Kwon, Byung-Doo;Park, No-Wook
    • Journal of the Korean earth science society
    • /
    • v.24 no.5
    • /
    • pp.428-439
    • /
    • 2003
  • Landslides, as a geological hazard, have caused extensive damage to property and sometimes result in loss of life. Thus, it is necessary to assess vulnerable areas for future possible landslides in order to mitigate the damage they cause. For this purpose, spatial data integration has been developed and applied to landslide hazard mapping. Among various models, this paper investigates and discusses the effectiveness of the Bayesian spatial data integration approach to landslide hazard mapping. In this study, several data sets related to landslide occurrences in Jangheung, Korea were constructed using GIS and then digitally represented using the likelihood ratio function. By computing the likelihood ratio, we obtained quantitative relationships between input data and landslide occurrences. The likelihood ratio functions were combined using the Bayesian combination rule. In order for predicted results to provide meaningful interpretations with respect to future landslides, we carried out validation based on the spatial partitioning of the landslide distribution. As a result, the Bayesian approach based on a likelihood ratio function can effectively integrate various spatial data for landslide hazard mapping, and it is expected that some suggestions in this study will be helpful to further applications including integration and interpretation stages in order to obtain a decision-support layer.

An Energy-Efficient Clustering Using Division of Cluster in Wireless Sensor Network (무선 센서 네트워크에서 클러스터의 분할을 이용한 에너지 효율적 클러스터링)

  • Kim, Jong-Ki;Kim, Yoeng-Won
    • Journal of Internet Computing and Services
    • /
    • v.9 no.4
    • /
    • pp.43-50
    • /
    • 2008
  • Various studies are being conducted to achieve efficient routing and reduce energy consumption in wireless sensor networks where energy replacement is difficult. Among routing mechanisms, the clustering technique has been known to be most efficient. The clustering technique consists of the elements of cluster construction and data transmission. The elements that construct a cluster are repeated in regular intervals in order to equalize energy consumption among sensor nodes in the cluster. The algorithms for selecting a cluster head node and arranging cluster member nodes optimized for the cluster head node are complex and requires high energy consumption. Furthermore, energy consumption for the data transmission elements is proportional to $d^2$ and $d^4$ around the crossover region. This paper proposes a means of reducing energy consumption by increasing the efficiency of the cluster construction elements that are regularly repeated in the cluster technique. The proposed approach maintains the number of sensor nodes in a cluster at a constant level by equally partitioning the region where nodes with density considerations will be allocated in cluster construction, and reduces energy consumption by selecting head nodes near the center of the cluster. It was confirmed through simulation experiments that the proposed approach consumes less energy than the LEACH algorithm.

  • PDF

Service Identification of Component-Based System for Service-Oriented Architecture (서비스 지향 아키텍처를 위한 컴포넌트기반 시스템의 서비스 식별)

  • Lee, Hyeon-Joo;Choi, Byoung-Ju;Lee, Jung-Won
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.2
    • /
    • pp.70-80
    • /
    • 2008
  • Today, businesses have to respond with flexibility and speed to ever-changing customer demand and market opportunities. Service-oriented architecture (SOA) is the best methodology for minimizing the complexity and the cost of enterprise-level infrastructure and for maximizing the productivity and the flexibility of an enterprise. Most of the enterprise-level SOA delivery strategies deal with the top-down approach, which organization has to define the business processes, to model business services, and to find the required services or to develop new services. However, a lot of peoples want to maximally reuse legacy component-based systems as well as to deliver SOA into their organizations. In this paper, we propose a bottom-up approach for identifying business services with proper granularity. It can improve the reusability and maintenance of services by considering not data I/O of components of legacy applications but GUI event patterns. Our proposed method is applied to MIS with 129 GUIs and 13 components. As a result, the valiance of the coupling value of components is increased five times and three business services are distinctly exposed. It also provides a 49% improvement in reducing the relationship problems between services over a service identification method using only partitioning information of components.

An Implementation of an Edge-based Algorithm for Separating and Intersecting Spherical Polygons (구 볼록 다각형 들의 분리 및 교차를 위한 간선 기반 알고리즘의 구현)

  • Ha, Jong-Seong;Cheon, Eun-Hong
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.28 no.9
    • /
    • pp.479-490
    • /
    • 2001
  • In this paper, we consider the method of partitioning a sphere into faces with a set of spherical convex polygons $\Gamma$=${P_1...P_n}$ for determining the maximum of minimum intersection. This problem is commonly related with five geometric problems that fin the densest hemisphere containing the maximum subset of $\Gamma$, a great circle separating $\Gamma$, a great circle bisecting $\Gamma$ and a great circle intersecting the minimum or maximum subset of $\Gamma$. In order to efficiently compute the minimum or maximum intersection of spherical polygons. we take the approach of edge-based partition, in which the ownerships of edges rather than faces are manipulated as the sphere is incrementally partitioned by each of the polygons. Finally, by gathering the unordered split edges with the maximum number of ownerships. we approximately obtain the centroids of the solution faces without constructing their boundaries. Our algorithm for finding the maximum intersection is analyzed to have an efficient time complexity O(nv) where n and v respectively, are the numbers of polygons and all vertices. Furthermore, it is practical from the view of implementation, since it computes numerical values. robustly and deals with all the degenerate cases, Using the similar approach, the boundary of a general intersection can be constructed in O(nv+LlogL) time, where : is the output-senstive number of solution edges.

  • PDF

Accelerated Loarning of Latent Topic Models by Incremental EM Algorithm (점진적 EM 알고리즘에 의한 잠재토픽모델의 학습 속도 향상)

  • Chang, Jeong-Ho;Lee, Jong-Woo;Eom, Jae-Hong
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.12
    • /
    • pp.1045-1055
    • /
    • 2007
  • Latent topic models are statistical models which automatically captures salient patterns or correlation among features underlying a data collection in a probabilistic way. They are gaining an increased popularity as an effective tool in the application of automatic semantic feature extraction from text corpus, multimedia data analysis including image data, and bioinformatics. Among the important issues for the effectiveness in the application of latent topic models to the massive data set is the efficient learning of the model. The paper proposes an accelerated learning technique for PLSA model, one of the popular latent topic models, by an incremental EM algorithm instead of conventional EM algorithm. The incremental EM algorithm can be characterized by the employment of a series of partial E-steps that are performed on the corresponding subsets of the entire data collection, unlike in the conventional EM algorithm where one batch E-step is done for the whole data set. By the replacement of a single batch E-M step with a series of partial E-steps and M-steps, the inference result for the previous data subset can be directly reflected to the next inference process, which can enhance the learning speed for the entire data set. The algorithm is advantageous also in that it is guaranteed to converge to a local maximum solution and can be easily implemented just with slight modification of the existing algorithm based on the conventional EM. We present the basic application of the incremental EM algorithm to the learning of PLSA and empirically evaluate the acceleration performance with several possible data partitioning methods for the practical application. The experimental results on a real-world news data set show that the proposed approach can accomplish a meaningful enhancement of the convergence rate in the learning of latent topic model. Additionally, we present an interesting result which supports a possible synergistic effect of the combination of incremental EM algorithm with parallel computing.

Further Evidence of Linkage at the tva and tvc Loci in the Layer Lines and a Possibility of Polyallelism at the tvc Locus

  • Ghosh, A.K.;Pani, P.K.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.18 no.5
    • /
    • pp.601-605
    • /
    • 2005
  • Three lines of White Leghorn (WL) chickens (IWJ, IWG and IWC) maintained at Central Avian Research Institute, Izatnagar (UP), were used for chorioallantoic membrane (CAM) and liver tumour (LT) assay. Eleven-day-old embryos of each line were partitioned into three groups and inoculated with 0.2 ml of subgroup A, subgroup C and an equal mixture of subgroup A and C Rous sarcoma virus (RSV). Subgroup virus receptor on the cell surface membrane for subgroup A is coded for by tumour virus a (tva) locus and for subgroup C by tumour virus c (tvc) locus. The random association of the genes at the tva and tvc loci in IWJ and IWC line was assessed and the $x^2$-values for phenotypic classes were found to be significant, indicating the linkage between the tva and tvc loci. The linkage value was estimated to be 0.09 on pooled sex and pooled line basis. On the basis of four subclass tumour phenotypes a 4-allele model was proposed for tva locus having $a^{s1}$, $a^{s2}$, $a^{r1}$ and $a^{r2}$ alleles and the frequencies were calculated as 0.47, 0.13, 0.13 and 0.27 for IWJ line, 0.31, 0.33, 0.14 and 0.22 for IWG line and 0.44, 0.11, 0.21 and 0.24 for IWC line, respectively. Similarly, for tvc locus the frequencies of four alleles i.e. $c^{s1}$, $c^{s2}$, $c^{r1}$ and $c^{r2}$ were calculated as 0.42, 0.20, 0.21 and 0.17 for IWJ line, 0.42, 0.17, 0.27 and 0.14 for IWG line and 0.30, 0.21, 0.16 and 0.33 for IWC line, respectively. The $x^2$-values for all classes of observations were not significant (p>0.05), indicating a good fit to the 4-allele model for the occurrence of 4-subclass tumour phenotypes for tva and tvc loci. On the basis of the 2-allele model both tva and tvc locus carries three genotypes each. But, on the basis of the 4-allele model tva and tvc locus carries 10 genotypes each. The interaction between A-resistance and C-resistance (both CAM and LT death) was ascertained by taking the 10 genotypes of tva locus and 3 genotypes of tvc locus by pooling the lines and partitioning the observations into 3 classes. The $x^2$-values for the genotypic classes of CAM (-) LT (+) and CAM (-) LT (-) phenotypes to mixed virus (A+C) infection were found to be highly significant (p<0.01), indicating increased resistance, which indicates the joint segregation of $a^r$ and $c^r$ genes, suggesting the existence of close linkage between the tva and tvc loci. Therefore, an indirect selection approach using subgroup C viruses can be employed to generate stocks resistant to subgroup A LLV, obviating contamination with the most common agent causing LL in field condition.

User-Perspective Issue Clustering Using Multi-Layered Two-Mode Network Analysis (다계층 이원 네트워크를 활용한 사용자 관점의 이슈 클러스터링)

  • Kim, Jieun;Kim, Namgyu;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.93-107
    • /
    • 2014
  • In this paper, we report what we have observed with regard to user-perspective issue clustering based on multi-layered two-mode network analysis. This work is significant in the context of data collection by companies about customer needs. Most companies have failed to uncover such needs for products or services properly in terms of demographic data such as age, income levels, and purchase history. Because of excessive reliance on limited internal data, most recommendation systems do not provide decision makers with appropriate business information for current business circumstances. However, part of the problem is the increasing regulation of personal data gathering and privacy. This makes demographic or transaction data collection more difficult, and is a significant hurdle for traditional recommendation approaches because these systems demand a great deal of personal data or transaction logs. Our motivation for presenting this paper to academia is our strong belief, and evidence, that most customers' requirements for products can be effectively and efficiently analyzed from unstructured textual data such as Internet news text. In order to derive users' requirements from textual data obtained online, the proposed approach in this paper attempts to construct double two-mode networks, such as a user-news network and news-issue network, and to integrate these into one quasi-network as the input for issue clustering. One of the contributions of this research is the development of a methodology utilizing enormous amounts of unstructured textual data for user-oriented issue clustering by leveraging existing text mining and social network analysis. In order to build multi-layered two-mode networks of news logs, we need some tools such as text mining and topic analysis. We used not only SAS Enterprise Miner 12.1, which provides a text miner module and cluster module for textual data analysis, but also NetMiner 4 for network visualization and analysis. Our approach for user-perspective issue clustering is composed of six main phases: crawling, topic analysis, access pattern analysis, network merging, network conversion, and clustering. In the first phase, we collect visit logs for news sites by crawler. After gathering unstructured news article data, the topic analysis phase extracts issues from each news article in order to build an article-news network. For simplicity, 100 topics are extracted from 13,652 articles. In the third phase, a user-article network is constructed with access patterns derived from web transaction logs. The double two-mode networks are then merged into a quasi-network of user-issue. Finally, in the user-oriented issue-clustering phase, we classify issues through structural equivalence, and compare these with the clustering results from statistical tools and network analysis. An experiment with a large dataset was performed to build a multi-layer two-mode network. After that, we compared the results of issue clustering from SAS with that of network analysis. The experimental dataset was from a web site ranking site, and the biggest portal site in Korea. The sample dataset contains 150 million transaction logs and 13,652 news articles of 5,000 panels over one year. User-article and article-issue networks are constructed and merged into a user-issue quasi-network using Netminer. Our issue-clustering results applied the Partitioning Around Medoids (PAM) algorithm and Multidimensional Scaling (MDS), and are consistent with the results from SAS clustering. In spite of extensive efforts to provide user information with recommendation systems, most projects are successful only when companies have sufficient data about users and transactions. Our proposed methodology, user-perspective issue clustering, can provide practical support to decision-making in companies because it enhances user-related data from unstructured textual data. To overcome the problem of insufficient data from traditional approaches, our methodology infers customers' real interests by utilizing web transaction logs. In addition, we suggest topic analysis and issue clustering as a practical means of issue identification.

Antioxidant Activity of Different Parts of Lespedeza bicolor and Isolation of Antioxidant Compound (싸리나무(Lespedeza bicolor) 부위별 추출물의 항산화 활성 및 항산화물질 분리)

  • Lee, Jae-Hak;Jhoo, Jin-Woo
    • Korean Journal of Food Science and Technology
    • /
    • v.44 no.6
    • /
    • pp.763-771
    • /
    • 2012
  • In this study, total antioxidant properties of extracts from different parts of Lespedeza bicolor were determined using techniques of measuring 1,1-diphenyl-2-picryl hydrazyl/2,2'-azino-bis(3-ethylbenzothiazoline-6-sulfonic acid)-radical scavenging activity and total phenolic contents. The total antioxidant activities of leaf, stem and root extracts from various solvents (water, 50, 70, 100% ethanol, and hot-water) indicated that 50 and 70% ethanol extracts have high radical scavenging activities and phenolic contents. A systematic approach was used to determine the total antioxidant activity of different solvent fractions of the Lespedeza bicolor extracts, partitioning with chloroform, ethyl acetate, n-butanol, and water, and the ethyl acetate fraction was found to have the strongest antioxidant activity. Antioxidant assay-guided isolation was carried out to isolate potential antioxidant compounds. The ethyl acetate fraction of the leaf extract was subjected to silica gel, LH-20 and RP-18 column chromatography successively, and afforded compound 1, which was identified as eriodictyol by NMR and MS analysis, after which its antioxidant activity was determined.