• Title/Summary/Keyword: model-based cluster

Search Result 638, Processing Time 0.031 seconds

Analysis of Categories of Internationalization Strategy by Korean Ventures and Their Performances (한국 벤처기업 국제화 전략의 유형과 성과 분석)

  • Lee, Gi-Whan;Choi, Bong-Ho
    • Korea Trade Review
    • /
    • v.43 no.4
    • /
    • pp.177-217
    • /
    • 2018
  • The purpose of this study is to classify the types of internationalization strategies utilized by Korean ventures and to examine whether there is a significant relationship between these types and internationalization performances. Specifically we tested whether there is a good feasibility through empirical analysis of the study model constructed through the following process. As a criterion of typification, the capability of international entrepreneurship and the capability of effectuation of a venture were chosen, and a model in which those three types exist based on the capabilities is established. The characteristics of each type and the contents of internationalization strategy are explained and empirical analysis is conducted. We also test whether there are significant differences in internationalization performance for each type. As a result of the cluster analysis, we concluded that there are three types : pioneer, careful preparation and passive response. In addition, these three types have significant differences in the levels of performances of reputation in foreign markets and the accumulation of knowledge in international management. This implies significant differences among the performances of each type of ventures according to their internationalization strategy positions. Therefore, the type of venture should be considered when a venture establishes its internationalization strategy and governments set supporting policy for venture companies.

  • PDF

Development of Retargetable Hadoop Simulation Environment Based on DEVS Formalism (DEVS 형식론 기반의 재겨냥성 하둡 시뮬레이션 환경 개발)

  • Kim, Byeong Soo;Kang, Bong Gu;Kim, Tag Gon;Song, Hae Sang
    • Journal of the Korea Society for Simulation
    • /
    • v.26 no.4
    • /
    • pp.51-61
    • /
    • 2017
  • Hadoop platform is a representative storing and managing platform for big data. Hadoop consists of distributed computing system called MapReduce and distributed file system called HDFS. It is important to analyse the effectiveness according to the change of cluster constructions and several parameters. However, since it is hard to construct thousands of clusters and analyse the constructed system, simulation method is required to analyse the system. This paper proposes Hadoop simulator based on DEVS formalism which provides hierarchical and modular modeling. Hadoop simulator provides a retargetable experimental environment that is possible to change of various parameters, algorithms and models. It is also possible to design input models reflecting the characteristics of Hadoop applications. To maximize the user's convenience, the user interface, real-time model viewer, and input scenario editor are also provided. In this paper, we validate Hadoop Simulator through the comparison with the Hadoop execution results and perform various experiments.

The Strength of the Relationship between Semantic Similarity and the Subcategorization Frames of the English Verbs: a Stochastic Test based on the ICE-GB and WordNet (영어 동사의 의미적 유사도와 논항 선택 사이의 연관성 : ICE-GB와 WordNet을 이용한 통계적 검증)

  • Song, Sang-Houn;Choe, Jae-Woong
    • Language and Information
    • /
    • v.14 no.1
    • /
    • pp.113-144
    • /
    • 2010
  • The primary goal of this paper is to find a feasible way to answer the question: Does the similarity in meaning between verbs relate to the similarity in their subcategorization? In order to answer this question in a rather concrete way on the basis of a large set of English verbs, this study made use of various language resources, tools, and statistical methodologies. We first compiled a list of 678 verbs that were selected from the most and second most frequent word lists from the Colins Cobuild English Dictionary, which also appeared in WordNet 3.0. We calculated similarity measures between all the pairs of the words based on the 'jcn' algorithm (Jiang and Conrath, 1997) implemented in the WordNet::Similarity module (Pedersen, Patwardhan, and Michelizzi, 2004). The clustering process followed, first building similarity matrices out of the similarity measure values, next drawing dendrograms on the basis of the matricies, then finally getting 177 meaningful clusters (covering 437 verbs) that passed a certain level set by z-score. The subcategorization frames and their frequency values were taken from the ICE-GB. In order to calculate the Selectional Preference Strength (SPS) of the relationship between a verb and its subcategorizations, we relied on the Kullback-Leibler Divergence model (Resnik, 1996). The SPS values of the verbs in the same cluster were compared with each other, which served to give the statistical values that indicate how much the SPS values overlap between the subcategorization frames of the verbs. Our final analysis shows that the degree of overlap, or the relationship between semantic similarity and the subcategorization frames of the verbs in English, is equally spread out from the 'very strongly related' to the 'very weakly related'. Some semantically similar verbs share a lot in terms of their subcategorization frames, and some others indicate an average degree of strength in the relationship, while the others, though still semantically similar, tend to share little in their subcategorization frames.

  • PDF

Software Architecture of the Grid for implementing the Cloud Computing of the High Availability (고가용성 클라우드 컴퓨팅 구축을 위한 그리드 소프트웨어 아키텍처)

  • Lee, Byoung-Yup;Park, Jun-Ho;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.2
    • /
    • pp.19-29
    • /
    • 2012
  • Currently, cloud computing technology is being supplied in various service forms and it is becoming a ground breaking service which provides usage of storage service, data and software while user is not involved in technical background such as physical location of service or system environment. cloud computing technology has advantages that it can use easily as many IT resources as it wants freely regardless of hardware issues required by a variety of systems and service level required by infrastructure. Also, since it has a strength that it can choose usage of resource about business model due to various internet-based technologies, provisioning technology and virtualization technology are being paid attention as main technologies. These technologies are ones of important technology elements which help web-based users approach freely and install according to user environment. Therefore, this thesis introduces software-related technologies and architectures in an aspect of grid for building up high availability cloud computing environment by analysis about cloud computing technology trend.

A Study on Web Services for Sequence Similarity search in the Workflow Environment (워크플로우 환경에서의 대규모 서열 유사성 검색 웹 서비스에 관한 연구)

  • Jun, Jin-Young
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.6
    • /
    • pp.41-49
    • /
    • 2008
  • In recent years, a life phenomenon using a workflow management tool in bioinformatics has been actively researched. Workflow management tool is the base which enables researchers to collaborate through the re-use and sharing of service, and a variety of workflow management tools including MyGrid project's Taverna, Kepler and BioWMS have been developed and used as the open source. This workflow management tool can model and automate different services in spatially-distant area in one working space based on the web service technology. Many tools and databases used in the bioinformatics are provided in the web services form and are used in the workflow management tool. In such the situation, the web services development and stable service offering for a sequence similarity search which is basically used in the bioinformatics can be essential in the bioinformatics field. In this paper, the similarity retrieval speed of biology sequence data was improved based on a Linux cluster, and the sequence similarity retrieval could be done for a short time by linking with the workflow management tool through developing it in the web services.

  • PDF

Identification of Fuzzy-Radial Basis Function Neural Network Based on Mountain Clustering (Mountain Clustering 기반 퍼지 RBF 뉴럴네트워크의 동정)

  • Choi, Jeoung-Nae;Oh, Sung-Kwun;Kim, Hyun-Ki
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.1 no.3
    • /
    • pp.69-76
    • /
    • 2008
  • This paper concerns Fuzzy Radial Basis Function Neural Network (FRBFNN) and automatic rule generation of extraction of the FRBFNN by means of mountain clustering. In the proposed network, the membership functions of the premise part of fuzzy rules do not assume any explicit functional forms such as Gaussian, ellipsoidal, triangular, etc., so its resulting fitness values (degree of membership) directly rely on the computation of the relevant distance between data points. Also, we consider high-order polynomial as the consequent part of fuzzy rules which represent input-output characteristic of sup-space. The number of clusters and the centers of clusters are automatically generated by using mountain clustering method based on the density of data. The centers of cluster which are obtained by using mountain clustering are used to determine a degree of membership and weighted least square estimator (WLSE) is adopted to estimate the coefficients of the consequent polynomial of fuzzy rules. The effectiveness of the proposed model have been investigated and analyzed in detail for the representative nonlinear function.

  • PDF

Active Learning based on Hierarchical Clustering (계층적 군집화를 이용한 능동적 학습)

  • Woo, Hoyoung;Park, Cheong Hee
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.10
    • /
    • pp.705-712
    • /
    • 2013
  • Active learning aims to improve the performance of a classification model by repeating the process to select the most helpful unlabeled data and include it to the training set through labelling by expert. In this paper, we propose a method for active learning based on hierarchical agglomerative clustering using Ward's linkage. The proposed method is able to construct a training set actively so as to include at least one sample from each cluster and also to reflect the total data distribution by expanding the existing training set. While most of existing active learning methods assume that an initial training set is given, the proposed method is applicable in both cases when an initial training data is given or not given. Experimental results show the superiority of the proposed method.

An Efficient Deep Learning Ensemble Using a Distribution of Label Embedding

  • Park, Saerom
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.1
    • /
    • pp.27-35
    • /
    • 2021
  • In this paper, we propose a new stacking ensemble framework for deep learning models which reflects the distribution of label embeddings. Our ensemble framework consists of two phases: training the baseline deep learning classifier, and training the sub-classifiers based on the clustering results of label embeddings. Our framework aims to divide a multi-class classification problem into small sub-problems based on the clustering results. The clustering is conducted on the label embeddings obtained from the weight of the last layer of the baseline classifier. After clustering, sub-classifiers are constructed to classify the sub-classes in each cluster. From the experimental results, we found that the label embeddings well reflect the relationships between classification labels, and our ensemble framework can improve the classification performance on a CIFAR 100 dataset.

A Ppoisson Regression Aanlysis of Physician Visits (외래이용빈도 분석의 모형과 기법)

  • 이영조;한달선;배상수
    • Health Policy and Management
    • /
    • v.3 no.2
    • /
    • pp.159-176
    • /
    • 1993
  • The utilization of outpatient care services involves two steps of sequential decisions. The first step decision is about whether to initiate the utilization and the second one is about how many more visits to make after the initiation. Presumably, the initiation decision is largely made by the patient and his or her family, while the number of additional visits is decided under a strong influence of the physician. Implication is that the analysis of the outpatient care utilization requires to specify each of the two decisions underlying the utilization as a distinct stochastic process. This paper is concerned with the number of physician visits, which is, by definition, a discrete variable that can take only non-negative integer values. Since the initial visit is considered in the analysis of whether or not having made any physician visit, the focus on the number of visits made in addition to the initial one must be enough. The number of additional visits, being a kind of count data, could be assumed to exhibit a Poisson distribution. However, it is likely that the distribution is over dispersed since the number of physician visits tends to cluster around a few values but still vary widely. A recently reported study of outpatient care utilization employed an analysis based upon the assumption of a negative binomial distribution which is a type of overdispersed Poisson distribution. But there is an indication that the use of Poisson distribution making adjustments for over-dispersion results in less loss of efficiency in parameter estimation compared to the use of a certain type of distribution like a negative binomial distribution. An analysis of the data for outpatient care utilization was performed focusing on an assessment of appropriateness of available techniques. The data used in the analysis were collected by a community survey in Hwachon Gun, Kangwon Do in 1990. It was observed that a Poisson regression with adjustments for over-dispersion is superior to either an ordinary regression or a Poisson regression without adjustments oor over-dispersion. In conclusion, it seems the most approprite to assume that the number of physician visits made in addition to the initial visist exhibits an overdispersed Poisson distribution when outpatient care utilization is studied based upon a model which embodies the two-part character of the decision process uderlying the utilization.

  • PDF

The correction of Lens distortion based on Image division using Artificial Neural Network (영상분할 방법 기반의 인공신경망을 적용한 카메라의 렌즈왜곡 보정)

  • Shin, Ki-Young;Bae, Jang-Han;Mun, Joung-H.
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.4
    • /
    • pp.31-38
    • /
    • 2009
  • Lens distortion is inevitable phenomenon in machine vision system. More and more distortion phenomenon is occurring in order to choice of lens for minimizing cost and system size. As shown above, correction of lens distortion is critical issue. However previous lens correction methods using camera model have problem such as nonlinear property and complicated operation. And recent lens correction methods using neural network also have accuracy and efficiency problem. In this study, I propose new algorithms for correction of lens distortion. Distorted image is divided based on the distortion quantity using k-means. And each divided image region is corrected by using neural network. As a result, the proposed algorithms have better accuracy than previous methods without image division.