• Title/Summary/Keyword: model-based cluster

Search Result 638, Processing Time 0.025 seconds

Building an Innovation System for Industrial Development in a Knowledge based Economy (산업의 지식집약화를 위한 혁신체제 구축 방향)

  • 김선배
    • Journal of the Economic Geographical Society of Korea
    • /
    • v.4 no.1
    • /
    • pp.61-76
    • /
    • 2001
  • The purposes of this research are to examine the theoretical background and industrial policy issues with regard to building a Innovation System for encouraging industrial competitiveness and fostering regional industry in Korea. Knowledge has become the driving force of economic growth and the primary source of competitiveness in the world market. So since 1990s, Innovation Systems have been put emphasis on as new industrial development strategy in a knowledge-based economy. It can be understood that Innovation System is composed of National Innovation System(NIS) and Regional Innovation System(RIS) and interrelated the concept of clusters and networks, which are contribute to industry development throughout boosting innovation. As for the Korean industrial policy, when the former centralized policy decision making process became decentralized through the implementation of local autonomy, the role of local or state government in relation to regional industrial promotion intensified. But with the impotance of for fostering strategic industry in the region. new industrial policy issues in Korea are needed as follows; $\circled1$ Building a market-oriented support system for industrial cluster through providing the resource of innovation. $\circled2$ Establishing agency for regional industrial development. $\circled3$ Making a evolutionary vision for broader region including 2 or 3 province, $\circled4$ Fostering strategic industry which is selected in term of specialization and potential of the region. The RIS model for industry development is outlined in this paper but policy initiatives for building a RIS have to be extracted from further case studies.

  • PDF

A Suggestion for Spatiotemporal Analysis Model of Complaints on Officially Assessed Land Price by Big Data Mining (빅데이터 마이닝에 의한 공시지가 민원의 시공간적 분석모델 제시)

  • Cho, Tae In;Choi, Byoung Gil;Na, Young Woo;Moon, Young Seob;Kim, Se Hun
    • Journal of Cadastre & Land InformatiX
    • /
    • v.48 no.2
    • /
    • pp.79-98
    • /
    • 2018
  • The purpose of this study is to suggest a model analysing spatio-temporal characteristics of the civil complaints for the officially assessed land price based on big data mining. Specifically, in this study, the underlying reasons for the civil complaints were found from the spatio-temporal perspectives, rather than the institutional factors, and a model was suggested monitoring a trend of the occurrence of such complaints. The official documents of 6,481 civil complaints for the officially assessed land price in the district of Jung-gu of Incheon Metropolitan City over the period from 2006 to 2015 along with their temporal and spatial poperties were collected and used for the analysis. Frequencies of major key words were examined by using a text mining method. Correlations among mafor key words were studied through the social network analysis. By calculating term frequency(TF) and term frequency-inverse document frequency(TF-IDF), which correspond to the weighted value of key words, I identified the major key words for the occurrence of the civil complaint for the officially assessed land price. Then the spatio-temporal characteristics of the civil complaints were examined by analysing hot spot based on the statistics of Getis-Ord $Gi^*$. It was found that the characteristic of civil complaints for the officially assessed land price were changing, forming a cluster that is linked spatio-temporally. Using text mining and social network analysis method, we could find out that the occurrence reason of civil complaints for the officially assessed land price could be identified quantitatively based on natural language. TF and TF-IDF, the weighted averages of key words, can be used as main explanatory variables to analyze spatio-temporal characteristics of civil complaints for the officially assessed land price since these statistics are different over time across different regions.

A Study on Market Size Estimation Method by Product Group Using Word2Vec Algorithm (Word2Vec을 활용한 제품군별 시장규모 추정 방법에 관한 연구)

  • Jung, Ye Lim;Kim, Ji Hui;Yoo, Hyoung Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.1-21
    • /
    • 2020
  • With the rapid development of artificial intelligence technology, various techniques have been developed to extract meaningful information from unstructured text data which constitutes a large portion of big data. Over the past decades, text mining technologies have been utilized in various industries for practical applications. In the field of business intelligence, it has been employed to discover new market and/or technology opportunities and support rational decision making of business participants. The market information such as market size, market growth rate, and market share is essential for setting companies' business strategies. There has been a continuous demand in various fields for specific product level-market information. However, the information has been generally provided at industry level or broad categories based on classification standards, making it difficult to obtain specific and proper information. In this regard, we propose a new methodology that can estimate the market sizes of product groups at more detailed levels than that of previously offered. We applied Word2Vec algorithm, a neural network based semantic word embedding model, to enable automatic market size estimation from individual companies' product information in a bottom-up manner. The overall process is as follows: First, the data related to product information is collected, refined, and restructured into suitable form for applying Word2Vec model. Next, the preprocessed data is embedded into vector space by Word2Vec and then the product groups are derived by extracting similar products names based on cosine similarity calculation. Finally, the sales data on the extracted products is summated to estimate the market size of the product groups. As an experimental data, text data of product names from Statistics Korea's microdata (345,103 cases) were mapped in multidimensional vector space by Word2Vec training. We performed parameters optimization for training and then applied vector dimension of 300 and window size of 15 as optimized parameters for further experiments. We employed index words of Korean Standard Industry Classification (KSIC) as a product name dataset to more efficiently cluster product groups. The product names which are similar to KSIC indexes were extracted based on cosine similarity. The market size of extracted products as one product category was calculated from individual companies' sales data. The market sizes of 11,654 specific product lines were automatically estimated by the proposed model. For the performance verification, the results were compared with actual market size of some items. The Pearson's correlation coefficient was 0.513. Our approach has several advantages differing from the previous studies. First, text mining and machine learning techniques were applied for the first time on market size estimation, overcoming the limitations of traditional sampling based- or multiple assumption required-methods. In addition, the level of market category can be easily and efficiently adjusted according to the purpose of information use by changing cosine similarity threshold. Furthermore, it has a high potential of practical applications since it can resolve unmet needs for detailed market size information in public and private sectors. Specifically, it can be utilized in technology evaluation and technology commercialization support program conducted by governmental institutions, as well as business strategies consulting and market analysis report publishing by private firms. The limitation of our study is that the presented model needs to be improved in terms of accuracy and reliability. The semantic-based word embedding module can be advanced by giving a proper order in the preprocessed dataset or by combining another algorithm such as Jaccard similarity with Word2Vec. Also, the methods of product group clustering can be changed to other types of unsupervised machine learning algorithm. Our group is currently working on subsequent studies and we expect that it can further improve the performance of the conceptually proposed basic model in this study.

A Study of Anomaly Detection for ICT Infrastructure using Conditional Multimodal Autoencoder (ICT 인프라 이상탐지를 위한 조건부 멀티모달 오토인코더에 관한 연구)

  • Shin, Byungjin;Lee, Jonghoon;Han, Sangjin;Park, Choong-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.57-73
    • /
    • 2021
  • Maintenance and prevention of failure through anomaly detection of ICT infrastructure is becoming important. System monitoring data is multidimensional time series data. When we deal with multidimensional time series data, we have difficulty in considering both characteristics of multidimensional data and characteristics of time series data. When dealing with multidimensional data, correlation between variables should be considered. Existing methods such as probability and linear base, distance base, etc. are degraded due to limitations called the curse of dimensions. In addition, time series data is preprocessed by applying sliding window technique and time series decomposition for self-correlation analysis. These techniques are the cause of increasing the dimension of data, so it is necessary to supplement them. The anomaly detection field is an old research field, and statistical methods and regression analysis were used in the early days. Currently, there are active studies to apply machine learning and artificial neural network technology to this field. Statistically based methods are difficult to apply when data is non-homogeneous, and do not detect local outliers well. The regression analysis method compares the predictive value and the actual value after learning the regression formula based on the parametric statistics and it detects abnormality. Anomaly detection using regression analysis has the disadvantage that the performance is lowered when the model is not solid and the noise or outliers of the data are included. There is a restriction that learning data with noise or outliers should be used. The autoencoder using artificial neural networks is learned to output as similar as possible to input data. It has many advantages compared to existing probability and linear model, cluster analysis, and map learning. It can be applied to data that does not satisfy probability distribution or linear assumption. In addition, it is possible to learn non-mapping without label data for teaching. However, there is a limitation of local outlier identification of multidimensional data in anomaly detection, and there is a problem that the dimension of data is greatly increased due to the characteristics of time series data. In this study, we propose a CMAE (Conditional Multimodal Autoencoder) that enhances the performance of anomaly detection by considering local outliers and time series characteristics. First, we applied Multimodal Autoencoder (MAE) to improve the limitations of local outlier identification of multidimensional data. Multimodals are commonly used to learn different types of inputs, such as voice and image. The different modal shares the bottleneck effect of Autoencoder and it learns correlation. In addition, CAE (Conditional Autoencoder) was used to learn the characteristics of time series data effectively without increasing the dimension of data. In general, conditional input mainly uses category variables, but in this study, time was used as a condition to learn periodicity. The CMAE model proposed in this paper was verified by comparing with the Unimodal Autoencoder (UAE) and Multi-modal Autoencoder (MAE). The restoration performance of Autoencoder for 41 variables was confirmed in the proposed model and the comparison model. The restoration performance is different by variables, and the restoration is normally well operated because the loss value is small for Memory, Disk, and Network modals in all three Autoencoder models. The process modal did not show a significant difference in all three models, and the CPU modal showed excellent performance in CMAE. ROC curve was prepared for the evaluation of anomaly detection performance in the proposed model and the comparison model, and AUC, accuracy, precision, recall, and F1-score were compared. In all indicators, the performance was shown in the order of CMAE, MAE, and AE. Especially, the reproduction rate was 0.9828 for CMAE, which can be confirmed to detect almost most of the abnormalities. The accuracy of the model was also improved and 87.12%, and the F1-score was 0.8883, which is considered to be suitable for anomaly detection. In practical aspect, the proposed model has an additional advantage in addition to performance improvement. The use of techniques such as time series decomposition and sliding windows has the disadvantage of managing unnecessary procedures; and their dimensional increase can cause a decrease in the computational speed in inference.The proposed model has characteristics that are easy to apply to practical tasks such as inference speed and model management.

Development of an Analytical Framework for Dialogic Argumentation in the Context of Socioscientific Issues: Based on Discourse Clusters and Schemes (과학관련 사회쟁점(SSI) 맥락에서의 소집단 논증활동 분석틀 개발: 담화클러스터와 담화요소의 분석)

  • Ko, Yeonjoo;Choi, Yunhee;Lee, Hyunju
    • Journal of The Korean Association For Science Education
    • /
    • v.35 no.3
    • /
    • pp.509-521
    • /
    • 2015
  • Argumentation is a social and collaborative dialogic process. A large number of researchers have focused on analyzing the structure of students' argumentation occurring in the scientific inquiry context, using the Toulmin's model of argument. Since SSI dialogic argumentation often presents distinctive features (e.g. interdisciplinary, controversial, value-laden, etc.), Toulmin's model would not fit into the context. Therefore, we attempted to develop an analytical framework for SSI dialogic argumentation by addressing the concepts of 'discourse clusters' and 'discourse schemes.' Discourse clusters indicated a series of utterances created for a similar dialogical purpose in the SSI contexts. Discourse schemes denoted meaningful discourse units that well represented the features of SSI reasoning. In this study, we presented six types of discourse clusters and 19 discourse schemes. We applied the framework to the data of students' group discourse on SSIs (e.g. euthanasia, nuclear energy, etc.) in order to verify its validity and applicability. The results indicate that the framework well explained the overall flow, dynamics, and features of students' discourse on SSI.

An Efficiency Analysis of Industry-University-Public Research Institute Collaborative Research: Employing the Input-Output Itemization Model (투입 및 산출 분해모형을 활용한 산학연 협력연구의 효율성 분석)

  • Kim, Hong-Young;Chung, Sunyang
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.12
    • /
    • pp.473-484
    • /
    • 2017
  • This study analyzed collaborative R&D projects funded by the Korean government from 2013-2015. For this analysis, input and output variables of projects were considered, and a combination of those variables was itemized. The output-oriented variable return to scale (VRS) model extended from the DEA methodology was adopted to evaluate the cooperation efficiency of the types of R&D collaboration, which were classified according to the project leader's organizations. In addition, hierarchical cluster analysis was conducted using the efficiency results of the scientific, technical, and economical outcome models. The results showed that cooperation efficiency between large companies and public research institutions was relatively high. Conversely, cooperation among medium-sized companies, small businesses and universities was particularly inefficient. The clustering results demonstrated the various strengths and weaknesses of the types depending on publications, patents, technical loyalties and the number of commercialization. In conclusion, this study suggests differentiated investment portfolios and strategies based on the efficiency results of diverse cooperation types among industries, universities and public research institutions.

Design of Summer Very Short-term Precipitation Forecasting Pattern in Metropolitan Area Using Optimized RBFNNs (최적화된 다항식 방사형 기저함수 신경회로망을 이용한 수도권 여름철 초단기 강수예측 패턴 설계)

  • Kim, Hyun-Ki;Choi, Woo-Yong;Oh, Sung-Kwun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.23 no.6
    • /
    • pp.533-538
    • /
    • 2013
  • The damage caused by Recent frequently occurring locality torrential rains is increasing rapidly. In case of densely populated metropolitan area, casualties and property damage is a serious due to landslides and debris flows and floods. Therefore, the importance of predictions about the torrential is increasing. Precipitation characteristic of the bad weather in Korea is divided into typhoons and torrential rains. This seems to vary depending on the duration and area. Rainfall is difficult to predict because regional precipitation is large volatility and nonlinear. In this paper, Very short-term precipitation forecasting pattern model is implemented using KLAPS data used by Korea Meteorological Administration. we designed very short term precipitation forecasting pattern model using GA-based RBFNNs. the structural and parametric values such as the number of Inputs, polynomial type,number of fcm cluster, and fuzzification coefficient are optimized by GA optimization algorithm.

Predictors of Intention to Undergo Mammography among Underutilizers (유방암 검진 미수검자의 검진의도 관련 요인)

  • Kye, Su-Yeon;Park, Kee-Ho;Choi, Kui-Son;Bae, Mi-Jin;Moon, In-Ok;Yun, Young-Ok;Lim, Min-Kyung
    • Korean Journal of Health Education and Promotion
    • /
    • v.26 no.2
    • /
    • pp.75-86
    • /
    • 2009
  • Objectives: To identify the factors associated with the intention to undergo mammography among Korean women without a prior screening experience. Methods: Among 1,039 women of the general population, we selected 145 women (mean age: 54.2 years, age range : 40-78 years) without any prior experience with mammography. They were recruited for the 'Cancer Information Needs Assessment Survey' by using the method of random multi-stage cluster sampling. Data on the socio-demographic characteristics, intention to undergo mammography based on the Precaution Adoption Process Model, level of self belief and self efficacy for breast cancer screening, motivation for decision to undergo breast cancer screening were obtained by conducting a household survey. Results: Of the study subjects, 49.7% were classified as "unengaged" and "decided not to act" regarding breast cancer screening. Women with the intention to undergo mammography were more likely to be younger (OR 0.11, 95%CI 0.04-0.36), to have been recommended to undergo screening by others (OR 3.27, 95%CI 1.36-7.87), to have a high level of perceived sensitivity (OR 3.15, 95%CI 1.27-7.82), and to have a high level of self efficacy (OR 1.09, 95%CI 0.97-1.23). Exposure to campaigns and information regarding breast cancer screening, whether cancer patients are or not in around, perceived severity, perceived benefit, and perceived cost were factors that were not significantly associated with the intention to undergo mammography. Conclusion: It is necessary to develop tailored intervention strategies for women who have never undergone breast cancer screening on the basis of their demographic characteristics and factors that positively influence the intention to undergo mammography.

FORMATION AND EVOLUTION OF SELF-INTERACTING DARK MATTER HALOS

  • AHN KYUNGJIN;SHAPIRO PAUL R.
    • Journal of The Korean Astronomical Society
    • /
    • v.36 no.3
    • /
    • pp.89-95
    • /
    • 2003
  • Observations of dark matter dominated dwarf and low surface brightness disk galaxies favor density profiles with a flat-density core, while cold dark matter (CDM) N-body simulations form halos with central cusps, instead. This apparent discrepancy has motivated a re-examination of the microscopic nature of the dark matter in order to explain the observed halo profiles, including the suggestion that CDM has a non-gravitational self-interaction. We study the formation and evolution of self-interacting dark matter (SIDM) halos. We find analytical, fully cosmological similarity solutions for their dynamics, which take proper account of the collisional interaction of SIDM particles, based on a fluid approximation derived from the Boltzmann equation. The SIDM particles scatter each other elastically, which results in an effective thermal conductivity that heats the halo core and flattens its density profile. These similarity solutions are relevant to galactic and cluster halo formation in the CDM model. We assume that the local density maximum which serves as the progenitor of the halo has an initial mass profile ${\delta}M / M {\propto} M^{-{\epsilon}$, as in the familiar secondary infall model. If $\epsilon$ = 1/6, SIDM halos will evolve self-similarly, with a cold, supersonic infall which is terminated by a strong accretion shock. Different solutions arise for different values of the dimensionless collisionality parameter, $Q {\equiv}{\sigma}p_br_s$, where $\sigma$ is the SIDM particle scattering cross section per unit mass, $p_b$ is the cosmic mean density, and $r_s$ is the shock radius. For all these solutions, a flat-density, isothermal core is present which grows in size as a fixed fraction of $r_s$. We find two different regimes for these solutions: 1) for $Q < Q_{th}({\simeq} 7.35{\times} 10^{-4}$), the core density decreases and core size increases as Q increases; 2) for $Q > Q_{th}$, the core density increases and core size decreases as Q increases. Our similarity solutions are in good agreement with previous results of N-body simulation of SIDM halos, which correspond to the low-Q regime, for which SIDM halo profiles match the observed galactic rotation curves if $Q {\~} [8.4 {\times}10^{-4} - 4.9 {\times} 10^{-2}]Q_{th}$, or ${\sigma}{\~} [0.56 - 5.6] cm^2g{-1}$. These similarity solutions also show that, as $Q {\to}{\infty}$, the central density acquires a singular profile, in agreement with some earlier simulation results which approximated the effects of SIDM collisionality by considering an ordinary fluid without conductivity, i.e. the limit of mean free path ${\lambda}_{mfp}{\to} 0$. The intermediate regime where $Q {\~} [18.6 - 231]Q_{th}$ or ${\sigma}{\~} [1.2{\times}10^4 - 2.7{\times}10^4] cm^2g{-1}$, for which we find flat-density cores comparable to those of the low-Q solutions preferred to make SIDM halos match halo observations, has not previously been identified. Further study of this regime is warranted.

Motives for Writing After-Purchase Consumer Reviews in Online Stores and Classification of Online Store Shoppers (인터넷 점포에서의 구매후기 작성 동기 및 점포 고객 유형화)

  • Hong, Hee-Sook;Ryu, Sung-Min
    • Journal of Distribution Research
    • /
    • v.17 no.3
    • /
    • pp.25-57
    • /
    • 2012
  • This study identified motives for writing apparel product reviews in online stores, and determined what motives increase the behavior of writing reviews. It also classified store customers based on the type of writing motives, and clarified the characteristics of internet purchase behavior and of a demographic profile. Data were collected from 252 females aged 20s' and 30s' who have experience of reading and writing reviews on online shopping. The five types of writing motives were altruistic information sharing, remedying of a grievance and vengeance, economic incentives, helping new product development, and the expression of satisfaction feelings. Among five motives, altruistic information sharing, economic incentives, and helping new product development stimulate writing reviews. Store customers who write reviews were classified into three groups based on their writing motive types: Other consumer advocates(29.8%), self-interested shoppers(40.5%) and shoppers with moderate motives(29.8%). There were significant differences among three groups in writing behavior (the frequency of writing reviews, writing intent of reviews, duration of writing reviews, and frequency of online shopping) and age. Based on results, managerial implications were suggested. Long Abstract : The purpose of present study is to identify the types of writing motives on online shopping, and to clarify the motives affecting the behavior of writing reviews. This study also classifies online shoppers based on the motive types, and identifies the characteristics of the classified groups in terms of writing behavior, frequency of online shopping, and demographics. Use and Gratification Theory was adopted in this study. Qualitative research (focus group interview) and quantitative research were used. Korean women(20 to 39 years old) who reported experience with purchasing clothing online, and reading and writing reviews were selected as samples(n=252). Most of the respondents were relatively young (20-34yrs., 86.1%,), single (61.1%), employed(61.1%) and residents living in big cities(50.9%). About 69.8% of respondents read and 40.5% write apparel reviews frequently or very frequently. 24.6% of the respondents indicated an "average" in their writing frequency. Based on the qualitative result of focus group interviews and previous studies on motives for online community activities, measurement items of motives for writing after-purchase reviews were developed. All items were used a five-point Likert scale with endpoints 1 (strongly disagree) and 5 (strongly agree). The degree of writing behavior was measured by items concerning experience of writing reviews, frequency of writing reviews, amount of writing reviews, and intention of writing reviews. A five-point scale(strongly disagree-strongly agree) was employed. SPSS 18.0 was used for exploratory factor analysis, K-means cluster analysis, one-way ANOVA(Scheffe test) and ${\chi}^2$-test. Confirmatory factor analysis and path model analysis were conducted by AMOS 18.0. By conducting principal components factor analysis (varimax rotation, extracting factors with eigenvalues above 1.0) on the measurement items, five factors were identified: Altruistic information sharing, remedying of a grievance and vengeance, economic incentives, helping new product development, and expression of satisfaction feelings(see Table 1). The measurement model including these final items was analyzed by confirmatory factor analysis. The measurement model had good fit indices(GFI=.918, AGFI=.884, RMR=.070, RMSEA=.054, TLI=.941) except for the probability value associated with the ${\chi}^2$ test(${\chi}^2$=189.078, df=109, p=.00). Convergent validities of all variables were confirmed using composite reliability. All SMC values were found to be lower than AVEs confirming discriminant validity. The path model's goodness-of-fit was greater than the recommended limits based on several indices(GFI=.905, AGFI=.872, RMR=.070, RMSEA=.052, TLI=.935; ${\chi}^2$=260.433, df=155, p=.00). Table 2 shows that motives of altruistic information sharing, economic incentives and helping new product development significantly increased the degree of writing product reviews of online shopping. In particular, the effect of altruistic information sharing and pursuit of economic incentives on the behavior of writing reviews were larger than the effect of helping new product development. As shown in table 3, online store shoppers were classified into three groups: Other consumer advocates (29.8%), self-interested shoppers (40.5%), and moderate shoppers (29.8%). There were significant differences among the three groups in the degree of writing reviews (experience of writing reviews, frequency of writing reviews, amount of writing reviews, intention of writing reviews, and duration of writing reviews, frequency of online shopping) and age. For five aspects of writing behavior, the group of other consumer advocates who is mainly comprised of 20s had higher scores than the other two groups. There were not any significant differences between self-interested group and moderate group regarding writing behavior and demographics.

  • PDF