• Title/Summary/Keyword: Co-clustering

Search Result 223, Processing Time 0.028 seconds

Analysis of the genetic diversity and population structure of Lindera obtusiloba (Lauraceae), a dioecious tree in Korea

  • Ho Bang Kim;Hye-Young Lee;Mi Sun Lee;Yi Lee;Youngtae Choi;Sung-Yeol Kim;Jaeyong Choi
    • Journal of Plant Biotechnology
    • /
    • v.50
    • /
    • pp.207-214
    • /
    • 2023
  • Lindera obtusiloba (Lauraceae) is a dioecious tree that is widely distributed in the low-altitude montane forests of East Asia, including Korea. Despite its various pharmacological properties and ornamental value, the genetic diversity and population structure of this species in Korea have not been explored. In this study, we selected 6 nuclear and 6 chloroplast microsatellite markers with polymorphism or clean cross-amplification and used these markers to perform genetic diversity and population structure analyses of L. obtusiloba samples collected from 20 geographical regions. Using these 12 markers, we identified a total of 44 alleles, ranging from 1 to 8 per locus, and the average observed and expected heterozygosity values were 0.11 and 0.44, respectively. The average polymorphism information content was 0.39. Genetic relationship and population structure analyses revealed that the natural L. obtusiloba population in Korea is composed of 2 clusters, possibly due to two different plastid genotypes. The same clustering patterns have also been observed in Lindera species in mainland China and Japan.

Research on Function and Policy for e-Government System using Semantic Technology (전자정부내 의미기반 기술 도입에 따른 기능 및 정책 연구)

  • Jang, Young-Cheol
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.13 no.5
    • /
    • pp.22-28
    • /
    • 2008
  • This paper aims to offer a solution based on semantic document classification to improve e-Government utilization and efficiency for people using their own information retrieval system and linguistic expression. Generally, semantic document classification method is an approach that classifies documents based on the diverse relationships between keywords in a document without fully describing hierarchial concepts between keywords. Our approach considers the deep meanings within the context of the document and radically enhances the information retrieval performance. Concept Weight Document Classification(CoWDC) method, which goes beyond using existing keyword and simple thesaurus/ontology methods by fully considering the concept hierarchy of various concepts is proposed, experimented, and evaluated. With the recognition that in order to verify the superiority of the semantic retrieval technology through test results of the CoWDC and efficiently integrate it into the e-Government, creation of a thesaurus, management of the operating system, expansion of the knowledge base and improvements in search service and accuracy at the national level were needed.

  • PDF

The Distribution Structure of the Internet Movie and Spatial Clustering of the Internet Movie Industry (인터넷 영화의 유통구조와 인터넷 영화산업의 공간적 집적화)

  • Lee, Hee-Yeon;Lee, Nan-Kyung
    • Journal of the Economic Geographical Society of Korea
    • /
    • v.8 no.1
    • /
    • pp.107-130
    • /
    • 2005
  • The purpose of this study were to examine the spatial distribution and locational characteristics of the Internet movie industry, to seize the value chains of the Internet movie industry and distribution structure of the internet movies, and to analyze the vertical-horizontal linkages of the Internet movie firms and their spatial clustering. Recently, the Internet movie industry has developed rapidly due to the development of techniques related to movie contents, the broadband Internet and a wide expansion of the high speed communication network and the increase of demands on movie contents. It has been found that 74$\%$ of the Internet movie industry was concentrated in Seoul. Especially this industry was quite agglomerated in several dongs of Gangnam-gu such as Yoeksam, Nonhyeon, Daechi and Samseung. The proximity of the same or similar business firms was the primary locational factors that influenced on the Internet movie industry, followed by other factors such as convenience of transportation, the reputation of the place, and proximity of technically supporting firms. The Internet movie industry had the valve chain composed of 'contents suppliers $\rightarrow$ contents distributors $\rightarrow$ service providers', However, there were also a complex network of the VOD copyright owner, VOD syndicator, and service providers in each category of the value chain. This research clearly revealed that the localized clustering has been formed with the movie contents providers, technically supporting firms, client firms, and cooperative-affiliated business firms related to the Internet movie industry, Additionally, a very intimate network has been established within the clustering, inducing the enlargement of the market and decrease of costs, the co-sharing of tacit knowledge, and the synergy effect.

  • PDF

Co-author and Keyword Networks and their Clustering Appearance in Preventive Medicine Fields in Korea: Analysis of Papers in the Journal of Preventive Medicine and Public Health, $1991{\sim}2006$ (국내 예방의학 분야의 공저자.핵심어 네트워크와 군집 양상 - 대한예방의학회지($1991{\sim}2006$) 게재논문의 분석 -)

  • Jung, Min-Soo;Chung, Dong-Jun
    • Journal of Preventive Medicine and Public Health
    • /
    • v.41 no.1
    • /
    • pp.1-9
    • /
    • 2008
  • Objectives : This study evaluated knowledge structure and its effect factor by analysis of co-author and keyword networks in Korea's preventive medicine sector. Methods : The data was extracted from 873 papers listed in the Journal of Preventive Medicine and Public Health, and was transformed into a co-author and keyword matrix where the existence of a 'link' was judged by impact factors calculated by the weight value of the role and rate of author participation. Research achievement was dependent upon the author's status and networking index, as analyzed by neighborhood degree, multidimensional scaling, correspondence analysis, and multiple regression. Results : Co-author networks developed as randomness network in the center of a few high-productivity researchers. In particular, closeness centrality was more developed than degree centrality. Also, power law distribution was discovered in impact factor and research productivity by college affiliation. In multiple regression, the effect of the author's role was significant in both the impact factor calculated by the participatory rate and the number of listed articles. However, the number of listed articles varied by sex. Conclusions : This study shows that the small world phenomenon exists in co-author and keyword networks in a journal, as in citation networks. However, the differentiation of knowledge structure in the field of preventive medicine was relatively restricted by specialization.

Proposal of Analysis Method for Biota Survey Data Using Co-occurrence Frequency

  • Yong-Ki Kim;Jeong-Boon Lee;Sung Je Lee;Jong-Hyun Kang
    • Proceedings of the National Institute of Ecology of the Republic of Korea
    • /
    • v.5 no.3
    • /
    • pp.76-85
    • /
    • 2024
  • The purpose of this study is to propose a new method of analysis focusing on interconnections between species rather than traditional biodiversity analysis, which represents ecosystems in terms of species and individual counts such as species diversity and species richness. This new approach aims to enhance our understanding of ecosystem networks. Utilizing data from the 4th National Natural Environment Survey (2014-2018), the following eight taxonomic groups were targeted for our study: herbaceous plants, woody plants, butterflies, Passeriformes birds, mammals, reptiles & amphibians, freshwater fishes, and benthonic macroinvertebrates. A co-occurrence frequency analysis was conducted using nationwide data collected over five years. As a result, in all eight taxonomic groups, the degree value represented by a linear regression trend line showed a slope of 0.8 and the weighted degree value showed an exponential nonlinear curve trend line with a coefficient of determination (R2) exceeding 0.95. The average value of the clustering coefficient was also around 0.8, reminiscent of well-known social phenomena. Creating a combination set from the species list grouped by temporal information such as survey date and spatial information such as coordinates or grids is an easy approach to discern species distributed regionally and locally. Particularly, grouping by species or taxonomic groups to produce data such as co-occurrence frequency between survey points could allow us to discover spatial similarities based on species present. This analysis could overcome limitations of species data. Since there are no restrictions on time or space, data collected over a short period in a small area and long-term national-scale data can be analyzed through appropriate grouping. The co-occurrence frequency analysis enables us to measure how many species are associated with a single species and the frequency of associations among each species, which will greatly help us understand ecosystems that seem too complex to comprehend. Such connectivity data and graphs generated by the co-occurrence frequency analysis of species are expected to provide a wealth of information and insights not only to researchers, but also to those who observe, manage, and live within ecosystems.

Domain Analysis on the Field of Open Access by Co-Word Analysis: Based on Published Journals of Library and Information Science during 2013 to 2018 (동시출현단어 분석을 활용한 오픈액세스 분야의 지적구조 분석: 2013년부터 2018년까지 출판된 문헌정보학 저널을 기반으로)

  • Kim, Sun-Kyum;Kim, Wan-Jong;Seo, Tae-Sul;Choi, Hyun-Jin
    • Journal of Korean Library and Information Science Society
    • /
    • v.50 no.1
    • /
    • pp.333-356
    • /
    • 2019
  • Open access has emerged as an alternative to overcome the crisis brought by scholarly communication on commercial publishers. The purpose of this study is to suggest the intellectual structure that reflects the newest research trend in the field of open access, to identify how the subject area is structured by using co-word analysis, and compare and analyze with the existing study. In order to do this, the total number of dataset was 761 papers collected from Web of Science during the period from January 2012 to November 2018 using information science and 2,321 keywords as a noun phase are extracted from titles and abstracts. To analyze the intellectual structure of open access, 13 topic clusters are extracted by network analysis and the keywords with higher centrallity are drawn by visualizing the intellectual relationship. In addition, after clustering analysis, the relationship was analyzed by plotting the result on the multidimensional scaling map. As a result, it is expected that our research helps the research direction of open access for the future.

A Comparative Analysis on Multiple Authorship Counting for Author Co-citation Analysis (저자동시인용분석을 위한 복수저자 기여도 산정 방식의 비교 분석)

  • Lee, Jae Yun;Chung, EunKyung
    • Journal of the Korean Society for information Management
    • /
    • v.31 no.2
    • /
    • pp.57-77
    • /
    • 2014
  • As co-authorship has been prevalent within science communities, counting the credit of co-authors appropriately is an important consideration, particularly in the context of identifying the knowledge structure of fields with author-based analysis. The purpose of this study is to compare the characteristics of co-author credit counting methods by utilizing correlations, multidimensional scaling, and pathfinder networks. To achieve this purpose, this study analyzed a dataset of 2,014 journal articles and 3,892 cited authors from the Journal of the Architectural Institute of Korea: Planning & Design from 2003 to 2008 in the field of Architecture in Korea. In this study, six different methods of crediting co-authors are selected for comparative analyses. These methods are first-author counting (m1), straight full counting (m2), and fractional counting (m3), proportional counting with a total score of 1 (m4), proportional counting with a total score between 1 and 2 (m5), and first-author-weighted fractional counting (m6). As shown in the data analysis, m1 and m2 are found as extreme opposites, since m1 counts only first authors and m2 assigns all co-authors equally with a credit score of 1. With correlation and multidimensional scaling analyses, among five counting methods (from m2 to m6), a group of counting methods including m3, m4, and m5 are found to be relatively similar. When the knowledge structure is visualized with pathfinder network, the knowledge structure networks from different counting methods are differently presented due to the connections of individual links. In addition, the internal validity shows that first-author-weighted fractional counting (m6) might be considered a better method to author clustering. Findings demonstrate that different co-author counting methods influence the network results of knowledge structure and a better counting method is revealed for author clustering.

Bagged Auto-Associative Kernel Regression-Based Fault Detection and Identification Approach for Steam Boilers in Thermal Power Plants

  • Yu, Jungwon;Jang, Jaeyel;Yoo, Jaeyeong;Park, June Ho;Kim, Sungshin
    • Journal of Electrical Engineering and Technology
    • /
    • v.12 no.4
    • /
    • pp.1406-1416
    • /
    • 2017
  • In complex and large-scale industries, properly designed fault detection and identification (FDI) systems considerably improve safety, reliability and availability of target processes. In thermal power plants (TPPs), generating units operate under very dangerous conditions; system failures can cause severe loss of life and property. In this paper, we propose a bagged auto-associative kernel regression (AAKR)-based FDI approach for steam boilers in TPPs. AAKR estimates new query vectors by online local modeling, and is suitable for TPPs operating under various load levels. By combining the bagging method, more stable and reliable estimations can be achieved, since the effects of random fluctuations decrease because of ensemble averaging. To validate performance, the proposed method and comparison methods (i.e., a clustering-based method and principal component analysis) are applied to failure data due to water wall tube leakage gathered from a 250 MW coal-fired TPP. Experimental results show that the proposed method fulfills reasonable false alarm rates and, at the same time, achieves better fault detection performance than the comparison methods. After performing fault detection, contribution analysis is carried out to identify fault variables; this helps operators to confirm the types of faults and efficiently take preventive actions.

Improved Multidimensional Scaling Techniques Considering Cluster Analysis: Cluster-oriented Scaling (클러스터링을 고려한 다차원척도법의 개선: 군집 지향 척도법)

  • Lee, Jae-Yun
    • Journal of the Korean Society for information Management
    • /
    • v.29 no.2
    • /
    • pp.45-70
    • /
    • 2012
  • There have been many methods and algorithms proposed for multidimensional scaling to mapping the relationships between data objects into low dimensional space. But traditional techniques, such as PROXSCAL or ALSCAL, were found not effective for visualizing the proximities between objects and the structure of clusters of large data sets have more than 50 objects. The CLUSCAL(CLUster-oriented SCALing) technique introduced in this paper differs from them especially in that it uses cluster structure of input data set. The CLUSCAL procedure was tested and evaluated on two data sets, one is 50 authors co-citation data and the other is 85 words co-occurrence data. The results can be regarded as promising the usefulness of CLUSCAL method especially in identifying clusters on MDS maps.

Experimental Evaluation of Distance-based and Probability-based Clustering

  • Kwon, Na Yeon;Kim, Jang Il;Dollein, Richard;Seo, Weon Joon;Jung, Yong Gyu
    • International journal of advanced smart convergence
    • /
    • v.2 no.1
    • /
    • pp.36-41
    • /
    • 2013
  • Decision-making is to extract information that can be executed in the future, it refers to the process of discovering a new data model that is induced in the data. In other words, it is to find out the information to peel off to find the vein to catch the relationship between the hidden patterns in data. The information found here, is a process of finding the relationship between the useful patterns by applying modeling techniques and sophisticated statistical analysis of the data. It is called data mining which is a key technology for marketing database. Therefore, research for cluster analysis of the current is performed actively, which is capable of extracting information on the basis of the large data set without a clear criterion. The EM and K-means methods are used a lot in particular, how the result values of evaluating are come out in experiments, which are depending on the size of the data by the type of distance-based and probability-based data analysis.