• Title/Summary/Keyword: Number of clusters

Search Result 929, Processing Time 0.034 seconds

Genetic Differences and Variation in Two Largehead Hairtail (Trichiurus lepturus) Populations Determined by RAPD-PCR Analysis (RAPD-PCR 분석에 의해 결정된 갈치 (Trichiurus lepturus) 2 집단의 유전적 차이와 변이)

  • Park, Chang-Yi;Yoon, Jong-Man
    • Korean Journal of Ichthyology
    • /
    • v.17 no.3
    • /
    • pp.173-186
    • /
    • 2005
  • Genomic DNA was isolated from two geographic populations of largehead hairtail (Trichiurus lepturus) in Korea and the Atlantic Ocean. The eight arbitrarily selected primers were found to generate common, polymorphic, and specific fragments. The complexity of the banding patterns varied dramatically between primers from the two locations. The size of the DNA fragments also varied widely, from 150 bp (base pairs) to 3,000 bp. Here, 947 fragments were identified in the largehead hairtail population from Korea, and 642 in the largehead hairtail population from the Atlantic Ocean: 148 specific fragments (15.6%) in the Korean population, and 61 (9.5%) in the Atlantic population. In the Korean population, 638 common fragments with an average of 79.8 per primer were observed.; 429 common fragments, with an average of 53.6 per primer, were identified in the Atlantic population. The number of polymorphic fragments in the largehead hairtail population from Korea and the Atlantic Ocean was 76 and 27, respectively. Based on the average bandsharing values of all samples, the similarity matrix ranged from 0.784 to 0.922 in the Korean population, and from 0.833 to 0.990 in the Atlantic population. The bandsharing value of individuals within the Atlantic population was much higher than in the Korean population. The dendrogram obtained by the eight primers indicated two genetic clusters: cluster 1 (KOREAN 01~KOREAN 11), and cluster 2 (ATLANTIC 12~ATLANTIC 22). Individual KOREAN no. 10 from Korea was genetically most closely related to KOREAN no. 11 in the Korean population (genetic distance = 0.038). Ultimately, individual KOREAN no. 01 of the Korean population was most distantly related to ATLANTIC no. 16 of the Atlantic population (genetic distance = 0.708).

Development of Multiplex Microsatellite Marker Set for Identification of Korean Potato Cultivars (국내 감자 품종 판별을 위한 다중 초위성체 마커 세트 개발)

  • Cho, Kwang-Soo;Won, Hong-Sik;Jeong, Hee-Jin;Cho, Ji-Hong;Park, Young-Eun;Hong, Su-Young
    • Horticultural Science & Technology
    • /
    • v.29 no.4
    • /
    • pp.366-373
    • /
    • 2011
  • To analyze the genetic relationships among Korean potato cultivars and to develop cultivar identification method using DNA markers, we carried out genotyping using simple sequence repeats (SSR) analysis and developed multiplex-SSR set. Initially, we designed 92 SSR primer combinations reported previously and applied them to twenty four Korean potato cultivars. Among the 92 SSR markers, we selected 14 SSR markers based on polymorphism information contents (PIC) values. PIC values of the selected 14 markers ranged from 0.48 to 0.89 with an average of 0.76. PIC value of PSSR-29 was the lowest with 0.48 and PSSR-191 was the highest with 0.89. UPGMA clustering analysis based on genetic distances using 14 SSR markers classified 21 potato cultivars into 2 clusters. Cluster I and II included 16 and 5 cultivars, respectively. And 3 cultivars were not classified into major cluster group I and II. These 14 SSR markers generated a total of 121 alleles and the average number of alleles per SSR marker was 10.8 with a range from 3 to 34. Among the selected markers, we combined three SSR markers, PSSR-17, PSSR-24 and PSSR-24, as a multiplex-SSR set. This multiplex-SSR set used in the study can distinguish all the cultivars with one time PCR and PAGE (Polyacrylamide gel electrophoresis) analysis and PIC value of multiplex-SSR set was 0.95.

Geographic Variation in Pond Smelt (Hypomesus nipponensis) by RAPD Analysis (RAPD 분석에 의한 빙어 (Hypomesus nipponensis)의 지리적 변이)

  • Kim, Yong-Ho;Park, Su-Young;Yoon, Jong-Man
    • Korean Journal of Ichthyology
    • /
    • v.18 no.1
    • /
    • pp.1-11
    • /
    • 2006
  • Genomic DNA isolated from two geographical populations of pond-smelt (Hypomesus nipponensis) was amplified for RAPD (randomly amplified polymorphic DNA) analysis. The populations were obtained from Chungju (CJ), in the inland area, and Dangjin (DJ), in the vicinity of the West Sea in Korea. Seven arbitrarily selected primers, OPB-06, OPB-10, OPB-13, OPB-17, OPC-09, OPC-17 and OPC-20, were used to generate the shared loci, polymorphic, and specific loci. Three hundred and eighty-three loci observed per primer were identified in the CJ population, and 287 were identified in the DJ population. Among them, 91 polymorphic loci or 23.8% were polymorphic in the CJ population, and 47 (16.4%) in the DJ population. The number of shared loci observed was 198 in the CJ population and 176 in the DJ population. Forty-four and 75 specific loci were detected in the CJ and DJ populations, respectively. Especially, 99 numbers of shared loci by the two populations, with an average of 14.1 per primer, were observed in the two pond-smelt populations. The average bandsharing value between the two geographical pond-smelt populations was $0.700{\pm}0.008$, ranging from 0.600 to 0.846. Compared separately, the bandsharing value of individuals within the CJ population was higher than that of the DJ population. The dendrogram obtained using the data from the seven primers indicated three genetic clusters: cluster 1, CJ 01, 02, 03, 04, 05, 06, 07, 08, 09, 10, and 11; cluster 2, DJ 01, 02, 03, 04, 05, 06, 07, 08, and 09; and cluster 3, DJ 10 and 11. The genetic distance between the two geographical populations ranged from 0.040 to 0.545. Thus, RAPD-PCR analysis revealed a significant genetic distance between the two pond-smelt populations.

Effects of Seeding Date on Growth, Yield, and Fatty Acid Content of Perilla Inter-cropped with Sesame in Central Korea (중부지역 참깨 간작 들깨 재배시 파종기가 수량 및 품질에 미치는 영향)

  • Kim, Young Sang;Kim, Ki Hyeon;Yun, Cheol Gu;Heo, Yun Seon;Kim, Ik Jei;Kim, Young-Ho;Song, Yong-Sup;Lee, Myoung Hee
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.66 no.2
    • /
    • pp.138-145
    • /
    • 2021
  • Perilla contains more than 60% of fatty acids. Linolenic acid is effective in preventing heart disease, improving learning ability, treating allergies, and preventing cancer. This study was carried out to improve the cultivation method to aid the stable production of perilla by developing a suitable inter-cropping system with sesame in the central region as well as to report a suitable planting time. The test results are summarized as follows. As the planting time of perilla in the inter-cropping system with sesame was delayed, the number of clusters and capsules decreased. The perilla yields in this system showed significant differences compared to that with the previous crops (sesame varieties) and planting period. The yield of perilla was significantly lower in the characteristic-Type B variety than in the characteristic-Type A variety and decreased significantly as the planting time was delayed. With regards to the quality characteristics of perilla, such as crude protein, crude fat, etc., there were no differences between previous perilla crops and those inter-cropped with sesame. The perilla composition did not show any difference during the planting period; however, with delay in the planting time, crude protein content increased but crude fat content decreased. Yield of perilla was 38% higher in a two-row (40 x 40 cm) system, compared to a single-row cultivation (110 x 20 cm) of perilla inter-cropped with sesame. These results suggest that the suitable method for inter-cropping perilla with sesame in the central region is to sow the characteristic-Type A variety in early May, and cultivate the perilla in two lines (40 x 40 cm) in mid-June. This was judged to be the best cultivation method in the central region.

Fruit Characteristics of Gaeryangmeoru Grapes According to Gibberellic Acid and Thidiazuron Treatments (Gibberellic acid와 thidiazuron 처리에 의한 개량머루의 과실 특성)

  • Kwon, YongHee
    • Journal of Bio-Environment Control
    • /
    • v.23 no.2
    • /
    • pp.77-82
    • /
    • 2014
  • The present study was conducted to establish an effect and a proper concentration for treatment with gibberellic acid ($GA_3$) and thidiazuron (TDZ), resulting with increase berry size and yield in Gaeryangmeoru grapes. Berry size was increased by treatment with $GA_3$, and the fruit clusters obtained for the groups treated with $GA_3$ concentrations of 100 and $200mg{\cdot}L^{-1}$ were bigger. The berry number was also enhanced in $GA_3$ treated groups, but the soluble solid content and acidity was not significantly different. Damage caused by $GA_3$ treatment, such as peel pollination and berry shatter, was observed in the group with $200mg{\cdot}L^{-1}$. The berry size was larger in group treated with a high concentration of $GA_3$ and TDZ respectively than in those treated with low concentrations in the treatment mixed $GA_3$ and TDZ; however, fruit with low soluble solid content and high acidity was harvested after $GA_3$ and TDZ treatment due to delay of berry ripening. The pericarp tissue layers were not changed, but the distance from the epidermis layer to vascular bundle tissue was increased as a result of $GA_3$ and TDZ treatment. Therefore, $GA_3$ and TDZ did not affect an cell division but not cell size, resulting in an enlarged berry size. It is necessary to treat plant growth regulators 2~3 times and immediately after berry set to enhance berry set rate, because the period of berry set is short. This study suggests that the proper concentration for enhancing berry size and set were up to $100mg{\cdot}L^1$ $GA_3$ or $50mg{\cdot}L^{-1}GA_3+1.25mg{\cdot}L^{-1}$ TDZ, and it is necessary to pay attention to harvest mature fruits because of the delay of ripening caused by the usage of TDZ.

Analysis of shopping website visit types and shopping pattern (쇼핑 웹사이트 탐색 유형과 방문 패턴 분석)

  • Choi, Kyungbin;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.85-107
    • /
    • 2019
  • Online consumers browse products belonging to a particular product line or brand for purchase, or simply leave a wide range of navigation without making purchase. The research on the behavior and purchase of online consumers has been steadily progressed, and related services and applications based on behavior data of consumers have been developed in practice. In recent years, customization strategies and recommendation systems of consumers have been utilized due to the development of big data technology, and attempts are being made to optimize users' shopping experience. However, even in such an attempt, it is very unlikely that online consumers will actually be able to visit the website and switch to the purchase stage. This is because online consumers do not just visit the website to purchase products but use and browse the websites differently according to their shopping motives and purposes. Therefore, it is important to analyze various types of visits as well as visits to purchase, which is important for understanding the behaviors of online consumers. In this study, we explored the clustering analysis of session based on click stream data of e-commerce company in order to explain diversity and complexity of search behavior of online consumers and typified search behavior. For the analysis, we converted data points of more than 8 million pages units into visit units' sessions, resulting in a total of over 500,000 website visit sessions. For each visit session, 12 characteristics such as page view, duration, search diversity, and page type concentration were extracted for clustering analysis. Considering the size of the data set, we performed the analysis using the Mini-Batch K-means algorithm, which has advantages in terms of learning speed and efficiency while maintaining the clustering performance similar to that of the clustering algorithm K-means. The most optimized number of clusters was derived from four, and the differences in session unit characteristics and purchasing rates were identified for each cluster. The online consumer visits the website several times and learns about the product and decides the purchase. In order to analyze the purchasing process over several visits of the online consumer, we constructed the visiting sequence data of the consumer based on the navigation patterns in the web site derived clustering analysis. The visit sequence data includes a series of visiting sequences until one purchase is made, and the items constituting one sequence become cluster labels derived from the foregoing. We have separately established a sequence data for consumers who have made purchases and data on visits for consumers who have only explored products without making purchases during the same period of time. And then sequential pattern mining was applied to extract frequent patterns from each sequence data. The minimum support is set to 10%, and frequent patterns consist of a sequence of cluster labels. While there are common derived patterns in both sequence data, there are also frequent patterns derived only from one side of sequence data. We found that the consumers who made purchases through the comparative analysis of the extracted frequent patterns showed the visiting pattern to decide to purchase the product repeatedly while searching for the specific product. The implication of this study is that we analyze the search type of online consumers by using large - scale click stream data and analyze the patterns of them to explain the behavior of purchasing process with data-driven point. Most studies that typology of online consumers have focused on the characteristics of the type and what factors are key in distinguishing that type. In this study, we carried out an analysis to type the behavior of online consumers, and further analyzed what order the types could be organized into one another and become a series of search patterns. In addition, online retailers will be able to try to improve their purchasing conversion through marketing strategies and recommendations for various types of visit and will be able to evaluate the effect of the strategy through changes in consumers' visit patterns.

SKU recommender system for retail stores that carry identical brands using collaborative filtering and hybrid filtering (협업 필터링 및 하이브리드 필터링을 이용한 동종 브랜드 판매 매장간(間) 취급 SKU 추천 시스템)

  • Joe, Denis Yongmin;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.77-110
    • /
    • 2017
  • Recently, the diversification and individualization of consumption patterns through the web and mobile devices based on the Internet have been rapid. As this happens, the efficient operation of the offline store, which is a traditional distribution channel, has become more important. In order to raise both the sales and profits of stores, stores need to supply and sell the most attractive products to consumers in a timely manner. However, there is a lack of research on which SKUs, out of many products, can increase sales probability and reduce inventory costs. In particular, if a company sells products through multiple in-store stores across multiple locations, it would be helpful to increase sales and profitability of stores if SKUs appealing to customers are recommended. In this study, the recommender system (recommender system such as collaborative filtering and hybrid filtering), which has been used for personalization recommendation, is suggested by SKU recommendation method of a store unit of a distribution company that handles a homogeneous brand through a plurality of sales stores by country and region. We calculated the similarity of each store by using the purchase data of each store's handling items, filtering the collaboration according to the sales history of each store by each SKU, and finally recommending the individual SKU to the store. In addition, the store is classified into four clusters through PCA (Principal Component Analysis) and cluster analysis (Clustering) using the store profile data. The recommendation system is implemented by the hybrid filtering method that applies the collaborative filtering in each cluster and measured the performance of both methods based on actual sales data. Most of the existing recommendation systems have been studied by recommending items such as movies and music to the users. In practice, industrial applications have also become popular. In the meantime, there has been little research on recommending SKUs for each store by applying these recommendation systems, which have been mainly dealt with in the field of personalization services, to the store units of distributors handling similar brands. If the recommendation method of the existing recommendation methodology was 'the individual field', this study expanded the scope of the store beyond the individual domain through a plurality of sales stores by country and region and dealt with the store unit of the distribution company handling the same brand SKU while suggesting a recommendation method. In addition, if the existing recommendation system is limited to online, it is recommended to apply the data mining technique to develop an algorithm suitable for expanding to the store area rather than expanding the utilization range offline and analyzing based on the existing individual. The significance of the results of this study is that the personalization recommendation algorithm is applied to a plurality of sales outlets handling the same brand. A meaningful result is derived and a concrete methodology that can be constructed and used as a system for actual companies is proposed. It is also meaningful that this is the first attempt to expand the research area of the academic field related to the existing recommendation system, which was focused on the personalization domain, to a sales store of a company handling the same brand. From 05 to 03 in 2014, the number of stores' sales volume of the top 100 SKUs are limited to 52 SKUs by collaborative filtering and the hybrid filtering method SKU recommended. We compared the performance of the two recommendation methods by totaling the sales results. The reason for comparing the two recommendation methods is that the recommendation method of this study is defined as the reference model in which offline collaborative filtering is applied to demonstrate higher performance than the existing recommendation method. The results of this model are compared with the Hybrid filtering method, which is a model that reflects the characteristics of the offline store view. The proposed method showed a higher performance than the existing recommendation method. The proposed method was proved by using actual sales data of large Korean apparel companies. In this study, we propose a method to extend the recommendation system of the individual level to the group level and to efficiently approach it. In addition to the theoretical framework, which is of great value.

A Proposal of a Keyword Extraction System for Detecting Social Issues (사회문제 해결형 기술수요 발굴을 위한 키워드 추출 시스템 제안)

  • Jeong, Dami;Kim, Jaeseok;Kim, Gi-Nam;Heo, Jong-Uk;On, Byung-Won;Kang, Mijung
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.1-23
    • /
    • 2013
  • To discover significant social issues such as unemployment, economy crisis, social welfare etc. that are urgent issues to be solved in a modern society, in the existing approach, researchers usually collect opinions from professional experts and scholars through either online or offline surveys. However, such a method does not seem to be effective from time to time. As usual, due to the problem of expense, a large number of survey replies are seldom gathered. In some cases, it is also hard to find out professional persons dealing with specific social issues. Thus, the sample set is often small and may have some bias. Furthermore, regarding a social issue, several experts may make totally different conclusions because each expert has his subjective point of view and different background. In this case, it is considerably hard to figure out what current social issues are and which social issues are really important. To surmount the shortcomings of the current approach, in this paper, we develop a prototype system that semi-automatically detects social issue keywords representing social issues and problems from about 1.3 million news articles issued by about 10 major domestic presses in Korea from June 2009 until July 2012. Our proposed system consists of (1) collecting and extracting texts from the collected news articles, (2) identifying only news articles related to social issues, (3) analyzing the lexical items of Korean sentences, (4) finding a set of topics regarding social keywords over time based on probabilistic topic modeling, (5) matching relevant paragraphs to a given topic, and (6) visualizing social keywords for easy understanding. In particular, we propose a novel matching algorithm relying on generative models. The goal of our proposed matching algorithm is to best match paragraphs to each topic. Technically, using a topic model such as Latent Dirichlet Allocation (LDA), we can obtain a set of topics, each of which has relevant terms and their probability values. In our problem, given a set of text documents (e.g., news articles), LDA shows a set of topic clusters, and then each topic cluster is labeled by human annotators, where each topic label stands for a social keyword. For example, suppose there is a topic (e.g., Topic1 = {(unemployment, 0.4), (layoff, 0.3), (business, 0.3)}) and then a human annotator labels "Unemployment Problem" on Topic1. In this example, it is non-trivial to understand what happened to the unemployment problem in our society. In other words, taking a look at only social keywords, we have no idea of the detailed events occurring in our society. To tackle this matter, we develop the matching algorithm that computes the probability value of a paragraph given a topic, relying on (i) topic terms and (ii) their probability values. For instance, given a set of text documents, we segment each text document to paragraphs. In the meantime, using LDA, we can extract a set of topics from the text documents. Based on our matching process, each paragraph is assigned to a topic, indicating that the paragraph best matches the topic. Finally, each topic has several best matched paragraphs. Furthermore, assuming there are a topic (e.g., Unemployment Problem) and the best matched paragraph (e.g., Up to 300 workers lost their jobs in XXX company at Seoul). In this case, we can grasp the detailed information of the social keyword such as "300 workers", "unemployment", "XXX company", and "Seoul". In addition, our system visualizes social keywords over time. Therefore, through our matching process and keyword visualization, most researchers will be able to detect social issues easily and quickly. Through this prototype system, we have detected various social issues appearing in our society and also showed effectiveness of our proposed methods according to our experimental results. Note that you can also use our proof-of-concept system in http://dslab.snu.ac.kr/demo.html.

Clustering Method based on Genre Interest for Cold-Start Problem in Movie Recommendation (영화 추천 시스템의 초기 사용자 문제를 위한 장르 선호 기반의 클러스터링 기법)

  • You, Tithrottanak;Rosli, Ahmad Nurzid;Ha, Inay;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.57-77
    • /
    • 2013
  • Social media has become one of the most popular media in web and mobile application. In 2011, social networks and blogs are still the top destination of online users, according to a study from Nielsen Company. In their studies, nearly 4 in 5active users visit social network and blog. Social Networks and Blogs sites rule Americans' Internet time, accounting to 23 percent of time spent online. Facebook is the main social network that the U.S internet users spend time more than the other social network services such as Yahoo, Google, AOL Media Network, Twitter, Linked In and so on. In recent trend, most of the companies promote their products in the Facebook by creating the "Facebook Page" that refers to specific product. The "Like" option allows user to subscribed and received updates their interested on from the page. The film makers which produce a lot of films around the world also take part to market and promote their films by exploiting the advantages of using the "Facebook Page". In addition, a great number of streaming service providers allows users to subscribe their service to watch and enjoy movies and TV program. They can instantly watch movies and TV program over the internet to PCs, Macs and TVs. Netflix alone as the world's leading subscription service have more than 30 million streaming members in the United States, Latin America, the United Kingdom and the Nordics. As the matter of facts, a million of movies and TV program with different of genres are offered to the subscriber. In contrast, users need spend a lot time to find the right movies which are related to their interest genre. Recent years there are many researchers who have been propose a method to improve prediction the rating or preference that would give the most related items such as books, music or movies to the garget user or the group of users that have the same interest in the particular items. One of the most popular methods to build recommendation system is traditional Collaborative Filtering (CF). The method compute the similarity of the target user and other users, which then are cluster in the same interest on items according which items that users have been rated. The method then predicts other items from the same group of users to recommend to a group of users. Moreover, There are many items that need to study for suggesting to users such as books, music, movies, news, videos and so on. However, in this paper we only focus on movie as item to recommend to users. In addition, there are many challenges for CF task. Firstly, the "sparsity problem"; it occurs when user information preference is not enough. The recommendation accuracies result is lower compared to the neighbor who composed with a large amount of ratings. The second problem is "cold-start problem"; it occurs whenever new users or items are added into the system, which each has norating or a few rating. For instance, no personalized predictions can be made for a new user without any ratings on the record. In this research we propose a clustering method according to the users' genre interest extracted from social network service (SNS) and user's movies rating information system to solve the "cold-start problem." Our proposed method will clusters the target user together with the other users by combining the user genre interest and the rating information. It is important to realize a huge amount of interesting and useful user's information from Facebook Graph, we can extract information from the "Facebook Page" which "Like" by them. Moreover, we use the Internet Movie Database(IMDb) as the main dataset. The IMDbis online databases that consist of a large amount of information related to movies, TV programs and including actors. This dataset not only used to provide movie information in our Movie Rating Systems, but also as resources to provide movie genre information which extracted from the "Facebook Page". Formerly, the user must login with their Facebook account to login to the Movie Rating System, at the same time our system will collect the genre interest from the "Facebook Page". We conduct many experiments with other methods to see how our method performs and we also compare to the other methods. First, we compared our proposed method in the case of the normal recommendation to see how our system improves the recommendation result. Then we experiment method in case of cold-start problem. Our experiment show that our method is outperform than the other methods. In these two cases of our experimentation, we see that our proposed method produces better result in case both cases.