• Title/Summary/Keyword: Similarity Criterion

Search Result 93, Processing Time 0.026 seconds

A report of 43 unrecorded bacterial species within the phyla Bacteroidetes and Firmicutes isolated from various sources from Korea in 2019

  • Kang, Heeyoung;Kim, Haneul;Yi, Hana;Kim, Wonyong;Yoon, Jung-Hoon;Im, Wan-Taek;Kim, Myung Kyum;Seong, Chi Nam;Kim, Seung Bum;Cha, Chang-Jun;Jeon, Che Ok;Joh, Kiseong
    • Journal of Species Research
    • /
    • v.10 no.2
    • /
    • pp.117-133
    • /
    • 2021
  • In 2019, 43 bacterial strains were isolated from food, soil, marine environments, human, and animals related sources from the Republic of Korea. Based on the analysis of 16S rRNA gene sequence, these isolates were allocated to the phyla Bacteroidetes and Firmicutes as unrecorded species in Korea. The 10 Bacteroidetes strains were classified into the families Bacteroidaceae, Chitinophagaceae, Cytophagaceae, Flavobacteriaceae, and Prolixibacteraceae (of the orders Bacteroidales, Chitinophagales, Cytophagales, Flavobacteriales, and Marinilabiliales, respectively). The 33 Firmicutes strains belonged to the families Bacillaceae, Paenibacillaceae, Planococcaceae, Staphylococcaceae, Clostridiaceae, Lachnospiraceae, Peptostreptococcaceae, Enterococcaceae, Lactobacillaceae, Leuconostocaceae, and Streptococcaceae (of the orders Bacillales, Clostridiales, and Lactobacillales). These unrecorded bacteria were determined based on taxonomic criterion (>98.7%; 16S rRNA gene sequence similarity). In addition, their phylogenetic affiliation, as well as cell and colony morphologies, staining reactions, and physiological and biochemical properties were investigated. Therefore, we report 43 isolates as unrecorded species, and described basic features, isolation source, and locations of these strains.

Personalized Recommendation System for IPTV using Ontology and K-medoids (IPTV환경에서 온톨로지와 k-medoids기법을 이용한 개인화 시스템)

  • Yun, Byeong-Dae;Kim, Jong-Woo;Cho, Yong-Seok;Kang, Sang-Gil
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.147-161
    • /
    • 2010
  • As broadcasting and communication are converged recently, communication is jointed to TV. TV viewing has brought about many changes. The IPTV (Internet Protocol Television) provides information service, movie contents, broadcast, etc. through internet with live programs + VOD (Video on demand) jointed. Using communication network, it becomes an issue of new business. In addition, new technical issues have been created by imaging technology for the service, networking technology without video cuts, security technologies to protect copyright, etc. Through this IPTV network, users can watch their desired programs when they want. However, IPTV has difficulties in search approach, menu approach, or finding programs. Menu approach spends a lot of time in approaching programs desired. Search approach can't be found when title, genre, name of actors, etc. are not known. In addition, inserting letters through remote control have problems. However, the bigger problem is that many times users are not usually ware of the services they use. Thus, to resolve difficulties when selecting VOD service in IPTV, a personalized service is recommended, which enhance users' satisfaction and use your time, efficiently. This paper provides appropriate programs which are fit to individuals not to save time in order to solve IPTV's shortcomings through filtering and recommendation-related system. The proposed recommendation system collects TV program information, the user's preferred program genres and detailed genre, channel, watching program, and information on viewing time based on individual records of watching IPTV. To look for these kinds of similarities, similarities can be compared by using ontology for TV programs. The reason to use these is because the distance of program can be measured by the similarity comparison. TV program ontology we are using is one extracted from TV-Anytime metadata which represents semantic nature. Also, ontology expresses the contents and features in figures. Through world net, vocabulary similarity is determined. All the words described on the programs are expanded into upper and lower classes for word similarity decision. The average of described key words was measured. The criterion of distance calculated ties similar programs through K-medoids dividing method. K-medoids dividing method is a dividing way to divide classified groups into ones with similar characteristics. This K-medoids method sets K-unit representative objects. Here, distance from representative object sets temporary distance and colonize it. Through algorithm, when the initial n-unit objects are tried to be divided into K-units. The optimal object must be found through repeated trials after selecting representative object temporarily. Through this course, similar programs must be colonized. Selecting programs through group analysis, weight should be given to the recommendation. The way to provide weight with recommendation is as the follows. When each group recommends programs, similar programs near representative objects will be recommended to users. The formula to calculate the distance is same as measure similar distance. It will be a basic figure which determines the rankings of recommended programs. Weight is used to calculate the number of watching lists. As the more programs are, the higher weight will be loaded. This is defined as cluster weight. Through this, sub-TV programs which are representative of the groups must be selected. The final TV programs ranks must be determined. However, the group-representative TV programs include errors. Therefore, weights must be added to TV program viewing preference. They must determine the finalranks.Based on this, our customers prefer proposed to recommend contents. So, based on the proposed method this paper suggested, experiment was carried out in controlled environment. Through experiment, the superiority of the proposed method is shown, compared to existing ways.

Disaster Recovery Priority Decision for Credit Bureau Business Information System: Fuzzy-TOPSIS Approach (신용조회업무 정보시스템의 재난복구 우선순위결정: 퍼지 TOPSIS 접근방법)

  • Yang, Dong-Gu;Kim, Ki-Yoon
    • Management & Information Systems Review
    • /
    • v.35 no.3
    • /
    • pp.173-193
    • /
    • 2016
  • The aim of this paper is to extend the TOPSIS(Technique for Order Preference by Similarity to Ideal Solution) to the fuzzy environment for solving the disaster recovery priority decision problem in credit bureau business information system. In this paper, the rating of each information systems and the weight of each criterion are described by linguistic terms which can be expressed in trapezoidal fuzzy numbers. Then, a vertex method is proposed to calculate the distance between two trapezoidal fuzzy numbers. According to the concept of the TOPSIS, a closeness coefficient is defined to determine the ranking order of all information systems. The combination between the fuzzy set and TOPSIS brings several benefits when compared with other approaches, such that the fuzzy TOPSIS require few fuzzy judgements to parameterization, which contributes to the agility of the decision process, it does not limit the number of alternatives simultaneously evaluated, and it does not cause the ranking reversal problem when a new alternative is included in the evaluation process. This paper is demonstrated with a real case study of a credit rating agency involving 9 evaluation criteria and 9 credit bureau business information systems assessed by 6 evaluators, and provide the systematic disaster recovery framework for BCP(Business Continuity Planning) to practitioner. Finally, this paper show that the procedure of the proposed fuzzy TOPSIS method is well suited as a decision-making tool for the disaster recovery priority decision problem in credit bureau business information system.

  • PDF

Implementation of Policy based In-depth Searching for Identical Entities and Cleansing System in LOD Cloud (LOD 클라우드에서의 연결정책 기반 동일개체 심층검색 및 정제 시스템 구현)

  • Kim, Kwangmin;Sohn, Yonglak
    • Journal of Internet Computing and Services
    • /
    • v.19 no.3
    • /
    • pp.67-77
    • /
    • 2018
  • This paper suggests that LOD establishes its own link policy and publishes it to LOD cloud to provide identity among entities in different LODs. For specifying the link policy, we proposed vocabulary set founded on RDF model as well. We implemented Policy based In-depth Searching and Cleansing(PISC for short) system that proceeds in-depth searching across LODs by referencing the link policies. PISC has been published on Github. LODs have participated voluntarily to LOD cloud so that degree of the entity identity needs to be evaluated. PISC, therefore, evaluates the identities and cleanses the searched entities to confine them to that exceed user's criterion of entity identity level. As for searching results, PISC provides entity's detailed contents which have been collected from diverse LODs and ontology customized to the content. Simulation of PISC has been performed on DBpedia's 5 LODs. We found that similarity of 0.9 of source and target RDF triples' objects provided appropriate expansion ratio and inclusion ratio of searching result. For sufficient identity of searched entities, 3 or more target LODs are required to be specified in link policy.

Group Decision Making for New Professor Selection Using Fuzzy TOPSIS (퍼지 TOPSIS를 이용한 신임교수선택을 위한 집단의사결정)

  • Kim, Ki-Yoon;Yang, Dong-Gu
    • Journal of Digital Convergence
    • /
    • v.14 no.9
    • /
    • pp.229-239
    • /
    • 2016
  • The aim of this paper is to extend the TOPSIS(Technique for Order Performance by Similarity to Ideal Solution) to the fuzzy environment for solving the new professor selection problem in a university. In order to achieve the goal, the rating of each candidate and the weight of each criterion are described by linguistic terms which can be expressed in trapezoidal fuzzy numbers. In this paper, a vertex method is proposed to calculate the distance between two trapezoidal fuzzy numbers. According to the concept of the TOPSIS, a closeness coefficient is defined to determine the ranking order of all candidates. This research derived; 1) 4 evaluation criteria(research results, education and research competency, personality, major suitability) for new professor selection, 2) the 5 step procedure of the proposed fuzzy TOPSIS method for the group decision, 3) priorities of 4 candidates in the new professor selection case. The results of this paper will be useful to practical expert who is interested in analyzing fuzzy data and its multi-criteria decision-making tool for personal selection problem in personal management. Finally, the theoretical and practical implications of the findings were discussed and the directions for future research were suggested.

A study on extraction of optimized API sequence length and combination for efficient malware classification (효율적인 악성코드 분류를 위한 최적의 API 시퀀스 길이 및 조합 도출에 관한 연구)

  • Choi, Ji-Yeon;Kim, HeeSeok;Kim, Kyu-Il;Park, Hark-Soo;Song, Jung-Suk
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.24 no.5
    • /
    • pp.897-909
    • /
    • 2014
  • With the development of the Internet, the number of cyber threats is continuously increasing and their techniques are also evolving for the purpose of attacking our crucial systems. Since attackers are able to easily make exploit codes, i.e., malware, using dedicated generation tools, the number of malware is rapidly increasing. However, it is not easy to analyze all of malware due to an extremely large number of malware. Because of this, many researchers have proposed the malware classification methods that aim to identify unforeseen malware from the well-known malware. The existing malware classification methods used malicious information obtained from the static and the dynamic malware analysis as the criterion of calculating the similarity between malwares. Also, most of them used API functions and their sequences that are divided into a certain length. Thus, the accuracy of the malware classification heavily depends on the length of divided API sequences. In this paper, we propose an extraction method of optimized API sequence length and combination that can be used for improving the performance of the malware classification.

Methodology of Prior Art Search Based on Hierarchical Citation Analysis (계층적 인용관계분석을 통한 선행기술 탐색방법론)

  • Kang, Jiho;Kim, Jongchan;Lee, Joonhyuck;Park, Sangsung;Jang, Dongsik
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.27 no.1
    • /
    • pp.72-78
    • /
    • 2017
  • Prior art search is a core process of technology management performed by inventors and applicants, patent examiners, and employees in the patent industry. As a result of insufficient academic research on a systematic prior art search methodology, the process has been often carried out depending on the subjective judgment of researchers. Previous studies on exploring prior arts based on semantics have also have the risk of underestimating the similarity of major prior arts due to the nature of patent documents where the same technical ideas are expressed in various terms. In this study, we propose an effective prior art search methodology based on hierarchical citation analysis, which provides a clear criterion for selecting core prior arts by calculating weights according to the relative importance of the collected patents. In order to verify the feasibility of the proposed methodology, a case study was conducted to explore the core prior art of one patent in the display field. As a result, 10 core prior art candidates were selected out of the 206 precedent patents.

A Desirability Function-Based Multi-Characteristic Robust Design Optimization Technique (호감도 함수 기반 다특성 강건설계 최적화 기법)

  • Jong Pil Park;Jae Hun Jo;Yoon Eui Nahm
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.4
    • /
    • pp.199-208
    • /
    • 2023
  • Taguchi method is one of the most popular approaches for design optimization such that performance characteristics become robust to uncontrollable noise variables. However, most previous Taguchi method applications have addressed a single-characteristic problem. Problems with multiple characteristics are more common in practice. The multi-criteria decision making(MCDM) problem is to select the optimal one among multiple alternatives by integrating a number of criteria that may conflict with each other. Representative MCDM methods include TOPSIS(Technique for Order of Preference by Similarity to Ideal Solution), GRA(Grey Relational Analysis), PCA(Principal Component Analysis), fuzzy logic system, and so on. Therefore, numerous approaches have been conducted to deal with the multi-characteristic design problem by combining original Taguchi method and MCDM methods. In the MCDM problem, multiple criteria generally have different measurement units, which means that there may be a large difference in the physical value of the criteria and ultimately makes it difficult to integrate the measurements for the criteria. Therefore, the normalization technique is usually utilized to convert different units of criteria into one identical unit. There are four normalization techniques commonly used in MCDM problems, including vector normalization, linear scale transformation(max-min, max, or sum). However, the normalization techniques have several shortcomings and do not adequately incorporate the practical matters. For example, if certain alternative has maximum value of data for certain criterion, this alternative is considered as the solution in original process. However, if the maximum value of data does not satisfy the required degree of fulfillment of designer or customer, the alternative may not be considered as the solution. To solve this problem, this paper employs the desirability function that has been proposed in our previous research. The desirability function uses upper limit and lower limit in normalization process. The threshold points for establishing upper or lower limits let us know what degree of fulfillment of designer or customer is. This paper proposes a new design optimization technique for multi-characteristic design problem by integrating the Taguchi method and our desirability functions. Finally, the proposed technique is able to obtain the optimal solution that is robust to multi-characteristic performances.

A Groundwater Potential Map for the Nakdonggang River Basin (낙동강권역의 지하수 산출 유망도 평가)

  • Soonyoung Yu;Jaehoon Jung;Jize Piao;Hee Sun Moon;Heejun Suk;Yongcheol Kim;Dong-Chan Koh;Kyung-Seok Ko;Hyoung-Chan Kim;Sang-Ho Moon;Jehyun Shin;Byoung Ohan Shim;Hanna Choi;Kyoochul Ha
    • Journal of Soil and Groundwater Environment
    • /
    • v.28 no.6
    • /
    • pp.71-89
    • /
    • 2023
  • A groundwater potential map (GPM) was built for the Nakdonggang River Basin based on ten variables, including hydrogeologic unit, fault-line density, depth to groundwater, distance to surface water, lineament density, slope, stream drainage density, soil drainage, land cover, and annual rainfall. To integrate the thematic layers for GPM, the criteria were first weighted using the Analytic Hierarchical Process (AHP) and then overlaid using the Technique for Ordering Preferences by Similarity to Ideal Solution (TOPSIS) model. Finally, the groundwater potential was categorized into five classes (very high (VH), high (H), moderate (M), low (L), very low (VL)) and verified by examining the specific capacity of individual wells on each class. The wells in the area categorized as VH showed the highest median specific capacity (5.2 m3/day/m), while the wells with specific capacity < 1.39 m3/day/m were distributed in the areas categorized as L or VL. The accuracy of GPM generated in the work looked acceptable, although the specific capacity data were not enough to verify GPM in the studied large watershed. To create GPMs for the determination of high-yield well locations, the resolution and reliability of thematic maps should be improved. Criterion values for groundwater potential should be established when machine learning or statistical models are used in the GPM evaluation process.

A Study on Spatial Pattern of Impact Area of Intersection Using Digital Tachograph Data and Traffic Assignment Model (차량 운행기록정보와 통행배정 모형을 이용한 교차로 영향권의 공간적 패턴에 관한 연구)

  • PARK, Seungjun;HONG, Kiman;KIM, Taegyun;SEO, Hyeon;CHO, Joong Rae;HONG, Young Suk
    • Journal of Korean Society of Transportation
    • /
    • v.36 no.2
    • /
    • pp.155-168
    • /
    • 2018
  • In this study, we studied the directional pattern of entering the intersection from the intersection upstream link prior to predicting short future (such as 5 or 10 minutes) intersection direction traffic volume on the interrupted flow, and examined the possibility of traffic volume prediction using traffic assignment model. The analysis method of this study is to investigate the similarity of patterns by performing cluster analysis with the ratio of traffic volume by intersection direction divided by 2 hours using taxi DTG (Digital Tachograph) data (1 week). Also, for linking with the result of the traffic assignment model, this study compares the impact area of 5 minutes or 10 minutes from the center of the intersection with the analysis result of taxi DTG data. To do this, we have developed an algorithm to set the impact area of intersection, using the taxi DTG data and traffic assignment model. As a result of the analysis, the intersection entry pattern of the taxi is grouped into 12, and the Cubic Clustering Criterion indicating the confidence level of clustering is 6.92. As a result of correlation analysis with the impact area of the traffic assignment model, the correlation coefficient for the impact area of 5 minutes was analyzed as 0.86, and significant results were obtained. However, it was analyzed that the correlation coefficient is slightly lowered to 0.69 in the impact area of 10 minutes from the center of the intersection, but this was due to insufficient accuracy of O/D (Origin/Destination) travel and network data. In future, if accuracy of traffic network and accuracy of O/D traffic by time are improved, it is expected that it will be able to utilize traffic volume data calculated from traffic assignment model when controlling traffic signals at intersections.