• Title/Summary/Keyword: string algorithms

Search Result 106, Processing Time 0.021 seconds

δ-approximate Periods and γ-approximate Periods of Strings over Integer Alphabets (정수문자집합에 대한 문자열의 δ-근사주기와 γ-근사주기)

  • Kim, Youngho;Sim, Jeong Seop
    • Journal of KIISE
    • /
    • v.43 no.10
    • /
    • pp.1073-1078
    • /
    • 2016
  • (${\delta}$, ${\gamma}$)-matching for strings over integer alphabets can be applied to such fields as musical melody and share prices on stock markets. In this paper, we define ${\delta}$-approximate periods and ${\gamma}$-approximate periods of strings over integer alphabets. We also present two $O(n^2)$-time algorithms, each of which finds minimum ${\delta}$-approximate periods and minimum ${\gamma}$-approximate periods, respectively. Then, we provide the experimental results of execution times of both algorithms.

Effective Scheme for File Search Engine in Mobile Environments (모바일 환경에서 파일 검색 엔진을 위한 효과적인 방식)

  • Cho, Jong-Keun;Ha, Sang-Eun
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.11
    • /
    • pp.41-48
    • /
    • 2008
  • This study focuses on the modeling file search engine and suggesting modified file search schema based on weight value using file contents in order to improve the performance in terms of search accuracy and matching time. Most of the file search engines have used string matching algorithms like KMP(Knuth.Morris.Pratt), which may limit portability and fast searching time. However, this kind of algorithms don't find exactly the files what you want. Hence, the file search engine based on weight value using file contents is proposed here in order to optimize the performance for mobile environments. The Comparison with previous research shows that the proposed schema provides better.

TG-SPSR: A Systematic Targeted Password Attacking Model

  • Zhang, Mengli;Zhang, Qihui;Liu, Wenfen;Hu, Xuexian;Wei, Jianghong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.5
    • /
    • pp.2674-2697
    • /
    • 2019
  • Identity authentication is a crucial line of defense for network security, and passwords are still the mainstream of identity authentication. So far trawling password attacking has been extensively studied, but the research related with personal information is always sporadic. Probabilistic context-free grammar (PCFG) and Markov chain-based models perform greatly well in trawling guessing. In this paper we propose a systematic targeted attacking model based on structure partition and string reorganization by migrating the above two models to targeted attacking, denoted as TG-SPSR. In structure partition phase, besides dividing passwords to basic structure similar to PCFG, we additionally define a trajectory-based keyboard pattern in the basic grammar and introduce index bits to accurately characterize the position of special characters. Moreover, we also construct a BiLSTM recurrent neural network classifier to characterize the behavior of password reuse and modification after defining nine kinds of modification rules. Extensive experimental results indicate that in online attacking, TG-SPSR outperforms traditional trawling attacking algorithms by average about 275%, and respectively outperforms its foremost counterparts, Personal-PCFG, TarGuess-I, by about 70% and 19%; In offline attacking, TG-SPSR outperforms traditional trawling attacking algorithms by average about 90%, outperforms Personal-PCFG and TarGuess-I by 85% and 30%, respectively.

A Development of Hybrid Genetic Algorithms for Classical Job Shop Scheduling (전통적인 Job Shop 일정계획을 위한 혼합유전 알고리즘의 개발)

  • 정종백;김정자;주철민
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2000.04a
    • /
    • pp.609-612
    • /
    • 2000
  • Job-shop scheduling problem(JSSP) is one of the best-known machine scheduling problems and essentially an ordering problem. A new encoding scheme which always give a feasible schedule is presented, by which a schedule directly corresponds to an assigned-operation ordering string. It is initialized with G&T algorithm and improved using the developed genetic operator; APMX or BPMX crossover operator and mutation operator. and the problem of infeasibility in genetic generation is naturally overcome. Within the framework of the newly designed genetic algorithm, the NP-hard classical job-shop scheduling problem can be efficiently solved with high quality. Moreover the optimal solutions of the famous benchmarks, the Fisher and Thompson's 10${\times}$10 and 20${\times}$5 problems, are found.

  • PDF

Correction for Misrecognition of Korean Texts in Signboard Images using Improved Levenshtein Metric

  • Lee, Myung-Hun;Kim, Soo-Hyung;Lee, Guee-Sang;Kim, Sun-Hee;Yang, Hyung-Jeong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.6 no.2
    • /
    • pp.722-733
    • /
    • 2012
  • Recently various studies on various applications using images taken by mobile phone cameras have been actively conducted. This study proposes a correction method for misrecognition of Korean Texts in signboard images using improved Levenshtein metric. The proposed method calculates distances of five recognized candidates and detects the best match texts from signboard text database. For verifying the efficiency of the proposed method, a database dictionary is built using 1.3 million words of nationwide signboard through removing duplicated words. We compared the proposed method to Levenshtein Metric which is one of representative text string comparison algorithms. As a result, the proposed method based on improved Levenshtein metric represents an improvement in recognition rates 31.5% on average compared to that of conventional methods.

Analysis of Evolutionary Optimization Methods for CNN Structures (CNN 구조의 진화 최적화 방식 분석)

  • Seo, Kisung
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.67 no.6
    • /
    • pp.767-772
    • /
    • 2018
  • Recently, some meta-heuristic algorithms, such as GA(Genetic Algorithm) and GP(Genetic Programming), have been used to optimize CNN(Convolutional Neural Network). The CNN, which is one of the deep learning models, has seen much success in a variety of computer vision tasks. However, designing CNN architectures still requires expert knowledge and a lot of trial and error. In this paper, the recent attempts to automatically construct CNN architectures are investigated and analyzed. First, two GA based methods are summarized. One is the optimization of CNN structures with the number and size of filters, connection between consecutive layers, and activation functions of each layer. The other is an new encoding method to represent complex convolutional layers in a fixed-length binary string, Second, CGP(Cartesian Genetic Programming) based method is surveyed for CNN structure optimization with highly functional modules, such as convolutional blocks and tensor concatenation, as the node functions in CGP. The comparison for three approaches is analysed and the outlook for the potential next steps is suggested.

Analysis of Power Generation Characteristics according to the MPPT Algorithm Period (MPPT 알고리즘의 주기에 따른 발전 영향 분석)

  • Min, Joonki;Choi, Wonseok
    • The Transactions of the Korean Institute of Electrical Engineers P
    • /
    • v.67 no.4
    • /
    • pp.233-237
    • /
    • 2018
  • In this paper, we compared power generation characteristics of two MPPT algorithms, P & O and InC, applied to grid-connected solar inverters. The MPPT detects the voltage and current of the solar module string and transfers the power to the DC link. Since the grid is connected by the power conversion circuit, the grid connection control and the MPPT control influence each other. The power generation characteristics were analyzed by Psim simulation according to the MPPT cycle in the weak grid conditions.

GOMS: Large-scale ontology management system using graph databases

  • Lee, Chun-Hee;Kang, Dong-oh
    • ETRI Journal
    • /
    • v.44 no.5
    • /
    • pp.780-793
    • /
    • 2022
  • Large-scale ontology management is one of the main issues when using ontology data practically. Although many approaches have been proposed in relational database management systems (RDBMSs) or object-oriented DBMSs (OODBMSs) to develop large-scale ontology management systems, they have several limitations because ontology data structures are intrinsically different from traditional data structures in RDBMSs or OODBMSs. In addition, users have difficulty using ontology data because many terminologies (ontology nodes) in large-scale ontology data match with a given string keyword. Therefore, in this study, we propose a (graph database-based ontology management system (GOMS) to efficiently manage large-scale ontology data. GOMS uses a graph DBMS and provides new query templates to help users find key concepts or instances. Furthermore, to run queries with multiple joins and path conditions efficiently, we propose GOMS encoding as a filtering tool and develop hash-based join processing algorithms in the graph DBMS. Finally, we experimentally show that GOMS can process various types of queries efficiently.

Function of the Korean String Indexing System for the Subject Catalog (주제목록을 위한 한국용어열색인 시스템의 기능)

  • Yoon Kooho
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.15
    • /
    • pp.225-266
    • /
    • 1988
  • Various theories and techniques for the subject catalog have been developed since Charles Ammi Cutter first tried to formulate rules for the construction of subject headings in 1876. However, they do not seem to be appropriate to Korean language because the syntax and semantics of Korean language are different from those of English and other European languages. This study therefore attempts to develop a new Korean subject indexing system, namely Korean String Indexing System(KOSIS), in order to increase the use of subject catalogs. For this purpose, advantages and disadvantages between the classed subject catalog nd the alphabetical subject catalog, which are typical subject ca-alogs in libraries, are investigated, and most of remarkable subject indexing systems, in particular the PRECIS developed by the British National Bibliography, are reviewed and analysed. KOSIS is a string indexing based on purely the syntax and semantics of Korean language, even though considerable principles of PRECIS are applied to it. The outlines of KOSIS are as follows: 1) KOSIS is based on the fundamentals of natural language and an ingenious conjunction of human indexing skills and computer capabilities. 2) KOSIS is. 3 string indexing based on the 'principle of context-dependency.' A string of terms organized accoding to his principle shows remarkable affinity with certain patterns of words in ordinary discourse. From that point onward, natural language rather than classificatory terms become the basic model for indexing schemes. 3) KOSIS uses 24 role operators. One or more operators should be allocated to the index string, which is organized manually by the indexer's intellectual work, in order to establish the most explicit syntactic relationship of index terms. 4) Traditionally, a single -line entry format is used in which a subject heading or index entry is presented as a single sequence of words, consisting of the entry terms, plus, in some cases, an extra qualifying term or phrase. But KOSIS employs a two-line entry format which contains three basic positions for the production of index entries. The 'lead' serves as the user's access point, the 'display' contains those terms which are themselves context dependent on the lead, 'qualifier' sets the lead term into its wider context. 5) Each of the KOSIS entries is co-extensive with the initial subject statement prepared by the indexer, since it displays all the subject specificities. Compound terms are always presented in their natural language order. Inverted headings are not produced in KOSIS. Consequently, the precision ratio of information retrieval can be increased. 6) KOSIS uses 5 relational codes for the system of references among semantically related terms. Semantically related terms are handled by a different set of routines, leading to the production of 'See' and 'See also' references. 7) KOSIS was riginally developed for a classified catalog system which requires a subject index, that is an index -which 'trans-lates' subject index, that is, an index which 'translates' subjects expressed in natural language into the appropriate classification numbers. However, KOSIS can also be us d for a dictionary catalog system. Accordingly, KOSIS strings can be manipulated to produce either appropriate subject indexes for a classified catalog system, or acceptable subject headings for a dictionary catalog system. 8) KOSIS is able to maintain a constistency of index entries and cross references by means of a routine identification of the established index strings and reference system. For this purpose, an individual Subject Indicator Number and Reference Indicator Number is allocated to each new index strings and new index terms, respectively. can produce all the index entries, cross references, and authority cards by means of either manual or mechanical methods. Thus, detailed algorithms for the machine-production of various outputs are provided for the institutions which can use computer facilities.

  • PDF

An Adaptive Algorithm for Plagiarism Detection in a Controlled Program Source Set (제한된 프로그램 소스 집합에서 표절 탐색을 위한 적응적 알고리즘)

  • Ji, Jeong-Hoon;Woo, Gyun;Cho, Hwan-Gue
    • Journal of KIISE:Software and Applications
    • /
    • v.33 no.12
    • /
    • pp.1090-1102
    • /
    • 2006
  • This paper suggests a new algorithm for detecting the plagiarism among a set of source codes, constrained to be functionally equivalent, such are submitted for a programming assignment or for a programming contest problem. The typical algorithms largely exploited up to now are based on Greedy-String Tiling, which seeks for a perfect match of substrings, and analysis of similarity between strings based on the local alignment of the two strings. This paper introduces a new method for detecting the similar interval of the given programs based on an adaptive similarity matrix, each entry of which is the logarithm of the probabilities of the keywords based on the frequencies of them in the given set of programs. We experimented this method using a set of programs submitted for more than 10 real programming contests. According to the experimental results, we can find several advantages of this method compared to the previous one which uses fixed similarity matrix(+1 for match, -1 for mismatch, -2 for gap) and also can find that the adaptive similarity matrix can be used for detecting various plagiarism cases.