• Title/Summary/Keyword: text input

Search Result 355, Processing Time 0.021 seconds

Improved SIM Algorithm for Contents-based Image Retrieval (내용 기반 이미지 검색을 위한 개선된 SIM 방법)

  • Kim, Kwang-Baek
    • Journal of Intelligence and Information Systems
    • /
    • v.15 no.2
    • /
    • pp.49-59
    • /
    • 2009
  • Contents-based image retrieval methods are in general more objective and effective than text-based image retrieval algorithms since they use color and texture in search and avoid annotating all images for search. SIM(Self-organizing Image browsing Map) is one of contents-based image retrieval algorithms that uses only browsable mapping results obtained by SOM(Self Organizing Map). However, SOM may have an error in selecting the right BMU in learning phase if there are similar nodes with distorted color information due to the intensity of light or objects' movements in the image. Such images may be mapped into other grouping nodes thus the search rate could be decreased by this effect. In this paper, we propose an improved SIM that uses HSV color model in extracting image features with color quantization. In order to avoid unexpected learning error mentioned above, our SOM consists of two layers. In learning phase, SOM layer 1 has the color feature vectors as input. After learning SOM Layer 1, the connection weights of this layer become the input of SOM Layer 2 and re-learning occurs. With this multi-layered SOM learning, we can avoid mapping errors among similar nodes of different color information. In search, we put the query image vector into SOM layer 2 and select nodes of SOM layer 1 that connects with chosen BMU of SOM layer 2. In experiment, we verified that the proposed SIM was better than the original SIM and avoid mapping error effectively.

  • PDF

A Multi-speaker Speech Synthesis System Using X-vector (x-vector를 이용한 다화자 음성합성 시스템)

  • Jo, Min Su;Kwon, Chul Hong
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.4
    • /
    • pp.675-681
    • /
    • 2021
  • With the recent growth of the AI speaker market, the demand for speech synthesis technology that enables natural conversation with users is increasing. Therefore, there is a need for a multi-speaker speech synthesis system that can generate voices of various tones. In order to synthesize natural speech, it is required to train with a large-capacity. high-quality speech DB. However, it is very difficult in terms of recording time and cost to collect a high-quality, large-capacity speech database uttered by many speakers. Therefore, it is necessary to train the speech synthesis system using the speech DB of a very large number of speakers with a small amount of training data for each speaker, and a technique for naturally expressing the tone and rhyme of multiple speakers is required. In this paper, we propose a technology for constructing a speaker encoder by applying the deep learning-based x-vector technique used in speaker recognition technology, and synthesizing a new speaker's tone with a small amount of data through the speaker encoder. In the multi-speaker speech synthesis system, the module for synthesizing mel-spectrogram from input text is composed of Tacotron2, and the vocoder generating synthesized speech consists of WaveNet with mixture of logistic distributions applied. The x-vector extracted from the trained speaker embedding neural networks is added to Tacotron2 as an input to express the desired speaker's tone.

A STUDY ON THE PARAMETER ESTIMATION OF SNYDER-TYPE SYNTHETIC UNIT-HYDROGRAPH DEVELOPMENT IN KUM RIVER BASIN

  • Jeong, Sang-man;Park, Seok-Chae;Lee, Joo-Heon
    • Water Engineering Research
    • /
    • v.2 no.4
    • /
    • pp.219-229
    • /
    • 2001
  • Synthetic unit hydrograph equations for rainfall run-off characteristics analysis and estimation of design flood have long and quite frequently been presented, the Snyder and SCS synthetic unit hydrograph. The major inputs to the Snyder and SCS synthetic unit hydrograph are lag time and peak coefficient. In this study, the methods for estimating lag time and peak coefficient for small watersheds proposed by Zhao and McEnroe(1999) were applied to the Kum river basin in Korea. We investigated lag times of relatively small watersheds in the Kum river basin in Korea. For this investigation the recent rainfall and stream flow data for 10 relatively small watersheds with drainage areas ranging from 134 to 902 square kilometers were gathered and used. 250 flood flow events were identified along the way, and the lag time for the flood events was determined by using the rainfall and stream flow data. Lag time is closely related with the basin characteristics of a given drainage area such as channel length, channel slope, and drainage area. A regression analysis was conducted to relate lag time to the watershed characteristics. The resulting regression model is as shown below: ※ see full text (equations) In the model, Tlag is the lag time in hours, Lc is the length of the main river in kilometers and Se is the equivalent channel slope of the main channel. The coefficient of determinations (r$^2$)expressed in the regression equation is 0.846. The peak coefficient is not correlated significantly with any of the watershed characteristics. We recommend a peak coefficient of 0.60 as input to the Snyder unit-hydrograph model for the ungauged Kum river watersheds

  • PDF

Traffic Consideration and Link Capacity Estimation for Integrated Multimedia Network of The Naval Ship (함정용 멀티미디어 통합통신망을 위한 트래픽 및 링크용량 예측)

  • Lee, Chae-Dong;Shin, Woo-Seop;Kim, Suk-Chan
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.49 no.5
    • /
    • pp.99-106
    • /
    • 2012
  • Korea navy has been using the voice-oriented ICS to raise a efficiency of naval ship operation. Recently a multimedia network which are included voice, video and text is under consideration by korean navy. As a basic research to establish the integrated multimedia network of a naval ship, this paper classify the networks in order to apply to an integrated network among the various networks within a naval ship. We also consider the sort and characteristic of the multimedia traffic which is using within the classified networks. To predict the link capacity of switch from number of traffic input source, we suggest a traffic aggregation model. Then we calculate the link capacity of aggregated traffic and analyze a aggregated traffic of Korea major naval ship.

A Multimedia Bulletin Board System Providing Semantic-based Searching (의미 기반 정보 검색을 제공하는 멀티미디어 게시판 시스템)

  • Jung Eui-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.6 s.38
    • /
    • pp.75-84
    • /
    • 2005
  • Bulletin board systems have evolved to support diverse multimedia data as well as text. However, current board systems have an weakness : it takes much time and efforts for users to figure out contents of articles. Most board systems provide a searching function with lexical level data access for solving that problem, however it fails to serve users' intented searching results. Moreover, it is nearly impossible to search proper articles if they contain multimedia data. This paper proposed a bulletin board system adopting the Semantic Web to solve this issue. The proposed system provides users with new ontology which is used for describing articles' domain knowledge and multimedia features. Users can describe their own board ontology using the proposed ontology. To support semantic-based searching for diverse domain knowledge without modification of the system, the system dynamically generated input/query interface and RDF data access module according to the board ontology written by administrators. The proposed board system shows that semantic-based searching is feasible and effective for users to find their intended articles.

  • PDF

Mention Detection with Pointer Networks (포인터 네트워크를 이용한 멘션탐지)

  • Park, Cheoneum;Lee, Changki
    • Journal of KIISE
    • /
    • v.44 no.8
    • /
    • pp.774-781
    • /
    • 2017
  • Mention detection systems use nouns or noun phrases as a head and construct a chunk of text that defines any meaning, including a modifier. The term "mention detection" relates to the extraction of mentions in a document. In the mentions, a coreference resolution pertains to finding out if various mentions have the same meaning to each other. A pointer network is a model based on a recurrent neural network (RNN) encoder-decoder, and outputs a list of elements that correspond to input sequence. In this paper, we propose the use of mention detection using pointer networks. Our proposed model can solve the problem of overlapped mention detection, an issue that could not be solved by sequence labeling when applying the pointer network to the mention detection. As a result of this experiment, performance of the proposed mention detection model showed an F1 of 80.07%, a 7.65%p higher than rule-based mention detection; a co-reference resolution performance using this mention detection model showed a CoNLL F1 of 52.67% (mention boundary), and a CoNLL F1 of 60.11% (head boundary) that is high, 7.68%p, or 1.5%p more than coreference resolution using rule-based mention detection.

A Comparative Analysis of Content-based Music Retrieval Systems (내용기반 음악검색 시스템의 비교 분석)

  • Ro, Jung-Soon
    • Journal of the Korean Society for information Management
    • /
    • v.30 no.3
    • /
    • pp.23-48
    • /
    • 2013
  • This study compared and analyzed 15 CBMR (Content-based Music Retrieval) systems accessible on the web in terms of DB size and type, query type, access point, input and output type, and search functions, with reviewing features of music information and techniques used for transforming or transcribing of music sources, extracting and segmenting melodies, extracting and indexing features of music, and matching algorithms for CBMR systems. Application of text information retrieval techniques such as inverted indexing, N-gram indexing, Boolean search, truncation, keyword and phrase search, normalization, filtering, browsing, exact matching, similarity measure using edit distance, sorting, etc. to enhancing the CBMR; effort for increasing DB size and usability; and problems in extracting melodies, deleting stop notes in queries, and using solfege as pitch information were found as the results of analysis.

Modeling User Preference based on Bayesian Networks for Office Event Retrieval (사무실 이벤트 검색을 위한 베이지안 네트워크 기반 사용자 선호도 모델링)

  • Lim, Soo-Jung;Park, Han-Saem;Cho, Sung-Bae
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.6
    • /
    • pp.614-618
    • /
    • 2008
  • As the multimedia data increase a lot with the rapid development of the Internet, an efficient retrieval technique focusing on individual users is required based on the analyses of such data. However, user modeling services provided by recent web sites have the limitation of text-based page configurations and recommendation retrieval. In this paper, we construct the user preference model with a Bayesian network to apply the user modeling to video retrieval, and suggest a method which utilizes probability reasoning. To do this, context information is defined in a real office environment and the video scripts acquired from established cameras and annotated the context information manually are used. Personal information of the user, obtained from user input, is adopted for the evidence value of the constructed Bayesian Network, and user preference is inferred. The probability value, which is produced from the result of Bayesian Network reasoning, is used for retrieval, making the system return the retrieval result suitable for each user's preference. The usability test indicates that the satisfaction level of the selected results based on the proposed model is higher than general retrieval method.

Mobile Voice Note File Management Service For Improving Accessibility of the Blind (전맹인의 접근성 향상을 위한 모바일 음성 메모 파일 관리 서비스)

  • Lim, Soon-Bum;Lee, Mi Ji;Choi, Yoo Jin;Yook, Juhye;Park, Joo Hyun;Lee, Jongwoo
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.11
    • /
    • pp.1215-1222
    • /
    • 2019
  • Recently, people with disabilities also search for and collect information from the web through smart devices, and save collected information on smart devices or take notes. For non-disabled people, various memo applications are provided on the market, so it is more convenient to choose according to their preference. However, existing memo services are limited for use by blind people due to the importance of visual information. The problem with blind people when using smart devices is that the screen is not recognized, so it is not possible to check in which location the menu of the application exists. In addition, it is difficult to input and manipulate text, and systematic file management and control are not possible. Therefore, in this paper, we propose the development of voice memo service that blind people can use only voice and hearing information and can operate menu with Bluetooth remote controller. We will develop a system that includes a comprehensive voice file management function for storing, searching, playing, and deleting files, rather than simply storing voice files.

WeXGene: Web-based XML Data Generator (WeXGene: 웹 기반 XML 데이터 생성기)

  • Shin Sun Mi;Jeong Hoe Jin;Lee Sang Ho
    • The KIPS Transactions:PartD
    • /
    • v.12D no.2 s.98
    • /
    • pp.199-210
    • /
    • 2005
  • We need XML generate various kinds of XML data to evaluate XML database systems. Existing XML data generators are developed to generate XML data that are suitable for particular evaluation methods, and their functionalities are limited in terms of generating XML data This paper introduces a new XML data generator, WeXGene, that not only improves the drawbacks of existing data generators but also adds new data generation functionalities. For generating XML data WeXGene uses the user data files and the structure definition files, which specify SDTD(Symbolic DTD) or input parameters. The user data file is a text data file that has column data or row data. It is also possible that WeXGene generates XML data without accessing the user data file. This paper presents the design details, overall system architecture, and data generation process of WeXGene. An analytic comparison with other XML data generators is also presented.