• Title/Summary/Keyword: Data Translation

Search Result 644, Processing Time 0.029 seconds

Nonparametric Test for Multivariate Location Translation Alternatives

  • Na, Jong-Hwa
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.3
    • /
    • pp.799-809
    • /
    • 2000
  • In this paper we propose a nonparametric one sided test for location parameters in p-variate(p$\geq$2) location translation model. The exact null distributions of test statistics are calculated by permutation principle in the case of relatively small sample sizes and the asymptotic distributions are also considered. The powers of various tests are compared through computer simulation and thep-values with real data are also suggested through example.

  • PDF

AFTL: An Efficient Adaptive Flash Translation Layer using Hot Data Identifier for NAND Flash Memory (AFTL: Hot Data 검출기를 이용한 적응형 플래시 전환 계층)

  • Yun, Hyun-Sik;Joo, Young-Do;Lee, Dong-Ho
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.35 no.1
    • /
    • pp.18-29
    • /
    • 2008
  • NAND Flash memory has been growing popular storage device for the last years because of its low power consumption, fast access speed, shock resistance and light weight properties. However, it has the distinct characteristics such as erase-before-write architecture, asymmetric read/write/erase speed, and the limitation on the number of erasure per block. Due to these limitations, various Flash Translation Layers (FTLs) have been proposed to effectively use NAND flash memory. The systems that adopted the conventional FTL may result in severe performance degradation by the hot data which are frequently requested data for overwrite in the same logical address. In this paper, we propose a novel FTL algorithm called Adaptive Flash Translation Layer (AFTL) which uses sector mapping method for hot data and log-based block mapping method for cold data. Our system removes the redundant write operations and the erase operations by the separating hot data from cold data. Moreover, the read performance is enhanced according to sector translation that tends to use a few read operations. A series of experiments was organized to inspect the performance of the proposed method, and they show very impressive results.

FromTo-$Web/EK^{TM}$: English-to-Korean Machine Translation System for HTML Documents (에서로-웹/$EK^{TM}$: 영한 웹 문서 번역 시스템)

  • Sim, Chul-Min;Yuh, Sang-Wha;Jung, Han-Min;Kim, Tae-Wan;Park, Dong-In;Kwon, Hyuk-Chul
    • Annual Conference on Human and Language Technology
    • /
    • 1997.10a
    • /
    • pp.277-282
    • /
    • 1997
  • 최근 들어 웹 상의 문서를 번역해 주는 번역 시스템이 상용화되고 있다. 일반 문서와 달리 웹 문서는 HTML 태그를 포함하고 있어 번역 시스템에서 문장 단위로 분리하는데 어려움이 있다. 또한 그 대상 영역이 제한되지 않으므로 미등록어 및 구문 분석 실패에 대한 대처 기능이 필요하다. 따라서 웹 문서의 번역 품질이 일반 문서 번역에 비해 현저히 떨어지게 된다. 이 논문에서는 HTML 태그를 보유한 영어 웹 문서를 대상으로 하는 번역 시스템인 "에서로-웹/EK"에 대해 기술한다. 에서로-웹/EK는 HTML 문서의 특성을 고려하여 태그를 분리, 복원하는 태그 관리자를 별도로 가진다. 또한 태그를 유지하면서 영어에서 한국어로 변환되는 과정에서 발생하는 어휘 분리, 어휘 통합, 어순 변환 둥의 다양한 변환 현상을 처리한다. 이 시스템은 변환 방식에 기반한 번역 시스템으로서 영어 해석, 영한 변환, 한국어 생성의 단계를 거친다. 구현된 시스템은 Netscape와 DDE(Dynamic Data Exchange) 방식으로 연동하여 HTML 문서를 번역한다.

  • PDF

Understanding recurrent neural network for texts using English-Korean corpora

  • Lee, Hagyeong;Song, Jongwoo
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.3
    • /
    • pp.313-326
    • /
    • 2020
  • Deep Learning is the most important key to the development of Artificial Intelligence (AI). There are several distinguishable architectures of neural networks such as MLP, CNN, and RNN. Among them, we try to understand one of the main architectures called Recurrent Neural Network (RNN) that differs from other networks in handling sequential data, including time series and texts. As one of the main tasks recently in Natural Language Processing (NLP), we consider Neural Machine Translation (NMT) using RNNs. We also summarize fundamental structures of the recurrent networks, and some topics of representing natural words to reasonable numeric vectors. We organize topics to understand estimation procedures from representing input source sequences to predict target translated sequences. In addition, we apply multiple translation models with Gated Recurrent Unites (GRUs) in Keras on English-Korean sentences that contain about 26,000 pairwise sequences in total from two different corpora, colloquialism and news. We verified some crucial factors that influence the quality of training. We found that loss decreases with more recurrent dimensions and using bidirectional RNN in the encoder when dealing with short sequences. We also computed BLEU scores which are the main measures of the translation performance, and compared them with the score from Google Translate using the same test sentences. We sum up some difficulties when training a proper translation model as well as dealing with Korean language. The use of Keras in Python for overall tasks from processing raw texts to evaluating the translation model also allows us to include some useful functions and vocabulary libraries as well.

A study on performance improvement considering the balance between corpus in Neural Machine Translation (인공신경망 기계번역에서 말뭉치 간의 균형성을 고려한 성능 향상 연구)

  • Park, Chanjun;Park, Kinam;Moon, Hyeonseok;Eo, Sugyeong;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.5
    • /
    • pp.23-29
    • /
    • 2021
  • Recent deep learning-based natural language processing studies are conducting research to improve performance by training large amounts of data from various sources together. However, there is a possibility that the methodology of learning by combining data from various sources into one may prevent performance improvement. In the case of machine translation, data deviation occurs due to differences in translation(liberal, literal), style(colloquial, written, formal, etc.), domains, etc. Combining these corpora into one for learning can adversely affect performance. In this paper, we propose a new Corpus Weight Balance(CWB) method that considers the balance between parallel corpora in machine translation. As a result of the experiment, the model trained with balanced corpus showed better performance than the existing model. In addition, we propose an additional corpus construction process that enables coexistence with the human translation market, which can build high-quality parallel corpus even with a monolingual corpus.

Target Word Selection for English-Korean Machine Translation System using Multiple Knowledge (다양한 지식을 사용한 영한 기계번역에서의 대역어 선택)

  • Lee, Ki-Young;Kim, Han-Woo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.5 s.43
    • /
    • pp.75-86
    • /
    • 2006
  • Target word selection is one of the most important and difficult tasks in English-Korean Machine Translation. It effects on the translation accuracy of machine translation systems. In this paper, we present a new approach to select Korean target word for an English noun with translation ambiguities using multiple knowledge such as verb frame patterns, sense vectors based on collocations, statistical Korean local context information and co-occurring POS information. Verb frame patterns constructed with dictionary and corpus play an important role in resolving the sparseness problem of collocation data. Sense vectors are a set of collocation data when an English word having target selection ambiguities is to be translated to specific Korean target word. Statistical Korean local context Information is an N-gram information generated using Korean corpus. The co-occurring POS information is a statistically significant POS clue which appears with ambiguous word. The experiment showed promising results for diverse sentences from web documents.

  • PDF

A Translator of MUSS-80 for CYBER-72l

  • 이용태;이은구
    • Communications of the Korean Institute of Information Scientists and Engineers
    • /
    • v.1 no.1
    • /
    • pp.23-35
    • /
    • 1983
  • In its global meaning language translation refers to the process whereby a program which is executable in one computer can be executed in another computer directly to obtain the same result. There are four different ways of approaching translation. The first way is translation by a Translator or a Compier, the second way is Interpretation, the third way is Simulation, the last way is Emulation. This paper introduces the M-C Translator which was designed as the first way of translation. The MUSS 80 language (the subsystem of the UNIVAC Solid State 80 S-4 assembly language system) was chosen as the source language which includes forty-three instructions, using the CYBER COMPASS as the object language. The M-C translator is a two pass translator and is a two pas translator and es written in Fortran Extended language. For this M-C Translation, seven COMPASS subroutines and a set of thirty-five macros were prepared. Each executable source instruction corresponds to a macro, so it will be a macro instruction within the object profram. Subroutines are used to retain and handle the source data representation the same way in the object program as in the source system, and are used to convert the decimal source data into the equivalent binary result into the equivalent USS-80digits before and after arithmetic operations. The source instructions can be classified into three categories. First, therd are some instructions which are meaningless in the object system and are therefore unnecessary to translate, and the remaining instructions should be translated. Second, There are some instructions are required to indicate dual address portions. Third, there are Three instructions which have overflow conditions, which are lacking in the remaining instructions. The construction and functions of the M-C Translator, are explained including some of the subroutines, and macros. The problems, difficulties and the method of solving them, and easier features on this translation are analysed. The study of how to save memory and time will be continued.

Real-time and Parallel Semantic Translation Technique for Large-Scale Streaming Sensor Data in an IoT Environment (사물인터넷 환경에서 대용량 스트리밍 센서데이터의 실시간·병렬 시맨틱 변환 기법)

  • Kwon, SoonHyun;Park, Dongwan;Bang, Hyochan;Park, Youngtack
    • Journal of KIISE
    • /
    • v.42 no.1
    • /
    • pp.54-67
    • /
    • 2015
  • Nowadays, studies on the fusion of Semantic Web technologies are being carried out to promote the interoperability and value of sensor data in an IoT environment. To accomplish this, the semantic translation of sensor data is essential for convergence with service domain knowledge. The existing semantic translation technique, however, involves translating from static metadata into semantic data(RDF), and cannot properly process real-time and large-scale features in an IoT environment. Therefore, in this paper, we propose a technique for translating large-scale streaming sensor data generated in an IoT environment into semantic data, using real-time and parallel processing. In this technique, we define rules for semantic translation and store them in the semantic repository. The sensor data is translated in real-time with parallel processing using these pre-defined rules and an ontology-based semantic model. To improve the performance, we use the Apache Storm, a real-time big data analysis framework for parallel processing. The proposed technique was subjected to performance testing with the AWS observation data of the Meteorological Administration, which are large-scale streaming sensor data for demonstration purposes.

A Design and Implementation of the Multilingual RDD Registry (다중언어 RDD 레지스트리의 설계 및 구현)

  • 정상원;오원근;윤기송
    • Journal of Broadcast Engineering
    • /
    • v.8 no.4
    • /
    • pp.381-391
    • /
    • 2003
  • This paper deals nth the Multilingual Registry for the Rights Data Dictionary (RDD), which will be used for the semantic representation of rights on digital contents in MPEG-21 framework. The translation of RDD terms owing to different language populations often lacks the desirable precision. The purpose of this paper Is to demonstrate the Multilingual RDD Registry concept to achieve a more precise and interoperable translation of RDD terms among different DRM systems.

3D Object Recognition Using SOFM (3D Object Recognition Using SOFM)

  • Cho, Hyun-Chul;Shon, Ho-Woong
    • Journal of the Korean Geophysical Society
    • /
    • v.9 no.2
    • /
    • pp.99-103
    • /
    • 2006
  • 3D object recognition independent of translation and rotation using an ultrasonic sensor array, invariant moment vectors and SOFM(Self Organizing Feature Map) neural networks is presented. Using invariant moment vectors of the acquired 16×8 pixel data of square, rectangular, cylindric and regular triangular blocks, 3D objects could be classified by SOFM neural networks. Invariant moment vectors are constant independent of translation and rotation. The recognition rates for the training and testing data were 95.91% and 92.13%, respectively.

  • PDF