• Title/Summary/Keyword: Data Word length

Search Result 48, Processing Time 0.03 seconds

Psalm Text Generator Comparison Between English and Korean Using LSTM Blocks in a Recurrent Neural Network (순환 신경망에서 LSTM 블록을 사용한 영어와 한국어의 시편 생성기 비교)

  • Snowberger, Aaron Daniel;Lee, Choong Ho
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.269-271
    • /
    • 2022
  • In recent years, RNN networks with LSTM blocks have been used extensively in machine learning tasks that process sequential data. These networks have proven to be particularly good at sequential language processing tasks by being more able to accurately predict the next most likely word in a given sequence than traditional neural networks. This study trained an RNN / LSTM neural network on three different translations of 150 biblical Psalms - in both English and Korean. The resulting model is then fed an input word and a length number from which it automatically generates a new Psalm of the desired length based on the patterns it recognized while training. The results of training the network on both English text and Korean text are compared and discussed.

  • PDF

Memory Reduction Method of Radix-22 MDF IFFT for OFDM Communication Systems (OFDM 통신시스템을 위한 radix-22 MDF IFFT의 메모리 감소 기법)

  • Cho, Kyung-Ju
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.13 no.1
    • /
    • pp.42-47
    • /
    • 2020
  • In OFDM-based very high-speed communication systems, FFT/IFFT processor should have several properties of low-area and low-power consumption as well as high throughput and low processing latency. Thus, radix-2k MDF (multipath delay feedback) architectures by adopting pipeline and parallel processing are suitable. In MDF architecture, the feedback memory which increases in proportion to the input signal word-length has a large area and power consumption. This paper presents a feedback memory size reduction method of radix-22 MDF IFFT processor for OFDM applications. The proposed method focuses on reducing the feedback memory size in the first two stages of MDF architectures since the first two stages occupy about 75% of the total feedback memory. In OFDM transmissions, IFFT input signals are composed of modulated data and pilot, null signals. In order to reduce the IFFT input word-length, the integer mapping which generates mapped data composed of two signed integer corresponding to modulated data and pilot/null signals is proposed. By simulation, it is shown that the proposed method has achieved a feedback memory reduction up to 39% compared to conventional approach.

Comparison of System Call Sequence Embedding Approaches for Anomaly Detection (이상 탐지를 위한 시스템콜 시퀀스 임베딩 접근 방식 비교)

  • Lee, Keun-Seop;Park, Kyungseon;Kim, Kangseok
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.2
    • /
    • pp.47-53
    • /
    • 2022
  • Recently, with the change of the intelligent security paradigm, study to apply various information generated from various information security systems to AI-based anomaly detection is increasing. Therefore, in this study, in order to convert log-like time series data into a vector, which is a numerical feature, the CBOW and Skip-gram inference methods of deep learning-based Word2Vec model and statistical method based on the coincidence frequency were used to transform the published ADFA system call data. In relation to this, an experiment was carried out through conversion into various embedding vectors considering the dimension of vector, the length of sequence, and the window size. In addition, the performance of the embedding methods used as well as the detection performance were compared and evaluated through GRU-based anomaly detection model using vectors generated by the embedding model as an input. Compared to the statistical model, it was confirmed that the Skip-gram maintains more stable performance without biasing a specific window size or sequence length, and is more effective in making each event of sequence data into an embedding vector.

A Study on the input butter for efficient processing of MPEG Audio bitstream (MPEG Audio 비트스트림의 효율적 처리를 위한 입력 버퍼에 관한 연구)

  • 임성룡;공진흥
    • Proceedings of the IEEK Conference
    • /
    • 2000.06b
    • /
    • pp.181-184
    • /
    • 2000
  • In this paper, we described a design of the input buffer system for efficiently dealing with MPEG audio bitstream to demux header and side information, audio data. In order to overcome the limitations of fixed-word manipulation in bitstream demuxing, we proposed a new variable length bit retrieval system with FSM sequencer supporting MPEG audio frame format, and serial buffer demuxing audio stream, FIFO circular buffer including header and side information.

  • PDF

An Enhanced Feature Selection Method Based on the Impurity of Words Considering Unbalanced Distribution of Documents (문서의 불균등 분포를 고려한 단어 불순도 기반 특징 선택 방법)

  • Kang, Jin-Beom;Yang, Jae-Young;Choi, Joong-Min
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.9
    • /
    • pp.804-816
    • /
    • 2007
  • Sample training data for machine learning often contain irrelevant information or redundant concept. It is also the case that the original data may include noise. If the information collected for constructing learning model is not reliable, it is difficult to obtain accurate information. So the system attempts to find relations or regulations between features and categories in the teaming phase. The feature selection is to remove irrelevant or redundant information before constructing teaming model. for improving its performance. Existing feature selection methods assume that the distribution of documents is balanced in terms of the number of documents for each class and the length of each document. In practice, however, it is difficult not only to prepare a set of documents with almost equal length, but also to define a number of classes with fixed number of document elements. In this paper, we propose a new feature selection method that considers the impurities among the words and unbalanced distribution of documents in categories. We could obtain feature candidates using the word impurity and eventually select the features through unbalanced distribution of documents. We demonstrate that our method performs better than other existing methods via some experiments.

Finding approximate occurrence of a pattern that contains gaps by the bit-vector approach

  • Lee, In-Bok;Park, Kun-Soo
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2003.10a
    • /
    • pp.193-199
    • /
    • 2003
  • The application of finding occurrences of a pattern that contains gaps includes information retrieval, data mining, and computational biology. As the biological sequences may contain errors, it is important to find not only the exact occurrences of a pattern but also approximate ones. In this paper we present an O(mnk$_{max}$/w) time algorithm for the approximate gapped pattern matching problem, where m is the length of the text, H is the length of the pattern, w is the word size of the target machine, and k$_{max}$ is the greatest error bound for subpatterns.

  • PDF

Early Vocalization and Phonological Developments of Typically Developing Children: A longitudinal study (일반 영유아의 초기 발성과 음운 발달에 관한 종단 연구)

  • Ha, Seunghee;Park, Bora
    • Phonetics and Speech Sciences
    • /
    • v.7 no.2
    • /
    • pp.63-73
    • /
    • 2015
  • This study investigated longitudinally early vocalization and phonological developments of typically developing children. Ten typically developing children participated in the study from 9 months to 18 months of age. Spontaneous utterance samples were collected at 9, 12, 15, 18 months of age and phonetically transcribed and analyzed. Utterance samples were classified into 5 levels using Stark Assessment of Early Vocal Development-Revised(SAEVD-R). The data analysis focused on 4 and 5 levels of vocalizations classified by SAEVD-R and word productions. The percentage of each vocalization level, vocalization length, syllable structures, and consonant inventory were obtained. The results showed that the percentages of level 4 and 5 vocalizations and word significantly increased with age and the production of syllable structures containing consonants significantly increased around 12 and 15 months of age. On average, the children produced 4 types of syllable structure and 5.4 consonants at 9 months and they produced 5 types of syllable structure and 9.8 consonants at 18 months. The phonological development patterns in this study were consistent with those analyzed from children's meaningful utterances in previous studies. The results support the perspective on the continuity between babbling and early speech. This study has clinical implications in early identification and speech-language intervention for young children with speech delays or at risk.

Parameterized FFT/IFFT Core Generator for ODFM Modulation/Demodulation (OFDM 변복조를 위한 파라메터화된 FFT/IFFT 코어 생성기)

  • Lee, J.W.;Kim, J.H.;Shin, K.W.;Baek, Y.S.;Eo, I.S.
    • Proceedings of the IEEK Conference
    • /
    • 2005.11a
    • /
    • pp.659-662
    • /
    • 2005
  • A parameterized FFT/IFFT core generator (PFFT_CoreGen) is designed, which can be used as an essential IP (Intellectual Property) in various OFDM modem designs. The PFFT_CoreGen generates Verilog-HDL models of FFT cores in the range of 64 ${\sim}$ 2048-point. To optimize the performance of the generated FFT cores, the PFFT_CoreGen can select the word-length of input data, internal data and twiddle factors in the range of 8-b ${\sim}$ 24-b. Some design techniques for low-power design are considered from algorithm level to circuit level.

  • PDF

Research on Personalized Course Recommendation Algorithm Based on Att-CIN-DNN under Online Education Cloud Platform

  • Xiaoqiang Liu;Feng Hou
    • Journal of Information Processing Systems
    • /
    • v.20 no.3
    • /
    • pp.360-374
    • /
    • 2024
  • A personalized course recommendation algorithm based on deep learning in an online education cloud platform is proposed to address the challenges associated with effective information extraction and insufficient feature extraction. First, the user potential preferences are obtained through the course summary, course review information, user course history, and other data. Second, by embedding, the word vector is turned into a low-dimensional and dense real-valued vector, which is then fed into the compressed interaction network-deep neural network model. Finally, considering that learners and different interactive courses play different roles in the final recommendation and prediction results, an attention mechanism is introduced. The accuracy, recall rate, and F1 value of the proposed method are 0.851, 0.856, and 0.853, respectively, when the length of the recommendation list K is 35. Consequently, the proposed strategy outperforms the comparison model in terms of recommending customized course resources.

Design and Implementation of Open-Loop Clock Recovery Circuit for 39.8 Gb/s and 42.8 Gb/s Dual-Mode Operation

  • Lim, Sang-Kyu;Cho, Hyun-Woo;Shin, Jong-Yoon;Ko, Je-Soo
    • ETRI Journal
    • /
    • v.30 no.2
    • /
    • pp.268-274
    • /
    • 2008
  • This paper proposes an open-loop clock recovery circuit (CRC) using two high-Q dielectric resonator (DR) filters for 39.8 Gb/s and 42.8 Gb/s dual-mode operation. The DR filters are fabricated to obtain high Q-values of approximately 950 at the 40 GHz band and to suppress spurious resonant modes up to 45 GHz. The CRC is implemented in a compact module by integrating the DR filters with other circuits in the CRC. The peak-to-peak and RMS jitter values of the clock signals recovered from 39.8 Gb/s and 42.8 Gb/s pseudo-random binary sequence (PRBS) data with a word length of $2^{31}-1$ are less than 2.0 ps and 0.3 ps, respectively. The peak-to-peak amplitudes of the recovered clocks are quite stable and within the range of 2.5 V to 2.7 V, even when the input data signals vary from 150 mV to 500 mV. Error-free operation of the 40 Gb/s-class optical receiver with the dual-mode CRC is confirmed at both 39.8 Gb/s and 42.8 Gb/s data rates.

  • PDF