Search | Korea Science

Robustness of Differentiable Neural Computer Using Limited Retention Vector-based Memory Deallocation in Language Model

Lee, Donghyun;Park, Hosung;Seo, Soonshin;Son, Hyunsoo;Kim, Gyujin;Kim, Ji-Hwan
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.15 no.3
- /
- pp.837-852
- /
- 2021
Recurrent neural network (RNN) architectures have been used for language modeling (LM) tasks that require learning long-range word or character sequences. However, the RNN architecture is still suffered from unstable gradients on long-range sequences. To address the issue of long-range sequences, an attention mechanism has been used, showing state-of-the-art (SOTA) performance in all LM tasks. A differentiable neural computer (DNC) is a deep learning architecture using an attention mechanism. The DNC architecture is a neural network augmented with a content-addressable external memory. However, in the write operation, some information unrelated to the input word remains in memory. Moreover, DNCs have been found to perform poorly with low numbers of weight parameters. Therefore, we propose a robust memory deallocation method using a limited retention vector. The limited retention vector determines whether the network increases or decreases its usage of information in external memory according to a threshold. We experimentally evaluate the robustness of a DNC implementing the proposed approach according to the size of the controller and external memory on the enwik8 LM task. When we decreased the number of weight parameters by 32.47%, the proposed DNC showed a low bits-per-character (BPC) degradation of 4.30%, demonstrating the effectiveness of our approach in language modeling tasks.
https://doi.org/10.3837/tiis.2021.03.002 인용 PDF KSCI HTML

Comparison of System Call Sequence Embedding Approaches for Anomaly Detection (이상 탐지를 위한 시스템콜 시퀀스 임베딩 접근 방식 비교)

Lee, Keun-Seop;Park, Kyungseon;Kim, Kangseok
- Journal of Convergence for Information Technology
- /
- v.12 no.2
- /
- pp.47-53
- /
- 2022
Recently, with the change of the intelligent security paradigm, study to apply various information generated from various information security systems to AI-based anomaly detection is increasing. Therefore, in this study, in order to convert log-like time series data into a vector, which is a numerical feature, the CBOW and Skip-gram inference methods of deep learning-based Word2Vec model and statistical method based on the coincidence frequency were used to transform the published ADFA system call data. In relation to this, an experiment was carried out through conversion into various embedding vectors considering the dimension of vector, the length of sequence, and the window size. In addition, the performance of the embedding methods used as well as the detection performance were compared and evaluated through GRU-based anomaly detection model using vectors generated by the embedding model as an input. Compared to the statistical model, it was confirmed that the Skip-gram maintains more stable performance without biasing a specific window size or sequence length, and is more effective in making each event of sequence data into an embedding vector.
https://doi.org/10.22156/CS4SMB.2022.12.02.047 인용 PDF KSCI

A Morpheme Analyzer based on Transformer using Morpheme Tokens and User Dictionary (사용자 사전과 형태소 토큰을 사용한 트랜스포머 기반 형태소 분석기)

DongHyun Kim;Do-Guk Kim;ChulHui Kim;MyungSun Shin;Young-Duk Seo
- Smart Media Journal
- /
- v.12 no.9
- /
- pp.19-27
- /
- 2023
Since morphemes are the smallest unit of meaning in Korean, it is necessary to develop an accurate morphemes analyzer to improve the performance of the Korean language model. However, most existing analyzers present morpheme analysis results by learning word unit tokens as input values. However, since Korean words are consist of postpositions and affixes that are attached to the root, even if they have the same root, the meaning tends to change due to the postpositions or affixes. Therefore, learning morphemes using word unit tokens can lead to misclassification of postposition or affixes. In this paper, we use morpheme-level tokens to grasp the inherent meaning in Korean sentences and propose a morpheme analyzer based on a sequence generation method using Transformer. In addition, a user dictionary is constructed based on corpus data to solve the out - of-vocabulary problem. During the experiment, the morpheme and morpheme tags printed by each morpheme analyzer were compared with the correct answer data, and the experiment proved that the morpheme analyzer presented in this paper performed better than the existing morpheme analyzer.
https://doi.org/10.30693/SMJ.2023.12.9.19 인용 PDF

The Effect of E-commerce Platform Seller Signals on Revenue: Focusing on the Moderating Effect of Keyword Specificity (e-커머스 플랫폼 판매자 신호가 수익에 미치는 영향: 키워드 구체성의 조절 효과를 중심으로)

Jungwon Lee;Jaehyun You
- Information Systems Review
- /
- v.25 no.2
- /
- pp.103-123
- /
- 2023
One of the valid perspectives in the e-commerce platform literature is the seller signaling strategy in the information asymmetry situation. In this study, a research model was constructed based on signaling theory and shopping goal theory to systematically explore the effects of a seller's signaling strategy on consumer decision-making. Specifically, the study examined whether the signaling effects (i.e., reputation, electronic word-of-mouth, price) provided by the seller differed based on consumers' shopping goals. For the empirical analysis, the Gaussian Copula method was employed, utilizing 26,246 data collected from Amazon, a leading e-commerce platform. The analysis revealed that the signals provided by the seller positively impacted sales, and this effect was moderated by consumers' shopping goals. Drawing on shopping goal theory, this study contributes to signaling theory and e-commerce literature by discovering differences in the effectiveness of a seller's signaling strategy based on the keywords input by consumers.
https://doi.org/10.14329/isr.2023.25.2.103 인용 PDF

Memory Organization for a Fuzzy Controller.

Jee, K.D.S.;Poluzzi, R.;Russo, B.
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 1993.06a
- /
- pp.1041-1043
- /
- 1993
Fuzzy logic based Control Theory has gained much interest in the industrial world, thanks to its ability to formalize and solve in a very natural way many problems that are very difficult to quantify at an analytical level. This paper shows a solution for treating membership function inside hardware circuits. The proposed hardware structure optimizes the memoried size by using particular form of the vectorial representation. The process of memorizing fuzzy sets, i.e. their membership function, has always been one of the more problematic issues for the hardware implementation, due to the quite large memory space that is needed. To simplify such an implementation, it is commonly [1,2,8,9,10,11] used to limit the membership functions either to those having triangular or trapezoidal shape, or pre-definite shape. These kinds of functions are able to cover a large spectrum of applications with a limited usage of memory, since they can be memorized by specifying very few parameters ( ight, base, critical points, etc.). This however results in a loss of computational power due to computation on the medium points. A solution to this problem is obtained by discretizing the universe of discourse U, i.e. by fixing a finite number of points and memorizing the value of the membership functions on such points [3,10,14,15]. Such a solution provides a satisfying computational speed, a very high precision of definitions and gives the users the opportunity to choose membership functions of any shape. However, a significant memory waste can as well be registered. It is indeed possible that for each of the given fuzzy sets many elements of the universe of discourse have a membership value equal to zero. It has also been noticed that almost in all cases common points among fuzzy sets, i.e. points with non null membership values are very few. More specifically, in many applications, for each element u of U, there exists at most three fuzzy sets for which the membership value is ot null [3,5,6,7,12,13]. Our proposal is based on such hypotheses. Moreover, we use a technique that even though it does not restrict the shapes of membership functions, it reduces strongly the computational time for the membership values and optimizes the function memorization. In figure 1 it is represented a term set whose characteristics are common for fuzzy controllers and to which we will refer in the following. The above term set has a universe of discourse with 128 elements (so to have a good resolution), 8 fuzzy sets that describe the term set, 32 levels of discretization for the membership values. Clearly, the number of bits necessary for the given specifications are 5 for 32 truth levels, 3 for 8 membership functions and 7 for 128 levels of resolution. The memory depth is given by the dimension of the universe of the discourse (128 in our case) and it will be represented by the memory rows. The length of a world of memory is defined by: Length = nem (dm(m)＋dm(fm) Where: fm is the maximum number of non null values in every element of the universe of the discourse, dm(m) is the dimension of the values of the membership function m, dm(fm) is the dimension of the word to represent the index of the highest membership function. In our case then Length=24. The memory dimension is therefore 128*24 bits. If we had chosen to memorize all values of the membership functions we would have needed to memorize on each memory row the membership value of each element. Fuzzy sets word dimension is 8*5 bits. Therefore, the dimension of the memory would have been 128*40 bits. Coherently with our hypothesis, in fig. 1 each element of universe of the discourse has a non null membership value on at most three fuzzy sets. Focusing on the elements 32,64,96 of the universe of discourse, they will be memorized as follows: The computation of the rule weights is done by comparing those bits that represent the index of the membership function, with the word of the program memor . The output bus of the Program Memory (μCOD), is given as input a comparator (Combinatory Net). If the index is equal to the bus value then one of the non null weight derives from the rule and it is produced as output, otherwise the output is zero (fig. 2). It is clear, that the memory dimension of the antecedent is in this way reduced since only non null values are memorized. Moreover, the time performance of the system is equivalent to the performance of a system using vectorial memorization of all weights. The dimensioning of the word is influenced by some parameters of the input variable. The most important parameter is the maximum number membership functions (nfm) having a non null value in each element of the universe of discourse. From our study in the field of fuzzy system, we see that typically nfm 3 and there are at most 16 membership function. At any rate, such a value can be increased up to the physical dimensional limit of the antecedent memory. A less important role n the optimization process of the word dimension is played by the number of membership functions defined for each linguistic term. The table below shows the request word dimension as a function of such parameters and compares our proposed method with the method of vectorial memorization[10]. Summing up, the characteristics of our method are: Users are not restricted to membership functions with specific shapes. The number of the fuzzy sets and the resolution of the vertical axis have a very small influence in increasing memory space. Weight computations are done by combinatorial network and therefore the time performance of the system is equivalent to the one of the vectorial method. The number of non null membership values on any element of the universe of discourse is limited. Such a constraint is usually non very restrictive since many controllers obtain a good precision with only three non null weights. The method here briefly described has been adopted by our group in the design of an optimized version of the coprocessor described in [10].
PDF

Korean Dependency Parsing Using Stack-Pointer Networks and Subtree Information (스택-포인터 네트워크와 부분 트리 정보를 이용한 한국어 의존 구문 분석)

Choi, Yong-Seok;Lee, Kong Joo
- KIPS Transactions on Software and Data Engineering
- /
- v.10 no.6
- /
- pp.235-242
- /
- 2021
In this work, we develop a Korean dependency parser based on a stack-pointer network that consists of a pointer network and an internal stack. The parser has an encoder and decoder and builds a dependency tree for an input sentence in a depth-first manner. The encoder of the parser encodes an input sentence, and the decoder selects a child for the word at the top of the stack at each step. Since the parser has the internal stack where a search path is stored, the parser can utilize information of previously derived subtrees when selecting a child node. Previous studies used only a grandparent and the most recently visited sibling without considering a subtree structure. In this paper, we introduce graph attention networks that can represent a previously derived subtree. Then we modify our parser based on the stack-pointer network to utilize subtree information produced by the graph attention networks. After training the dependency parser using Sejong and Everyone's corpus, we evaluate the parser's performance. Experimental results show that the proposed parser achieves better performance than the previous approaches at sentence-level accuracies when adopting 2-depth graph attention networks.
https://doi.org/10.3745/KTSDE.2021.10.6.235 인용 PDF KSCI

Encoding & Decoding of Radix 4 Polar Code (Radix 4 Polar code의 부호 및 복호)

Lee, Moon-Ho;Choi, Eun-Ji;Yang, Jae-Seung;Park, Ju-Yong
- Journal of the Institute of Electronics Engineers of Korea TC
- /
- v.46 no.10
- /
- pp.14-27
- /
- 2009
Polar Code was proposed by Turkish professor Erdal Arikan in 2006 as an idea that splitted input channel is increasing the cutoff rate. The channel polarization consisted of code sequences with symmetric high rate capacity in a given B-DMC(Binary-input Discrete Memoryless Channel) W. The symmetric capacity is the highest rate achievable subject to using the input letters of the channel with equal probability. The channel polarization is said to a set of given N independent outputs of B-DMC W. In other word, N increases when N is a set of binary-input channels {$W^{(i)}_N\;:\;1{\leq}\;i\;{\leq}\;N$}, in I{WN(i)} as the fraction of indices is near to 1, which is approaching to I(W), and it is near to 0, then to 1-I(W), where I(W) presents high rates in reliable wireless communication channel as inputs of W with equal frequences. After all, {WN(i)} is shown to be a state of channel coding. On the based on this Polar codes, this paper analyzes Polar coding and decoding of Arikan and propose Radix4 Polar coding newly.
PDF KSCI

A Study on Development of Patent Information Retrieval Using Textmining (텍스트 마이닝을 이용한 특허정보검색 개발에 관한 연구)

Go, Gwang-Su;Jung, Won-Kyo;Shin, Young-Geun;Park, Sang-Sung;Jang, Dong-Sik
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.12 no.8
- /
- pp.3677-3688
- /
- 2011
The patent information retrieval system can serve a variety of purposes. In general, the patent information is retrieved using limited key words. To identify earlier technology and priority rights repeated effort is needed. This study proposes a method of content-based retrieval using text mining. Using the proposed algorithm, each of the documents is invested with characteristic value. The characteristic values are used to compare similarities between query documents and database documents. Text analysis is composed of 3 steps: stop-word, keyword analysis and weighted value calculation. In the test results, the general retrieval and the proposed algorithm were compared by using accuracy measurements. As the study arranges the result documents as similarities of the query documents, the surfer can improve the efficiency by reviewing the similar documents first. Also because of being able to input the full-text of patent documents, the users unacquainted with surfing can use it easily and quickly. It can reduce the amount of displayed missing data through the use of content based retrieval instead of keyword based retrieval for extending the scope of the search.
https://doi.org/10.5762/KAIS.2011.12.8.3677 인용 PDF KSCI

A Single-End-Point DTW Algorithm for Keyword Spotting (핵심어 검출을 위한 단일 끝점 DTW알고리즘)

최용선;오상훈;이수영
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.41 no.3
- /
- pp.209-219
- /
- 2004
In order to implement a real time hardware for keyword spotting, we propose a Single-End-Point DTW(SEP-DTW) algorithm which is simple and less complex for computation. The SEP-DTW algorithm only needs a single end point which enables efficient applications, and it has a small wont of computations because the global search area is divided into successive local search areas. Also, we adopt new local constraints and a new distance measure for a better performance of the SEP-DTW algorithm. Besides, we make a normalization of feature same vectors so that they have the same variance in each frequency bin, and each frame has the same energy levels. To construct several reference patterns for each keyword, we use a clustering algorithm for all training patterns, and mean vectors in every cluster are taken as reference patterns. In order to detect a key word for input streams of speech, we measure the distances between reference patterns and input pattern, and we make a decision whether the distances are smaller than a pre-defined threshold value. With isolated speech recognition and keyword spotting experiments, we verify that the proposed algorithm has a better performance than other methods.
PDF KSCI

A Study on the Diphone Recognition of Korean Connected Words and Eojeol Reconstruction (한국어 연결단어의 이음소 인식과 어절 형성에 관한 연구)

;Jeong, Hong
- The Journal of the Acoustical Society of Korea
- /
- v.14 no.4
- /
- pp.46-63
- /
- 1995
This thesis described an unlimited vocabulary connected speech recognition system using Time Delay Neural Network(TDNN). The recognition unit is the diphone unit which includes the transition section of two phonemes, and the number of diphone unit is 329. The recognition processing of korean connected speech is composed by three part; the feature extraction section of the input speech signal, the diphone recognition processing and post-processing. In the feature extraction section, the extraction of diphone interval in input speech signal is carried and then the feature vectors of 16th filter-bank coefficients are calculated for each frame in the diphone interval. The diphone recognition processing is comprised by the three stage hierachical structure and is carried using 30 Time Delay Neural Networks. particularly, the structure of TDNN is changed so as to increase the recognition rate. The post-processing section, mis-recognized diphone strings are corrected using the probability of phoneme transition and the probability o phoneme confusion and then the eojeols (Korean word or phrase) are formed by combining the recognized diphones.
PDF

Search Result 227, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)