• Title/Summary/Keyword: Embedding method

Search Result 701, Processing Time 0.033 seconds

A Discourse-based Compositional Approach to Overcome Drawbacks of Sequence-based Composition in Text Modeling via Neural Networks (신경망 기반 텍스트 모델링에 있어 순차적 결합 방법의 한계점과 이를 극복하기 위한 담화 기반의 결합 방법)

  • Lee, Kangwook;Han, Sanggyu;Myaeng, Sung-Hyon
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.12
    • /
    • pp.698-702
    • /
    • 2017
  • Since the introduction of Deep Neural Networks to the Natural Language Processing field, two major approaches have been considered for modeling text. One method involved learning embeddings, i.e. the distributed representations containing abstract semantics of words or sentences, with the textual context. The other strategy consisted of composing the embeddings trained by the above to get embeddings of longer texts. However, most studies of the composition methods just adopt word embeddings without consideration of the optimal embedding unit and the optimal method of composition. In this paper, we conducted experiments to analyze the optimal embedding unit and the optimal composition method for modeling longer texts, such as documents. In addition, we suggest a new discourse-based composition to overcome the limitation of the sequential composition method on composing sentence embeddings.

A Watermarking Method Based on the Informed Coding and Embedding Using Trellis Code and Entropy Masking (Trellis 부호 및 엔트로피 마스킹을 이용한 정보부호화 기반 워터마킹)

  • Lee, Jeong-Hwan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.12
    • /
    • pp.2677-2684
    • /
    • 2009
  • In this paper, we study a watermarking method based on the informed coding and embedding by means of trellis code and entropy masking. An image is divided as $8{\times}8$ block with no overlapping and the discrete cosine transform(DCT) is applied to each block. Then the 16 medium-frequency AC terms of each block are extracted. Next it is compared with gaussian random vectors having zero mean and unit variance. As these processing, the embedding vectors with minimum value of linear combination between linear correlation and Watson distance can be obtained by Viterbi algorithm at each stage of trellis coding. For considering the image characteristics, we apply different weight value between the linear correlation and the Watson distance using the entropy masking. To evaluate the performance of proposed method, the average bit error rate of watermark message is calculated from different several images. By the experiments the proposed method is improved in terms of the average bit error rate.

Fragile Watermarking Scheme Based on Wavelet Edge Features

  • Vaishnavi, D.;Subashini, T.S.
    • Journal of Electrical Engineering and Technology
    • /
    • v.10 no.5
    • /
    • pp.2149-2154
    • /
    • 2015
  • This paper proposes a novel watermarking method to discover the tampers and localize it in digital image. The image which is to be used to generate a watermark is first wavelet decomposed and the edge feature from the sub bands of high frequency coefficients are retrieved to generate a watermark (Edge Feature Image) and which is to be embed on the cover image. Before embedding the watermark, the pixels of cover image are disordered through the Arnold Transform and this helps to upgrade the security of the watermark. The embedding of generated edge feature image is done only on the Least Significant Bit (LSB) of the cover image. The invisibleness and robustness of the proposed method is computed using Peak-Signal to Noise Ratio (PSNR) and Normalized Correlation (NC) and it proves that the proposed method delivers good results and the proposed method also detects and localizes the tampers efficiently. The invisibleness of proposed method is compared with the existing method and it proves that the proposed method is better.

Reversible Binary Image Watermarking Method Using Overlapping Pattern Substitution

  • Dong, Keming;Kim, Hyoung Joong;Choi, Yong Soo;Joo, Sang Hyun;Chung, Byung Ho
    • ETRI Journal
    • /
    • v.37 no.5
    • /
    • pp.990-1000
    • /
    • 2015
  • This paper presents an overlapping pattern substitution (PS) method. The original overlapping PS method as a reversible data hiding scheme works well with only four pattern pairs among fifteen possible such pairs. This paper generalizes the original PS method so that it will work well with an optimal pair from among the fifteen possible pattern pairs. To implement such an overlapping PS method, changeable and embeddable patterns are first defined. A class map is virtually constructed to identify the changeable and embeddable pairs. The run-lengths between consecutive least probable patterns are recorded. Experiments show that an implementation of our overlapping PS method works well with any possible type of pairs. Comparison results show that the proposed method achieves more embedding capacity, a higher PSNR value, and less human visual distortion for a given embedding payload.

Word-Level Embedding to Improve Performance of Representative Spatio-temporal Document Classification

  • Byoungwook Kim;Hong-Jun Jang
    • Journal of Information Processing Systems
    • /
    • v.19 no.6
    • /
    • pp.830-841
    • /
    • 2023
  • Tokenization is the process of segmenting the input text into smaller units of text, and it is a preprocessing task that is mainly performed to improve the efficiency of the machine learning process. Various tokenization methods have been proposed for application in the field of natural language processing, but studies have primarily focused on efficiently segmenting text. Few studies have been conducted on the Korean language to explore what tokenization methods are suitable for document classification task. In this paper, an exploratory study was performed to find the most suitable tokenization method to improve the performance of a representative spatio-temporal document classifier in Korean. For the experiment, a convolutional neural network model was used, and for the final performance comparison, tasks were selected for document classification where performance largely depends on the tokenization method. As a tokenization method for comparative experiments, commonly used Jamo, Character, and Word units were adopted. As a result of the experiment, it was confirmed that the tokenization of word units showed excellent performance in the case of representative spatio-temporal document classification task where the semantic embedding ability of the token itself is important.

Comparison of System Call Sequence Embedding Approaches for Anomaly Detection (이상 탐지를 위한 시스템콜 시퀀스 임베딩 접근 방식 비교)

  • Lee, Keun-Seop;Park, Kyungseon;Kim, Kangseok
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.2
    • /
    • pp.47-53
    • /
    • 2022
  • Recently, with the change of the intelligent security paradigm, study to apply various information generated from various information security systems to AI-based anomaly detection is increasing. Therefore, in this study, in order to convert log-like time series data into a vector, which is a numerical feature, the CBOW and Skip-gram inference methods of deep learning-based Word2Vec model and statistical method based on the coincidence frequency were used to transform the published ADFA system call data. In relation to this, an experiment was carried out through conversion into various embedding vectors considering the dimension of vector, the length of sequence, and the window size. In addition, the performance of the embedding methods used as well as the detection performance were compared and evaluated through GRU-based anomaly detection model using vectors generated by the embedding model as an input. Compared to the statistical model, it was confirmed that the Skip-gram maintains more stable performance without biasing a specific window size or sequence length, and is more effective in making each event of sequence data into an embedding vector.

A New Function Embedding Method for the Multiple-Controlled Unitary Gate based on Literal Switch (리터럴 스위치에 의한 다중제어 유니터리 게이트의 새로운 함수 임베딩 방법)

  • Park, Dong-Young
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.12 no.1
    • /
    • pp.101-108
    • /
    • 2017
  • As the quantum gate matrix is a $r^{n+1}{\times}r^{n+1}$ dimension when the radix is r, the number of control state vectors is n, and the number of target state vectors is one, the matrix dimension with increasing n is exponentially increasing. If the number of control state vectors is $2^n$, then the number of $2^n-1$ unit matrix operations preserves the output from the input, and only one can be performed the unitary operation to the target state vector. Therefore, this paper proposes a new method of function embedding that can replace $2^n-1$ times of unit matrix operations with deterministic contribution to matrix dimension by arithmetic power switch of the unitary gate. The proposed function embedding method uses a binary literal switch with a multivalued threshold, so that a general purpose hybrid MCU gate can be realized in a $r{\times}r$ unitary matrix.

Digital Watermarking Algorithm for Copyright Protection of JPEG Image (JPEG 영상의 저작권 보호를 위한 Digital Watermarking 알고리즘)

  • Park, Eun-Suk;Woo, Jong-Won;Lee, Seok-Hee;Heo, Yoon-Seok;Cho, Ki-Hyung
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.1
    • /
    • pp.296-305
    • /
    • 2000
  • In this paper, we propose the method of embedding the encrypted digital watermark in quantization coefficient when we encode the image data in the process of JPEC. The proposed method is as following. After a DCT coefficient of each block is quantized, we arrange the quantization coefficient as on dimension with a zigzag scan and replace each block. By applying even-odd feature of frequency of the encrypted watermark to a quantization coefficient of some fixed domain of replaced each block and embedding it, we obtain the compressed image data by encoding after placing it in the order prior to replacement. The advantages of the proposed method here are as follows: We can embed many information keeping a secret as much as possible by using the algorithm of block replacement. We can control the amount of embedding of each use, as we embed the encrypted information by selecting some fixed domain of a quantization coefficient, we can fix the embedding data regardless of the image and the value of quantization. We verified the results by experiments and analyzed the efficiency of them in comparison with the former study.

  • PDF

Multi-Document Summarization Method of Reviews Using Word Embedding Clustering (워드 임베딩 클러스터링을 활용한 리뷰 다중문서 요약기법)

  • Lee, Pil Won;Hwang, Yun Young;Choi, Jong Seok;Shin, Young Tae
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.11
    • /
    • pp.535-540
    • /
    • 2021
  • Multi-document refers to a document consisting of various topics, not a single topic, and a typical example is online reviews. There have been several attempts to summarize online reviews because of their vast amounts of information. However, collective summarization of reviews through existing summary models creates a problem of losing the various topics that make up the reviews. Therefore, in this paper, we present method to summarize the review with minimal loss of the topic. The proposed method classify reviews through processes such as preprocessing, importance evaluation, embedding substitution using BERT, and embedding clustering. Furthermore, the classified sentences generate the final summary using the trained Transformer summary model. The performance evaluation of the proposed model was compared by evaluating the existing summary model, seq2seq model, and the cosine similarity with the ROUGE score, and performed a high performance summary compared to the existing summary model.

Locally Linear Embedding for Face Recognition with Simultaneous Diagonalization (얼굴 인식을 위한 연립 대각화와 국부 선형 임베딩)

  • Kim, Eun-Sol;Noh, Yung-Kyun;Zhang, Byoung-Tak
    • Journal of KIISE
    • /
    • v.42 no.2
    • /
    • pp.235-241
    • /
    • 2015
  • Locally linear embedding (LLE) [1] is a type of manifold algorithms, which preserves inner product value between high-dimensional data when embedding the high-dimensional data to low-dimensional space. LLE closely embeds data points on the same subspace in low-dimensional space, because the data points have significant inner product values. On the other hand, if the data points are located orthogonal to each other, these are separately embedded in low-dimensional space, even though they are in close proximity to each other in high-dimensional space. Meanwhile, it is well known that the facial images of the same person under varying illumination lie in a low-dimensional linear subspace [2]. In this study, we suggest an improved LLE method for face recognition problem. The method maximizes the characteristic of LLE, which embeds the data points totally separately when they are located orthogonal to each other. To accomplish this, all of the subspaces made by each class are forced to locate orthogonally. To make all of the subspaces orthogonal, the simultaneous Diagonalization (SD) technique was applied. From experimental results, the suggested method is shown to dramatically improve the embedding results and classification performance.