• Title/Summary/Keyword: Data Word length

Search Result 48, Processing Time 0.022 seconds

LSTM Language Model Based Korean Sentence Generation (LSTM 언어모델 기반 한국어 문장 생성)

  • Kim, Yang-hoon;Hwang, Yong-keun;Kang, Tae-gwan;Jung, Kyo-min
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.41 no.5
    • /
    • pp.592-601
    • /
    • 2016
  • The recurrent neural network (RNN) is a deep learning model which is suitable to sequential or length-variable data. The Long Short-Term Memory (LSTM) mitigates the vanishing gradient problem of RNNs so that LSTM can maintain the long-term dependency among the constituents of the given input sequence. In this paper, we propose a LSTM based language model which can predict following words of a given incomplete sentence to generate a complete sentence. To evaluate our method, we trained our model using multiple Korean corpora then generated the incomplete part of Korean sentences. The result shows that our language model was able to generate the fluent Korean sentences. We also show that the word based model generated better sentences compared to the other settings.

The Performance Evaluation for PHY-LINK Data Transfer using SPI-4.2 (SPI-4.2 프로토콜을 사용한 PHY-LINK 계층간의 데이터 전송 성능평가)

  • 박노식;손승일;최익성;이범철
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.8 no.3
    • /
    • pp.577-585
    • /
    • 2004
  • System Packet Interface Level 4 Phase(SPI-4.2) is an interface for packet and cell transfer between a physical layer(PHY) device and a link layer device, for aggregate bandwidths of OC-192 ATM and Packet Over Sonet/SDH(POS), as well as 10Gbps Ethernet applications. In this paper, we performs the research for SPI-4.2. Also we analyze the performance of SPI-4.2 interface module after modeling using C programming language. This paper shows that SPI-4.2 interface module with 512-word FIFO depth is able to be adapted for the offered loads to 97% in random uniform traffic and 94% in bursty traffic with bursty length 32. SPI-4.2 interface module can experience an performance degradation due to heavy overhead when it massively receives small size packets less than 14-byte. SPI-4.2 interface module is suited for line cards in gigabit/terabit routers, and optical cross-connect switches, and SONET/SDH-based transmission systems.

A Visual Study of the Quality of English Pronunciation Using the Praat Program (Praat을 활용한 영어발음특성의 시각적 연구)

  • Park, Heesuk
    • Journal of Digital Contents Society
    • /
    • v.14 no.3
    • /
    • pp.323-331
    • /
    • 2013
  • This study aims at investigating and comparing the diphthongs, words, and sentences between two Korean highschool students groups using the Praat program. To do this English words and sentences were uttered and recorded by twenty Korean subjects; each group has ten subjects. All the subjects are female and their grades range from freshman to sophomore. Acoustic features were measured from a sound spectrogram with the help of the Praat software program and analyzed through statistical analysis. Results showed that the lengths of diphthongs and words were different between two groups, but the difference was not significant. However, in the lengths of sentence utterance, the group of 5 to 6 grade students in the current grading system pronounced longer than that of 1 to 2 grade students. Especially in the pronunciation of the first two sentences with more than five words, the difference was significant. From the data of the overall sum of words between the two subject groups, we were able to find out that the differences of the lengths of the words with the diphthongs were not significant, but those of the sentences with more than five words were significant. In the pronunciation of the words between coat and code, the length of the diphthong in coat was smaller than that of in code.

The Design of Optimal Filters in Vector-Quantized Subband Codecs (벡터양자화된 부대역 코덱에서 최적필터의 구현)

  • 지인호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.1
    • /
    • pp.97-102
    • /
    • 2000
  • Subband coding is to divide the signal frequency band into a set of uncorrelated frequency bands by filtering and then to encode each of these subbands using a bit allocation rationale matched to the signal energy in that subband. The actual coding of the subband signal can be done using waveform encoding techniques such as PCM, DPCM and vector quantizer(VQ) in order to obtain higher data compression. Most researchers have focused on the error in the quantizer, but not on the overall reconstruction error and its dependence on the filter bank. This paper provides a thorough analysis of subband codecs and further development of optimum filter bank design using vector quantizer. We compute the mean squared reconstruction error(MSE) which depends on N the number of entries in each code book, k the length of each code word, and on the filter bank coefficients. We form this MSE measure in terms of the equivalent quantization model and find the optimum FIR filter coefficients for each channel in the M-band structure for a given bit rate, given filter length, and given input signal correlation model. Specific design examples are worked out for 4-tap filter in 2-band paraunitary filter bank structure. These optimum paraunitary filter coefficients are obtained by using Monte Carlo simulation. We expect that the results of this work could be contributed to study on the optimum design of subband codecs using vector quantizer.

  • PDF

A Study on Regression Class Generation of MLLR Adaptation Using State Level Sharing (상태레벨 공유를 이용한 MLLR 적응화의 회귀클래스 생성에 관한 연구)

  • 오세진;성우창;김광동;노덕규;송민규;정현열
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.8
    • /
    • pp.727-739
    • /
    • 2003
  • In this paper, we propose a generation method of regression classes for adaptation in the HM-Net (Hidden Markov Network) system. The MLLR (Maximum Likelihood Linear Regression) adaptation approach is applied to the HM-Net speech recognition system for expressing the characteristics of speaker effectively and the use of HM-Net in various tasks. For the state level sharing, the context domain state splitting of PDT-SSS (Phonetic Decision Tree-based Successive State Splitting) algorithm, which has the contextual and time domain clustering, is adopted. In each state of contextual domain, the desired phoneme classes are determined by splitting the context information (classes) including target speaker's speech data. The number of adaptation parameters, such as means and variances, is autonomously controlled by contextual domain state splitting of PDT-SSS, depending on the context information and the amount of adaptation utterances from a new speaker. The experiments are performed to verify the effectiveness of the proposed method on the KLE (The center for Korean Language Engineering) 452 data and YNU (Yeungnam Dniv) 200 data. The experimental results show that the accuracies of phone, word, and sentence recognition system increased by 34∼37%, 9%, and 20%, respectively, Compared with performance according to the length of adaptation utterances, the performance are also significantly improved even in short adaptation utterances. Therefore, we can argue that the proposed regression class method is well applied to HM-Net speech recognition system employing MLLR speaker adaptation.

The Effect of Health and Environmental Message Framing on Consumer Attitude and WoM: Focused on Vegan Product (건강과 환경 메시지 프레이밍에 따른 소비자 태도와 구전에 미치는 영향: 비건 제품을 중심으로)

  • Park, Seoyoung;Lim, Boram
    • Journal of Service Research and Studies
    • /
    • v.13 no.3
    • /
    • pp.127-146
    • /
    • 2023
  • Recently, digital advertising has shifted towards delivering messages through short ads of less than 15 seconds, and on social media, ads need to convey the message within 5 seconds before consumers skip them. Although the length of advertisements has decreased, advancements in artificial intelligence algorithms and big data analysis have made it possible to deliver personalized messages that cater to consumers' interests. In this changing landscape, the importance of delivering tailored messages through short and efficient ads is increasing. In this study, we examined the effects of message framing as part of effective message delivery. Specifically, we examined the differences in the effects of two framings, "health" and "environment," for vegan products. The growing consumer interest in health and the environment has elevated the interest in vegan products, and the vegan market is expanding rapidly. Consumers purchase vegan products not only for personal health benefits but also due to their ethical responsibility towards the environment, which can be considered ethical consumption. Previous research has not shown the differences in the effects between health and environment message framings, and the research has been limited to vegan food products. This study investigates the differences in the effects of health and environment message framings using a dish soap product category. By identifying which advertising messages, either health or environment, are more effective in promoting vegan products, this study provides insights for companies to enhance their message framing strategies effectively.

Impact Analysis of Tributaries and Simulation of Water Pollution Accident Scenarios in the Water Source Section of Han River Using 3-D Hydrodynamic Model (3차원 수리모델을 이용한 한강 상수원구간 지류영향 분석 및 수질오염사고 시나리오 모의)

  • Kim, Eunjung;Park, Changmin;Na, Mijeong;Park, Hyeon;Kim, Bogsoon
    • Journal of Korean Society on Water Environment
    • /
    • v.34 no.4
    • /
    • pp.363-374
    • /
    • 2018
  • The Han River serves as an important water resource for the city of Seoul, Korea and in the neighboring metropolitan areas. From the Paldang dam to the Jamsil submerged weir, the 4 water intake stations that are located for the Seoul metropolitan population were under review in this study. Therefore the water quality management in this section is very important to monitor, analyze and review to rule out any safety concerns. In this study, a 3-D hydrodynamic model, EFDC (Environmental Fluid Dynamics Code), was applied to the downstream of the Paldang Dam in the Han River, which is about 23 km in length, to determine issues related to water resource management. The 3-D grid was composed of 2,168 horizontal grids and three vertical layers. In this case, the hydrodynamic model was calibrated and verified with an observed average daily water surface elevation, water temperature and flow rate data for 3 years (2013~2015). The developed EFDC model proved to reproduce the hydrodynamics of the Han River well. The composition ratios of the noted incoming flows at the monitored intake stations for 3 years and their flow patterns in the river were analyzed using the validated model. It was found that the flow of the Wangsuk Stream depended on the Paldnag dam discharge, and it was noted that the composition ratios of the stream at the intake stations changed accordingly. In a word, the Wangsuk Stream moved mainly along the right bank of the Han River under the condition of a normal dam flow. As can be seen, when the dam discharge rate was low, the incidence of lateral mixing was often seen. The scenario analyses were also conducted to predict the transport of conservative pollutants as in the case of a chemical spill accident. Generally speaking, when scenarios were applied, the arrival time and concentration of pollutants at each intake station was thus predicted.

The Effects of Sentiment and Readability on Useful Votes for Customer Reviews with Count Type Review Usefulness Index (온라인 리뷰의 감성과 독해 용이성이 리뷰 유용성에 미치는 영향: 가산형 리뷰 유용성 정보 활용)

  • Cruz, Ruth Angelie;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.43-61
    • /
    • 2016
  • Customer reviews help potential customers make purchasing decisions. However, the prevalence of reviews on websites push the customer to sift through them and change the focus from a mere search to identifying which of the available reviews are valuable and useful for the purchasing decision at hand. To identify useful reviews, websites have developed different mechanisms to give customers options when evaluating existing reviews. Websites allow users to rate the usefulness of a customer review as helpful or not. Amazon.com uses a ratio-type helpfulness, while Yelp.com uses a count-type usefulness index. This usefulness index provides helpful reviews to future potential purchasers. This study investigated the effects of sentiment and readability on useful votes for customer reviews. Similar studies on the relationship between sentiment and readability have focused on the ratio-type usefulness index utilized by websites such as Amazon.com. In this study, Yelp.com's count-type usefulness index for restaurant reviews was used to investigate the relationship between sentiment/readability and usefulness votes. Yelp.com's online customer reviews for stores in the beverage and food categories were used for the analysis. In total, 170,294 reviews containing information on a store's reputation and popularity were used. The control variables were the review length, store reputation, and popularity; the independent variables were the sentiment and readability, while the dependent variable was the number of helpful votes. The review rating is the moderating variable for the review sentiment and readability. The length is the number of characters in a review. The popularity is the number of reviews for a store, and the reputation is the general average rating of all reviews for a store. The readability of a review was calculated with the Coleman-Liau index. The sentiment is a positivity score for the review as calculated by SentiWordNet. The review rating is a preference score selected from 1 to 5 (stars) by the review author. The dependent variable (i.e., usefulness votes) used in this study is a count variable. Therefore, the Poisson regression model, which is commonly used to account for the discrete and nonnegative nature of count data, was applied in the analyses. The increase in helpful votes was assumed to follow a Poisson distribution. Because the Poisson model assumes an equal mean and variance and the data were over-dispersed, a negative binomial distribution model that allows for over-dispersion of the count variable was used for the estimation. Zero-inflated negative binomial regression was used to model count variables with excessive zeros and over-dispersed count outcome variables. With this model, the excess zeros were assumed to be generated through a separate process from the count values and therefore should be modeled as independently as possible. The results showed that positive sentiment had a negative effect on gaining useful votes for positive reviews but no significant effect on negative reviews. Poor readability had a negative effect on gaining useful votes and was not moderated by the review star ratings. These findings yield considerable managerial implications. The results are helpful for online websites when analyzing their review guidelines and identifying useful reviews for their business. Based on this study, positive reviews are not necessarily helpful; therefore, restaurants should consider which type of positive review is helpful for their business. Second, this study is beneficial for businesses and website designers in creating review mechanisms to know which type of reviews to highlight on their websites and which type of reviews can be beneficial to the business. Moreover, this study highlights the review systems employed by websites to allow their customers to post rating reviews.