• Title/Summary/Keyword: Text Input

Search Result 360, Processing Time 0.026 seconds

Implementation of Interactive Media Content Production Framework based on Gesture Recognition (제스처 인식 기반의 인터랙티브 미디어 콘텐츠 제작 프레임워크 구현)

  • Koh, You-jin;Kim, Tae-Won;Kim, Yong-Goo;Choi, Yoo-Joo
    • Journal of Broadcast Engineering
    • /
    • v.25 no.4
    • /
    • pp.545-559
    • /
    • 2020
  • In this paper, we propose a content creation framework that enables users without programming experience to easily create interactive media content that responds to user gestures. In the proposed framework, users define the gestures they use and the media effects that respond to them by numbers, and link them in a text-based configuration file. In the proposed framework, the interactive media content that responds to the user's gesture is linked with the dynamic projection mapping module to track the user's location and project the media effects onto the user. To reduce the processing speed and memory burden of the gesture recognition, the user's movement is expressed as a gray scale motion history image. We designed a convolutional neural network model for gesture recognition using motion history images as input data. The number of network layers and hyperparameters of the convolutional neural network model were determined through experiments that recognize five gestures, and applied to the proposed framework. In the gesture recognition experiment, we obtained a recognition accuracy of 97.96% and a processing speed of 12.04 FPS. In the experiment connected with the three media effects, we confirmed that the intended media effect was appropriately displayed in real-time according to the user's gesture.

Multi-Modal Controller Usability for Smart TV Control

  • Yu, Jeongil;Kim, Seongmin;Choe, Jaeho;Jung, Eui S.
    • Journal of the Ergonomics Society of Korea
    • /
    • v.32 no.6
    • /
    • pp.517-528
    • /
    • 2013
  • Objective: The objective of this study was to suggest a multi-modal controller type for Smart TV Control. Background: Recently, many issues regarding the Smart TV are arising due to the rising complexity of features in a Smart TV. One of the specific issues involves what type of controller must be utilized in order to perform regulated tasks. This study examines the ongoing trend of the controller. Method: The selected participants had experiences with the Smart TV and were 20 to 30 years of age. A pre-survey determined the first independent variable of five tasks(Live TV, Record, Share, Web, App Store). The second independent variable was the type of controllers(Conventional, Mouse, Voice-Based Remote Controllers). The dependent variables were preference, task completion time, and error rate. The experiment consist a series of three experiments. The first experiment utilized a uni-modal Controller for tasks; the second experiment utilized a dual-modal Controller, while the third experiment utilized a triple-modal Controller. Results: The first experiment revealed that the uni-modal Controller (Conventional, Voice Controller) showed the best results for the Live TV task. The second experiment revealed that the dual-modal Controller(Conventional-Voice, Conventional-Mouse combinations) showed the best results for the Share, Web, App Store tasks. The third experiment revealed that the triple-modal Controller among all the level had not effective compared with dual-modal Controller. Conclusion: In order to control simple tasks in a smart TV, our results showed that a uni-modal Controller was more effective than a dual-modal controller. However, the control of complex tasks was better suited to the dual-modal Controller. User preference for a controller differs according the Smart TV functions. For instance, there was a high user preference for the uni-Controller for simple functions while high user preference appeared for Dual-Controllers when the task was complex. Additionally, in accordance with task characteristics, there was a high user preference for the Voice Controller for channel and volume adjustment. Furthermore, there was a high user preference for the Conventional Controller for menu selection. In situations where the user had to input text, the Voice Controller had the highest preference among users while the Mouse Type, Voice Controller had the highest user preference for performing a search or selecting items on the menu. Application: The results of this study may be utilized in the design of a controller which can effectively carry out the various tasks of the Smart TV.

Speech Animation Synthesis based on a Korean Co-articulation Model (한국어 동시조음 모델에 기반한 스피치 애니메이션 생성)

  • Jang, Minjung;Jung, Sunjin;Noh, Junyong
    • Journal of the Korea Computer Graphics Society
    • /
    • v.26 no.3
    • /
    • pp.49-59
    • /
    • 2020
  • In this paper, we propose a speech animation synthesis specialized in Korean through a rule-based co-articulation model. Speech animation has been widely used in the cultural industry, such as movies, animations, and games that require natural and realistic motion. Because the technique for audio driven speech animation has been mainly developed for English, however, the animation results for domestic content are often visually very unnatural. For example, dubbing of a voice actor is played with no mouth motion at all or with an unsynchronized looping of simple mouth shapes at best. Although there are language-independent speech animation models, which are not specialized in Korean, they are yet to ensure the quality to be utilized in a domestic content production. Therefore, we propose a natural speech animation synthesis method that reflects the linguistic characteristics of Korean driven by an input audio and text. Reflecting the features that vowels mostly determine the mouth shape in Korean, a coarticulation model separating lips and the tongue has been defined to solve the previous problem of lip distortion and occasional missing of some phoneme characteristics. Our model also reflects the differences in prosodic features for improved dynamics in speech animation. Through user studies, we verify that the proposed model can synthesize natural speech animation.

Improvement of Endoscopic Image using De-Interlacing Technique (De-Interlace 기법을 이용한 내시경 영상의 화질 개선)

  • 신동익;조민수;허수진
    • Journal of Biomedical Engineering Research
    • /
    • v.19 no.5
    • /
    • pp.469-476
    • /
    • 1998
  • In the case of acquisition and displaying medical Images such as ultrasonography and endoscopy on VGA monitor of PC system, image degradation of tear-drop appears through scan conversion. In this study, we compare several methods which can solve this degradation and implement the hardware system that resolves this problem in real-time with PC. It is possible to represent high quality image display and real-time processing and acquisition with specific de-interlacing device and PCI bridge on our hardware system. Image quality is improved remarkably on our hardware system. It is implemented as PC-based system, so acquiring, saving images and describing text comment on those images and PACS networking can be easily implemented.metabolism. All images were spatially normalized to MNI standard PET template and smoothed with 16mm FWHM Gaussian kernel using SPM96. Mean count in cerebral region was normalized. The VOls for 34 cerebral regions were previously defined on the standard template and 17 different counts of mirrored regions to hemispheric midline were extracted from spatially normalized images. A three-layer feed-forward error back-propagation neural network classifier with 7 input nodes and 3 output nodes was used. The network was trained to interpret metabolic patterns and produce identical diagnoses with those of expert viewers. The performance of the neural network was optimized by testing with 5~40 nodes in hidden layer. Randomly selected 40 images from each group were used to train the network and the remainders were used to test the learned network. The optimized neural network gave a maximum agreement rate of 80.3% with expert viewers. It used 20 hidden nodes and was trained for 1508 epochs. Also, neural network gave agreement rates of 75~80% with 10 or 30 nodes in hidden layer. We conclude that artificial neural network performed as well as human experts and could be potentially useful as clinical decision support tool for the localization of epileptogenic zones.

  • PDF

Research on Touch Function capable of Real-time Response in Low-end Embedded System (저사양 임베디드 시스템에서의 실시간 응답이 가능한 터치 기능 연구)

  • Lee, Yong-Min;Han, Chang Ho
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.4
    • /
    • pp.37-41
    • /
    • 2021
  • This paper presents a study to implement a touch screen capable of real-time response processing in a low-end embedded system. This was done by introducing an algorithm using an interpolation method to represent real-time response characteristics when a touch input is performed. In this experiment, we applied a linear interpolation algorithm that estimates random data by deriving a first-order polynomial from 2-point data. We also applied a Lagrange interpolation algorithm that estimates random data by deriving a quadratic polynomial from 3-point data. As a result of the experiment, it was found that the Lagrange interpolation method was more complicated than the linear interpolation method, and the processing speed was slow, so the text was not smooth. When using the linear interpolation method, it was confirmed that the speed displayed on a screen is 2.4 times faster than when using the Lagrange interpolation method. For real-time response characteristics, it was confirmed that smaller size of the executable file of the algorithm is more advantageous than the superiority of the algorithm itself. In conclusion, in order to secure real-time response characteristics in a low-end embedded system, it was confirmed that a relatively simple linear interpolation algorithm performs touch operations with better real-time response characteristics than the Lagrange interpolation method.

A Study of Functional Performance on Smartphone according to Age Difference (나이 차이에 따른 스마트폰 기능 수행도 연구)

  • Yoon, Cheol-Ho
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.3
    • /
    • pp.318-323
    • /
    • 2019
  • In this study, we examined the differences in age among the various functions required for everyday life through smartphone using environment. The subjects were composed of 30 young adults and 30 elderly people. We set up 12 tasks to evaluate the performance of smartphone functions. At the same time, a questionnaire about smartphone usage habits was made. The questionnaire consists of items related to user history and usage habits. ANOVA analysis was performed using Minitab version 14, and statistically significant differences were found in 10 tasks. The result of the actual values for each task showed that the elderly generally took more time to perform all the tasks than the younger ones. Especially, the tendency of the task which requires a lot of keystrokes was revealed. Especially, in the case of a task requiring a lot of keystrokes, the tendency was remarkable. Young adults have found that they use all functions uniformly overall, and the functions used by the elderly were biased toward some functions, such as dialing, text, kakao talk, and searching. These results suggest that young people use smartphones more frequently than elderly people, and as they become accustomed to using smartphones, the time required to perform functions may be shortened. We suggest that it is necessary to design in terms of hardware or software so that the elderly people can input easily and conveniently.

A Study on the Automatic Digital DB of Boring Log Using AI (AI를 활용한 시추주상도 자동 디지털 DB화 방안에 관한 연구)

  • Park, Ka-Hyun;Han, Jin-Tae;Yoon, Youngno
    • Journal of the Korean Geotechnical Society
    • /
    • v.37 no.11
    • /
    • pp.119-129
    • /
    • 2021
  • The process of constructing the DB in the current geotechnical information DB system needs a lot of human and time resource consumption. In addition, it causes accuracy problems frequently because the current input method is a person viewing the PDF and directly inputting the results. Therefore, this study proposes building an automatic digital DB using AI (artificial intelligence) of boring logs. In order to automatically construct DB for various boring log formats without exception, the boring log forms were classified using the deep learning model ResNet 34 for a total of 6 boring log forms. As a result, the overall accuracy was 99.7, and the ROC_AUC score was 1.0, which separated the boring log forms with very high performance. After that, the text in the PDF is automatically read using the robotic processing automation technique fine-tuned for each form. Furthermore, the general information, strata information, and standard penetration test information were extracted, separated, and saved in the same format provided by the geotechnical information DB system. Finally, the information in the boring log was automatically converted into a DB at a speed of 140 pages per second.

Digital Transformation: Using D.N.A.(Data, Network, AI) Keywords Generalized DMR Analysis (디지털 전환: D.N.A.(Data, Network, AI) 키워드를 활용한 토픽 모델링)

  • An, Sehwan;Ko, Kangwook;Kim, Youngmin
    • Knowledge Management Research
    • /
    • v.23 no.3
    • /
    • pp.129-152
    • /
    • 2022
  • As a key infrastructure for digital transformation, the spread of data, network, artificial intelligence (D.N.A.) fields and the emergence of promising industries are laying the groundwork for active digital innovation throughout the economy. In this study, by applying the text mining methodology, major topics were derived by using the abstract, publication year, and research field of the study corresponding to the SCIE, SSCI, and A&HCI indexes of the WoS database as input variables. First, main keywords were identified through TF and TF-IDF analysis based on word appearance frequency, and then topic modeling was performed using g-DMR. With the advantage of the topic model that can utilize various types of variables as meta information, it was possible to properly explore the meaning beyond simply deriving a topic. According to the analysis results, topics such as business intelligence, manufacturing production systems, service value creation, telemedicine, and digital education were identified as major research topics in digital transformation. To summarize the results of topic modeling, 1) research on business intelligence has been actively conducted in all areas after COVID-19, and 2) issues such as intelligent manufacturing solutions and metaverses have emerged in the manufacturing field. It has been confirmed that the topic of production systems is receiving attention once again. Finally, 3) Although the topic itself can be viewed separately in terms of technology and service, it was found that it is undesirable to interpret it separately because a number of studies comprehensively deal with various services applied by combining the relevant technologies.

Prediction of Music Generation on Time Series Using Bi-LSTM Model (Bi-LSTM 모델을 이용한 음악 생성 시계열 예측)

  • Kwangjin, Kim;Chilwoo, Lee
    • Smart Media Journal
    • /
    • v.11 no.10
    • /
    • pp.65-75
    • /
    • 2022
  • Deep learning is used as a creative tool that could overcome the limitations of existing analysis models and generate various types of results such as text, image, and music. In this paper, we propose a method necessary to preprocess audio data using the Niko's MIDI Pack sound source file as a data set and to generate music using Bi-LSTM. Based on the generated root note, the hidden layers are composed of multi-layers to create a new note suitable for the musical composition, and an attention mechanism is applied to the output gate of the decoder to apply the weight of the factors that affect the data input from the encoder. Setting variables such as loss function and optimization method are applied as parameters for improving the LSTM model. The proposed model is a multi-channel Bi-LSTM with attention that applies notes pitch generated from separating treble clef and bass clef, length of notes, rests, length of rests, and chords to improve the efficiency and prediction of MIDI deep learning process. The results of the learning generate a sound that matches the development of music scale distinct from noise, and we are aiming to contribute to generating a harmonistic stable music.

A Blockchain Network Construction Tool and its Electronic Voting Application Case (블록체인 자동화도구 개발과 전자투표 적용사례)

  • AING TECKCHUN;KONG VUNGSOVANREACH;Okki Kim;Kyung-Hee Lee;Wan-Sup Cho
    • The Journal of Bigdata
    • /
    • v.6 no.2
    • /
    • pp.151-159
    • /
    • 2021
  • Construction of a blockchain network needs a cumbersome and time consuming activity. To overcome these limitations, global IT companies such as Microsoft are providing cloud-based blockchain services. In this paper, we propose a blockchain-based construction and management tool that enables blockchain developers, blockchain operators, and enterprises to deploy blockchain more comfortably in their infrastructure. This tool is implemented using Hyperledger Fabric, one of the famous private blockchain platforms, and Ansible, an open-source IT automation engine that supports network-wide deployment. Instead of complex and repetitive text commands, the tool provides a user-friendly web dashboard interface that allows users to seamlessly set up, deploy and interact with a blockchain network. With this proposed solution, blockchain developers, operators, and blockchain researchers can more easily build blockchain infrastructure, saving time and cost. To verify the usefulness and convenience of the proposed tool, a blockchain network that conducts electronic voting was built and tested. The construction of a blockchain network, which consists of writing more than 10 setting files and executing commands over hundreds of lines, can be replaced with simple input and click operations in the graphical user interface, saving user convenience and time. The proposed blockchain tool will be used to build trust data infrastructure in various fields such as food safety supply chain construction in the future.