Search | Korea Science

Text-to-speech with linear spectrogram prediction for quality and speed improvement (음질 및 속도 향상을 위한 선형 스펙트로그램 활용 Text-to-speech)

Yoon, Hyebin
- Phonetics and Speech Sciences
- /
- v.13 no.3
- /
- pp.71-78
- /
- 2021
Most neural-network-based speech synthesis models utilize neural vocoders to convert mel-scaled spectrograms into high-quality, human-like voices. However, neural vocoders combined with mel-scaled spectrogram prediction models demand considerable computer memory and time during the training phase and are subject to slow inference speeds in an environment where GPU is not used. This problem does not arise in linear spectrogram prediction models, as they do not use neural vocoders, but these models suffer from low voice quality. As a solution, this paper proposes a Tacotron 2 and Transformer-based linear spectrogram prediction model that produces high-quality speech and does not use neural vocoders. Experiments suggest that this model can serve as the foundation of a high-quality text-to-speech model with fast inference speed.
https://doi.org/10.13064/KSSS.2021.13.3.071 인용 PDF KSCI

A Study on the Mobilization Simulation Mode of Government Exercise for Emergency (비상대비 정부연습의 동원 시뮬레이션 모형에 관한 연구)

Joo, Choong-Geun;Lee, Sung-Lyong
- The Journal of the Korea Contents Association
- /
- v.21 no.10
- /
- pp.476-493
- /
- 2021
This study is on the simulation conditions of the tentative 'mobilization simulation mode'(MOBSM) and the setting option of major simulation elements. The MOBSM is a training module that practices mobilization of various institutions through a simulation computer similar to actual situations. So far, mobilization exercise(Mob-Ex) is a message simulation method, so it is necessary to convert into a MOBSM because many problems such as fragmentary and practice only by some institutions are raised. Therefore, the theoretical background and previous studies on Mob-Ex and simulation were reviewed to derive the requirements and simulated elements of the MOBSM to meet the purpose of government level exercise and to suggest the critical concepts and the direction of application. The basic requirement is to simulate the main mobilization practices by institution and provide information on the mobilization execution in a nationwide scope. The simulation elements are simulated events and flow charts by mobilization type, simulated range and level by object, simulated contents of material mobilization by institution, key simulated items, DB application, and simulated period, etc. This study will be useful for policy establishment and follow-up research for technology development of MOBSM in the future, and will accelerate the transition to practical mobilization exercise by MOBSM.
https://doi.org/10.5392/JKCA.2021.21.10.476 인용 PDF KSCI HTML

How to create mixed reality educational contents using Hololens (홀로렌즈를 활용한 혼합현실 교육 콘텐츠 제작 방법)

Song, Eun-Jee
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.24 no.3
- /
- pp.391-397
- /
- 2020
Realistic content such as virtual reality, augmented reality, and mixed reality is emerging as an innovative technology in the education field in that it allows people to safely and efficiently experience dangerous, expensive or impossible situations, such as disaster training or space travel. Recently, as government agencies have supported a lot for producing virtual augmented reality contents about education, various educational contents using virtual augmented reality technology have been developed through the Edutech industry. Many virtual augmented reality-based educational contents are being developed, but mixed reality-based educational contents are very limited which could be more effective for education. This study examines the basic method of producing mixed reality educational contents using Hololens and, on the basis of this, it proposes the method for producing scientific experiment contents. Hololens made it possible to share information in real time without a regular desktop PC, and it is effective for teachers to manage and evaluate students in real time.
https://doi.org/10.6109/jkiice.2020.24.3.391 인용 PDF KSCI

A Design of Similar Video Recommendation System using Extracted Words in Big Data Cluster (빅데이터 클러스터에서의 추출된 형태소를 이용한 유사 동영상 추천 시스템 설계)

Lee, Hyun-Sup;Kim, Jindeog
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.24 no.2
- /
- pp.172-178
- /
- 2020
In order to recommend contents, the company generally uses collaborative filtering that takes into account both user preferences and video (item) similarities. Such services are primarily intended to facilitate user convenience by leveraging personal preferences such as user search keywords and viewing time. It will also be ranked around the keywords specified in the video. However, there is a limit to analyzing video similarities using limited keywords. In such cases, the problem becomes serious if the specified keyword does not properly reflect the item. In this paper, I would like to propose a system that identifies the characteristics of a video as it is by the system without human intervention, and analyzes and recommends similarities between videos. The proposed system analyzes similarities by taking into account all words (keywords) that have different meanings from training videos, and in such cases, the methods handled by big data clusters are applied because of the large scale of data and operations.
https://doi.org/10.6109/jkiice.2020.24.2.172 인용 PDF KSCI

A data extension technique to handle incomplete data (불완전한 데이터를 처리하기 위한 데이터 확장기법)

Lee, Jong Chan
- Journal of the Korea Convergence Society
- /
- v.12 no.2
- /
- pp.7-13
- /
- 2021
This paper introduces an algorithm that compensates for missing values after converting them into a format that can represent the probability for incomplete data including missing values in training data. In the previous method using this data conversion, incomplete data was processed by allocating missing values with an equal probability that missing variables can have. This method applied to many problems and obtained good results, but it was pointed out that there is a loss of information in that all information remaining in the missing variable is ignored and a new value is assigned. On the other hand, in the new proposed method, only complete information not including missing values is input into the well-known classification algorithm (C4.5), and the decision tree is constructed during learning. Then, the probability of the missing value is obtained from this decision tree and assigned as an estimated value of the missing variable. That is, some lost information is recovered using a lot of information that has not been lost from incomplete learning data.
https://doi.org/10.15207/JKCS.2021.12.2.007 인용 PDF KSCI

Real Time Hornet Classification System Based on Deep Learning (딥러닝을 이용한 실시간 말벌 분류 시스템)

Jeong, Yunju;Lee, Yeung-Hak;Ansari, Israfil;Lee, Cheol-Hee
- Journal of IKEEE
- /
- v.24 no.4
- /
- pp.1141-1147
- /
- 2020
The hornet species are so similar in shape that they are difficult for non-experts to classify, and because the size of the objects is small and move fast, it is more difficult to detect and classify the species in real time. In this paper, we developed a system that classifies hornets species in real time based on a deep learning algorithm using a boundary box. In order to minimize the background area included in the bounding box when labeling the training image, we propose a method of selecting only the head and body of the hornet. It also experimentally compares existing boundary box-based object recognition algorithms to find the best algorithms that can detect wasps in real time and classify their species. As a result of the experiment, when the mish function was applied as the activation function of the convolution layer and the hornet images were tested using the YOLOv4 model with the Spatial Attention Module (SAM) applied before the object detection block, the average precision was 97.89% and the average recall was 98.69%.
https://doi.org/10.7471/ikeee.2020.24.4.1141 인용 PDF KSCI

Evaluation of a multi-stage convolutional neural network-based fully automated landmark identification system using cone-beam computed tomography-synthesized posteroanterior cephalometric images

Kim, Min-Jung;Liu, Yi;Oh, Song Hee;Ahn, Hyo-Won;Kim, Seong-Hun;Nelson, Gerald
- The korean journal of orthodontics
- /
- v.51 no.2
- /
- pp.77-85
- /
- 2021
Objective: To evaluate the accuracy of a multi-stage convolutional neural network (CNN) model-based automated identification system for posteroanterior (PA) cephalometric landmarks. Methods: The multi-stage CNN model was implemented with a personal computer. A total of 430 PA-cephalograms synthesized from cone-beam computed tomography scans (CBCT-PA) were selected as samples. Twenty-three landmarks used for Tweemac analysis were manually identified on all CBCT-PA images by a single examiner. Intra-examiner reproducibility was confirmed by repeating the identification on 85 randomly selected images, which were subsequently set as test data, with a two-week interval before training. For initial learning stage of the multi-stage CNN model, the data from 345 of 430 CBCT-PA images were used, after which the multi-stage CNN model was tested with previous 85 images. The first manual identification on these 85 images was set as a truth ground. The mean radial error (MRE) and successful detection rate (SDR) were calculated to evaluate the errors in manual identification and artificial intelligence (AI) prediction. Results: The AI showed an average MRE of 2.23 ± 2.02 mm with an SDR of 60.88% for errors of 2 mm or lower. However, in a comparison of the repetitive task, the AI predicted landmarks at the same position, while the MRE for the repeated manual identification was 1.31 ± 0.94 mm. Conclusions: Automated identification for CBCT-synthesized PA cephalometric landmarks did not sufficiently achieve the clinically favorable error range of less than 2 mm. However, AI landmark identification on PA cephalograms showed better consistency than manual identification.
https://doi.org/10.4041/kjod.2021.51.2.77 인용 PDF KSCI

A Study on Pagoda Image Search Using Artificial Intelligence (AI) Technology for Restoration of Cultural Properties

Lee, ByongKwon;Kim, Soo Kyun;Kim, Seokhun
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.15 no.6
- /
- pp.2086-2097
- /
- 2021
The current cultural assets are being restored depending on the opinions of experts (craftsmen). We intend to introduce digitalized artificial intelligence techniques, excluding the personal opinions of experts on reconstruction of such cultural properties. The first step toward restoring digitized cultural properties is separation. The restoration of cultural properties should be reorganized based on recorded documents, period historical backgrounds and regional characteristics. The cultural properties in the form of photographs or images should be collected by separating the background. In addition, when restoring cultural properties most of them depend a lot on the tendency of the restoring person workers. As a result, it often occurs when there is a problem in the accuracy and reliability of restoration of cultural properties. In this study, we propose a search method for learning stored digital cultural assets using AI technology. Pagoda was selected for restoration of Cultural Properties. Pagoda data collection was collected through the Internet and various historical records. The pagoda data was classified by period and region, and grouped into similar buildings. The collected data was learned by applying the well-known CNN algorithm for artificial intelligence learning. The pagoda search used Yolo Marker to mark the tower shape. The tower was used a total of about 100-10,000 pagoda data. In conclusion, it was confirmed that the probability of searching for a tower differs according to the number of pagoda pictures and the number of learning iterations. Finally, it was confirmed that the number of 500 towers and the epochs in training of 8000 times were good. If the test result exceeds 8,000 times, it becomes overfitting. All so, I found a phenomenon that the recognition rate drops when the enemy repeatedly learns more than 8,000 times. As a result of this study, it is believed that it will be helpful in data gathering to increase the accuracy of tower restoration.
https://doi.org/10.3837/tiis.2021.06.008 인용 PDF KSCI HTML

An Analysis Study of SW·AI elements of Primary Textbooks based on the 2015 Revised National Curriculum (2015 개정교육과정에 따른 초등학교 교과서의 SW·AI 요소 분석 연구)

Park, SunJu
- Journal of The Korean Association of Information Education
- /
- v.25 no.2
- /
- pp.317-325
- /
- 2021
In this paper, the degree of reflection of SW·AI elements and CT elements was investigated and analyzed for a total of 44 textbooks of Korean, social, moral, mathematics and science textbooks based on the 2015 revised curriculum. As a result of the analysis, most of the activities of data collection, data analysis, and data presentation, which are ICT elements, were not reflected, and algorithm and programming elements were not reflected among SW·AI content elements, and there were no abstraction, automation, and generalization elements among CT elements. Therefore, in order to effectively implement SW·AI convergence education in elementary school subjects, we will expand ICT utilization activities to SW·AI utilization activities. Training on the understanding of SW·AI convergence education and improvement of teaching and learning methods using SW·AI is needed for teachers. In addition, it is necessary to establish an information curriculum and secure separate class hours for substantial SW·AI education.
https://doi.org/10.14352/jkaie.2021.25.2.317 인용 PDF KSCI

Video Camera Model Identification System Using Deep Learning (딥 러닝을 이용한 비디오 카메라 모델 판별 시스템)

Kim, Dong-Hyun;Lee, Soo-Hyeon;Lee, Hae-Yeoun
- The Journal of Korean Institute of Information Technology
- /
- v.17 no.8
- /
- pp.1-9
- /
- 2019
With the development of imaging information communication technology in modern society, imaging acquisition and mass production technology have developed rapidly. However, crime rates using these technology are increased and forensic studies are conducted to prevent it. Identification techniques for image acquisition devices are studied a lot, but the field is limited to images. In this paper, camera model identification technique for video, not image is proposed. We analyzed video frames using the trained model with images. Through training and analysis by considering the frame characteristics of video, we showed the superiority of the model using the P frame. Then, we presented a video camera model identification system by applying a majority-based decision algorithm. In the experiment using 5 video camera models, we obtained maximum 96.18% accuracy for each frame identification and the proposed video camera model identification system achieved 100% identification rate for each camera model.
https://doi.org/10.14801/jkiit.2019.17.8.1 인용

Search Result 2,428, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)