Search | Korea Science

Transfer Learning using Multiple ConvNet Layers Activation Features with Principal Component Analysis for Image Classification (전이학습 기반 다중 컨볼류션 신경망 레이어의 활성화 특징과 주성분 분석을 이용한 이미지 분류 방법)

Byambajav, Batkhuu;Alikhanov, Jumabek;Fang, Yang;Ko, Seunghyun;Jo, Geun Sik
- Journal of Intelligence and Information Systems
- /
- v.24 no.1
- /
- pp.205-225
- /
- 2018
Convolutional Neural Network (ConvNet) is one class of the powerful Deep Neural Network that can analyze and learn hierarchies of visual features. Originally, first neural network (Neocognitron) was introduced in the 80s. At that time, the neural network was not broadly used in both industry and academic field by cause of large-scale dataset shortage and low computational power. However, after a few decades later in 2012, Krizhevsky made a breakthrough on ILSVRC-12 visual recognition competition using Convolutional Neural Network. That breakthrough revived people interest in the neural network. The success of Convolutional Neural Network is achieved with two main factors. First of them is the emergence of advanced hardware (GPUs) for sufficient parallel computation. Second is the availability of large-scale datasets such as ImageNet (ILSVRC) dataset for training. Unfortunately, many new domains are bottlenecked by these factors. For most domains, it is difficult and requires lots of effort to gather large-scale dataset to train a ConvNet. Moreover, even if we have a large-scale dataset, training ConvNet from scratch is required expensive resource and time-consuming. These two obstacles can be solved by using transfer learning. Transfer learning is a method for transferring the knowledge from a source domain to new domain. There are two major Transfer learning cases. First one is ConvNet as fixed feature extractor, and the second one is Fine-tune the ConvNet on a new dataset. In the first case, using pre-trained ConvNet (such as on ImageNet) to compute feed-forward activations of the image into the ConvNet and extract activation features from specific layers. In the second case, replacing and retraining the ConvNet classifier on the new dataset, then fine-tune the weights of the pre-trained network with the backpropagation. In this paper, we focus on using multiple ConvNet layers as a fixed feature extractor only. However, applying features with high dimensional complexity that is directly extracted from multiple ConvNet layers is still a challenging problem. We observe that features extracted from multiple ConvNet layers address the different characteristics of the image which means better representation could be obtained by finding the optimal combination of multiple ConvNet layers. Based on that observation, we propose to employ multiple ConvNet layer representations for transfer learning instead of a single ConvNet layer representation. Overall, our primary pipeline has three steps. Firstly, images from target task are given as input to ConvNet, then that image will be feed-forwarded into pre-trained AlexNet, and the activation features from three fully connected convolutional layers are extracted. Secondly, activation features of three ConvNet layers are concatenated to obtain multiple ConvNet layers representation because it will gain more information about an image. When three fully connected layer features concatenated, the occurring image representation would have 9192 (4096+4096+1000) dimension features. However, features extracted from multiple ConvNet layers are redundant and noisy since they are extracted from the same ConvNet. Thus, a third step, we will use Principal Component Analysis (PCA) to select salient features before the training phase. When salient features are obtained, the classifier can classify image more accurately, and the performance of transfer learning can be improved. To evaluate proposed method, experiments are conducted in three standard datasets (Caltech-256, VOC07, and SUN397) to compare multiple ConvNet layer representations against single ConvNet layer representation by using PCA for feature selection and dimension reduction. Our experiments demonstrated the importance of feature selection for multiple ConvNet layer representation. Moreover, our proposed approach achieved 75.6% accuracy compared to 73.9% accuracy achieved by FC7 layer on the Caltech-256 dataset, 73.1% accuracy compared to 69.2% accuracy achieved by FC8 layer on the VOC07 dataset, 52.2% accuracy compared to 48.7% accuracy achieved by FC7 layer on the SUN397 dataset. We also showed that our proposed approach achieved superior performance, 2.8%, 2.1% and 3.1% accuracy improvement on Caltech-256, VOC07, and SUN397 dataset respectively compare to existing work.
https://doi.org/10.13088/jiis.2018.24.1.205 인용 PDF KSCI

Super-resolution Algorithm Using Adaptive Unsharp Masking for Infra-red Images (적외선 영상을 위한 적응적 언샤프 마스킹을 이용한 초고해상도 알고리즘)

Kim, Yong-Jun;Song, Byung Cheol
- Journal of Broadcast Engineering
- /
- v.21 no.2
- /
- pp.180-191
- /
- 2016
When up-scaling algorithms for visible light images are applied to infrared (IR) images, they rarely work because IR images are usually blurred. In order to solve such a problem, this paper proposes an up-scaling algorithm for IR images. We employ adaptive dynamic range encoding (ADRC) as a simple classifier based on the observation that IR images have weak details. Also, since human visual systems are more sensitive to edges, our algorithm focuses on edges. Then, we add pre-processing in learning phase. As a result, we can improve visibility of IR images without increasing computational cost. Comparing with Anchored neighborhood regression (A+), the proposed algorithm provides better results. In terms of just noticeable blur, the proposed algorithm shows higher values by 0.0201 than the A+, respectively.
https://doi.org/10.5909/JBE.2016.21.2.180 인용 PDF KSCI KPUBS HTML

PE Header Characteristics Analysis Technique for Malware Detection (악성프로그램 탐지를 위한 PE헤더 특성 분석 기술)

Choi, Yang-Seo;Kim, Ik-Kyun;Oh, Jin-Tae;Ryu, Jae-Cheol
- Convergence Security Journal
- /
- v.8 no.2
- /
- pp.63-70
- /
- 2008
In order not to make the malwares be easily analyzed, the hackers apply various anti-reversing and obfuscation techniques to the malwares. However, as the more anti-revering techniques are applied to the malwares the more abnormal characteristics in the PE file's header which are not shown in the normal PE file, could be observed. In this letter, a new malware detection technique is proposed based on this observation. For the malware detection, we define the Characteristics Vector(CV) which can represent the characteristics of a PE file's header. In the learning phase, we calculate the average CV(ACV) of malwares(ACVM) and normal files(ACVN). To detect the malwares we calculate the 2 Weighted Euclidean Distances(WEDs) from a file's CV to ACVs and they are used to decide whether the file is a malware or not. The proposed technique is very fast and detection rate is fairly high, so it could be applied to the network based attack detection and prevention devices. Moreover, this technique is could be used to detect the unknown malwares because it does not utilize a signature but the malware's characteristics.
PDF

Development of Health Promotion Program through IUHPE : Possibilities of Collaboration in East Asia

Moriyama, Masaki
- Korean Journal of Health Education and Promotion
- /
- v.22 no.3
- /
- pp.97-107
- /
- 2005
This paper considers the possibilities of health promotion from the following perspectives; (1) IUHPE, (2) socio-cultural similarities, (3) action research, and (4) learning from our past. 1. The IUHPE values decentralized activities through regions, and countries such as Japan, Korea, Hong Kong, Taiwan and China belong to NPWP region. Since IUHPE World Conference was held in Japan in 1995, Japan used to occupy more than 60% of NPWP membership. After 2001, membership is increasing rapidly in Chinese speaking sub-region. The transnational collaboration is still in its beginning phase. 2. Confucianism is one of key points. Confucian tradition should not be seen only as obstacles but as advantages to seek a form of health promotion more acceptable in East Asia. 3 Within the new public health framework, people are expected to create and live their health. However, especially in Japan, the tendency of 'lacking of face-to-face explicit interactions' is still common at health-promotion settings as well as academic settings. Therefore, the author tried participatory approaches such as asking WIFY(interactive questions designed for subjects to review their daily life and environment) and as introducing round table interactions. So far, majority of participants welcome new trials. 4. The following social phenomena are comparatively discussed after Japanese invasion and occupation of Korea ended in 1945; status of oriental medicine, separation of dispensary services, and health promotion specialist as a national license. In contrast to Japanese' tendency of maintaining the status quo and postponing of substantial social change, trend toward rapid and dynamic social changes are more commonly observed in Korea. Although all of above possibilities are still in their beginning stages, they are going to offer interesting directions waiting for further challenges and accompanying researches.
PDF KSCI

The Case Study of Elementary School Teachers Who Have Experienced Teacher Participation-oriented Education Program (TPEP) for Elementary School Teachers to Improve Class Expertise in Science Classes - Focusing on Visual Attention - (교사 참여형 교육프로그램(TPEP)을 경험한 초등교사의 과학 수업 전문성 변화 사례 - 시각적 주의를 중심으로 -)

Kim, Jang-Hwan;Shin, Won-Sub;Shin, Dong-Hoon
- Journal of Korean Elementary Science Education
- /
- v.39 no.1
- /
- pp.133-144
- /
- 2020
The purpose of this study is to identify the effect of Teacher Participation-oriented Education Program (TPEP) for Elementary School Teachers to Improve Class Expertise in Science Classes with a focus on visual attention. The participants were two elementary school teachers in Seoul and taught science subjects. The lesson topic applied to this study were 'Structure and Function of Our Body' in the second semester of fifth grade and 'Volcano and Earthquake' in the second semester of fourth grade. The mobile eye tracker SMI's ETG 2w, which is a binocular tracking system was used in this study. In this study, the actual practice time, participant's visual attention, visual intake time average, and visual intake time average were analyzed by class phase. The results of the study are as follows. First, as a result of analyzing the actual class execution time, the actual class execution time was almost in line with the lesson plan after the TPEP application. Second, visual attention in the areas related to teaching and learning activities was high after applying TPEP. Factors affecting the progress of the class and cognitive burdens were identified quantitatively and objectively through visual attention. Third, as a result of analyzing the visual intake time average of participants, there was a statistically significant difference in all classes. Fourth, as a result of analyzing the visual intake time average of participants, the results were statistically significant in the introduction(video), activity 1, activity 2, and activity 3 stages in the lecture type class. The Teacher Participation-oriented Education Program (TPEP) for Elementary School Teachers to Improve Class Expertise in Science Classes can extend elementary science class expertise such as self-class analysis, eye tracking, linguistic, gesture, and class design beyond traditional class analysis and consulting.
https://doi.org/10.15267/keses.2020.39.1.133 인용 PDF KSCI

Feature-Strengthened Gesture Recognition Model Based on Dynamic Time Warping for Multi-Users (다중 사용자를 위한 Dynamic Time Warping 기반의 특징 강조형 제스처 인식 모델)

Lee, Suk Kyoon;Um, Hyun Min;Kwon, Hyuck Tae
- KIPS Transactions on Software and Data Engineering
- /
- v.5 no.10
- /
- pp.503-510
- /
- 2016
FsGr model, which has been proposed recently, is an approach of accelerometer-based gesture recognition by applying DTW algorithm in two steps, which improved recognition success rate. In FsGr model, sets of similar gestures will be produced through training phase, in order to define the notion of a set of similar gestures. At the 1st attempt of gesture recognition, if the result turns out to belong to a set of similar gestures, it makes the 2nd recognition attempt to feature-strengthened parts extracted from the set of similar gestures. However, since a same gesture show drastically different characteristics according to physical traits such as body size, age, and sex, FsGr model may not be good enough to apply to multi-user environments. In this paper, we propose FsGrM model that extends FsGr model for multi-user environment and present a program which controls channel and volume of smart TV using FsGrM model.
https://doi.org/10.3745/KTSDE.2016.5.10.503 인용 PDF KSCI

Text-to-speech with linear spectrogram prediction for quality and speed improvement (음질 및 속도 향상을 위한 선형 스펙트로그램 활용 Text-to-speech)

Yoon, Hyebin
- Phonetics and Speech Sciences
- /
- v.13 no.3
- /
- pp.71-78
- /
- 2021
Most neural-network-based speech synthesis models utilize neural vocoders to convert mel-scaled spectrograms into high-quality, human-like voices. However, neural vocoders combined with mel-scaled spectrogram prediction models demand considerable computer memory and time during the training phase and are subject to slow inference speeds in an environment where GPU is not used. This problem does not arise in linear spectrogram prediction models, as they do not use neural vocoders, but these models suffer from low voice quality. As a solution, this paper proposes a Tacotron 2 and Transformer-based linear spectrogram prediction model that produces high-quality speech and does not use neural vocoders. Experiments suggest that this model can serve as the foundation of a high-quality text-to-speech model with fast inference speed.
https://doi.org/10.13064/KSSS.2021.13.3.071 인용 PDF KSCI

Performance comparison evaluation of speech enhancement using various loss functions (다양한 손실 함수를 이용한 음성 향상 성능 비교 평가)

Hwang, Seo-Rim;Byun, Joon;Park, Young-Cheol
- The Journal of the Acoustical Society of Korea
- /
- v.40 no.2
- /
- pp.176-182
- /
- 2021
This paper evaluates and compares the performance of the Deep Nerual Network (DNN)-based speech enhancement models according to various loss functions. We used a complex network that can consider the phase information of speech as a baseline model. As the loss function, we consider two types of basic loss functions; the Mean Squared Error (MSE) and the Scale-Invariant Source-to-Noise Ratio (SI-SNR), and two types of perceptual-based loss functions, including the Perceptual Metric for Speech Quality Evaluation (PMSQE) and the Log Mel Spectra (LMS). The performance comparison was performed through objective evaluation and listening tests with outputs obtained using various combinations of the loss functions. Test results show that when a perceptual-based loss function was combined with MSE or SI-SNR, the overall performance is improved, and the perceptual-based loss functions, even exhibiting lower objective scores showed better performance in the listening test.
https://doi.org/10.7776/ASK.2021.40.2.176 인용 PDF KSCI

Predicting the CPT-based pile set-up parameters using HHO-RF and PSO-RF hybrid models

Yun Dawei;Zheng Bing;Gu Bingbing;Gao Xibo;Behnaz Razzaghzadeh
- Structural Engineering and Mechanics
- /
- v.86 no.5
- /
- pp.673-686
- /
- 2023
Determining the properties of pile from cone penetration test (CPT) is costly, and need several in-situ tests. At the present study, two novel hybrid learning models, namely PSO-RF and HHO-RF, which are an amalgamation of random forest (RF) with particle swarm optimization (PSO) and Harris hawks optimization (HHO) were developed and applied to predict the pile set-up parameter "A" from CPT for the design aim of the projects. To forecast the "A," CPT data along were collected from different sites in Louisiana, where the selected variables as input were plasticity index (PI), undrained shear strength (S_u), and over consolidation ratio (OCR). Results show that both PSO-RF and HHO-RF models have acceptable performance in predicting the set-up parameter "A," with R² larger than 0.9094, representing the admissible correlation between observed and predicted values. HHO-RF has better proficiency than the PSO-RF model, with R² and RMSE equal to 0.9328 and 0.0292 for the training phase and 0.9729 and 0.024 for testing data, respectively. Moreover, PI and OBJ indices are considered, in which the HHO-RF model has lower results which leads to outperforming this hybrid algorithm with respect to PSO-RF for predicting the pile set-up parameter "A," consequently being specified as the proposed model. Therefore, the results demonstrate the ability of the HHO algorithm in determining the optimal value of RF hyperparameters than PSO.
https://doi.org/10.12989/sem.2023.86.5.673 인용

Ensembles of neural network with stochastic optimization algorithms in predicting concrete tensile strength

Hu, Juan;Dong, Fenghui;Qiu, Yiqi;Xi, Lei;Majdi, Ali;Ali, H. Elhosiny
- Steel and Composite Structures
- /
- v.45 no.2
- /
- pp.205-218
- /
- 2022
Proper calculation of splitting tensile strength (STS) of concrete has been a crucial task, due to the wide use of concrete in the construction sector. Following many recent studies that have proposed various predictive models for this aim, this study suggests and tests the functionality of three hybrid models in predicting the STS from the characteristics of the mixture components including cement compressive strength, cement tensile strength, curing age, the maximum size of the crushed stone, stone powder content, sand fine modulus, water to binder ratio, and the ratio of sand. A multi-layer perceptron (MLP) neural network incorporates invasive weed optimization (IWO), cuttlefish optimization algorithm (CFOA), and electrostatic discharge algorithm (ESDA) which are among the newest optimization techniques. A dataset from the earlier literature is used for exploring and extrapolating the STS behavior. The results acquired from several accuracy criteria demonstrated a nice learning capability for all three hybrid models viz. IWO-MLP, CFOA-MLP, and ESDA-MLP. Also in the prediction phase, the prediction products were in a promising agreement (above 88%) with experimental results. However, a comparative look revealed the ESDA-MLP as the most accurate predictor. Considering mean absolute percentage error (MAPE) index, the error of ESDA-MLP was 9.05%, while the corresponding value for IWO-MLP and CFOA-MLP was 9.17 and 13.97%, respectively. Since the combination of MLP and ESDA can be an effective tool for optimizing the concrete mixture toward a desirable STS, the last part of this study is dedicated to extracting a predictive formula from this model.
https://doi.org/10.12989/scs.2022.45.2.205 인용 KSCI

Search Result 191, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)