• Title/Summary/Keyword: training parameters

Search Result 1,021, Processing Time 0.029 seconds

Speech Recognition Model Based on CNN using Spectrogram (스펙트로그램을 이용한 CNN 음성인식 모델)

  • Won-Seog Jeong;Haeng-Woo Lee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.4
    • /
    • pp.685-692
    • /
    • 2024
  • In this paper, we propose a new CNN model to improve the recognition performance of command voice signals. This method obtains a spectrogram image after performing a short-time Fourier transform (STFT) of the input signal and improves command recognition performance through supervised learning using a CNN model. After Fourier transforming the input signal for each short-time section, a spectrogram image is obtained and multi-classification learning is performed using a CNN deep learning model. This effectively classifies commands by converting the time domain voice signal to the frequency domain to express the characteristics well and performing deep learning training using the spectrogram image for the conversion parameters. To verify the performance of the speech recognition system proposed in this study, a simulation program using Tensorflow and Keras libraries was created and a simulation experiment was performed. As a result of the experiment, it was confirmed that an accuracy of 92.5% could be obtained using the proposed deep learning algorithm.

Investigating Dynamic Mutation Process of Issues Using Unstructured Text Analysis (부도예측을 위한 KNN 앙상블 모형의 동시 최적화)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.139-157
    • /
    • 2016
  • Bankruptcy involves considerable costs, so it can have significant effects on a country's economy. Thus, bankruptcy prediction is an important issue. Over the past several decades, many researchers have addressed topics associated with bankruptcy prediction. Early research on bankruptcy prediction employed conventional statistical methods such as univariate analysis, discriminant analysis, multiple regression, and logistic regression. Later on, many studies began utilizing artificial intelligence techniques such as inductive learning, neural networks, and case-based reasoning. Currently, ensemble models are being utilized to enhance the accuracy of bankruptcy prediction. Ensemble classification involves combining multiple classifiers to obtain more accurate predictions than those obtained using individual models. Ensemble learning techniques are known to be very useful for improving the generalization ability of the classifier. Base classifiers in the ensemble must be as accurate and diverse as possible in order to enhance the generalization ability of an ensemble model. Commonly used methods for constructing ensemble classifiers include bagging, boosting, and random subspace. The random subspace method selects a random feature subset for each classifier from the original feature space to diversify the base classifiers of an ensemble. Each ensemble member is trained by a randomly chosen feature subspace from the original feature set, and predictions from each ensemble member are combined by an aggregation method. The k-nearest neighbors (KNN) classifier is robust with respect to variations in the dataset but is very sensitive to changes in the feature space. For this reason, KNN is a good classifier for the random subspace method. The KNN random subspace ensemble model has been shown to be very effective for improving an individual KNN model. The k parameter of KNN base classifiers and selected feature subsets for base classifiers play an important role in determining the performance of the KNN ensemble model. However, few studies have focused on optimizing the k parameter and feature subsets of base classifiers in the ensemble. This study proposed a new ensemble method that improves upon the performance KNN ensemble model by optimizing both k parameters and feature subsets of base classifiers. A genetic algorithm was used to optimize the KNN ensemble model and improve the prediction accuracy of the ensemble model. The proposed model was applied to a bankruptcy prediction problem by using a real dataset from Korean companies. The research data included 1800 externally non-audited firms that filed for bankruptcy (900 cases) or non-bankruptcy (900 cases). Initially, the dataset consisted of 134 financial ratios. Prior to the experiments, 75 financial ratios were selected based on an independent sample t-test of each financial ratio as an input variable and bankruptcy or non-bankruptcy as an output variable. Of these, 24 financial ratios were selected by using a logistic regression backward feature selection method. The complete dataset was separated into two parts: training and validation. The training dataset was further divided into two portions: one for the training model and the other to avoid overfitting. The prediction accuracy against this dataset was used to determine the fitness value in order to avoid overfitting. The validation dataset was used to evaluate the effectiveness of the final model. A 10-fold cross-validation was implemented to compare the performances of the proposed model and other models. To evaluate the effectiveness of the proposed model, the classification accuracy of the proposed model was compared with that of other models. The Q-statistic values and average classification accuracies of base classifiers were investigated. The experimental results showed that the proposed model outperformed other models, such as the single model and random subspace ensemble model.

The Effects of 8-week Ketone Body Supplementation on Endurance Exercise Performance and Autophagy in the Skeletal Muscle of Mice (8주 케톤체 투여가 마우스 지구성 운동수행능력과 골격근의 자가포식에 미치는 영향)

  • Jeong-sun Ju;Min-joo Park;Dal-woo Lee;Dong-won Lee
    • Journal of Life Science
    • /
    • v.33 no.3
    • /
    • pp.242-251
    • /
    • 2023
  • The purpose of this study was to investigate the effects of 8-week β-hydroxybutyrate (β-HB) administration with and without endurance exercise training on endurance exercise performance and skeletal muscle protein synthesis and degradation using a mouse model. Forty-eight male wild-type ICR mice (8 weeks old) were randomly divided into four groups: sedentary control (Sed+Con), (Sed+Con), sedentary β-HB (Sed+β-HB), exercise control (Exe+Con), and exercise β-HB (Exe+β-HB). β-HB was dissolved in PBS (150 mg/ml) and injected subcutaneously daily (250 mg/kg) for 8 weeks. Mice performed 5 days/week of a 20 min treadmill running exercise for 8 weeks. The running exercise was carried out at a speed of 10 m/min at a 10° incline for 5 min, and then the speed was increased by 1 m/min for every 1 min of the remaining 15 min. Following 8 weeks of treatments, visceral fat mass and skeletal muscle mass, blood parameters, and the markers for autophagy and protein synthesis were analyzed. The data were analyzed with one-way ANOVA (p<0.05) using the SPSS 21 program. Eight weeks of Exe+β-HB treatment significantly lowered blood lactate levels compared with the other three groups (Sed+Con, Sed+β-HB, and Exe+β-HB) Exe+β-HB) (p<0.05). Eight weeks of Exe+β-HB significantly increased maximal running time (time to exhaustion) compared with the Sed+Con and Exe+Con groups (p<0.05). Eight weeks of β-HB administration significantly decreased autophagy flux and autophagy-related proteins in the skeletal muscle of mice (p<0.05). Conversely, the combined treatment of β-HB and endurance exercise training increased protein synthesis (mTOR signaling and translation) (p<0.05). The 8-week β-HB treatment and endurance exercise training had synergistic effects on the increase in endurance performance, increase in protein synthesis, and decrease in protein degradation in the skeletal muscle of mice.

Study on the Short-Term Hemodynamic Effects of Experimental Cardiomyoplasty in Heart Failure Model (심부전 모델에서 실험적 심근성형술의 단기 혈역학적 효과에 관한 연구)

  • Jeong, Yoon-Seop;Youm, Wook;Lee, Chang-Ha;Kim, Wook-Seong;Lee, Young-Tak;Kim, Won-Gon
    • Journal of Chest Surgery
    • /
    • v.32 no.3
    • /
    • pp.224-236
    • /
    • 1999
  • Background: To evaluate the short-term effect of dynamic cardiomyoplasty on circulatory function and detect the related factors that can affect it, experimental cardiomyoplasties were performed under the state of normal cardiac function and heart failure. Material and Method: A total of 10 mongrel dogs weighing 20 to 30kg were divided arbitrarily into two groups. Five dogs of group A underwent cardiomyoplasty with latissimus dorsi(LD) muscle mobilization followed by a 2-week vascular delay and 6-week muscle training. Then, hemodynamic studies were conducted. In group B, doxorubicin was given to 5 dogs in an IV dose of 1 mg/kg once a week for 8 weeks to induce chronic heart failure, and simultaneous muscle training was given for preconditioning during this period. Then, cardiomyoplasties were performed and hemodynamic studies were conducted immediately after these cardiomyoplasties in group B. Result: In group A, under the state of normal cardiac function, only mean right atrial pressure significantly increased with the pacer-on(p<0.05) and the left ventricular hemodynamic parameters did not change significantly. However, with pacer-on in group B, cardiac output(CO), rate of left ventricular pressure development(dp/dt), stroke volume(SV), and left ventricular stroke work(SW) increased by 16.7${\pm}$7.2%, 9.3${\pm}$3.2%, 16.8${\pm}$8.6%, and 23.1${\pm}$9.7%, respectively, whereas left ventricular end-diastole pressure(LVEDP) and mean pulmonary capillary wedge pressure(mPCWP) decreased by 32.1${\pm}$4.6% and 17.7${\pm}$9.1%, respectively(p<0.05). In group A, imipramine was infused at the rate of 7.5mg/kg/hour for 34${\pm}$2.6 minutes to induce acute heart failure, which resulted in the reduction of cardiac output by 17.5${\pm}$2.7%, systolic left ventricular pressure by 15.8${\pm}$2.5% and the elevation of left ventricular end-diastole pressure by 54.3${\pm}$15.2%(p<0.05). With pacer-on under this state of acute heart failu e, CO, dp/dt, SV, and SW increased by 4.5${\pm}$1.8% and 3.1${\pm}$1.1%, 5.7${\pm}$3.6%, and 6.9${\pm}$4.4%, respectively, whereas LVEDP decreased by 11.7${\pm}$4.7%(p<0.05). Comparing CO, dp/dt, SV, SW and LVEDP that changed significantly with pacer-on, both under the state of acute and chronic heart failure, augmentation widths of these left ventricular hemodynamic parameters were significantly larger under the state of chronic heart failure(group B) than acute heart failure(group A)(p<0.05). On gross inspection, variable degrees of adhesion and inflammation were present in all 5 dogs of group A, including 2 dogs that showed no muscle contraction. No adhesion and inflammation were, however, present in all 5 dogs of group B, which showed vivid muscle contractions. Considering these differences in gross findings along with the following premise that the acute heart failure state was not statistically different from the chronic one in terms of left ventricular parameters(p>0.05), the larger augmentation effect seen in group B is presumed to be mainly attributed to the viability and contractility of the LD muscle. Conclusion: These results indicate that the positive circulatory augmentation effect of cardiomyoplasty is apparent only under the state of heart failure and the preservation of muscle contractility is important to maximize this effect.

  • PDF

A Time Series Graph based Convolutional Neural Network Model for Effective Input Variable Pattern Learning : Application to the Prediction of Stock Market (효과적인 입력변수 패턴 학습을 위한 시계열 그래프 기반 합성곱 신경망 모형: 주식시장 예측에의 응용)

  • Lee, Mo-Se;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.167-181
    • /
    • 2018
  • Over the past decade, deep learning has been in spotlight among various machine learning algorithms. In particular, CNN(Convolutional Neural Network), which is known as the effective solution for recognizing and classifying images or voices, has been popularly applied to classification and prediction problems. In this study, we investigate the way to apply CNN in business problem solving. Specifically, this study propose to apply CNN to stock market prediction, one of the most challenging tasks in the machine learning research. As mentioned, CNN has strength in interpreting images. Thus, the model proposed in this study adopts CNN as the binary classifier that predicts stock market direction (upward or downward) by using time series graphs as its inputs. That is, our proposal is to build a machine learning algorithm that mimics an experts called 'technical analysts' who examine the graph of past price movement, and predict future financial price movements. Our proposed model named 'CNN-FG(Convolutional Neural Network using Fluctuation Graph)' consists of five steps. In the first step, it divides the dataset into the intervals of 5 days. And then, it creates time series graphs for the divided dataset in step 2. The size of the image in which the graph is drawn is $40(pixels){\times}40(pixels)$, and the graph of each independent variable was drawn using different colors. In step 3, the model converts the images into the matrices. Each image is converted into the combination of three matrices in order to express the value of the color using R(red), G(green), and B(blue) scale. In the next step, it splits the dataset of the graph images into training and validation datasets. We used 80% of the total dataset as the training dataset, and the remaining 20% as the validation dataset. And then, CNN classifiers are trained using the images of training dataset in the final step. Regarding the parameters of CNN-FG, we adopted two convolution filters ($5{\times}5{\times}6$ and $5{\times}5{\times}9$) in the convolution layer. In the pooling layer, $2{\times}2$ max pooling filter was used. The numbers of the nodes in two hidden layers were set to, respectively, 900 and 32, and the number of the nodes in the output layer was set to 2(one is for the prediction of upward trend, and the other one is for downward trend). Activation functions for the convolution layer and the hidden layer were set to ReLU(Rectified Linear Unit), and one for the output layer set to Softmax function. To validate our model - CNN-FG, we applied it to the prediction of KOSPI200 for 2,026 days in eight years (from 2009 to 2016). To match the proportions of the two groups in the independent variable (i.e. tomorrow's stock market movement), we selected 1,950 samples by applying random sampling. Finally, we built the training dataset using 80% of the total dataset (1,560 samples), and the validation dataset using 20% (390 samples). The dependent variables of the experimental dataset included twelve technical indicators popularly been used in the previous studies. They include Stochastic %K, Stochastic %D, Momentum, ROC(rate of change), LW %R(Larry William's %R), A/D oscillator(accumulation/distribution oscillator), OSCP(price oscillator), CCI(commodity channel index), and so on. To confirm the superiority of CNN-FG, we compared its prediction accuracy with the ones of other classification models. Experimental results showed that CNN-FG outperforms LOGIT(logistic regression), ANN(artificial neural network), and SVM(support vector machine) with the statistical significance. These empirical results imply that converting time series business data into graphs and building CNN-based classification models using these graphs can be effective from the perspective of prediction accuracy. Thus, this paper sheds a light on how to apply deep learning techniques to the domain of business problem solving.

Optimization of Multiclass Support Vector Machine using Genetic Algorithm: Application to the Prediction of Corporate Credit Rating (유전자 알고리즘을 이용한 다분류 SVM의 최적화: 기업신용등급 예측에의 응용)

  • Ahn, Hyunchul
    • Information Systems Review
    • /
    • v.16 no.3
    • /
    • pp.161-177
    • /
    • 2014
  • Corporate credit rating assessment consists of complicated processes in which various factors describing a company are taken into consideration. Such assessment is known to be very expensive since domain experts should be employed to assess the ratings. As a result, the data-driven corporate credit rating prediction using statistical and artificial intelligence (AI) techniques has received considerable attention from researchers and practitioners. In particular, statistical methods such as multiple discriminant analysis (MDA) and multinomial logistic regression analysis (MLOGIT), and AI methods including case-based reasoning (CBR), artificial neural network (ANN), and multiclass support vector machine (MSVM) have been applied to corporate credit rating.2) Among them, MSVM has recently become popular because of its robustness and high prediction accuracy. In this study, we propose a novel optimized MSVM model, and appy it to corporate credit rating prediction in order to enhance the accuracy. Our model, named 'GAMSVM (Genetic Algorithm-optimized Multiclass Support Vector Machine),' is designed to simultaneously optimize the kernel parameters and the feature subset selection. Prior studies like Lorena and de Carvalho (2008), and Chatterjee (2013) show that proper kernel parameters may improve the performance of MSVMs. Also, the results from the studies such as Shieh and Yang (2008) and Chatterjee (2013) imply that appropriate feature selection may lead to higher prediction accuracy. Based on these prior studies, we propose to apply GAMSVM to corporate credit rating prediction. As a tool for optimizing the kernel parameters and the feature subset selection, we suggest genetic algorithm (GA). GA is known as an efficient and effective search method that attempts to simulate the biological evolution phenomenon. By applying genetic operations such as selection, crossover, and mutation, it is designed to gradually improve the search results. Especially, mutation operator prevents GA from falling into the local optima, thus we can find the globally optimal or near-optimal solution using it. GA has popularly been applied to search optimal parameters or feature subset selections of AI techniques including MSVM. With these reasons, we also adopt GA as an optimization tool. To empirically validate the usefulness of GAMSVM, we applied it to a real-world case of credit rating in Korea. Our application is in bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. The experimental dataset was collected from a large credit rating company in South Korea. It contained 39 financial ratios of 1,295 companies in the manufacturing industry, and their credit ratings. Using various statistical methods including the one-way ANOVA and the stepwise MDA, we selected 14 financial ratios as the candidate independent variables. The dependent variable, i.e. credit rating, was labeled as four classes: 1(A1); 2(A2); 3(A3); 4(B and C). 80 percent of total data for each class was used for training, and remaining 20 percent was used for validation. And, to overcome small sample size, we applied five-fold cross validation to our dataset. In order to examine the competitiveness of the proposed model, we also experimented several comparative models including MDA, MLOGIT, CBR, ANN and MSVM. In case of MSVM, we adopted One-Against-One (OAO) and DAGSVM (Directed Acyclic Graph SVM) approaches because they are known to be the most accurate approaches among various MSVM approaches. GAMSVM was implemented using LIBSVM-an open-source software, and Evolver 5.5-a commercial software enables GA. Other comparative models were experimented using various statistical and AI packages such as SPSS for Windows, Neuroshell, and Microsoft Excel VBA (Visual Basic for Applications). Experimental results showed that the proposed model-GAMSVM-outperformed all the competitive models. In addition, the model was found to use less independent variables, but to show higher accuracy. In our experiments, five variables such as X7 (total debt), X9 (sales per employee), X13 (years after founded), X15 (accumulated earning to total asset), and X39 (the index related to the cash flows from operating activity) were found to be the most important factors in predicting the corporate credit ratings. However, the values of the finally selected kernel parameters were found to be almost same among the data subsets. To examine whether the predictive performance of GAMSVM was significantly greater than those of other models, we used the McNemar test. As a result, we found that GAMSVM was better than MDA, MLOGIT, CBR, and ANN at the 1% significance level, and better than OAO and DAGSVM at the 5% significance level.

Study of Motion-induced Dose Error Caused by Irregular Tumor Motion in Helical Tomotherapy (나선형 토모테라피에서 불규칙적인 호흡으로 발생되는 움직임에 의한 선량 오차에 대한 연구)

  • Cho, Min-Seok;Kim, Tae-Ho;Kang, Seong-Hee;Kim, Dong-Su;Kim, Kyeong-Hyeon;Cheon, Geum Seong;Suh, Tae Suk
    • Progress in Medical Physics
    • /
    • v.26 no.3
    • /
    • pp.119-126
    • /
    • 2015
  • The purpose of this study is to analyze motion-induced dose error generated by each tumor motion parameters of irregular tumor motion in helical tomotherapy. To understand the effect of the irregular tumor motion, a simple analytical model was simulated. Moving cases that has tumor motion were divided into a slightly irregular tumor motion case, a large irregular tumor motion case and a patient case. The slightly irregular tumor motion case was simulated with a variability of 10% in the tumor motion parameters of amplitude (amplitude case), period (period case), and baseline (baseline case), while the large irregular tumor motion case was simulated with a variability of 40%. In the phase case, the initial phase of the tumor motion was divided into end inhale, mid exhale, end exhale, and mid inhale; the simulated dose profiles for each case were compared. The patient case was also investigated to verify the motion-induced dose error in 'clinical-like' conditions. According to the simulation process, the dose profile was calculated. The moving case was compared with the static case that has no tumor motion. In the amplitude, period, baseline cases, the results show that the motion-induced dose error in the large irregular tumor motion case was larger than that in the slightly irregular tumor motion case or regular tumor motion case. Because the offset effect was inversely proportion to irregularity of tumor motion, offset effect was smaller in the large irregular tumor motion case than the slightly irregular tumor motion case or regular tumor motion case. In the phase case, the larger dose discrepancy was observed in the irregular tumor motion case than regular tumor motion case. A larger motion-induced dose error was also observed in the patient case than in the regular tumor motion case. This study analyzed motion-induced dose error as a function of each tumor motion parameters of irregular tumor motion during helical tomotherapy. The analysis showed that variability control of irregular tumor motion is important. We believe that the variability of irregular tumor motion can be reduced by using abdominal compression and respiratory training.

Investigation of aerodynamic evaluation in female patients undergoing thyroidectomy (갑상선절제술을 받은 여성 환자의 공기역학 검사변수 조사)

  • Kang, Young Ae;Kwon, In Sun;Won, Ho-Ryun;Chang, Jae Won;Koo, Bon Seok
    • Phonetics and Speech Sciences
    • /
    • v.12 no.2
    • /
    • pp.73-80
    • /
    • 2020
  • Breathing is the voice's driving force and also acts as a regulator of larynx function and efficiency. Respiratory distress is a side effect of general anesthesia in thyroid surgery. Therefore, this study's objective was to provide practical and complementary information for voice recovery after thyroid surgery, based on aerodynamic evaluation pre- and post-thyroidectomy. From May 2014 to July 2015, aerodynamic evaluations were performed on 34 female patients diagnosed with thyroid papillary cancer one week before surgery (PRE), one month after surgery (P1), and three months after surgery (P3). The Phonatory Aerodynamic System (model 6600, KayPENTAX, USA) was employed for this purpose, and a total of 29 analysis parameters were selected. The results showed statistically significant differences in peak expiratory airflow (p=0.004), mean pitch (p<0.01), expiration airflow duration (p=0.001), and expiratory volume (p=0.018), based on time factors. In the comparison of time factors, peak expiratory airflow and mean pitch parameters were different in PRE-P1 and PRE-P3. Expiration airflow duration and expiratory volume parameters were different in PRE-P3 and P1-P3. The interaction effect of time and surgical range was significant only for expiratory volume (p=0.024). Female patients who undergo thyroidectomy require post-operative breathing training, and exhalation improvement is considered to reflect a positive lifestyle after surgery.

A Study on Market Size Estimation Method by Product Group Using Word2Vec Algorithm (Word2Vec을 활용한 제품군별 시장규모 추정 방법에 관한 연구)

  • Jung, Ye Lim;Kim, Ji Hui;Yoo, Hyoung Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.1-21
    • /
    • 2020
  • With the rapid development of artificial intelligence technology, various techniques have been developed to extract meaningful information from unstructured text data which constitutes a large portion of big data. Over the past decades, text mining technologies have been utilized in various industries for practical applications. In the field of business intelligence, it has been employed to discover new market and/or technology opportunities and support rational decision making of business participants. The market information such as market size, market growth rate, and market share is essential for setting companies' business strategies. There has been a continuous demand in various fields for specific product level-market information. However, the information has been generally provided at industry level or broad categories based on classification standards, making it difficult to obtain specific and proper information. In this regard, we propose a new methodology that can estimate the market sizes of product groups at more detailed levels than that of previously offered. We applied Word2Vec algorithm, a neural network based semantic word embedding model, to enable automatic market size estimation from individual companies' product information in a bottom-up manner. The overall process is as follows: First, the data related to product information is collected, refined, and restructured into suitable form for applying Word2Vec model. Next, the preprocessed data is embedded into vector space by Word2Vec and then the product groups are derived by extracting similar products names based on cosine similarity calculation. Finally, the sales data on the extracted products is summated to estimate the market size of the product groups. As an experimental data, text data of product names from Statistics Korea's microdata (345,103 cases) were mapped in multidimensional vector space by Word2Vec training. We performed parameters optimization for training and then applied vector dimension of 300 and window size of 15 as optimized parameters for further experiments. We employed index words of Korean Standard Industry Classification (KSIC) as a product name dataset to more efficiently cluster product groups. The product names which are similar to KSIC indexes were extracted based on cosine similarity. The market size of extracted products as one product category was calculated from individual companies' sales data. The market sizes of 11,654 specific product lines were automatically estimated by the proposed model. For the performance verification, the results were compared with actual market size of some items. The Pearson's correlation coefficient was 0.513. Our approach has several advantages differing from the previous studies. First, text mining and machine learning techniques were applied for the first time on market size estimation, overcoming the limitations of traditional sampling based- or multiple assumption required-methods. In addition, the level of market category can be easily and efficiently adjusted according to the purpose of information use by changing cosine similarity threshold. Furthermore, it has a high potential of practical applications since it can resolve unmet needs for detailed market size information in public and private sectors. Specifically, it can be utilized in technology evaluation and technology commercialization support program conducted by governmental institutions, as well as business strategies consulting and market analysis report publishing by private firms. The limitation of our study is that the presented model needs to be improved in terms of accuracy and reliability. The semantic-based word embedding module can be advanced by giving a proper order in the preprocessed dataset or by combining another algorithm such as Jaccard similarity with Word2Vec. Also, the methods of product group clustering can be changed to other types of unsupervised machine learning algorithm. Our group is currently working on subsequent studies and we expect that it can further improve the performance of the conceptually proposed basic model in this study.

True Orthoimage Generation from LiDAR Intensity Using Deep Learning (딥러닝에 의한 라이다 반사강도로부터 엄밀정사영상 생성)

  • Shin, Young Ha;Hyung, Sung Woong;Lee, Dong-Cheon
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.38 no.4
    • /
    • pp.363-373
    • /
    • 2020
  • During last decades numerous studies generating orthoimage have been carried out. Traditional methods require exterior orientation parameters of aerial images and precise 3D object modeling data and DTM (Digital Terrain Model) to detect and recover occlusion areas. Furthermore, it is challenging task to automate the complicated process. In this paper, we proposed a new concept of true orthoimage generation using DL (Deep Learning). DL is rapidly used in wide range of fields. In particular, GAN (Generative Adversarial Network) is one of the DL models for various tasks in imaging processing and computer vision. The generator tries to produce results similar to the real images, while discriminator judges fake and real images until the results are satisfied. Such mutually adversarial mechanism improves quality of the results. Experiments were performed using GAN-based Pix2Pix model by utilizing IR (Infrared) orthoimages, intensity from LiDAR data provided by the German Society for Photogrammetry, Remote Sensing and Geoinformation (DGPF) through the ISPRS (International Society for Photogrammetry and Remote Sensing). Two approaches were implemented: (1) One-step training with intensity data and high resolution orthoimages, (2) Recursive training with intensity data and color-coded low resolution intensity images for progressive enhancement of the results. Two methods provided similar quality based on FID (Fréchet Inception Distance) measures. However, if quality of the input data is close to the target image, better results could be obtained by increasing epoch. This paper is an early experimental study for feasibility of DL-based true orthoimage generation and further improvement would be necessary.