• Title/Summary/Keyword: Network Parameters

Search Result 3,062, Processing Time 0.028 seconds

Sensitivity of Aerosol Optical Parameters on the Atmospheric Radiative Heating Rate (에어로졸 광학변수가 대기복사가열률 산정에 미치는 민감도 분석)

  • Kim, Sang-Woo;Choi, In-Jin;Yoon, Soon-Chang;Kim, Yumi
    • Atmosphere
    • /
    • v.23 no.1
    • /
    • pp.85-92
    • /
    • 2013
  • We estimate atmospheric radiative heating effect of aerosols, based on AErosol RObotic NETwork (AERONET) and lidar observations and radiative transfer calculations. The column radiation model (CRM) is modified to ingest the AERONET measured variables (aerosol optical depth, single scattering albedo, and asymmetric parameter) and subsequently calculate the optical parameters at the 19 bands from the data obtained at four wavelengths. The aerosol radiative forcing at the surface and the top of the atmosphere, and atmospheric absorption on pollution (April 15, 2001) and dust (April 17~18, 2001) days are 3~4 times greater than those on clear-sky days (April 14 and 16, 2001). The atmospheric radiative heating rate (${\Delta}H$) and heating rate by aerosols (${\Delta}H_{aerosol}$) are estimated to be about $3\;K\;day^{-1}$ and $1{\sim}3\;K\;day^{-1}$ for pollution and dust aerosol layers. The sensitivity test showed that a 10% uncertainty in the single scattering albedo results in 30% uncertainties in aerosol radiative forcing at the surface and at the top of the atmosphere and 60% uncertainties in atmospheric forcing, thereby translated to about 35% uncertainties in ${\Delta}H$. This result suggests that atmospheric radiative heating is largely determined by the amount of light-absorbing aerosols.

Corporate Bond Rating Using Various Multiclass Support Vector Machines (다양한 다분류 SVM을 적용한 기업채권평가)

  • Ahn, Hyun-Chul;Kim, Kyoung-Jae
    • Asia pacific journal of information systems
    • /
    • v.19 no.2
    • /
    • pp.157-178
    • /
    • 2009
  • Corporate credit rating is a very important factor in the market for corporate debt. Information concerning corporate operations is often disseminated to market participants through the changes in credit ratings that are published by professional rating agencies, such as Standard and Poor's (S&P) and Moody's Investor Service. Since these agencies generally require a large fee for the service, and the periodically provided ratings sometimes do not reflect the default risk of the company at the time, it may be advantageous for bond-market participants to be able to classify credit ratings before the agencies actually publish them. As a result, it is very important for companies (especially, financial companies) to develop a proper model of credit rating. From a technical perspective, the credit rating constitutes a typical, multiclass, classification problem because rating agencies generally have ten or more categories of ratings. For example, S&P's ratings range from AAA for the highest-quality bonds to D for the lowest-quality bonds. The professional rating agencies emphasize the importance of analysts' subjective judgments in the determination of credit ratings. However, in practice, a mathematical model that uses the financial variables of companies plays an important role in determining credit ratings, since it is convenient to apply and cost efficient. These financial variables include the ratios that represent a company's leverage status, liquidity status, and profitability status. Several statistical and artificial intelligence (AI) techniques have been applied as tools for predicting credit ratings. Among them, artificial neural networks are most prevalent in the area of finance because of their broad applicability to many business problems and their preeminent ability to adapt. However, artificial neural networks also have many defects, including the difficulty in determining the values of the control parameters and the number of processing elements in the layer as well as the risk of over-fitting. Of late, because of their robustness and high accuracy, support vector machines (SVMs) have become popular as a solution for problems with generating accurate prediction. An SVM's solution may be globally optimal because SVMs seek to minimize structural risk. On the other hand, artificial neural network models may tend to find locally optimal solutions because they seek to minimize empirical risk. In addition, no parameters need to be tuned in SVMs, barring the upper bound for non-separable cases in linear SVMs. Since SVMs were originally devised for binary classification, however they are not intrinsically geared for multiclass classifications as in credit ratings. Thus, researchers have tried to extend the original SVM to multiclass classification. Hitherto, a variety of techniques to extend standard SVMs to multiclass SVMs (MSVMs) has been proposed in the literature Only a few types of MSVM are, however, tested using prior studies that apply MSVMs to credit ratings studies. In this study, we examined six different techniques of MSVMs: (1) One-Against-One, (2) One-Against-AIL (3) DAGSVM, (4) ECOC, (5) Method of Weston and Watkins, and (6) Method of Crammer and Singer. In addition, we examined the prediction accuracy of some modified version of conventional MSVM techniques. To find the most appropriate technique of MSVMs for corporate bond rating, we applied all the techniques of MSVMs to a real-world case of credit rating in Korea. The best application is in corporate bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. For our study the research data were collected from National Information and Credit Evaluation, Inc., a major bond-rating company in Korea. The data set is comprised of the bond-ratings for the year 2002 and various financial variables for 1,295 companies from the manufacturing industry in Korea. We compared the results of these techniques with one another, and with those of traditional methods for credit ratings, such as multiple discriminant analysis (MDA), multinomial logistic regression (MLOGIT), and artificial neural networks (ANNs). As a result, we found that DAGSVM with an ordered list was the best approach for the prediction of bond rating. In addition, we found that the modified version of ECOC approach can yield higher prediction accuracy for the cases showing clear patterns.

A New Bias Scheduling Method for Improving Both Classification Performance and Precision on the Classification and Regression Problems (분류 및 회귀문제에서의 분류 성능과 정확도를 동시에 향상시키기 위한 새로운 바이어스 스케줄링 방법)

  • Kim Eun-Mi;Park Seong-Mi;Kim Kwang-Hee;Lee Bae-Ho
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.11
    • /
    • pp.1021-1028
    • /
    • 2005
  • The general solution for classification and regression problems can be found by matching and modifying matrices with the information in real world and then these matrices are teaming in neural networks. This paper treats primary space as a real world, and dual space that Primary space matches matrices using kernel. In practical study, there are two kinds of problems, complete system which can get an answer using inverse matrix and ill-posed system or singular system which cannot get an answer directly from inverse of the given matrix. Further more the problems are often given by the latter condition; therefore, it is necessary to find regularization parameter to change ill-posed or singular problems into complete system. This paper compares each performance under both classification and regression problems among GCV, L-Curve, which are well known for getting regularization parameter, and kernel methods. Both GCV and L-Curve have excellent performance to get regularization parameters, and the performances are similar although they show little bit different results from the different condition of problems. However, these methods are two-step solution because both have to calculate the regularization parameters to solve given problems, and then those problems can be applied to other solving methods. Compared with UV and L-Curve, kernel methods are one-step solution which is simultaneously teaming a regularization parameter within the teaming process of pattern weights. This paper also suggests dynamic momentum which is leaning under the limited proportional condition between learning epoch and the performance of given problems to increase performance and precision for regularization. Finally, this paper shows the results that suggested solution can get better or equivalent results compared with GCV and L-Curve through the experiments using Iris data which are used to consider standard data in classification, Gaussian data which are typical data for singular system, and Shaw data which is an one-dimension image restoration problems.

Measurement of Backscattering Coefficients of Rice Canopy Using a Ground Polarimetric Scatterometer System (지상관측 레이다 산란계를 이용한 벼 군락의 후방산란계수 측정)

  • Hong, Jin-Young;Kim, Yi-Hyun;Oh, Yi-Sok;Hong, Suk-Young
    • Korean Journal of Remote Sensing
    • /
    • v.23 no.2
    • /
    • pp.145-152
    • /
    • 2007
  • The polarimetric backscattering coefficients of a wet-land rice field which is an experimental plot belong to National Institute of Agricultural Science and Technology in Suwon are measured using ground-based polarimetric scatterometers at 1.8 and 5.3 GHz throughout a growth year from transplanting period to harvest period (May to October in 2006). The polarimetric scatterometers consist of a vector network analyzer with time-gating function and polarimetric antenna set, and are well calibrated to get VV-, HV-, VH-, HH-polarized backscattering coefficients from the measurements, based on single target calibration technique using a trihedral corner reflector. The polarimetric backscattering coefficients are measured at $30^{\circ},\;40^{\circ},\;50^{\circ}\;and\;60^{\circ}$ with 30 independent samples for each incidence angle at each frequency. In the measurement periods the ground truth data including fresh and dry biomass, plant height, stem density, leaf area, specific leaf area, and moisture contents are also collected for each measurement. The temporal variations of the measured backscattering coefficients as well as the measured plant height, LAI (leaf area index) and biomass are analyzed. Then, the measured polarimetric backscattering coefficients are compared with the rice growth parameters. The measured plant height increases monotonically while the measured LAI increases only till the ripening period and decreases after the ripening period. The measured backscattering coefficientsare fitted with polynomial expressions as functions of growth age, plant LAI and plant height for each polarization, frequency, and incidence angle. As the incidence angle is bigger, correlations of L band signature to the rice growth was higher than that of C band signatures. It is found that the HH-polarized backscattering coefficients are more sensitive than the VV-polarized backscattering coefficients to growth age and other input parameters. It is necessary to divide the data according to the growth period which shows the qualitative changes of growth such as panicale initiation, flowering or heading to derive functions to estimate rice growth.

Evaluation of Water Quality Characteristics of Saemangeum Lake Using Statistical Analysis (통계분석을 이용한 새만금호의 수질특성 평가)

  • Jong Gu Kim
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.29 no.4
    • /
    • pp.297-306
    • /
    • 2023
  • Saemangeum Lake is the largest artificial lake in Korea. The continuous deterioration of lake water quality necessitates the introduction of novel water quality management strategies. Therefore, this study aims to identify the spatiotemporal water quality characteristics of Saemangeum Lake using data from the National Water Quality Measurement Network and provide basic information for water quality management. In the water quality parameters of Saemangeum Lake, water temperature and total phosphorous content were correlated, and salt, total nitrogen content, pH, and chemical oxygen demand were significantly correlated. Other parameters showed a low correlation. The spatial principal component analysis of Saemangeum Lake showed the characteristics of its four zones. The mid-to-downstream section of the river affected by freshwater inflow showed a high nutrient salt concentration, and the deep-water section of the drainage gate and the lake affected by seawater showed a high salt concentration. Two types of water qualities were observed in the intermediate water area where river water and outer sea water were mixed: waters with relatively low salt and high chemical oxygen demand, and waters with relatively low salt and high pH concentration. In the principal component analysis by time, the water quality was divided into four groups based on the observation month. Group I occurred during May and June in late spring and early summer, Group II was in early spring (March-April) and late autumn (November-December), Group III was in winter (January-February), and Group IV was in summer (July-October) during high temperatures. The water quality characteristics of Saemangeum Lake were found to be affected by the inflow of the upper Mangyeong and Dongjin rivers, and the seawater through the Garuk and Shinshi gates installed in the Saemangeum Embankment. In order to achieve the target water quality of Saemangeum Lake, it is necessary to establish water quality management measures for Saemangeum Lake along with pollution source management measures in the upper basin.

Memory Organization for a Fuzzy Controller.

  • Jee, K.D.S.;Poluzzi, R.;Russo, B.
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 1993.06a
    • /
    • pp.1041-1043
    • /
    • 1993
  • Fuzzy logic based Control Theory has gained much interest in the industrial world, thanks to its ability to formalize and solve in a very natural way many problems that are very difficult to quantify at an analytical level. This paper shows a solution for treating membership function inside hardware circuits. The proposed hardware structure optimizes the memoried size by using particular form of the vectorial representation. The process of memorizing fuzzy sets, i.e. their membership function, has always been one of the more problematic issues for the hardware implementation, due to the quite large memory space that is needed. To simplify such an implementation, it is commonly [1,2,8,9,10,11] used to limit the membership functions either to those having triangular or trapezoidal shape, or pre-definite shape. These kinds of functions are able to cover a large spectrum of applications with a limited usage of memory, since they can be memorized by specifying very few parameters ( ight, base, critical points, etc.). This however results in a loss of computational power due to computation on the medium points. A solution to this problem is obtained by discretizing the universe of discourse U, i.e. by fixing a finite number of points and memorizing the value of the membership functions on such points [3,10,14,15]. Such a solution provides a satisfying computational speed, a very high precision of definitions and gives the users the opportunity to choose membership functions of any shape. However, a significant memory waste can as well be registered. It is indeed possible that for each of the given fuzzy sets many elements of the universe of discourse have a membership value equal to zero. It has also been noticed that almost in all cases common points among fuzzy sets, i.e. points with non null membership values are very few. More specifically, in many applications, for each element u of U, there exists at most three fuzzy sets for which the membership value is ot null [3,5,6,7,12,13]. Our proposal is based on such hypotheses. Moreover, we use a technique that even though it does not restrict the shapes of membership functions, it reduces strongly the computational time for the membership values and optimizes the function memorization. In figure 1 it is represented a term set whose characteristics are common for fuzzy controllers and to which we will refer in the following. The above term set has a universe of discourse with 128 elements (so to have a good resolution), 8 fuzzy sets that describe the term set, 32 levels of discretization for the membership values. Clearly, the number of bits necessary for the given specifications are 5 for 32 truth levels, 3 for 8 membership functions and 7 for 128 levels of resolution. The memory depth is given by the dimension of the universe of the discourse (128 in our case) and it will be represented by the memory rows. The length of a world of memory is defined by: Length = nem (dm(m)+dm(fm) Where: fm is the maximum number of non null values in every element of the universe of the discourse, dm(m) is the dimension of the values of the membership function m, dm(fm) is the dimension of the word to represent the index of the highest membership function. In our case then Length=24. The memory dimension is therefore 128*24 bits. If we had chosen to memorize all values of the membership functions we would have needed to memorize on each memory row the membership value of each element. Fuzzy sets word dimension is 8*5 bits. Therefore, the dimension of the memory would have been 128*40 bits. Coherently with our hypothesis, in fig. 1 each element of universe of the discourse has a non null membership value on at most three fuzzy sets. Focusing on the elements 32,64,96 of the universe of discourse, they will be memorized as follows: The computation of the rule weights is done by comparing those bits that represent the index of the membership function, with the word of the program memor . The output bus of the Program Memory (μCOD), is given as input a comparator (Combinatory Net). If the index is equal to the bus value then one of the non null weight derives from the rule and it is produced as output, otherwise the output is zero (fig. 2). It is clear, that the memory dimension of the antecedent is in this way reduced since only non null values are memorized. Moreover, the time performance of the system is equivalent to the performance of a system using vectorial memorization of all weights. The dimensioning of the word is influenced by some parameters of the input variable. The most important parameter is the maximum number membership functions (nfm) having a non null value in each element of the universe of discourse. From our study in the field of fuzzy system, we see that typically nfm 3 and there are at most 16 membership function. At any rate, such a value can be increased up to the physical dimensional limit of the antecedent memory. A less important role n the optimization process of the word dimension is played by the number of membership functions defined for each linguistic term. The table below shows the request word dimension as a function of such parameters and compares our proposed method with the method of vectorial memorization[10]. Summing up, the characteristics of our method are: Users are not restricted to membership functions with specific shapes. The number of the fuzzy sets and the resolution of the vertical axis have a very small influence in increasing memory space. Weight computations are done by combinatorial network and therefore the time performance of the system is equivalent to the one of the vectorial method. The number of non null membership values on any element of the universe of discourse is limited. Such a constraint is usually non very restrictive since many controllers obtain a good precision with only three non null weights. The method here briefly described has been adopted by our group in the design of an optimized version of the coprocessor described in [10].

  • PDF

Predicting Forest Gross Primary Production Using Machine Learning Algorithms (머신러닝 기법의 산림 총일차생산성 예측 모델 비교)

  • Lee, Bora;Jang, Keunchang;Kim, Eunsook;Kang, Minseok;Chun, Jung-Hwa;Lim, Jong-Hwan
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.21 no.1
    • /
    • pp.29-41
    • /
    • 2019
  • Terrestrial Gross Primary Production (GPP) is the largest global carbon flux, and forest ecosystems are important because of the ability to store much more significant amounts of carbon than other terrestrial ecosystems. There have been several attempts to estimate GPP using mechanism-based models. However, mechanism-based models including biological, chemical, and physical processes are limited due to a lack of flexibility in predicting non-stationary ecological processes, which are caused by a local and global change. Instead mechanism-free methods are strongly recommended to estimate nonlinear dynamics that occur in nature like GPP. Therefore, we used the mechanism-free machine learning techniques to estimate the daily GPP. In this study, support vector machine (SVM), random forest (RF) and artificial neural network (ANN) were used and compared with the traditional multiple linear regression model (LM). MODIS products and meteorological parameters from eddy covariance data were employed to train the machine learning and LM models from 2006 to 2013. GPP prediction models were compared with daily GPP from eddy covariance measurement in a deciduous forest in South Korea in 2014 and 2015. Statistical analysis including correlation coefficient (R), root mean square error (RMSE) and mean squared error (MSE) were used to evaluate the performance of models. In general, the models from machine-learning algorithms (R = 0.85 - 0.93, MSE = 1.00 - 2.05, p < 0.001) showed better performance than linear regression model (R = 0.82 - 0.92, MSE = 1.24 - 2.45, p < 0.001). These results provide insight into high predictability and the possibility of expansion through the use of the mechanism-free machine-learning models and remote sensing for predicting non-stationary ecological processes such as seasonal GPP.

Comparison of rainfall-runoff performance based on various gridded precipitation datasets in the Mekong River basin (메콩강 유역의 격자형 강수 자료에 의한 강우-유출 모의 성능 비교·분석)

  • Kim, Younghun;Le, Xuan-Hien;Jung, Sungho;Yeon, Minho;Lee, Gihae
    • Journal of Korea Water Resources Association
    • /
    • v.56 no.2
    • /
    • pp.75-89
    • /
    • 2023
  • As the Mekong River basin is a nationally shared river, it is difficult to collect precipitation data, and the quantitative and qualitative quality of the data sets differs from country to country, which may increase the uncertainty of hydrological analysis results. Recently, with the development of remote sensing technology, it has become easier to obtain grid-based precipitation products(GPPs), and various hydrological analysis studies have been conducted in unmeasured or large watersheds using GPPs. In this study, rainfall-runoff simulation in the Mekong River basin was conducted using the SWAT model, which is a quasi-distribution model with three satellite GPPs (TRMM, GSMaP, PERSIANN-CDR) and two GPPs (APHRODITE, GPCC). Four water level stations, Luang Prabang, Pakse, Stung Treng, and Kratie, which are major outlets of the main Mekong River, were selected, and the parameters of the SWAT model were calibrated using APHRODITE as an observation value for the period from 2001 to 2011 and runoff simulations were verified for the period form 2012 to 2013. In addition, using the ConvAE, a convolutional neural network model, spatio-temporal correction of original satellite precipitation products was performed, and rainfall-runoff performances were compared before and after correction of satellite precipitation products. The original satellite precipitation products and GPCC showed a quantitatively under- or over-estimated or spatially very different pattern compared to APHPRODITE, whereas, in the case of satellite precipitation prodcuts corrected using ConvAE, spatial correlation was dramatically improved. In the case of runoff simulation, the runoff simulation results using the satellite precipitation products corrected by ConvAE for all the outlets have significantly improved accuracy than the runoff results using original satellite precipitation products. Therefore, the bias correction technique using the ConvAE technique presented in this study can be applied in various hydrological analysis for large watersheds where rain guage network is not dense.

Investigation on the water quality challenges and benefits of buffer zone application to Yongdam reservoir, Republic of Korea (용담호의 홍수터 적용을 위한 문제점 및 이점 조사 연구)

  • Franz Kevin Geronimo;Hyeseon Choi;Minsu Jeon;Lee-Hyung Kim
    • Journal of Wetlands Research
    • /
    • v.25 no.4
    • /
    • pp.274-283
    • /
    • 2023
  • Buffer zones, an example of nature-based solutions, offer wide range of environmental, social and economic benefits due to their multifunctionality when applied to watershed areas promoting blue-green connectivity. This study evaluated the effects of buffer zone application to the water quality of Yongdam reservoir tributaries. Particularly, the challenges and improvement of the buffer zone design were identified and suggested, respectively. Water and soil samples were collected from a total of six sites in Yongdam reservoir from September 2021 to April 2022. Water quality analyses revealed that among the sites monitored, downstream of Sangjeonmyeon Galhyeonri (SG_W_D2) was found to have the highest concentration for water quality parameters turbidity, total suspended solids (TSS), chemical oxygen demand (COD), total phosphorus (TP) and total nitrogen (TN). This finding was attributed to the algal bloom observed during the sampling conducted in September and October 2021. It was found through the soil analyses that high TN and TP concentrations were also observed in all the agricultural land uses implying that nutrient accumulation in agricultural areas are high. Highest TN concentration was found in the agricultural area of Jeongcheonmyeon Wolpyeongri (JW_S_A) followed by Jucheonmyeon Sinyangri (JS_S_A) while the lowest TN concentration was found in the original soil of Sangjeonmyeon Galhyeonri (SG_S_O). Among the types of buffer zones identified in this study, Type II-A, Type II-B and Type III were found to have blue-green connectivity. However, initially, blue-green connectivity in these buffer zone types were not considered leading to poor design and poor performance. As such, improvement in the design considering blue-green network and renovation must be considered to optimize the performance of these buffer zones. The findings in this study is useful for designing buffer zones in the future.

A Study on Market Size Estimation Method by Product Group Using Word2Vec Algorithm (Word2Vec을 활용한 제품군별 시장규모 추정 방법에 관한 연구)

  • Jung, Ye Lim;Kim, Ji Hui;Yoo, Hyoung Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.1-21
    • /
    • 2020
  • With the rapid development of artificial intelligence technology, various techniques have been developed to extract meaningful information from unstructured text data which constitutes a large portion of big data. Over the past decades, text mining technologies have been utilized in various industries for practical applications. In the field of business intelligence, it has been employed to discover new market and/or technology opportunities and support rational decision making of business participants. The market information such as market size, market growth rate, and market share is essential for setting companies' business strategies. There has been a continuous demand in various fields for specific product level-market information. However, the information has been generally provided at industry level or broad categories based on classification standards, making it difficult to obtain specific and proper information. In this regard, we propose a new methodology that can estimate the market sizes of product groups at more detailed levels than that of previously offered. We applied Word2Vec algorithm, a neural network based semantic word embedding model, to enable automatic market size estimation from individual companies' product information in a bottom-up manner. The overall process is as follows: First, the data related to product information is collected, refined, and restructured into suitable form for applying Word2Vec model. Next, the preprocessed data is embedded into vector space by Word2Vec and then the product groups are derived by extracting similar products names based on cosine similarity calculation. Finally, the sales data on the extracted products is summated to estimate the market size of the product groups. As an experimental data, text data of product names from Statistics Korea's microdata (345,103 cases) were mapped in multidimensional vector space by Word2Vec training. We performed parameters optimization for training and then applied vector dimension of 300 and window size of 15 as optimized parameters for further experiments. We employed index words of Korean Standard Industry Classification (KSIC) as a product name dataset to more efficiently cluster product groups. The product names which are similar to KSIC indexes were extracted based on cosine similarity. The market size of extracted products as one product category was calculated from individual companies' sales data. The market sizes of 11,654 specific product lines were automatically estimated by the proposed model. For the performance verification, the results were compared with actual market size of some items. The Pearson's correlation coefficient was 0.513. Our approach has several advantages differing from the previous studies. First, text mining and machine learning techniques were applied for the first time on market size estimation, overcoming the limitations of traditional sampling based- or multiple assumption required-methods. In addition, the level of market category can be easily and efficiently adjusted according to the purpose of information use by changing cosine similarity threshold. Furthermore, it has a high potential of practical applications since it can resolve unmet needs for detailed market size information in public and private sectors. Specifically, it can be utilized in technology evaluation and technology commercialization support program conducted by governmental institutions, as well as business strategies consulting and market analysis report publishing by private firms. The limitation of our study is that the presented model needs to be improved in terms of accuracy and reliability. The semantic-based word embedding module can be advanced by giving a proper order in the preprocessed dataset or by combining another algorithm such as Jaccard similarity with Word2Vec. Also, the methods of product group clustering can be changed to other types of unsupervised machine learning algorithm. Our group is currently working on subsequent studies and we expect that it can further improve the performance of the conceptually proposed basic model in this study.