• Title/Summary/Keyword: Size Prediction

Search Result 1,436, Processing Time 0.033 seconds

Characteristics Analysis of Snow Particle Size Distribution in Gangwon Region according to Topography (지형에 따른 강원지역의 강설입자 크기 분포 특성 분석)

  • Bang, Wonbae;Kim, Kwonil;Yeom, Daejin;Cho, Su-jeong;Lee, Choeng-lyong;Lee, Daehyung;Ye, Bo-Young;Lee, GyuWon
    • Journal of the Korean earth science society
    • /
    • v.40 no.3
    • /
    • pp.227-239
    • /
    • 2019
  • Heavy snowfall events frequently occur in the Gangwon province, and the snowfall amount significantly varies in space due to the complex terrain and topographical modulation of precipitation. Understanding the spatial characteristics of heavy snowfall and its prediction is particularly challenging during snowfall events in the easterly winds. The easterly wind produces a significantly different atmospheric condition. Hence, it brings different precipitation characteristics. In this study, we have investigated the microphysical characteristics of snowfall in the windward and leeward sides of the Taebaek mountain range in the easterly condition. The two snowfall events are selected in the easterly, and the snow particles size distributions (SSD) are observed in the four sites (two windward and two leeward sites) by the PARSIVEL distrometers. We compared the characteristic parameters of SSDs that come from leeward sites to that of windward sites. The results show that SSDs of windward sites have a relatively wide distribution with many small snow particles compared to those of leeward sites. This characteristic is clearly shown by the larger characteristic number concentration and characteristic diameter in the windward sites. Snowfall rate and ice water content of windward also are larger than those of leeward sites. The results indicate that a new generation of snowfall particles is dominant in the windward sites which is likely due to the orographic lifting. In addition, the windward sites show heavy aggregation particles by nearby zero ground temperature that is likely driven by the wet and warm condition near the ocean.

Feasibility of Deep Learning Algorithms for Binary Classification Problems (이진 분류문제에서의 딥러닝 알고리즘의 활용 가능성 평가)

  • Kim, Kitae;Lee, Bomi;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.1
    • /
    • pp.95-108
    • /
    • 2017
  • Recently, AlphaGo which is Bakuk (Go) artificial intelligence program by Google DeepMind, had a huge victory against Lee Sedol. Many people thought that machines would not be able to win a man in Go games because the number of paths to make a one move is more than the number of atoms in the universe unlike chess, but the result was the opposite to what people predicted. After the match, artificial intelligence technology was focused as a core technology of the fourth industrial revolution and attracted attentions from various application domains. Especially, deep learning technique have been attracted as a core artificial intelligence technology used in the AlphaGo algorithm. The deep learning technique is already being applied to many problems. Especially, it shows good performance in image recognition field. In addition, it shows good performance in high dimensional data area such as voice, image and natural language, which was difficult to get good performance using existing machine learning techniques. However, in contrast, it is difficult to find deep leaning researches on traditional business data and structured data analysis. In this study, we tried to find out whether the deep learning techniques have been studied so far can be used not only for the recognition of high dimensional data but also for the binary classification problem of traditional business data analysis such as customer churn analysis, marketing response prediction, and default prediction. And we compare the performance of the deep learning techniques with that of traditional artificial neural network models. The experimental data in the paper is the telemarketing response data of a bank in Portugal. It has input variables such as age, occupation, loan status, and the number of previous telemarketing and has a binary target variable that records whether the customer intends to open an account or not. In this study, to evaluate the possibility of utilization of deep learning algorithms and techniques in binary classification problem, we compared the performance of various models using CNN, LSTM algorithm and dropout, which are widely used algorithms and techniques in deep learning, with that of MLP models which is a traditional artificial neural network model. However, since all the network design alternatives can not be tested due to the nature of the artificial neural network, the experiment was conducted based on restricted settings on the number of hidden layers, the number of neurons in the hidden layer, the number of output data (filters), and the application conditions of the dropout technique. The F1 Score was used to evaluate the performance of models to show how well the models work to classify the interesting class instead of the overall accuracy. The detail methods for applying each deep learning technique in the experiment is as follows. The CNN algorithm is a method that reads adjacent values from a specific value and recognizes the features, but it does not matter how close the distance of each business data field is because each field is usually independent. In this experiment, we set the filter size of the CNN algorithm as the number of fields to learn the whole characteristics of the data at once, and added a hidden layer to make decision based on the additional features. For the model having two LSTM layers, the input direction of the second layer is put in reversed position with first layer in order to reduce the influence from the position of each field. In the case of the dropout technique, we set the neurons to disappear with a probability of 0.5 for each hidden layer. The experimental results show that the predicted model with the highest F1 score was the CNN model using the dropout technique, and the next best model was the MLP model with two hidden layers using the dropout technique. In this study, we were able to get some findings as the experiment had proceeded. First, models using dropout techniques have a slightly more conservative prediction than those without dropout techniques, and it generally shows better performance in classification. Second, CNN models show better classification performance than MLP models. This is interesting because it has shown good performance in binary classification problems which it rarely have been applied to, as well as in the fields where it's effectiveness has been proven. Third, the LSTM algorithm seems to be unsuitable for binary classification problems because the training time is too long compared to the performance improvement. From these results, we can confirm that some of the deep learning algorithms can be applied to solve business binary classification problems.

The Pattern Analysis of Financial Distress for Non-audited Firms using Data Mining (데이터마이닝 기법을 활용한 비외감기업의 부실화 유형 분석)

  • Lee, Su Hyun;Park, Jung Min;Lee, Hyoung Yong
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.111-131
    • /
    • 2015
  • There are only a handful number of research conducted on pattern analysis of corporate distress as compared with research for bankruptcy prediction. The few that exists mainly focus on audited firms because financial data collection is easier for these firms. But in reality, corporate financial distress is a far more common and critical phenomenon for non-audited firms which are mainly comprised of small and medium sized firms. The purpose of this paper is to classify non-audited firms under distress according to their financial ratio using data mining; Self-Organizing Map (SOM). SOM is a type of artificial neural network that is trained using unsupervised learning to produce a lower dimensional discretized representation of the input space of the training samples, called a map. SOM is different from other artificial neural networks as it applies competitive learning as opposed to error-correction learning such as backpropagation with gradient descent, and in the sense that it uses a neighborhood function to preserve the topological properties of the input space. It is one of the popular and successful clustering algorithm. In this study, we classify types of financial distress firms, specially, non-audited firms. In the empirical test, we collect 10 financial ratios of 100 non-audited firms under distress in 2004 for the previous two years (2002 and 2003). Using these financial ratios and the SOM algorithm, five distinct patterns were distinguished. In pattern 1, financial distress was very serious in almost all financial ratios. 12% of the firms are included in these patterns. In pattern 2, financial distress was weak in almost financial ratios. 14% of the firms are included in pattern 2. In pattern 3, growth ratio was the worst among all patterns. It is speculated that the firms of this pattern may be under distress due to severe competition in their industries. Approximately 30% of the firms fell into this group. In pattern 4, the growth ratio was higher than any other pattern but the cash ratio and profitability ratio were not at the level of the growth ratio. It is concluded that the firms of this pattern were under distress in pursuit of expanding their business. About 25% of the firms were in this pattern. Last, pattern 5 encompassed very solvent firms. Perhaps firms of this pattern were distressed due to a bad short-term strategic decision or due to problems with the enterpriser of the firms. Approximately 18% of the firms were under this pattern. This study has the academic and empirical contribution. In the perspectives of the academic contribution, non-audited companies that tend to be easily bankrupt and have the unstructured or easily manipulated financial data are classified by the data mining technology (Self-Organizing Map) rather than big sized audited firms that have the well prepared and reliable financial data. In the perspectives of the empirical one, even though the financial data of the non-audited firms are conducted to analyze, it is useful for find out the first order symptom of financial distress, which makes us to forecast the prediction of bankruptcy of the firms and to manage the early warning and alert signal. These are the academic and empirical contribution of this study. The limitation of this research is to analyze only 100 corporates due to the difficulty of collecting the financial data of the non-audited firms, which make us to be hard to proceed to the analysis by the category or size difference. Also, non-financial qualitative data is crucial for the analysis of bankruptcy. Thus, the non-financial qualitative factor is taken into account for the next study. This study sheds some light on the non-audited small and medium sized firms' distress prediction in the future.

Bankruptcy Forecasting Model using AdaBoost: A Focus on Construction Companies (적응형 부스팅을 이용한 파산 예측 모형: 건설업을 중심으로)

  • Heo, Junyoung;Yang, Jin Yong
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.35-48
    • /
    • 2014
  • According to the 2013 construction market outlook report, the liquidation of construction companies is expected to continue due to the ongoing residential construction recession. Bankruptcies of construction companies have a greater social impact compared to other industries. However, due to the different nature of the capital structure and debt-to-equity ratio, it is more difficult to forecast construction companies' bankruptcies than that of companies in other industries. The construction industry operates on greater leverage, with high debt-to-equity ratios, and project cash flow focused on the second half. The economic cycle greatly influences construction companies. Therefore, downturns tend to rapidly increase the bankruptcy rates of construction companies. High leverage, coupled with increased bankruptcy rates, could lead to greater burdens on banks providing loans to construction companies. Nevertheless, the bankruptcy prediction model concentrated mainly on financial institutions, with rare construction-specific studies. The bankruptcy prediction model based on corporate finance data has been studied for some time in various ways. However, the model is intended for all companies in general, and it may not be appropriate for forecasting bankruptcies of construction companies, who typically have high liquidity risks. The construction industry is capital-intensive, operates on long timelines with large-scale investment projects, and has comparatively longer payback periods than in other industries. With its unique capital structure, it can be difficult to apply a model used to judge the financial risk of companies in general to those in the construction industry. Diverse studies of bankruptcy forecasting models based on a company's financial statements have been conducted for many years. The subjects of the model, however, were general firms, and the models may not be proper for accurately forecasting companies with disproportionately large liquidity risks, such as construction companies. The construction industry is capital-intensive, requiring significant investments in long-term projects, therefore to realize returns from the investment. The unique capital structure means that the same criteria used for other industries cannot be applied to effectively evaluate financial risk for construction firms. Altman Z-score was first published in 1968, and is commonly used as a bankruptcy forecasting model. It forecasts the likelihood of a company going bankrupt by using a simple formula, classifying the results into three categories, and evaluating the corporate status as dangerous, moderate, or safe. When a company falls into the "dangerous" category, it has a high likelihood of bankruptcy within two years, while those in the "safe" category have a low likelihood of bankruptcy. For companies in the "moderate" category, it is difficult to forecast the risk. Many of the construction firm cases in this study fell in the "moderate" category, which made it difficult to forecast their risk. Along with the development of machine learning using computers, recent studies of corporate bankruptcy forecasting have used this technology. Pattern recognition, a representative application area in machine learning, is applied to forecasting corporate bankruptcy, with patterns analyzed based on a company's financial information, and then judged as to whether the pattern belongs to the bankruptcy risk group or the safe group. The representative machine learning models previously used in bankruptcy forecasting are Artificial Neural Networks, Adaptive Boosting (AdaBoost) and, the Support Vector Machine (SVM). There are also many hybrid studies combining these models. Existing studies using the traditional Z-Score technique or bankruptcy prediction using machine learning focus on companies in non-specific industries. Therefore, the industry-specific characteristics of companies are not considered. In this paper, we confirm that adaptive boosting (AdaBoost) is the most appropriate forecasting model for construction companies by based on company size. We classified construction companies into three groups - large, medium, and small based on the company's capital. We analyzed the predictive ability of AdaBoost for each group of companies. The experimental results showed that AdaBoost has more predictive ability than the other models, especially for the group of large companies with capital of more than 50 billion won.

Prediction on the Quality of Total Mixed Ration for Dairy Cows by Near Infrared Reflectance Spectroscopy (근적외선 분광법에 의한 국내 축우용 TMR의 성분추정)

  • Ki, Kwang-Seok;Kim, Sang-Bum;Lee, Hyun-June;Yang, Seung-Hak;Lee, Jae-Sik;Jin, Ze-Lin;Kim, Hyeon-Shup;Jeo, Joon-Mo;Koo, Jae-Yeon;Cho, Jong-Ku
    • Journal of The Korean Society of Grassland and Forage Science
    • /
    • v.29 no.3
    • /
    • pp.253-262
    • /
    • 2009
  • The present study was conducted to develop a rapid and accurate method of evaluating chemical composition of total mixed ration (TMR) for dairy cows using near infrared reflectance spectroscopy (NIRS). A total of 253 TMR samples were collected from TMR manufacturers and dairy farms in Korea. Prior to NIR analysis, TMR samples were dried at $65^{\circ}C$ for 48 hour and then ground to 2 mm size. The samples were scanned at 2 nm interval over the wavelength range of 400-2500 nm on a FOSS-NIR Systems Model 6500. The values obtained by NIR analysis and conventional chemical methods were compared. Generally, the relationship between chemical analysis and NIR analysis was linear: $R^2$ and standard error of calibration (SEC) were 0.701 (SEC 0.407), 0.965 (SEC 0.315), 0.796 (SEC 0.406), 0.889 (SEC 0.987), 0.894 (SEC 0.311), 0.933 (SEC 0.885) and 0.889 (SEC 1.490) for moisture, crude protein, ether extract, crude fiber, crude ash, acid detergent fiber (ADF) and neutral detergent fiber (NDF), respectively. In addition, the standard error of prediction (SEP) value was 0.371, 0.290, 0.321, 0.380, 0.960, 0.859 and 1.446 for moisture, crude protein, ether extract, crude fiber, crude ash, ADF and NDF, respectively. The results of the present study showed that the NIR analysis for unknown TMR samples would be relatively accurate. Use of the developed NIR calibration curve can obtain fast and reliable data on chemical composition of TMR. Collection and analysis of more TMR samples will increase accuracy and precision of NIR analysis to TMR.

An Intelligent Intrusion Detection Model Based on Support Vector Machines and the Classification Threshold Optimization for Considering the Asymmetric Error Cost (비대칭 오류비용을 고려한 분류기준값 최적화와 SVM에 기반한 지능형 침입탐지모형)

  • Lee, Hyeon-Uk;Ahn, Hyun-Chul
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.157-173
    • /
    • 2011
  • As the Internet use explodes recently, the malicious attacks and hacking for a system connected to network occur frequently. This means the fatal damage can be caused by these intrusions in the government agency, public office, and company operating various systems. For such reasons, there are growing interests and demand about the intrusion detection systems (IDS)-the security systems for detecting, identifying and responding to unauthorized or abnormal activities appropriately. The intrusion detection models that have been applied in conventional IDS are generally designed by modeling the experts' implicit knowledge on the network intrusions or the hackers' abnormal behaviors. These kinds of intrusion detection models perform well under the normal situations. However, they show poor performance when they meet a new or unknown pattern of the network attacks. For this reason, several recent studies try to adopt various artificial intelligence techniques, which can proactively respond to the unknown threats. Especially, artificial neural networks (ANNs) have popularly been applied in the prior studies because of its superior prediction accuracy. However, ANNs have some intrinsic limitations such as the risk of overfitting, the requirement of the large sample size, and the lack of understanding the prediction process (i.e. black box theory). As a result, the most recent studies on IDS have started to adopt support vector machine (SVM), the classification technique that is more stable and powerful compared to ANNs. SVM is known as a relatively high predictive power and generalization capability. Under this background, this study proposes a novel intelligent intrusion detection model that uses SVM as the classification model in order to improve the predictive ability of IDS. Also, our model is designed to consider the asymmetric error cost by optimizing the classification threshold. Generally, there are two common forms of errors in intrusion detection. The first error type is the False-Positive Error (FPE). In the case of FPE, the wrong judgment on it may result in the unnecessary fixation. The second error type is the False-Negative Error (FNE) that mainly misjudges the malware of the program as normal. Compared to FPE, FNE is more fatal. Thus, when considering total cost of misclassification in IDS, it is more reasonable to assign heavier weights on FNE rather than FPE. Therefore, we designed our proposed intrusion detection model to optimize the classification threshold in order to minimize the total misclassification cost. In this case, conventional SVM cannot be applied because it is designed to generate discrete output (i.e. a class). To resolve this problem, we used the revised SVM technique proposed by Platt(2000), which is able to generate the probability estimate. To validate the practical applicability of our model, we applied it to the real-world dataset for network intrusion detection. The experimental dataset was collected from the IDS sensor of an official institution in Korea from January to June 2010. We collected 15,000 log data in total, and selected 1,000 samples from them by using random sampling method. In addition, the SVM model was compared with the logistic regression (LOGIT), decision trees (DT), and ANN to confirm the superiority of the proposed model. LOGIT and DT was experimented using PASW Statistics v18.0, and ANN was experimented using Neuroshell 4.0. For SVM, LIBSVM v2.90-a freeware for training SVM classifier-was used. Empirical results showed that our proposed model based on SVM outperformed all the other comparative models in detecting network intrusions from the accuracy perspective. They also showed that our model reduced the total misclassification cost compared to the ANN-based intrusion detection model. As a result, it is expected that the intrusion detection model proposed in this paper would not only enhance the performance of IDS, but also lead to better management of FNE.

Prognostic Usefulness of Maximum Standardized Uptake Value on FDG-PET in Surgically Resected Non-small-cell Lung Cancer (수술로 제거된 비소세포폐암의 예후 예측에 있어 FDG-PET 최대 표준화 섭취계수의 유용성)

  • Nguyen Xuan Canh;Lee Won-Woo;Sung Sook-Whan;Jheon Sang-Hoon;Kim Yu-Kyeong;Lee Dong-Soo;Chung June-Key;Lee Myung-Chul;Kim Sang-Eun
    • Nuclear Medicine and Molecular Imaging
    • /
    • v.40 no.4
    • /
    • pp.205-210
    • /
    • 2006
  • Purpose: FDG uptake on positron omission tomography (PET) has been considered a prognostic indicator in non-small cell lung cancer (NSCLC). The aim of this study was to assess the clinical significance of maximum value of SUV (maxSUV) in recurrence prediction in patients with surgically resected NSCLC. Materials & methods: NSCLC patients (n=42, F:M =14:28, age $62.3{\pm}12.3$ y) who underwent curative resection after FDG-PET were enrolled. Twenty-nine patients had pathologic stage 1, and 13 had pathologic stage II. Thirty-one patients were additionally treated with adjuvant oral chemotherapy. MaxSUVs of primary tumors were analyzed for correlation with tumor recurrence and compared with pathologic or clinical prognostic indicators. The median follow-up duration was 16 mo (range, 3-26 mo). Results: Ten (23.8%) of the 42 patients experienced recurrence during a median follow-up of 7.5 mo (range, 3-13 mo). Univariate analysis revealed that disease-free survival (DFS) was significantly correlated with maxSUV (<7 vs. $\geq7$, p=0.006), tumor size (<3 cm vs. $\geq3$ cm, p=0.024), and tumor tell differentiation (well/moderate vs. poor, p=0.044). However, multivariate Cox proportional analysis identified maxSUV as the single determinant for DFS (p=0.014). Patients with a maxSUV of $\geq7$(n=10) had a significantly lower 1-year DFS rate (50.0%) than those with a maxSUV of <7 (n=32, 87.5%). Conclusion: MaxSUV is a significant independent predictor for recurrence in surgically resected NSCLC. FDG uptake can be added to other well-known factors in prognosis prediction of NSCLC.

Investigation of Fatigue Strength and Prediction of Remaining Life in the Butt Welds Containing Penetration Defects (블완전용입 맞대기 용접재의 용입깊이에 따른 피로강도특성 및 잔류수명의 산출)

  • Han, Seung Ho;Han, Jeong Woo;Shin, Byung Chun
    • Journal of Korean Society of Steel Construction
    • /
    • v.10 no.3 s.36
    • /
    • pp.423-435
    • /
    • 1998
  • In this paper fatigue strength reduction of butt weld with penetration defect, which can be seen frequently in the steel bridge, was assessed quantitatively. S-N curves were derived and investigated through the constant amplitude fatigue test of fully or partially penetrated welded specimen made of SWS490 steel. The fracture mechanical method was applied in order to calculate the remaining fatigue life of the partially penetrated butt welds. The fatigue limit of the fully penetrated butt welds was higher than that of category A in AASHTO's fatigue design curves, and the slope of S-N curves with 5.57 was stiffer than that of other result for welded part generally accepted as 3. The fatigue strength of the partially Penetrated butt weld was strongly influenced by the size of lack of penetration, D. It decreased drastically with increasing D from 3.9 to 14.7mm. Fracture behaviour of the partially penetrated butt weld is able to be explained obviously from the beach mark test that a semi-elliptical surface crack with small a/c ratio initiates at a internal weld root and propagates through the weld metal. To estimate the fatigue life of the partially penetrated butt weld with fracture mechanics, stress intensity factors K of 3-dimensional semi-elliptical crack were calculated by appling finite elements method and fracture mechanics parameters such as C and m were derived through the fatigue test of CT-specimen. As a result, the fatigue lives obtained by using the fracture mechanical method agreed well with the experimental results. The results were applied to Sung-Su bridge collapsed due to penetration defects in butt weld of vertical member.

  • PDF

A Numerical Study on the Effects of the Wind Velocity and Height of Grassland on the flame Spread Rate of Forest Fires (초지화재 발생시 바람의 속도 및 초본의 높이가 화염전파에 미치는 영향에 대한 수치해석적 연구)

  • Bae, Sung-Yong;Kim, Dong-Hyun;Ryou, Hong-Sun;Lee, Sung-Hyuk
    • Fire Science and Engineering
    • /
    • v.22 no.3
    • /
    • pp.252-257
    • /
    • 2008
  • With the rapid exuberant growth of the forest, the number and size of forest fires and the costs of wildland fires have increased. The flame spread rate of forest fires is depending on the environmental variables like the wind velocity, moisture of grassland, etc. If we know the effects of the environmental variables on the fire growth, it is useful for wildland fiIre suppression. But analysis of the spread rate of wildland fire for these effects have not been established. In this study, the effects of wind velocity and height of grassland fuel have been investigated using the WFDS which is developed at NIST for prediction of the spread of wildland fires. The results showed that the relation between the height of the fuel and the spread rate of the head fires is, and the spread rates related to the wind velocity are predicted 17% less than the experimental results of Australia. When the wind velocity is over 7.5m/s, the concentration of pyrolyzed gas phase fuel is getting low due to fast movement of pyrolyzed gas, the flame spread rate becomes slow.

An attempt at soil profiling on a river embankment using geophysical data (물리탐사 자료를 이용한 강둑 토양 종단면도 작성)

  • Takahashi, Toru;Yamamoto, Tsuyoshi
    • Geophysics and Geophysical Exploration
    • /
    • v.13 no.1
    • /
    • pp.102-108
    • /
    • 2010
  • The internal structure of a river embankment must be delineated as part of investigations to evaluate its safety. Geophysical methods can be most effective means for that purpose, if they are used together with geotechnical methods such as the cone penetration test (CPT) and drilling. Since the dyke body and subsoil in general consist of material with a wide range of grain size, the properties and stratification of the soil must be accurately estimated to predict the mechanical stability and water infiltration in the river embankment. The strength and water content of the levee soil are also parameters required for such prediction. These parameters are usually estimated from CPT data, drilled core samples and laboratory tests. In this study we attempt to utilise geophysical data to estimate these parameters more effectively for very long river embankments. S-wave velocity and resistivity of the levee soils obtained with geophysical surveys are used to classify the soils. The classification is based on a physical soil model, called the unconsolidated sand model. Using this model, a soil profile along the river embankment is constructed from S-wave velocity and resistivity profiles. The soil profile thus obtained has been verified by geotechnical logs, which proves its usefulness for investigation of a river embankment.