• Title/Summary/Keyword: cluster method

Search Result 2,497, Processing Time 0.028 seconds

Proposing the Method for Improving the Forecast Accuracy of Loan Underwriting (대출심사의 예측 정확도 향상을 위한 방법 제안)

  • Yang, Yu-Young;Park, Sang-Sung;Shin, Young-Geun;Jang, Dong-Sik
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.4
    • /
    • pp.1419-1429
    • /
    • 2010
  • Industry structure and environment of the domestic bank have been changed by an influx of large foreign-banks and advanced financial products when the currency crisis erupted in Korea. In a competitive environment, accurate forecasts of changes and tendencies are essential for the survival and development. Forecast of whether to approve loan applications for customer or not is an important matter because that is related to profit generation and risk management on the bank. Therefore, this paper proposes the method to improve forecast accuracy of loan underwriting. Processes in experiments are as follows. First, we select the predictor variables which affect significantly to the result of loan underwriting by correlation analysis and feature selection technique, and then cluster the customers by the 2-Step clustering technique based on selected variables. Second, we find the most accurate forecasting model for each clustering by applying LR, NN and SVM. Finally, we compare the forecasting accuracy of the proposed method with the forecasting accuracy of existing application way.

A Visualization of Traffic Accidents Hotspot along the Road Network (도로 네트워크를 따른 교통사고 핫스팟의 시각화)

  • Cho, Nahye;Jun, Chulmin;Kang, Youngok
    • Journal of Cadastre & Land InformatiX
    • /
    • v.48 no.1
    • /
    • pp.201-213
    • /
    • 2018
  • In recent years, the number of traffic accidents caused by car accidents has been decreasing steadily due to traffic accident prevention activities in Korea. However, the number of accidents in Seoul is higher than that of other regions. Various studies have been conducted to prevent traffic accidents, which are human disasters. In particular, previous studies have performed the spatial analysis of traffic accidents by counting the number of traffic accidents by administrative districts or by estimating the density through kernel density method in order to identify the traffic accident cluster areas. However, since traffic accidents take place along the road, it would be more meaningful to investigate them concentrated on the road network. In this study, traffic accidents were assigned to the nearest road network in two ways and analyzed by hotspot analysis using Getis-Ord Gi* statistics. One of them was investigated with a fixed road link of 10m unit, and the other by computing the average traffic accidents per unit length per road section. As a result by the first method, it was possible to identify the specific road sections where traffic accidents are concentrated. On the other hand, the results by the second method showed that the traffic accident concentrated areas are extensible depending on the characteristic of the road links. The methods proposed here provide different approaches for visualizing the traffic accidents and thus, make it possible to identify those sections clearly that need improvement as for the traffic environment.

A Comparative Study on Physique and Health status of Elementary School Children between Ethnic Koreans in the People's Republic of China and Kojae Area in Korea (중국 연변지역 조선족 아동의 보건의료 및 건강상태 비교 - 한국 경남 거제지역과의 비교 -)

  • Nam, Eun-Woo;Lee, Kyu-Sik;Li, Zhao-Cheng;Ryu, Hwang-Gun;Bae, Sung-Kwon;Park, Kum-Ok
    • Journal of agricultural medicine and community health
    • /
    • v.21 no.1
    • /
    • pp.21-45
    • /
    • 1996
  • The purpose of this study is to compare the health status of Korean and Chosun-Jok elementary children. To accurately achieve the purpose, a survey was conducted in Yanbian area in China and Kojae in Korea during the period of June 15 to July in 1995. This survey was performed by using two survey methods. The first was the parents' survey method. It asked structured questions about their children. The second method focused on the actual health of the children. It used the collection of children's physical records in school. Guided by the school teachers, each child distributed the questionnaires to their parents. We used stratified-cluster sampling method to determine subjects. 1,083 questionnaires of 1,749 were used to analyze the data (666 questionnaires were incomplete and were not used in the analysis). Each questionnaire matched the data of their physical record. : Body Weight, Body Height, Chest-Circumstance, Eyesight, Dental Health. Using the data, we compared the BMI(Body Mass Index) the Koreans and Chosun-Jok in China. The results of this study were as follows : Comparing the general average physique of contained body height, body weight and chest circumstance of Korean and Chosun-Jok, the general physique of Chosun-Jok is inferior to that of Korean regardless of age and sex. Meanwhile, the average physical constitution of Korean compared the Chosun-Jok (i.e. eyesight and dental hygiene), revealed that the physical constitution of Chosun-Jok is superior to that of Korean without concern of age and sex. Average BMI of Chosun-Jok is lower than that of Koreans. But, it seemed that most of the students in both groups maintain an adequate health level. In the case of children from 10 to 12 years old, females are superior to males through all data contained of the body weight, the chest circumstance, and the body height. It seems that females and males have a different maximum growth age. Most of the parents preferred a good physique as a good health condition for their children. The physique of each child was affected with some variables, including the number of family members, and the educational level of the parents. According to the above results, the students' physique in Korea is superior to that of Chosun-Jok in China. But, Koreans are inferior to Ethnic Koreans in China in the students' physical constitution. In conclusion, we consider Chosun-Jok in China to maintain an adequate health level in their physique and physical constitution.

  • PDF

Effect of Different Milling Methods on Distribution of Particle Size of Rice Flours (제분방법이 쌀가루의 입자크기에 미치는 영향)

  • Kum, Jun-Seok;Lee, Sang-Hyo;Lee, Hyun-Yu;Kim, Kil-Hwan;Kim, Young-In
    • Korean Journal of Food Science and Technology
    • /
    • v.25 no.5
    • /
    • pp.541-545
    • /
    • 1993
  • Two different methods (Sieve shaker, Elzone particle size analyzer) were used to investigate rice flour particle size obtained by various milling method. Results of Elzone particle size analyzer were more effective than Sieve shaker in determining particle size, and the distribution of particle size of rice flours was affected by the type of the milling methods used. A rice flour, prepared in a Pin mill had a particle size range of $60{\sim}500$ mesh, and 30.38% of the sample was in the particle size range $200{\sim}270$ mesh. A rice flour, prepared in a Colloid mill had a particle size range of $40{\sim}500$ mesh and more of flour particles appeared in the range $140{\sim}200$ mesh than any other particle size. A rice flour, prepared in a Micro mill had a particle size range of $140{\sim}500$ mesh, and 41.62% of the sample was in the particle size range over 500 mesh. A rife flour, prepared in a Jet mill had a finer flour particle size was over the particle size range 500 mesh. The finer rice flour gave the highest L value and the lowest a value. The wet-milled flour particles were observed as a cluster of starch granules and the particles of rice flour (dry-milling) were observed as fragment of rice grains. Scanning Electron Photomicrographs revealed that visual differences in structure between milling methods, and similar results with Elzone particle size analyzer method in particle size.

  • PDF

A study on the estimation of AADT by short-term traffic volume survey (단기조사 교통량을 이용한 AADT 추정연구)

  • 이승재;백남철;권희정
    • Journal of Korean Society of Transportation
    • /
    • v.20 no.6
    • /
    • pp.59-68
    • /
    • 2002
  • AADT(Annual Average Daily Traffic) can be obtained by using short-term counted traffic data rather than using traffic data collected for 365 days. The process is a very important in estimating AADT using short-term traffic count data. Therefore, There have been many studies about estimating AADT. In this Paper, we tried to improve the process of the AADT estimation based on the former AADT estimation researches. Firstly, we found the factor showing differences among groups. To do so, we examined hourly variables(divided to total hours, weekday hours. Saturday hours, Sunday hours, weekday and Sunday hours, and weekday and Saturday hours) every time changing the number of groups. After all, we selected the hourly variables of Sunday and weekday as the factor showing differences among groups. Secondly, we classified 200 locations into 10 groups through cluster analysis using only monthly variables. The nile of deciding the number of groups is maximizing deviation among hourly variables of each group. Thirdly, we classified 200 locations which had been used in the second step into the 10 groups by applying statistical techniques such as Discriminant analysis and Neural network. This step is for testing the rate of distinguish between the right group including each location and a wrong one. In conclusion, the result of this study's method was closer to real AADT value than that of the former method. and this study significantly contributes to improve the method of AADT estimation.

Effect of LED Light Sources and Their Installation Method on the Growth of Strawberry Plants (LED 광원 및 설치조건에 따른 딸기의 생육 변화)

  • Lee, Ji Eun;Shin, Yong Seub;Cheung, Joung Do;Do, Han Woo;Kang, Young Hwa
    • Journal of Bio-Environment Control
    • /
    • v.24 no.2
    • /
    • pp.106-112
    • /
    • 2015
  • The objective of this study was to examine the growth reaction of strawberry plants to the mixed red and blue LED sources and their installation method. The artificial light sources were : LED PAR(PPFD $2{\sim}4{\mu}mol{\cdot}m^{-2}{\cdot}s^{-1}$), LED BAR(PPFD $100{\sim}120{\mu}mol{\cdot}m^{-2}{\cdot}s^{-1}$) and incandescent(PPFD $2{\sim}4{\mu}mol{\cdot}m^{-2}{\cdot}s^{-1}$) lamp. The lighting treatment was started at the first cluster flowering period as a night breaking lighting and was applied during 3 hours, between 22:00 and 01:00 every day. Plant height and leafstalk length were longer in plants treated with incandescent lamp, where as fresh and dry weight of shoot were heavier in LED PAR compared to incandescent lamp treatment. LED PAR treatment also resulted in the largest leaf area, chlorophyll content was increased by $0.36mg{\cdot}g^{-1}$ after 60 days from the starting of the artificial lighting. According to the experimental results application of 16W LED PAR lamps and W-type installation method can improve light environment in strawberry lighting culture.

Analysis and Prediction of Power Consumption Pattern Using Spatiotemporal Data Mining Techniques in GIS-AMR System (GIS-AMR 시스템에서 시공간 데이터마이닝 기법을 이용한 전력 소비 패턴의 분석 및 예측)

  • Park, Jin-Hyoung;Lee, Heon-Gyu;Shin, Jin-Ho;Ryu, Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.16D no.3
    • /
    • pp.307-316
    • /
    • 2009
  • In this paper, the spatiotemporal data mining methodology for detecting a cycle of power consumption pattern with the change of time and spatial was proposed, and applied to the power consumption data collected by GIS-AMR system with an aim to use its resulting knowledge in real world applications. First, partial clustering method was applied for cluster analysis concerned with the aim of customer's power consumption. Second, the patterns of customer's power consumption data which contain time and spatial attribute were detected by 3D cube mining method. Third, using the calendar pattern mining method for detection of cyclic patterns in the various time domains, the meanings and relationships of time attribute which is previously detected patterns were analyzed and predicted. For the evaluation of the proposed spatiotemporal data mining, we analyzed and predicted the power consumption patterns included the cycle of time and spatial feature from total 266,426 data of 3,256 customers with high power consumption from Jan. 2007 to Apr. 2007 supported by the GIS-AMR system in KEPRI. As a result of applying the proposed analysis methodology, cyclic patterns of each representative profiles of a group is identified on time and location.

Discovering Association Rules using Item Clustering on Frequent Pattern Network (빈발 패턴 네트워크에서 아이템 클러스터링을 통한 연관규칙 발견)

  • Oh, Kyeong-Jin;Jung, Jin-Guk;Ha, In-Ay;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.14 no.1
    • /
    • pp.1-17
    • /
    • 2008
  • Data mining is defined as the process of discovering meaningful and useful pattern in large volumes of data. In particular, finding associations rules between items in a database of customer transactions has become an important thing. Some data structures and algorithms had been proposed for storing meaningful information compressed from an original database to find frequent itemsets since Apriori algorithm. Though existing method find all association rules, we must have a lot of process to analyze association rules because there are too many rules. In this paper, we propose a new data structure, called a Frequent Pattern Network (FPN), which represents items as vertices and 2-itemsets as edges of the network. In order to utilize FPN, We constitute FPN using item's frequency. And then we use a clustering method to group the vertices on the network into clusters so that the intracluster similarity is maximized and the intercluster similarity is minimized. We generate association rules based on clusters. Our experiments showed accuracy of clustering items on the network using confidence, correlation and edge weight similarity methods. And We generated association rules using clusters and compare traditional and our method. From the results, the confidence similarity had a strong influence than others on the frequent pattern network. And FPN had a flexibility to minimum support value.

  • PDF

A method for learning users' preference on fuzzy values using neural networks and k-means clustering (신경망과 k-means 클러스터링을 이용한 사용자의 퍼지값 선호도 학습 방법)

  • Yoon, Tae-Bok;Na, Hyun-Jong;Park, Doo-Kyung;Lee, Jee-Hyong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.16 no.6
    • /
    • pp.716-720
    • /
    • 2006
  • Fuzzy sets are good for abstracting and unifying information using natural language like terms. However, fuzzy sets embody vagueness and users may have different attitude to the vagueness, each user may choose difference one as the best among several fuzzy values. In this paper, we develop a method teaming a user's, preference on fuzzy values and select one which fits to his preference. Users' preferences are modeled with artificial neural networks. We gather learning data from users by asking to choose the best from two fuzzy values in several representative cases of comparing two fuzzy sets. In order to establish tile representative comparing cases, we enumerate more than 600 cases and cluster them into several groups. Neural networks ate trained with the users' answer and the given two fuzzy values in each case. Experiments show that the proposed method produces outputs closet to users' preference than other methods.

Frequent Origin-Destination Sequence Pattern Analysis from Taxi Trajectories (택시 기종점 빈번 순차 패턴 분석)

  • Lee, Tae Young;Jeon, Seung Bae;Jeong, Myeong Hun;Choi, Yun Woong
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.39 no.3
    • /
    • pp.461-467
    • /
    • 2019
  • Advances in location-aware and IoT (Internet of Things) technology increase the rapid generation of massive movement data. Knowledge discovery from massive movement data helps us to understand the urban flow and traffic management. This paper proposes a method to analyze frequent origin-destination sequence patterns from irregular spatiotemporal taxi pick-up locations. The proposed method starts by conducting cluster analysis and then run a frequent sequence pattern analysis based on identified clusters as a base unit. The experimental data is Seoul taxi trajectory data between 7 a.m. and 9 a.m. during one week. The experimental results present that significant frequent sequence patterns occur within Gangnam. The significant frequent sequence patterns of different regions are identified between Gangnam and Seoul City Hall area. Further, this study uses administrative boundaries as a base unit. The results based on administrative boundaries fails to detect the frequent sequence patterns between different regions. The proposed method can be applied to decrease not only taxis' empty-loaded rate, but also improve urban flow management.