• Title/Summary/Keyword: 불균형비율

Search Result 179, Processing Time 0.022 seconds

A Study of Optimal Distance between Signalized Intersections and Roundabouts Considering Unbalanced Traffic Conditions (교통량 불균형을 고려한 회전교차로와 신호교차로간 최적거리 산정에 관한 연구)

  • An, Hong Ki;Kim, Dong Sun;Bae, Gi Mok
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.41 no.6
    • /
    • pp.707-714
    • /
    • 2021
  • An important factor that causes delays at roundabouts is unbalanced traffic conditions. The majority of domestic studies have tendedto focus on the proportion of entering traffic rather than conflicting traffic in order to explain unbalanced traffic conditions. Also, there is scant research on the proper distance between a signalized intersection and a roundabout. In this study, therefore, unbalanced traffic conditions and optimal distance between two intersections were analyzed using SIDRA software based on the Gajwa-ro roundabout in Icheon, where unbalanced traffic conditions occur during afternoon peak hours, and its position is 295 m from a signalized intersection. It was found that a state of unbalanced traffic conditions in the form of conflicting traffic caused delays at Gajwa-ro roundabout, and the queuing length on each approach was removed when there was 850 m of distance between two intersections.

미국의 방산 수입

  • Sin, Myeong-Ho
    • Defense and Technology
    • /
    • no.6 s.124
    • /
    • pp.50-55
    • /
    • 1989
  • 미국의 방산물자 수입비율을 검토하는 것은 "심한 불균형"현상을 연구하기 위한 것이다. 미국의 연간 구매는 10여개국으로부터 약 50억불 정도이나 수출은 구매액의 2배 정도가 된다. 일부 경제분석가들은 이것이 성공적인 재정적 업적이라고 주장하기도 하지만, 다른 일각에서는 이러한 현상이 미국에 해를 끼쳐왔다고 말하고 있다

  • PDF

Comparison and Analysis of Women Faces in 20s' and Women Faces in 60s Through Women faces's Measured value (여성 얼굴의 측정치를 통한 20대와 60대의 비교 분석)

  • Kim, Ae-Kyung;Lee, Kyung-Hee
    • Science of Emotion and Sensibility
    • /
    • v.13 no.3
    • /
    • pp.485-492
    • /
    • 2010
  • This thesis analyzes the proportion and disproportion of faces through visual analysis and measured value for women faces in 20s and 60s.. The proportion of bizygion breadth and face height is 1 : 1.34 in 20s and 1 : 1.39 in 60s which shows face height is ling in 60s, and 0.85 : 1 : 1 for upper face length, middle face length and lower face length in 20s which shows the proportion of upper face length and lower face length are long while they are 0.84 : 1 : 1.06 in 60s which shows lower face length is long and upper face length is short. If the proportion of the face is more than $2^{\circ}$ which is severe imbalance, angle of eyes is 8% in 20s, 13% in 60s, and angle of nasal is 11% in 20s, 29% in 60s, angle of mouse is 11% in 20s and 40% in 60s, showing imbalance of 60s is severe. As above, It shows that face height is longer in 60s than in 20s and lower face is long among others because face's change due to aging. Also, We able to know that face's imbalance is severer in 60s than in 20s.

  • PDF

Prediction of Good Seller in Overseas sales of Domestic Books Using Big Data (빅데이터를 활용한 국내 도서의 해외 판매시 굿셀러 예측)

  • Kim, Nayeon;Kim, Doyoung;Kim, Miryeo;Jung, Jiyeong;Kim, Hyon Hee
    • Annual Conference of KIPS
    • /
    • 2022.05a
    • /
    • pp.401-404
    • /
    • 2022
  • 한국 문학이 세계로 뻗어나감에 따라 해외 시장에서 자리를 잡는 것이 중요해진 시점이다. 본 연구에서는 2016 년도부터 2020 년도까지 최근 5 년간 해외 출간된 도서들 중에서 굿셀러로 분류되는 누적 5 천부 이상 판매 여부를 예측하고자 했다. 굿셀러로 분류되는 도서는 전체 번역 도서 중 적은 비율을 차지하여 데이터 불균형이 발생하였으며, 본 연구에서는 SMOTE 기법과 앙상블 알고리즘을 적용하여 데이터 불균형 문제를 해결하였다. 그 결과, 데이터 클래스 비율이 1:1 에 가까울수록 성능 개선 효과가 나타났으며 LightGBM 모델이 99.83%의 AUC 값을 얻어 다른 앙상블 알고리즘에 비해 가장 좋은 예측 성능을 보임을 검증하였다. 또한 누적 5 천부 이상 판매 여부 예측에 있어 큰 영향을 미치는 변수로는 작가가 가장 중요한 요인으로 나타났으며 출간 국가, 그리고 평점 평균, 평점 참여자 수 같은 온라인 요인도 판매 예측에 유의미한 변수로 나타난 것을 확인할 수 있었다.

Optimization of Uneven Margin SVM to Solve Class Imbalance in Bankruptcy Prediction (비대칭 마진 SVM 최적화 모델을 이용한 기업부실 예측모형의 범주 불균형 문제 해결)

  • Sung Yim Jo;Myoung Jong Kim
    • Information Systems Review
    • /
    • v.24 no.4
    • /
    • pp.23-40
    • /
    • 2022
  • Although Support Vector Machine(SVM) has been used in various fields such as bankruptcy prediction model, the hyperplane learned by SVM in class imbalance problem can be severely skewed toward minority class and has a negative impact on performance because the area of majority class is expanded while the area of minority class is invaded. This study proposed optimized uneven margin SVM(OPT-UMSVM) combining threshold moving or post scaling method with UMSVM to cope with the limitation of the traditional even margin SVM(EMSVM) in class imbalance problem. OPT-UMSVM readjusted the skewed hyperplane to the majority class and had better generation ability than EMSVM improving the sensitivity of minority class and calculating the optimized performance. To validate OPT-UMSVM, 10-fold cross validations were performed on five sub-datasets with different imbalance ratio values. Empirical results showed two main findings. First, UMSVM had a weak effect on improving the performance of EMSVM in balanced datasets, but it greatly outperformed EMSVM in severely imbalanced datasets. Second, compared to EMSVM and conventional UMSVM, OPT-UMSVM had better performance in both balanced and imbalanced datasets and showed a significant difference performance especially in severely imbalanced datasets.

Classification Analysis for Unbalanced Data (불균형 자료에 대한 분류분석)

  • Kim, Dongah;Kang, Suyeon;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.3
    • /
    • pp.495-509
    • /
    • 2015
  • We study a classification problem of significant differences in the proportion of two groups known as the unbalanced classification problem. It is usually more difficult to classify classes accurately in unbalanced data than balanced data. Most observations are likely to be classified to the bigger group if we apply classification methods to the unbalanced data because it can minimize the misclassification loss. However, this smaller group is misclassified as the larger group problem that can cause a bigger loss in most real applications. We compare several classification methods for the unbalanced data using sampling techniques (up and down sampling). We also check the total loss of different classification methods when the asymmetric loss is applied to simulated and real data. We use the misclassification rate, G-mean, ROC and AUC (area under the curve) for the performance comparison.

Churn Prediction Model using Logistic Regression (Logistic Regression을 이용한 이탈고객예측모형)

  • Jeong, Han-Na;Park, Hye-Jin;Kim, Nam-Hyeong;Jeon, Chi-Hyeok;Lee, Jae-Uk
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2008.10a
    • /
    • pp.324-328
    • /
    • 2008
  • 금융산업에서 고객의 이탈비율은 기대수익에 영향을 미친다는 점에서 예측이 필요한 부분이며 최근 들어 정확한 예측을 통한 비용관리가 이루어지면서 고객 이탈을 예측하는 것이 중요한 문제로 떠오르고 있다. 그러나 보험 고객 데이터가 대용량이고 불균형한 출력 값을 갖는 특성으로 인해 기존의 방법으로 예측 모델을 만드는 것이 적합하지 않다. 본 연구에서는 대용량 데이터를 처리하는 데 효과적으로 알려져 있는 Trust-region Newton method를 적용한 로지스틱 회귀분석을 통해 이탈고객을 예측하는 것을 주된 연구로 하며, 불균형한 데이터에서의 예측정확도를 높이기 위해 Oversampling, Clustering, Boosting 등을 이용하여 고객 데이터에 적합한 이탈 고객 예측 모형을 제시하고자 한다.

  • PDF

Migration to the Capital Region in Korea: Assessing the Relative Importance of Place Characteristics and Migrant Selectivity (우리나라 수도권으로의 인구이동: 시기별 유출지역 특성과 이주자 선별성의 상대적 중요도 평가)

  • Kwon, Sang-Cheol
    • Journal of the Korean association of regional geographers
    • /
    • v.11 no.6
    • /
    • pp.571-584
    • /
    • 2005
  • The population concentration in the Capital region of Korea has become an important issue for the pursuit of the balanced regional human capital development. Considering migration both as a geographic and a social movement, migration to the capital region could be examined in the push factors and the selective migrant characteristics from the out-migration region. Their relative importance reveals that age and education level are important in almost all years, but the importance of the percentage of manufacturing sector and rural/urban region moves to the years of education, the percentage of unskilled occupation and manufacturing sector and unemployment ratio recently. Since the brain drain has been occurring under the highly unbalanced regional development in Korea, the results suggest that regional human capital investment should be accompanied with enlarging quality employment opportunities to reap the benefits.

  • PDF

On sampling algorithms for imbalanced binary data: performance comparison and some caveats (불균형적인 이항 자료 분석을 위한 샘플링 알고리즘들: 성능비교 및 주의점)

  • Kim, HanYong;Lee, Woojoo
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.5
    • /
    • pp.681-690
    • /
    • 2017
  • Various imbalanced binary classification problems exist such as fraud detection in banking operations, detecting spam mail and predicting defective products. Several sampling methods such as over sampling, under sampling, SMOTE have been developed to overcome the poor prediction performance of binary classifiers when the proportion of one group is dominant. In order to overcome this problem, several sampling methods such as over-sampling, under-sampling, SMOTE have been developed. In this study, we investigate prediction performance of logistic regression, Lasso, random forest, boosting and support vector machine in combination with the sampling methods for binary imbalanced data. Four real data sets are analyzed to see if there is a substantial improvement in prediction performance. We also emphasize some precautions when the sampling methods are implemented.

A Deep Learning Based Over-Sampling Scheme for Imbalanced Data Classification (불균형 데이터 분류를 위한 딥러닝 기반 오버샘플링 기법)

  • Son, Min Jae;Jung, Seung Won;Hwang, Een Jun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.7
    • /
    • pp.311-316
    • /
    • 2019
  • Classification problem is to predict the class to which an input data belongs. One of the most popular methods to do this is training a machine learning algorithm using the given dataset. In this case, the dataset should have a well-balanced class distribution for the best performance. However, when the dataset has an imbalanced class distribution, its classification performance could be very poor. To overcome this problem, we propose an over-sampling scheme that balances the number of data by using Conditional Generative Adversarial Networks (CGAN). CGAN is a generative model developed from Generative Adversarial Networks (GAN), which can learn data characteristics and generate data that is similar to real data. Therefore, CGAN can generate data of a class which has a small number of data so that the problem induced by imbalanced class distribution can be mitigated, and classification performance can be improved. Experiments using actual collected data show that the over-sampling technique using CGAN is effective and that it is superior to existing over-sampling techniques.