• Title/Summary/Keyword: 데이터편향

Search Result 169, Processing Time 0.025 seconds

An Energy-Efficient Concurrency Control Method for Mobile Transactions with Skewed Data Access Patterns in Wireless Broadcast Environments (무선 브로드캐스트 환경에서 편향된 엑세스 패턴을 가진 모바일 트랜잭션을 위한 효과적인 동시성 제어 기법)

  • Jung, Sung-Won;Park, Sung-Geun;Choi, Keun-Ha
    • Journal of KIISE:Databases
    • /
    • v.33 no.1
    • /
    • pp.69-85
    • /
    • 2006
  • Broadcast has been often used to disseminate the frequently requested data efficiently to a large volume of mobile clients over a single or multiple channels. Conventional concurrency control protocols for mobile transactions are not suitable for the wireless broadcast environments due to the limited bandwidth of the up-link communication channel. In wireless broadcast environments, the server often broadcast different data items with different frequency to incorporate the data access patterns of mobile transactions. The previously proposed concurrency control protocols for mobile transactions in wireless broadcast environments are focused on the mobile transactions with uniform data access patterns. However, these protocols perform poorly when the data access pattern of update mobile transaction are not uniform but skewed. The update mobile transactions with skewed data access patterns will be frequently aborted and restarted due 4o the update conflict of the same data items with a high access frequency. In this paper, we propose an energy-efficient concurrence control protocol for mobile transactions with skewed data access as well as uniform data access patterns. Our protocol use a random back-off technique to avoid the frequent abort and restart of update mobile transactions. We present in-depth experimental analysis of our method by comparing it with existing concurrency control protocols. Our performance analysis show that it significantly decrease the average response time, the amount of upstream and downstream bandwidth usage over existing protocols.

Reliability Updates of Driven Piles Based on Bayesian Theory Using Proof Pile Load Test Results (베이지안 이론을 이용한 타입강관말뚝의 신뢰성 평가)

  • Park, Jae-Hyun;Kim, Dong-Wook;Kwak, Ki-Seok;Chung, Moon-Kyung;Kim, Jun-Young;Chung, Choong-Ki
    • Journal of the Korean Geotechnical Society
    • /
    • v.26 no.7
    • /
    • pp.161-170
    • /
    • 2010
  • For the development of load and resistance factor design, reliability analysis is required to calibrate resistance factors in the framework of reliability theory. The distribution of measured-to-predicted pile resistance ratio was obrained based on only the results of load tests conducted to failure for the assessment of uncertainty regarding pile resistance and used in the conventional reliability analysis. In other words, successful pile load test (piles resisted twice their design loads without failure) results were discarded, and therefore, were not reflected in the reliability analysis. In this paper, a new systematic method based on Bayesian theory is used to update reliability indices of driven steel pipe piles by adding more proof pile load test results, even not conducted to failure, to the prior distribution of pile resistance ratio. Fifty seven static pile load tests performed to failure in Korea were compiled for the construction of prior distribution of pile resistance ratio. The empirical method proposed by Meyerhof is used to calculate the predicted pile resistance. Reliability analyses were performed using the updated distribution of pile resistance ratio. The challenge of this study is that the distribution updates of pile resistance ratio are possible using the load test results even not conducted to failure, and that Bayesian updates are most effective when limited data are available for reliability analysis.

Approximate Variance of Least Square Estimators for Regression Coefficient under Inclusion Probability Proportional to Size Sampling (포함확률비례추출에서 회귀계수 최소제곱추정량의 근사분산)

  • Kim, Kyu-Seong
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.1
    • /
    • pp.23-32
    • /
    • 2012
  • This paper deals with the bias and variance of regression coefficient estimators in a finite population. We derive approximate formulas for the bias, variance and mean square error of two estimators when we select a fixed-size inclusion probability proportional to the size sample and then estimate regression coefficients by the ordinary least square estimator as well as the weighted least square estimator based on the selected sample data. Necessary and sufficient conditions for the comparison of the two estimators in terms of variance and mean square error are suggested. In addition, a simple example is introduced to numerically compare the variance and mean square error of the two estimators.

The Effects of Cognitive Bias on Entrepreneurial Opportunity Evaluations through Perceived Risks in Entrepreneurial Self-Efficacy (창업가의 인지편향이 지각된 위험과 조절된 창업효능감에 따라 창업기회평가에 미치는 영향)

  • Kim, Daeyop;Park, Jaehwan
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.15 no.1
    • /
    • pp.95-112
    • /
    • 2020
  • This paper is to investigate how cognitive bias of college students and entrepreneurs relates to perceived risks and entrepreneurial opportunities that represent uncertainty, and how various cognitive bias and entrepreneurial efficacy In the same way. The purpose of this study is to find improvement points of entrepreneurship education for college students and to suggest problems and improvement possibilities in the decision making process of current entrepreneurs. This empirical study is a necessary to improve the decision-making of individuals who want to start a business at the time when various attempts are made to activate the start-up business and increase the sustainability of the existing SME management. And understanding of the difference in opportunity evaluation, and suggests that it is necessary to provide good opportunities together with the upbringing of entrepreneurs. In order to achieve the purpose of the study, questionnaires were conducted for college students and entrepreneurs. A total of 363 questionnaire data were obtained and demonstrated through structural equation modeling. This study confirms that there is some relationship between perceived risk and cognitive bias. Overconfidence and control illusions among cognitive bias have a significant relationship between perceived risk and wealth. Especially, it is confirmed that control illusion of college students has a significant relationship with perceived risk. Second, cognitive bias demonstrated some significant relationship with opportunity evaluation. Although we did not find evidence that excess self-confidence is related to opportunity evaluation, we have verified that control illusions and current status bias are related to opportunity evaluation. Control illusions were significant in both college students and entrepreneurs. Third, perceived risk has a negative relationship with opportunity evaluation. All students, regardless of whether they are college students or entrepreneurs, judge opportunities positively if they perceive low risk. Fourth, it can be seen from the college students 'group that entrepreneurial efficacy has a moderating effect between perceived risk and opportunity evaluation, but no significant results were found in the entrepreneurs' group. Fifth, the college students and entrepreneurs have different cognitive bias, and they have proved that there is a different relationship between entrepreneurial opportunity evaluation and perceived risk. On the whole, there are various cognitive biases that are caused by time pressure or stress on college students and entrepreneurs who have to make judgments in uncertain opportunities, and in this respect, they can improve their judgment in the future. At the same time, university students can have a positive view of new opportunities based on high entrepreneurial efficacy, but if they fully understand the intrinsic risks of entrepreneurship through entrepreneurial education and fully understand the cognitive bias present in direct entrepreneurial experience, You will get a better opportunity assessment. This study has limitations in that it is based on the fact that university students and entrepreneurs are integrated, and that the survey respondents are selected by the limited random sampling method. It is necessary to conduct more systematic research based on more faithful data in the absence of the accumulation of entrepreneurial research data. Second, the translation tools used in the previous studies were translated and the meaning of the measurement tools might not be conveyed due to language differences. Therefore, it is necessary to construct a more precise scale for the accuracy of the study. Finally, complementary research should be done to identify what competitive opportunities are and what opportunities are appropriate for entrepreneurs.

Hate Speech Detection in Chatbot Data Using KoELECTRA (KoELECTRA를 활용한 챗봇 데이터의 혐오 표현 탐지)

  • Shin, Mingi;Chin, Hyojin;Song, Hyeonho;Choi, Jeonghoi;Lim, Hyeonseung;Cha, Meeyoung
    • Annual Conference on Human and Language Technology
    • /
    • 2021.10a
    • /
    • pp.518-523
    • /
    • 2021
  • 챗봇과 같은 대화형 에이전트 사용이 증가하면서 채팅에서의 혐오 표현 사용도 더불어 증가하고 있다. 혐오 표현을 자동으로 탐지하려는 노력은 다양하게 시도되어 왔으나, 챗봇 데이터를 대상으로 한 혐오 표현 탐지 연구는 여전히 부족한 실정이다. 이 연구는 혐오 표현을 포함한 챗봇-사용자 대화 데이터 35만 개에 한국어 말뭉치로 학습된 KoELETRA 기반 혐오 탐지 모델을 적용하여, 챗봇-사람 데이터셋에서의 혐오 표현 탐지의 성능과 한계점을 검토하였다. KoELECTRA 혐오 표현 분류 모델은 챗봇 데이터셋에 대해 가중 평균 F1-score 0.66의 성능을 보였으며, 오탈자에 대한 취약성, 맥락 미반영으로 인한 편향 강화, 가용한 데이터의 정확도 문제가 주요한 한계로 포착되었다. 이 연구에서는 실험 결과에 기반해 성능 향상을 위한 방향성을 제시한다.

  • PDF

FubaoLM : Automatic Evaluation based on Chain-of-Thought Distillation with Ensemble Learning (FubaoLM : 연쇄적 사고 증류와 앙상블 학습에 의한 대규모 언어 모델 자동 평가)

  • Huiju Kim;Donghyeon Jeon;Ohjoon Kwon;Soonhwan Kwon;Hansu Kim;Inkwon Lee;Dohyeon Kim;Inho Kang
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.448-453
    • /
    • 2023
  • 대규모 언어 모델 (Large Language Model, LLM)을 인간의 선호도 관점에서 평가하는 것은 기존의 벤치마크 평가와는 다른 도전적인 과제이다. 이를 위해, 기존 연구들은 강력한 LLM을 평가자로 사용하여 접근하였지만, 높은 비용 문제가 부각되었다. 또한, 평가자로서 LLM이 사용하는 주관적인 점수 기준은 모호하여 평가 결과의 신뢰성을 저해하며, 단일 모델에 의한 평가 결과는 편향될 가능성이 있다. 본 논문에서는 엄격한 기준을 활용하여 편향되지 않은 평가를 수행할 수 있는 평가 프레임워크 및 평가자 모델 'FubaoLM'을 제안한다. 우리의 평가 프레임워크는 심층적인 평가 기준을 통해 다수의 강력한 한국어 LLM을 활용하여 연쇄적 사고(Chain-of-Thought) 기반 평가를 수행한다. 이러한 평가 결과를 다수결로 통합하여 편향되지 않은 평가 결과를 도출하며, 지시 조정 (instruction tuning)을 통해 FubaoLM은 다수의 LLM으로 부터 평가 지식을 증류받는다. 더 나아가 본 논문에서는 전문가 기반 평가 데이터셋을 구축하여 FubaoLM 효과성을 입증한다. 우리의 실험에서 앙상블된 FubaoLM은 GPT-3.5 대비 16% 에서 23% 향상된 절대 평가 성능을 가지며, 이항 평가에서 인간과 유사한 선호도 평가 결과를 도출한다. 이를 통해 FubaoLM은 비교적 적은 비용으로도 높은 신뢰성을 유지하며, 편향되지 않은 평가를 수행할 수 있음을 보인다.

  • PDF

An Enhanced DBSCAN Algorithm to Consider Various Density Distributions for Educational Data (교육데이터 정제를 위한 다양한 밀도분포를 고려한 개선된 DBSCAN 알고리즘)

  • Kim, Jeong-Hun;Nasridinov, Aziz
    • Proceedings of The KACE
    • /
    • 2018.01a
    • /
    • pp.41-44
    • /
    • 2018
  • 교육데이터마이닝은 다양한 교육 환경에서 생성되는 막대한 양의 데이터를 활용하여 학습자들의 학습 유형, 학습 진도를 분석, 예측하고 교육 성취를 효과적으로 향상시키는 것을 목적으로 한다. 효과적인 교육데이터마이닝 결과를 얻기 위해서는 교육데이터에 대한 정제 과정이 필요하며 DBSCAN 클러스터링을 통해 교육데이터에 포함된 노이즈 데이터를 제거하고 생성된 각 클러스터에서 동일한 비율로 데이터를 추출함으로써 편향되지 않은 표본 데이터를 생성할 수 있다. 하지만 DBSCAN은 두 개의 전역 매개변수에 의해 다양한 밀도분포를 가지는 클러스터를 생성할 수 없다는 문제점이 있으며 이는 교육 데이터를 정제함에 있어 치명적인 문제점이 될 수 있다. 본 논문에서는 DBSCAN의 문제점을 개선하고 클러스터링 정확도를 향상시키기 위해 고정된 매개변수를 사용하지 않고 각 밀도분포에 대해 최적의 입력 매개변수를 결정함으로써 다양한 밀도분포를 가지는 클러스터들을 효과적으로 생성하는 C-DBSCAN을 제안한다.

  • PDF

Clustered Hash Index-based Skyline Query (해시 색인 군집화 기반 스카이라인 질의)

  • Choi, Jong-Hyeok;Nasridinov, Aziz
    • Proceedings of The KACE
    • /
    • 2018.01a
    • /
    • pp.45-48
    • /
    • 2018
  • 스카이라인 질의는 지배라는 개념을 활용, 주어진 데이터로부터 데이터를 대표할 수 있는 데이터들을 탐색하기 때문에 사용자의 요청에 부합하는 최적의 결과를 탐색하거나 기업에서 의사결정을 이루기 위해 사용되는 등 넓은 활용을 보이고 있다. 하지만 스카이라인 질의는 데이터의 차원이 증가하는 경우 전체적인 성능의 감소와 함께 스카이라인으로 선택되는 데이터의 수가 급증하여 사용자에게 유용한 결과를 반환하지 못하게 된다. 이러한 문제를 해결하기 위해 최근에는 Top-k 질의 기반의 방식이나 군집화 기반의 기법을 적용한 방식의 스카이라인 질의들이 새롭게 제안되고 있지만 이들은 데이터의 편향이나 사용자로부터 입력된 k에 큰 영향을 받는 등 해당 질의 결과가 데이터들을 충분히 대표하거나 다양성을 만족시키지 못했다. 이러한 문제를 해결하기 위해 본 논문에서는 해시 색인 기법과 군집화 기법인 DBSCAN을 통해 주어진 데이터들을 충분히 대표함과 동시에 다양성을 만족할 수 있는 새로운 방식의 스카이라인인 CHI-SQ의 이론적 배경을 제안하고자 한다.

  • PDF

Unsupervised Abstractive Summarization Method that Suitable for Documents with Flows (흐름이 있는 문서에 적합한 비지도학습 추상 요약 방법)

  • Lee, Hoon-suk;An, Soon-hong;Kim, Seung-hoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.11
    • /
    • pp.501-512
    • /
    • 2021
  • Recently, a breakthrough has been made in the NLP area by Transformer techniques based on encoder-decoder. However, this only can be used in mainstream languages where millions of dataset are well-equipped, such as English and Chinese, and there is a limitation that it cannot be used in non-mainstream languages where dataset are not established. In addition, there is a deflection problem that focuses on the beginning of the document in mechanical summarization. Therefore, these methods are not suitable for documents with flows such as fairy tales and novels. In this paper, we propose a hybrid summarization method that does not require a dataset and improves the deflection problem using GAN with two adaptive discriminators. We evaluate our model on the CNN/Daily Mail dataset to verify an objective validity. Also, we proved that the model has valid performance in Korean, one of the non-mainstream languages.