• Title/Summary/Keyword: Data Bias

Search Result 1,766, Processing Time 0.028 seconds

Assessment of Frequency Analysis using Daily Rainfall Data of HadGEM3-RA Climate Model (HadGEM3-RA 기후모델 일강우자료를 이용한 빈도해석 성능 평가)

  • Kim, Sunghun;Kim, Hanbeen;Jung, Younghun;Heo, Jun-Haeng
    • Journal of Wetlands Research
    • /
    • v.21 no.spc
    • /
    • pp.51-60
    • /
    • 2019
  • In this study, we performed At-site Frequency Analysis(AFA) and Regional Frequency Analysis(RFA) using the observed and climate change scenario data, and the relative root mean squared error(RMMSE) was compared and analyzed for both approaches through Monte Carlo simulation. To evaluate the rainfall quantile, the daily rainfall data were extracted for 615 points in Korea from HadGEM3-RA(12.5km) climate model data, one of the RCM(Regional Climate Model) data provided by the Korea Meteorological Administration(KMA). Quantile mapping(QM) and inverse distance squared methods(IDSM) were applied for bias correction and spatial disaggregation. As a result, it is shown that the RFA estimates more accurate rainfall quantile than AFA, and it is expected that the RFA could be reasonable when estimating the rainfall quantile based on climate change scenarios.

A Study on Impacts of De-identification on Machine Learning's Biased Knowledge (머신러닝 편향성 관점에서 비식별화의 영향분석에 대한 연구)

  • Soohyeon Ha;Jinsong Kim;Yeeun Son;Gaeun Won;Yujin Choi;Soyeon Park;Hyung-Jong Kim;Eunsung Kang
    • Journal of the Korea Society for Simulation
    • /
    • v.33 no.2
    • /
    • pp.27-35
    • /
    • 2024
  • We aimed to shed light on the issue of perpetuating societal disparities by analyzing the impact of inherent biases present in datasets used for training artificial intelligence models on the predictions generated by Artificial Intelligence(AI). Therefore, to examine the influence of data bias on AI models, we constructed an original dataset containing biases related to gender wage gaps and subsequently created a de-identified dataset. Additionally, by utilizing the decision tree algorithm, we compared the outputs of AI models trained on both the original and de-identified datasets, aiming to analyze how data de-identification affects the biases in the results produced by artificial intelligence models. Through this, our goal was to highlight the significant role of data de-identification not only in safeguarding individual privacy but also in addressing biases within the data.

TOVS retrieved data with the real time synoptic surface data (종관 지상 자료를 이용한 TOVS수치 해석 산출 자료)

  • 주상원;정효상;김금란
    • Korean Journal of Remote Sensing
    • /
    • v.10 no.1
    • /
    • pp.55-67
    • /
    • 1994
  • The International TOVS(TIROS Oprational Vertical Sounders) Process Package(ITPP-VI)is for a global usage, which needs a surface data to generate atmospheric soundings. If the initial input process in the ITPP-VI is not modified, it takes climatic surface data for producing sounding data in general. Korea Meteorological Administration(KMA) is trying to improve the quality of TOVS sounding data using real-time synoptic observations and make a use weather prediction and analysis in various ways. Serval cases in this study show that TOVS retrieved meteolorogical parameters such as atmopheric temperature, dew point depression and geopotential heights used by synoptic surface observations can delineate more detailed atmospheric feature rather than those used by climate surface data. In addition, the collocated comparisons of TOVS synoptic retrieved parameters with radiosonde observations are performed statistically. TOVS retrieved fields with the synoptic surface analyzed data show smaller bias reatively than those with the climatic data and also reduced root mean square differences below 700 hPa as expected.

Cross platform classification of microarrays by rank comparison

  • Lee, Sunho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.2
    • /
    • pp.475-486
    • /
    • 2015
  • Mining the microarray data accumulated in the public data repositories can save experimental cost and time and provide valuable biomedical information. Big data analysis pooling multiple data sets increases statistical power, improves the reliability of the results, and reduces the specific bias of the individual study. However, integrating several data sets from different studies is needed to deal with many problems. In this study, I limited the focus to the cross platform classification that the platform of a testing sample is different from the platform of a training set, and suggested a simple classification method based on rank. This method is compared with the diagonal linear discriminant analysis, k nearest neighbor method and support vector machine using the cross platform real example data sets of two cancers.

FDI and the Evolution of Directed Technological Progress Bias: New Evidence from Korean Outward Investment

  • Boye Li;Xiang Li;Yaokun Wu
    • Journal of Korea Trade
    • /
    • v.27 no.5
    • /
    • pp.1-22
    • /
    • 2023
  • Purpose - Southeast Asia has been the focus of Korea's foreign investment. Korea has been helping developing countries in Southeast Asia achieve economic growth and win-win cooperation through capital exports. FDI is an important channel for technology diffusion. However, the impact of FDI on the bias of technological progress in the host country is dependent on the host country's own endowment structure and capital-labor factor substitution elasticity. Therefore, the central issue of this paper is to accurately evaluate the impact of Korea's FDI to the four Southeast Asian countries in various industries on their bias of technological progress. Design/methodology - The paper uses macroeconomic data for Korea and four East Asian countries to estimate capital-labor factor elasticities of substitution using nonlinear, seemingly uncorrelated regressions (NLSUR). Then, the biased technological change index (BTCI) is calculated for each country. Finally, panel data analysis is used to explore the impact of Korean FDI in various industries in the four Southeast Asian countries on their own directed technological progress, and a robustness test is conducted. Findings - There is a substitution relationship between capital and labor factors based on their elasticity in Korea, Singapore and the Philippines. There is a complementary relationship between capital and labor factors in Indonesia and Malaysia. According to the BTCI, there is a trend toward labor-biased technological progress in all countries. Korean investments in manufacturing, wholesale and retail trade in the host country trigger capital-biased technological change in the host country; investments in the finance, insurance and information and communication sectors trigger labor-biased technological change. In addition, this paper also confirms that directed technological progress can enable cross-country transmission. Originality/value - The innovation of this paper lies in three aspects. First, we estimate the BTCI for five countries and explore the trend and situation of directed technological progress in each country from each country's own perspective. Second, we explore the impact of Korean FDI in the host country on the bias to its technological progress at the industry level. Second, we explore the impact of Korean FDI in various industries in the four Southeast Asian countries on the four countries' own directed technological progress from a national perspective. Finally, we propose corresponding countermeasures for technological progress from the perspective of inverse factor endowment. These innovative points not only expand the understanding of technological progress and cross-country technology transfer in East Asia but also provide practical references for policy-makers and business operators.

The Use and Abuse of Climate Scenarios in Agriculture (농업부문 기후시나리오 활용의 주의점)

  • Kim, Jin-Hee;Yun, Jin I.
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.18 no.3
    • /
    • pp.170-178
    • /
    • 2016
  • It is not clear how to apply the climate scenario to assess the impact of climate change in the agricultural sector. Even if you apply the same scenario, the result can vary depending on the temporal-spatial downscaling, the post-treatment to adjust the bias of a model, and the prediction model selection (used for an impact assessment). The end user, who uses the scenario climate data, should select climate factors, a spatial extend, and a temporal range appropriate for the objectives of an analysis. It is important to draw the impact assessment results with minimum uncertainty by evaluating the suitability of the data including the reproducibility of the past climate and calculating the optimum future climate change scenario. This study introduced data processing methods for reducing the uncertainties in the process of applying the future climate change scenario to users in the agricultural sector and tried to provide basic information for appropriately using the scenario data in accordance with the study objectives.

Development of Land Surface Temperature Retrieval Algorithm from the MTSAT-2 Data

  • Kim, Ji-Hyun;Suh, Myoung-Seok
    • Korean Journal of Remote Sensing
    • /
    • v.27 no.6
    • /
    • pp.653-662
    • /
    • 2011
  • Land surface temperature (LST) is a one of the key variables of land surface which can be estimated from geostationary meteorological satellite. In this study, we have developed the three sets of LST retrieval algorithm from MTSAT-2 data through the radiative transfer simulations under various atmospheric profiles (TIGR data), satellite zenith angle, spectral emissivity, and surface lapse rate conditions using MODTRAN 4. The three LST algorithms are daytime, nighttime and total LST algorithms. The weighting method based on the solar zenith angle is developed for the consistent retrieval of LST at the early morning and evening time. The spectral emissivity of two thermal infrared channels is estimated by using vegetation coverage method with land cover map and 15-day normalized vegetation index data. In general, the three LST algorithms well estimated the LST without regard to the satellite zenith angle, water vapour amount, and surface lapse rate. However, the daytime LST algorithm shows a large bias especially for the warm LST (> 300 K) at day time conditions. The night LST algorithm shows a relatively large error for the LST (260 ~ 280K) at the night time conditions. The sensitivity analysis showed that the performance of weighting method is clearly improved regardless of the impacting conditions although the improvements of the weighted LST compared to the total LST are quite different according to the atmospheric and surface lapse rate conditions. The validation results of daytime (nighttime) LST with MODIS LST showed that the correlation coefficients, bias and RMSE are about 0.62~0.93 (0.44~0.83), -1.47~1.53 (-1.80~0.17), and 2.25~4.77 (2.15~4.27), respectively. However, the performance of daytime/nighttime LST algorithms is slightly degraded compared to that of the total LST algorithm.

Design of a Whitening Block Module for Minimizing DC Bias in Wireless Communications (무선 통신에서 DC 바이어스를 최소화하는 화이트닝 블록 설계)

  • Moon, San-Gook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.10a
    • /
    • pp.673-676
    • /
    • 2008
  • In wireless communications such as Bluetooth, Baseband should be able to minimize the DC bias of the data which passed the modem interface of either transmitter or receiver for the reliability of the circuit and the integrity of the data. The transmitter scrambles the data to send randomly to the error correction block and the receiver recovers the randomly spread data as they have been. To design the whitening block, it is important to select the prime polynomial for the filtering. In this paper, we designed a optimal whitening block using the prime polynomial $g(D)=D^7+D^4+1$ for hardware and area efficiency. The proposed hardware whitening block was described and verified using Verilog HDL and later to be automatically synthesized. The synthesized whitening block operated at 40Mhz normal clock speed of the target baseband microcontroller.

  • PDF

Comparative Analysis of Observation and NWP Data of Downslope Windstorm Cases during 3-Dimensional Meteorological Observation Project in Yeongdong Region of Gangwon province, South Korea in 2020 (2020 강원영동 공동 입체기상관측 기간 강풍 사례에 대한 관측자료와 수치모델 비교 분석)

  • Kwon, Soon-Beom;Park, Se-Taek
    • Atmosphere
    • /
    • v.31 no.4
    • /
    • pp.395-404
    • /
    • 2021
  • In order to investigate downslope windstorm by using more detailed observation, we observed 6 cases at 3 sites - Inje, Yongpyeong, and Bukgangneung - during "3-D Meteorological Observation Project in Yeongdong region of Gangwon province, South Korea in 2020." The results from analysis of the project data were as follows. First, AWS data showed that a subsidence inversion layer appeared in 800~700 hPa on the windward side and 900~850 hPa on the leeward side. Second, before strong wind occurred, the inversion layer had descended to about 880~800 hPa. Third, with mountain wave breaking, downslope wind was intensified at the height of 2~3 km above sea level. After the downslope wind began to descend, the subsidence inversion layer developed. When the subsidence inversion layer got close to the ground, wind peak occurred. In general, UM (Unified Model) GDAPS (Global Data Assimilation Prediction System) have had negative bias in wind speed around peak area of Taebaek mountain range, and positive bias in that of East Sea coast area. The stronger wind blew, the larger the gap between observed and predicted wind speed by GDAPS became. GDAPS predicted strong p-velocity at 0600 LST 25 Apr 2020 (4th case) and weak p-velocity at 2100 LST 01 Jun 2020 (6th case) on the lee-side of Taebaek mountain range near Yangyang. As hydraulic jump theory was proved, which is known as a mechanism of downslope windstorm in Yeongdong region, it was confirmed that there is a relationship between p-velocity of lee-side and wind speed of eastern slope of Taebaek mountain range.

Learning Domain Invariant Representation via Self-Rugularization (자기 정규화를 통한 도메인 불변 특징 학습)

  • Hyun, Jaeguk;Lee, ChanYong;Kim, Hoseong;Yoo, Hyunjung;Koh, Eunjin
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.24 no.4
    • /
    • pp.382-391
    • /
    • 2021
  • Unsupervised domain adaptation often gives impressive solutions to handle domain shift of data. Most of current approaches assume that unlabeled target data to train is abundant. This assumption is not always true in practices. To tackle this issue, we propose a general solution to solve the domain gap minimization problem without any target data. Our method consists of two regularization steps. The first step is a pixel regularization by arbitrary style transfer. Recently, some methods bring style transfer algorithms to domain adaptation and domain generalization process. They use style transfer algorithms to remove texture bias in source domain data. We also use style transfer algorithms for removing texture bias, but our method depends on neither domain adaptation nor domain generalization paradigm. The second regularization step is a feature regularization by feature alignment. Adding a feature alignment loss term to the model loss, the model learns domain invariant representation more efficiently. We evaluate our regularization methods from several experiments both on small dataset and large dataset. From the experiments, we show that our model can learn domain invariant representation as much as unsupervised domain adaptation methods.