• Title/Summary/Keyword: Data 누락

Search Result 261, Processing Time 0.024 seconds

Estimate method of missing data using Similarity in AMI system (AMI시스템에서 유사도를 활용한 누락데이터 보정 방법)

  • Kwon, Hyuk-Rok;Hong, Taek-Eun;Kim, Pan-Koo
    • Smart Media Journal
    • /
    • v.8 no.4
    • /
    • pp.80-84
    • /
    • 2019
  • As a result of AMI rapidly expanding and distributing its products, variety of services that utilize data on the use of electricity are increasing. In order to make these services more effective, missing metric data needs to be corrected, compensating for which Euclidean similarity is used to find customers with similar usage patterns. Throughout such a process, we propose a method for correcting missing data and provide comparison with the preceding methods.

Considering of the Rainfall Effect in Missing Traffic Volume Data Imputation Method (누락교통량자료 보정방법에서 강우의 영향 고려)

  • Kim, Min-Heon;Oh, Ju-Sam
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.14 no.2
    • /
    • pp.1-13
    • /
    • 2015
  • Traffic volume data is basic information that is used in a wide variety of fields. Existing missing traffic volume data imputation method did not take the effect on the rainfall. This research analyzed considering of the rainfall effect in missing traffic volume data imputation method. In order to consider the effect of rainfall, established the following assumption. When missing of traffic volume data generated in rainy days it would be more accurate to use only the traffic volume data of the past rainy days. To confirm this assumption, compared for accuracy of imputed results at three kinds of imputation method(Unconditional Mean, Auto Regression, Expectation-Maximization Algorithm). The analysis results, the case on consideration of the rainfall effect was more low error occurred.

Development and Application of Imputation Technique Based on NPR for Missing Traffic Data (NPR기반 누락 교통자료 추정기법 개발 및 적용)

  • Jang, Hyeon-Ho;Han, Dong-Hui;Lee, Tae-Gyeong;Lee, Yeong-In;Won, Je-Mu
    • Journal of Korean Society of Transportation
    • /
    • v.28 no.3
    • /
    • pp.61-74
    • /
    • 2010
  • ITS (Intelligent transportation systems) collects real-time traffic data, and accumulates vest historical data. But tremendous historical data has not been managed and employed efficiently. With the introduction of data management systems like ADMS (Archived Data Management System), the potentiality of huge historical data dramatically surfs up. However, traffic data in any data management system includes missing values in nature, and one of major obstacles in applying these data has been the missing data because it makes an entire dataset useless every so often. For these reasons, imputation techniques take a key role in data management systems. To address these limitations, this paper presents a promising imputation technique which could be mounted in data management systems and robustly generates the estimations for missing values included in historical data. The developed model, based on NPR (Non-Parametric Regression) approach, employs various traffic data patterns in historical data and is designated for practical requirements such as the minimization of parameters, computational speed, the imputation of various types of missing data, and multiple imputation. The model was tested under the conditions of various missing data types. The results showed that the model outperforms reported existing approaches in the side of prediction accuracy, and meets the computational speed required to be mounted in traffic data management systems.

Missing Data Modeling based on Matrix Factorization of Implicit Feedback Dataset (암시적 피드백 데이터의 행렬 분해 기반 누락 데이터 모델링)

  • Ji, JiaQi;Chung, Yeongjee
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.5
    • /
    • pp.495-507
    • /
    • 2019
  • Data sparsity is one of the main challenges for the recommender system. The recommender system contains massive data in which only a small part is the observed data and the others are missing data. Most studies assume that missing data is randomly missing from the dataset. Therefore, they only use observed data to train recommendation model, then recommend items to users. In actual case, however, missing data do not lost randomly. In our research, treat these missing data as negative examples of users' interest. Three sample methods are seamlessly integrated into SVD++ algorithm and then propose SVD++_W, SVD++_R and SVD++_KNN algorithm. Experimental results show that proposed sample methods effectively improve the precision in Top-N recommendation over the baseline algorithms. Among the three improved algorithms, SVD++_KNN has the best performance, which shows that the KNN sample method is a more effective way to extract the negative examples of the users' interest.

Methods for screening time series data according to data quality and statistical status (품질 및 조건 기반 시계열 데이터 선별 활용 방법)

  • Moon, JaeWon;Yu, MiSeon;Oh, SeungTaek;Kum, SeungWoo;Hwang, JiSoo;Lee, JiHoon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.01a
    • /
    • pp.399-402
    • /
    • 2022
  • 본 논문에서는 불완전한 시계열 데이터를 활용하기 전 데이터를 선별하여 활용하는 방법을 소개한다. 시계열 데이터의 품질은 수집 네트워크와 수집 기기의 시간적 변화와 같은 가변적 상황에 의존적이므로 불규칙적으로 이상 혹은 누락 데이터가 발생한다. 이때 에러를 포함하였다는 이유로 일괄적으로 데이터를 제거하여 활용하지 않거나, 혹은 누락 데이터의 구간을 조건 없이 복원하여 활용한다면 원하지 않는 결과를 초래할 수 있다. 제안하는 방법은 시계열 데이터의 구간에 대한 누락 데이터의 통계적 정보를 축출하고 이에 기반하여 활용 목적과 활용 가능한 품질의 기준에 부합하지 않는다면 활용 불가능한 데이터라고 판별하고 미리 분석 등의 데이터 활용 시 자동 제외하는 구조를 제안하고 실험하였다. 제안하는 방법은 활용 목적과 상황에 적응적으로 누락 값을 포함하는 데이터의 빠른 활용 판단이 가능하며 보다 나은 분석 결과를 얻을 수 있다.

  • PDF

A Cosideration on Physical Aspects in Teleradiotherapy Chart QA (원격방사선치료 기록부의 QA 에서 물리적 측면의 고찰)

  • 강위생;허순녕
    • Progress in Medical Physics
    • /
    • v.10 no.2
    • /
    • pp.95-101
    • /
    • 1999
  • The aims of this report are to classify the incorrect data of patients and the errors of dose and dose distribution observed in QA activities on teleradiotherapy chart, and to analyze their frequency. In our department, radiation physicists check several sheets of patient chart to reduce numeric errors before starting radiation therapy and at least once a week, which include history, port diagram, MU calculation or treatment planning summary and daily treatment sheet. The observed errors are classified as followings. 1) Identity of patient, 2) Omitted or unrecorded history sheet even though not including the item related to dose, 3) Omission of port diagram, or omitted or erroneous data, 4) Erroneous calculation of MU and point dose, and important causes, 5) Loss of summary sheet of treatment planning, and erroneous data of patient in the sheet, 6) Erroneous record of radiation therapy, and errors of daily dose, port setup, MU and accumulated dose in the daily treatment sheet, 7) Errors leading inexact dose or dose distribution, errors not administerd even though its possibility, and simply recorded errors, 8) Omission of sign. Number of errors was counted rather than the number of patients. In radiotherapy chart QA from Jun 17, 1996 to Jul 31, 1999, no error of patient identity had been observed. 431 Errors in 399 patient charts had been observed and there were 405 physical errors, 9 cases of omitted or unrecorded history sheet, and 17 unsigned. There were 23 cases (5.7%) of omitted port diagram, 21 cases (5.2%) of omitted data and 73 cases (18.0 %) of erroneous data in port diagram, 13 cases (3.2 %) treated without MU calculation, 68 cases (16.3 %) of erroneous MU, 8 cases (2.0%) of erroneous point dose, 1 case (0.2 %) of omitted treatment planning summary, 11 cases (2.7%) of erroneous input of patient data, 13 cases (3.2%) of uncorrected record of treatment, 20 cases (4.9%) of discordant daily doses in MU calculation sheet and daily treatment sheet, 33 cases (8.1%) of erroneous setup, 52 cases (12.8%) of MU setting error, 61 cases (15.1%) of erroneous accumulated dose. Cases of error leading inexact dose or dose distribution were 239 (59.0 %), cases of error not administered even though its possibility were 142 (35.1 %), and cases of simply recorded error were 24 (5.9 %). The numeric errors observed in radiotherapy chart ranged over various items. Because errors observed can actually contribute to erroneous dose or dose distribution, or have the possibility to lead such errors, thorough QA activity in physical aspects of radiotherapy charts is required.

  • PDF

A STUDY ON THE ROLL-ALONG TECHNIQUE USED IN 2D ELECTRICAL RESISTIVITY SURVEYS (2차원 전기비저항 탐사에 사용되는 ROLL-ALONG 기법에 대한 고찰)

  • WonSeokHan;JongRyeolYoon
    • Journal of the Korean Geophysical Society
    • /
    • v.6 no.3
    • /
    • pp.155-164
    • /
    • 2003
  • The validity and efficiency of the roll-along technique widely used in 2-D electrical resistivity survey are analyzed in case of the dipole-dipole and the Wenner-Schlumberger arrays by numerical modelling. The shallow anomalous resistivity bodies are successfully inverted both in the dipole-dipole and in the Wenner-Schlumberger arrays because the shallow data of pseudosection are not omitted by the roll-along technique. However, the deep anomalous resistivity bodies can not be well resolved due to the skip of observed data which is more significant in the Wenner-Schlumberger array having relatively poor horizontal coverage of obtaining data. Carrying out electrical survey adopting the dipole-dipole array, the skip of data is insignificant because it is unfeasible to expand the electrodes to the maximum electrode separation coefficient($n_max$) owing to low S/N ratio. In case of the Wenner-Schlumberger array, however, because it is generally feasible to expand the electrodes $n_max$ to the owing to high S/N ratio, it is highly possible that skip of data from the roll-along technique causes significant distortion of inversion results. Therefore, adopting the Wenner-Schlumberger array having deeper median depth(Edwards, 1977) than do the dipole-dipole array on condition of the same unit electrode spacing( ($a$) ) and $n_max$, it is recommended to determine $a$ based on not $n_max$but $n_prob$free from the skip of observing data and forward electrodes with keeping overlap interval 3/4 of the survey line length in order to reduce the distortion of resistivity structure and perform resistivity survey efficiently. These results are confirmed by numerical modelling.

  • PDF

A Study on the Development of a Technique to Predict Missing Travel Speed Collected by Taxi Probe (결측 택시 Probe 통행속도 예측기법 개발에 관한 연구)

  • Yoon, Byoung Jo
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.31 no.1D
    • /
    • pp.43-50
    • /
    • 2011
  • The monitoring system for link travel speed using taxi probe is one of key sub-systems of ITS. Link travel speed collected by taxi probe has been widely employed for both monitoring the traffic states of urban road network and providing real-time travel time information. When sample size of taxi probe is small and link travel time is longer than a length of time interval to collect travel speed data, and in turn the missing state is inevitable. Under this missing state, link travel speed data is real-timely not collected. This missing state changes from single to multiple time intervals. Existing single interval prediction techniques can not generate multiple future states. For this reason, it is necessary to replace multiple missing states with the estimations generated by multi-interval prediction method. In this study, a multi-interval prediction method to generate the speed estimations of single and multiple future time step is introduced overcoming the shortcomings of short-term techniques. The model is developed based on Non-Parametric Regression (NPR), and outperformed single-interval prediction methods in terms of prediction accuracy in spite of multi-interval prediction scheme.

A Study on the Technique of Real-time Process for the Sections with Missed GPS Traffic Data (GPS 교통 정보 누락 구간의 실시간 처리 기법에 관한 연구)

  • Choi, Jin-Woo;Kim, Tae-Min;Park, Won-Sik;Yang, Young-Kyu
    • 한국공간정보시스템학회:학술대회논문집
    • /
    • 2007.06a
    • /
    • pp.177-182
    • /
    • 2007
  • 최근 텔레매틱스 분야에서 GPS 수신기를 장착한 probe car를 통해 교통 정보를 수집하는 방법에 대한 연구가 활발히 진행되고 있다. 이 방법은 기존에 교통 정보를 수집하기 위해 활용되고 있던 고정식 검지기들에 비해 수집되는 정보가 높은 신뢰성을 가지고, 도로 환경에 민감하지 않으며, 낮은 유지비용으로 운용할 수 있다는 장점을 가지고 있다. 하지만, probe car는 자신의 위치 정보를 교통 정보 센터로 전송해 주어야 하기 때문에 프라이버시가 노출될 수 있고, 주차되어 있는 시간에는 통행 정보를 보내줄 수가 없다. 이런 이유로 대중 교통차량이나 상업용 차량이 주로 probe car로 활용되어지게 되는데, 그 수가 많지 않을뿐더러 운행 구간이 고르게 분포되지 않아 probe car가 지나지 않는 구간, 즉 교통 정보 누락 구간이 존재할 수 있는 문제점을 가지고 있다. 본 논문에서는 교통 정보 누락 구간의 처리를 위해 과거의 이력 정보로 대체하는 방법, 주변 도로의 구간 정보로 예측하는 방법, 회귀 분석을 통한 예측 방법 등을 기술하고 실제 probe car들로 수집된 서울시 강남대로 구간의 자료로 각 방법에 대한 실험을 실시하여 각각의 방법에 대한 결과를 비교 분석한다.

  • PDF

Development of data supplementation algorithm of sewerage system for urban inundation modelling (도시홍수 모의를 위한 하수관망 자료 보정 알고리즘 개발)

  • Lee, Seung Soo;An, Hyun UK
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2019.05a
    • /
    • pp.63-63
    • /
    • 2019
  • 최근 기후변화로 인한 도시지역 침수 피해를 저감하기 위한 다양한 연구가 수행되고 있으며 침수해석을 위한 기초자료로써 GIS 기반 하수관망 자료 활용의 중요성이 증대되고 있다. 그러나 이러한 하수관망 자료의 대부분은 지자체 수준의 행정단위에 의해 작성/관리 되고 있으며 하수관 망의 유지보수에 중점을 두어 제작되었기 때문에 침수해석을 위한 속성자료가 누락되어 있는 경우가 상당수 존재한다. 따라서 고유의 제작 목적과 침수해석이라는 활용 목적이 일치 하지 않아 속성 데이터 값이 존재하지 않거나 침수 모델링에 필요한 필수 정보가 누락되어 개별 연구자들이 별도의 보완작업을 수행한 후 침수해석에 활용하고 있는 실정이다. 이러한 개인연구자들의 주관적 판단에 의한 하수관망의 단순화 또는 보완작업은 상황에 따라 자료의 불확실성을 증대시키며 연구자의 숙련도와 배경지식에 따라 침수 해석 결과에 많은 영향을 미치고 있다. 따라서 GIS기반 하수관망 자료를 침수 모의에 활용 가능한 입력 자료로 변환 하는 경우 개별 연구자들의 주관적 개입이 최대한 배제된 형태의 자료를 만들기 위한 기본 알고리즘 개발이 시급한 상태이다. 본 연구에서는 서울시 사당역 인근 유역과 부산시 온천천 유역의 GIS 기반 하수관망 자료의 형식에 대해서 알아보고 누락 자료를 보완하기 위한 알고리즘을 개발하였다. 개발된 알고리즘을 활용하여 누락자료가 보완된 하수관망 자료는 향후 개별 연구자들의 주관적 판단을 배제하여 도시침수 해석 시 하수관망 자료의 불확실성을 최소화 하는데 기여할 수 있을 것으로 판단된다.

  • PDF