• Title/Summary/Keyword: machine data

Search Result 6,279, Processing Time 0.037 seconds

Learning data preprocessing technique for improving indoor positioning performance based on machine learning (기계학습 기반의 실내 측위 성능 향상을 위한 학습 데이터 전처리 기법)

  • Kim, Dae-Jin;Hwang, Chi-Gon;Yoon, Chang-Pyo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.11
    • /
    • pp.1528-1533
    • /
    • 2020
  • Recently, indoor location recognition technology using Wi-Fi fingerprints has been applied and operated in various industrial fields and public services. Along with the interest in machine learning technology, location recognition technology based on machine learning using wireless signal data around a terminal is rapidly developing. At this time, in the process of collecting radio signal data required for machine learning, the accuracy of location recognition is lowered due to distorted or unsuitable data for learning. In addition, when location recognition is performed based on data collected at a specific location, a problem occurs in location recognition at surrounding locations that are not included in the learning. In this paper, we propose a learning data preprocessing technique to obtain an improved position recognition result through the preprocessing of the collected learning data.

A Reconstruction of Classification for Iris Species Using Euclidean Distance Based on a Machine Learning (머신러닝 기반 유클리드 거리를 이용한 붓꽃 품종 분류 재구성)

  • Nam, Soo-Tai;Shin, Seong-Yoon;Jin, Chan-Yong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.2
    • /
    • pp.225-230
    • /
    • 2020
  • Machine learning is an algorithm which learns a computer based on the data so that the computer can identify the trend of the data and predict the output of new input data. Machine learning can be classified into supervised learning, unsupervised learning, and reinforcement learning. Supervised learning is a way of learning a machine with given label of data. In other words, a method of inferring a function of the system through a pair of data and a label is used to predict a result using a function inferred about new input data. If the predicted value is continuous, regression analysis is used. If the predicted value is discrete, it is used as a classification. A result of analysis, no. 8 (5, 3.4, setosa), 27 (5, 3.4, setosa), 41 (5, 3.5, setosa), 44 (5, 3.5, setosa) and 40 (5.1, 3.4, setosa) in Table 3 were classified as the most similar Iris flower. Therefore, theoretical practical are suggested.

Machine learning in survival analysis (생존분석에서의 기계학습)

  • Baik, Jaiwook
    • Industry Promotion Research
    • /
    • v.7 no.1
    • /
    • pp.1-8
    • /
    • 2022
  • We investigated various types of machine learning methods that can be applied to censored data. Exploratory data analysis reveals the distribution of each feature, relationships among features. Next, classification problem has been set up where the dependent variable is death_event while the rest of the features are independent variables. After applying various machine learning methods to the data, it has been found that just like many other reports from the artificial intelligence arena random forest performs better than logistic regression. But recently well performed artificial neural network and gradient boost do not perform as expected due to the lack of data. Finally Kaplan-Meier and Cox proportional hazard model have been employed to explore the relationship of the dependent variable (ti, δi) with the independent variables. Also random forest which is used in machine learning has been applied to the survival analysis with censored data.

The Use of Unsupervised Machine Learning for the Attenuation of Seismic Noise (탄성파 자료 잡음 제거를 위한 비지도 학습 연구)

  • Kim, Sujeong;Jun, Hyunggu
    • Geophysics and Geophysical Exploration
    • /
    • v.25 no.2
    • /
    • pp.71-84
    • /
    • 2022
  • When acquiring seismic data, various types of simultaneously recorded seismic noise hinder accurate interpretation. Therefore, it is essential to attenuate this noise during the processing of seismic data and research on seismic noise attenuation. For this purpose, machine learning is extensively used. This study attempts to attenuate noise in prestack seismic data using unsupervised machine learning. Three unsupervised machine learning models, N2NUNET, PATCHUNET, and DDUL, are trained and applied to synthetic and field prestack seismic data to attenuate the noise and leave clean seismic data. The results are qualitatively and quantitatively analyzed and demonstrated that all three unsupervised learning models succeeded in removing seismic noise from both synthetic and field data. Of the three, the N2NUNET model performed the worst, and the PATCHUNET and DDUL models produced almost identical results, although the DDUL model performed slightly better.

Vacant House Prediction and Important Features Exploration through Artificial Intelligence: In Case of Gunsan (인공지능 기반 빈집 추정 및 주요 특성 분석)

  • Lim, Gyoo Gun;Noh, Jong Hwa;Lee, Hyun Tae;Ahn, Jae Ik
    • Journal of Information Technology Services
    • /
    • v.21 no.3
    • /
    • pp.63-72
    • /
    • 2022
  • The extinction crisis of local cities, caused by a population density increase phenomenon in capital regions, directly causes the increase of vacant houses in local cities. According to population and housing census, Gunsan-si has continuously shown increasing trend of vacant houses during 2015 to 2019. In particular, since Gunsan-si is the city which suffers from doughnut effect and industrial decline, problems regrading to vacant house seems to exacerbate. This study aims to provide a foundation of a system which can predict and deal with the building that has high risk of becoming vacant house through implementing a data driven vacant house prediction machine learning model. Methodologically, this study analyzes three types of machine learning model by differing the data components. First model is trained based on building register, individual declared land value, house price and socioeconomic data and second model is trained with the same data as first model but with additional POI(Point of Interest) data. Finally, third model is trained with same data as the second model but with excluding water usage and electricity usage data. As a result, second model shows the best performance based on F1-score. Random Forest, Gradient Boosting Machine, XGBoost and LightGBM which are tree ensemble series, show the best performance as a whole. Additionally, the complexity of the model can be reduced through eliminating independent variables that have correlation coefficient between the variables and vacant house status lower than the 0.1 based on absolute value. Finally, this study suggests XGBoost and LightGBM based machine learning model, which can handle missing values, as final vacant house prediction model.

Water level forecasting for extended lead times using preprocessed data with variational mode decomposition: A case study in Bangladesh

  • Shabbir Ahmed Osmani;Roya Narimani;Hoyoung Cha;Changhyun Jun;Md Asaduzzaman Sayef
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.179-179
    • /
    • 2023
  • This study suggests a new approach of water level forecasting for extended lead times using original data preprocessing with variational mode decomposition (VMD). Here, two machine learning algorithms including light gradient boosting machine (LGBM) and random forest (RF) were considered to incorporate extended lead times (i.e., 5, 10, 15, 20, 25, 30, 40, and 50 days) forecasting of water levels. At first, the original data at two water level stations (i.e., SW173 and SW269 in Bangladesh) and their decomposed data from VMD were prepared on antecedent lag times to analyze in the datasets of different lead times. Mean absolute error (MAE), root mean squared error (RMSE), and mean squared error (MSE) were used to evaluate the performance of the machine learning models in water level forecasting. As results, it represents that the errors were minimized when the decomposed datasets were considered to predict water levels, rather than the use of original data standalone. It was also noted that LGBM produced lower MAE, RMSE, and MSE values than RF, indicating better performance. For instance, at the SW173 station, LGBM outperformed RF in both decomposed and original data with MAE values of 0.511 and 1.566, compared to RF's MAE values of 0.719 and 1.644, respectively, in a 30-day lead time. The models' performance decreased with increasing lead time, as per the study findings. In summary, preprocessing original data and utilizing machine learning models with decomposed techniques have shown promising results for water level forecasting in higher lead times. It is expected that the approach of this study can assist water management authorities in taking precautionary measures based on forecasted water levels, which is crucial for sustainable water resource utilization.

  • PDF

Design of Virtual Machine for Vertex Shader (정점 셰이더의 가상 기계 구현)

  • Ha, Chang-Soo;Kim, Ju-Hong;Choi, Byeong-Yoon
    • Proceedings of the IEEK Conference
    • /
    • 2005.11a
    • /
    • pp.1003-1006
    • /
    • 2005
  • Vertex shader of GPU in personal computer is advanced in functions as to be half of traditional fixed T&L functions. And, capacity of memory for saving resources to process instructions is unlimited. GPU that can be programmed by programmer is needed for mobile system as well as personal computer. In this paper, we implement software virtual machine for vertex shader using C++ Language. Our goal is designing hardware GPU that can apply to mobile system. The virtual machine consists of nVidia GPU instructions. Input Data to virtual machine is generated by Microsoft fxc compiler. That is to say, Input Data is compiled shader program written in HLSL, Cg, or ASM. The virtual machine will be a reference model for designing hardware GPU and can be used for Testbed to test added or modified instruction.

  • PDF

Wearable Sensor-Based Biometric Gait Classification Algorithm Using WEKA

  • Youn, Ik-Hyun;Won, Kwanghee;Youn, Jong-Hoon;Scheffler, Jeremy
    • Journal of information and communication convergence engineering
    • /
    • v.14 no.1
    • /
    • pp.45-50
    • /
    • 2016
  • Gait-based classification has gained much interest as a possible authentication method because it incorporate an intrinsic personal signature that is difficult to mimic. The study investigates machine learning techniques to mitigate the natural variations in gait among different subjects. We incorporated several machine learning algorithms into this study using the data mining package called Waikato Environment for Knowledge Analysis (WEKA). WEKA's convenient interface enabled us to apply various sets of machine learning algorithms to understand whether each algorithm can capture certain distinctive gait features. First, we defined 24 gait features by analyzing three-axis acceleration data, and then selectively used them for distinguishing subjects 10 years of age or younger from those aged 20 to 40. We also applied a machine learning voting scheme to improve the accuracy of the classification. The classification accuracy of the proposed system was about 81% on average.

Error Prediction Considering the Measurement Direction in OMM System (OMM 시스템에서 측정방향을 고려한 가공물의 오차평가)

  • 최진필;이상조;권혁동
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 2002.05a
    • /
    • pp.632-635
    • /
    • 2002
  • In this paper, a general procedure to determine machine tool errors from the on-machine measurement (OMM) data is described. First, a parameterized error model of a machine tool is illustrated by approximating error components as linear function of axis positions, and a modified error model is proposed which includes backlash effects. To determine the unknown model coefficient vectors of the forward and backward error model, an artifact with 8 cutes is made and calibrated on CMM. Then, lower-left and upper-right cube corners are measured with a touch-trigger probe mounted on the machine tool spindle. Measured error data are used to determine the coefficient vectors. The positioning errors in the XY plane at the fixed z position are simulated for the forward and backward error model.

  • PDF

Part-Machine Grouping Using Production Data-based Part-Machine Incidence Matrix: Neural Network Approach - Part 2 (생산자료기반 부품-기계 행렬을 이용한 부품-기계 그룹핑 : 인공신경망 접근법 - Part 2)

  • Won, Yu-Gyeong
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2006.11a
    • /
    • pp.656-658
    • /
    • 2006
  • This study deals with the part-machine grouping (PMG) that considers realistic manufacturing factors, such as the machine duplication, operation sequences with multiple visits to the same machine, and production volumes of parts. Basically, this study is an extension of Won(2006) that has adopted fuzzy ART neural network to group parts and machines. The proposed fuzzy ART neural network algorithm is implemented with an ancillary procedure to enhance the block diagonal solution by rearranging the order of input presentation. Computational experiments applied to large-size PMG data sets with a psuedo-replicated clustering procedure show effectiveness of the proposed approach.

  • PDF