• Title/Summary/Keyword: split data

Search Result 596, Processing Time 0.026 seconds

Tree-Structured Nonlinear Regression

  • Chang, Young-Jae;Kim, Hyeon-Soo
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.5
    • /
    • pp.759-768
    • /
    • 2011
  • Tree algorithms have been widely developed for regression problems. One of the good features of a regression tree is the flexibility of fitting because it can correctly capture the nonlinearity of data well. Especially, data with sudden structural breaks such as the price of oil and exchange rates could be fitted well with a simple mixture of a few piecewise linear regression models. Now that split points are determined by chi-squared statistics related with residuals from fitting piecewise linear models and the split variable is chosen by an objective criterion, we can get a quite reasonable fitting result which goes in line with the visual interpretation of data. The piecewise linear regression by a regression tree can be used as a good fitting method, and can be applied to a dataset with much fluctuation.

A Study on Multi-Object Data Split Technique for Deep Learning Model Efficiency (딥러닝 효율화를 위한 다중 객체 데이터 분할 학습 기법)

  • Jong-Ho Na;Jun-Ho Gong;Hyu-Soung Shin;Il-Dong Yun
    • Tunnel and Underground Space
    • /
    • v.34 no.3
    • /
    • pp.218-230
    • /
    • 2024
  • Recently, many studies have been conducted for safety management in construction sites by incorporating computer vision. Anchor box parameters are used in state-of-the-art deep learning-based object detection and segmentation, and the optimized parameters are critical in the training process to ensure consistent accuracy. Those parameters are generally tuned by fixing the shape and size by the user's heuristic method, and a single parameter controls the training rate in the model. However, the anchor box parameters are sensitive depending on the type of object and the size of the object, and as the number of training data increases. There is a limit to reflecting all the characteristics of the training data with a single parameter. Therefore, this paper suggests a method of applying multiple parameters optimized through data split to solve the above-mentioned problem. Criteria for efficiently segmenting integrated training data according to object size, number of objects, and shape of objects were established, and the effectiveness of the proposed data split method was verified through a comparative study of conventional scheme and proposed methods.

A Study on Comparison of Satellite-Tracked Drifter Temperature with Satellite-Derived Sea Surface Temperature of NOAA/NESDIS

  • Park, Kyung-Ae;Chung, Joug-Yul;Kim, Kuh;Choi, Byung-Ho
    • Korean Journal of Remote Sensing
    • /
    • v.10 no.2
    • /
    • pp.83-107
    • /
    • 1994
  • Sea surface temperatures (SSTs) estimated by using the operational SST derivation equations of NOAA/NESDIS were compared with satellite-tracked drifter temperatures. As a result of eliminating cloud-filled or contaminated pixels through several cloud tests, 69 matchup points between the drifter temperatures and the SSTs estimated with NOAA satellite 9, 10. 11 and 12 data from August, 1993 to July, 1994 were collected. Multi-channel sea surface temperature(MCSST) using a split window technique showed an approximately $1.0{\circ}C$ rms error as compared with the drifting buoy temperatures for 69 coincidences. Accuracies for satellete-derived sea surface temperatures were evaluated for only NOAA-11 AVHRR data which had relatively large matchups of 35points as compared with other satellites. For the comparison of the oberved temperatures with the calculated SSTs, linear MCSST and nonlinear cross product sea surface temperature(CPSST) algorithms by the split, the dual and the triple window technique were used respectively. As a result, the split window CPSSTs showed the smallest rms error of $0.72{\circ}C$. Defferences between the split window SSTs and the drifter temperatures appeared th have a linear tendency against the drifter temperatures and also against the differences between AVHRR channel 4 and 5 brighness temperatures. This indicates some possibilities that satelite-derived SSTs operationally calculated from the NOAA/NESDIS equation in the seas around Korea have been underestimated as compared with actural SSTs in case sea water temperature is relatively low or the atmosphere over the sea surface is very dry like in winter, while overstimated in case of high temperature or very moist atmospheric equations based on local sea measurements around Korea instead of global measurements should be derived.

Spatial Statistic Data Release Based on Differential Privacy

  • Cai, Sujin;Lyu, Xin;Ban, Duohan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.10
    • /
    • pp.5244-5259
    • /
    • 2019
  • With the continuous development of LBS (Location Based Service) applications, privacy protection has become an urgent problem to be solved. Differential privacy technology is based on strict mathematical theory that provides strong privacy guarantees where it supposes that the attacker has the worst-case background knowledge and that knowledge has been applied to different research directions such as data query, release, and mining. The difficulty of this research is how to ensure data availability while protecting privacy. Spatial multidimensional data are usually released by partitioning the domain into disjointed subsets, then generating a hierarchical index. The traditional data-dependent partition methods need to allocate a part of the privacy budgets for the partitioning process and split the budget among all the steps, which is inefficient. To address such issues, a novel two-step partition algorithm is proposed. First, we partition the original dataset into fixed grids, inject noise and synthesize a dataset according to the noisy count. Second, we perform IH-Tree (Improved H-Tree) partition on the synthetic dataset and use the resulting partition keys to split the original dataset. The algorithm can save the privacy budget allocated to the partitioning process and obtain a more accurate release. The algorithm has been tested on three real-world datasets and compares the accuracy with the state-of-the-art algorithms. The experimental results show that the relative errors of the range query are considerably reduced, especially on the large scale dataset.

The Dynamic Split Policy of the KDB-Tree in Moving Objects Databases (이동 객체 데이타베이스에서 KDB-tree의 동적 분할 정책)

  • Lim Duk-Sung;Lee Chang-Heun;Hong Bong-Hee
    • Journal of KIISE:Databases
    • /
    • v.33 no.4
    • /
    • pp.396-408
    • /
    • 2006
  • Moving object databases manage a large amount of past location data which are accumulated as the time goes. To retrieve fast the past location of moving objects, we need index structures which consider features of moving objects. The KDB-tree has a good performance in processing range queries. Although we use the KDB-tree as an index structure for moving object databases, there has an over-split problem in the spatial domain since the feature of moving object databases is to increase the time domain. Because the over-split problem reduces spatial regions in the MBR of nodes inverse proportion to the number of splits, there has a problem that the cost for processing spatial-temporal range queries is increased. In this paper, we propose the dynamic split strategy of the KDB-tree to process efficiently the spatial-temporal range queries. The dynamic split strategy uses the space priority splitting method for choosing the split domain, the recent time splitting policy for splitting a point page to maximize the space utilization, and the last division policy for splitting a region page. We compare the performance of proposed dynamic split strategy with the 3DR-tree, the MV3R-tree, and the KDB-tree. In our performance study for range queries, the number of node access in the MKDB-tree is average 30% less than compared index structures.

Random Channel Allocation Scheme Based on Split Algorithm in HIPERLAN 2 (HIPERLAN Type 2에서 Split 알고리즘에 기반한 랜덤채널 할당 기법)

  • 황의석;고유창;이승규;윤철식;이형우;조충호
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.28 no.9A
    • /
    • pp.717-727
    • /
    • 2003
  • The HIPERLAN/2(HIgh PERformance Local Area Network Type2) is one of the wireless LAN standards for providing raw data rates of up to 54 Mbps. The MAC protocol of HIPERLAN/2 is based on TDMA/TDD, and resources in one MAC frame can be allocated dynamically by Access Point(AP). The random channel(RCH) is defined for the purpose of giving a mobile terminal the opportunity to request transmission resources in the uplink MAC frames. It is desirable that the number of RCHs is dynamically adapted by the AP depending on the current traffic situation. Allocation of excessive RCHs may waste radio resources and insufficient RCHs compared to traffic loads may result in many collisions in access attempts. We propose an RCH allocation scheme based on split algorithm in HIPERLAN/2. The simulation and analytic results show that the proposed scheme achieves a higher channel throughput, lower access delay and delay jitter than previously proposed RCH allocation schemes.

Study on the Split Hopkinson Pressure Bar Apparatus for Measuring High-strain Rate Tensile Properties of Plastic Material (플라스틱 소재의 고 변형률 인장특성 평가를 위한 홉킨스바(Split Hopkinson Pressure Bar) 측정 장비에 관한 연구)

  • Han, In-Soo;Lee, Se-Min;Kim, Kyu-Won;Kim, Hak-Sung
    • Composites Research
    • /
    • v.35 no.3
    • /
    • pp.196-200
    • /
    • 2022
  • Split Hopkinson Pressure Bar (SHPB) is a general test equipment for measuring the mechanical properties of high modulus metal and composite materials at high strain rate. However, for the soft plastic material, it is difficult to hold the specimen and achieve dynamic stress equilibrium due to the weak transmitted signals. In this study, SHPB test apparatus were designed to measure accurately the high strain rate stress-strain curve of the soft plastic materials by changing the incident bar materials and the shape of the specimen holder parts. In addition, to verify the high strain-rate tensile strain data obtained from SHPB, the strain distribution of the specimen was measured and analyzed with a high-speed camera and the digital image correlation (DIC), which was compared with the strain history measured from SHPB.

Retrieval of emissivity and land surface temperature from MODIS

  • Suh Myoung-Seok;Kang Jeon-Ho;Kim So-Hee;Kwak Chong-Heum
    • Proceedings of the KSRS Conference
    • /
    • 2005.10a
    • /
    • pp.165-168
    • /
    • 2005
  • In this study, emissivity and land surface temperature (LST) were retrieved using the previously developed algorithms and Aqua/MODIS data. And sensitivity of estimated emissivity and LST to the predefined values, such as land cover, normalized difference vegetation index (NOVI) and spectral emissivity were investigated. The methods used for emissivity and LST were vegetation cover method (VCM) and four different split-window algorithms. The spectral emissivity retrieved by VCM was not sensitive to the NOVI error but more sensitive to the land cover error. The comparison of LST showed that the LST was systematically different without regard to the land cover and season. And the LST was very sensitive to the emissivity error excepting the Uliveri et al. This preliminary result indicates that more works are needed for the retrieval of reliable LST from satellite data.

  • PDF

Vehicle Trajectory Control using Fuzzy Logic Controller (퍼지논리제어기를 이용한 차량의 궤적제어)

  • 이승종;조현욱
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.20 no.11
    • /
    • pp.91-99
    • /
    • 2003
  • When the driver suddenly depresses the brake pedal under critical conditions, the desired trajectory of the vehicle can be changed. In this study, the vehicle dynamics and fuzzy logic controller are used to control the vehicle trajectory. The dynamic vehicle model consists of the engine, the rotational wheel, chassis, tires and brakes. The engine model is derived from the engine experimental data. The engine torque makes the wheel rotate and generates the angular velocity and acceleration of the wheel. The dynamic equation of the vehicle model is derived from the top-view vehicle model using Newton's second law. The Pacejka tire model formulated from the experimental data is used. The fuzzy logic controller is developed to compensate for the trajectory error of the vehicle. This fuzzy logic controller individually acts on the front right, front left, rear right and rear left brakes and regulates each brake torque. The fuzzy logic controlling each brake works to compensate for the trajectory error on the split - $\mu$ road conditions follows the desired trajectory.

A Study on Photolysis of Aromatic Diazonium Salt (방향족 디아조늄염의 광분해에 관한 연구)

  • 이형관
    • Journal of the Korean Graphic Arts Communication Society
    • /
    • v.12 no.1
    • /
    • pp.93-105
    • /
    • 1994
  • A new ink transfer model based on the physical mechanism for the maximum ink transfer rate is proposed, and examined by the experimental data of P.J Mangin et, al. for the relations of the maximum ink transfer rates to the printing pressure, the speed and the roughness of paper substrates. The free ink split coefficient and immobilized ink under the maximum ink transfer rate are calculated by the new model and the experimental data. It is concluded that the new model is very useful, and the free ink split coefficient and the immobilized ink are inversely propotional and propotional to the paper roughness respectively and both are saturated eventually under the critical values.

  • PDF