• Title/Summary/Keyword: Generate Data

Search Result 3,065, Processing Time 0.033 seconds

Class 1·3 Vehicle Classification Using Deep Learning and Thermal Image (열화상 카메라를 활용한 딥러닝 기반의 1·3종 차량 분류)

  • Jung, Yoo Seok;Jung, Do Young
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.19 no.6
    • /
    • pp.96-106
    • /
    • 2020
  • To solve the limitation of traffic monitoring that occur from embedded sensor such as loop and piezo sensors, the thermal imaging camera was installed on the roadside. As the length of Class 1(passenger car) is getting longer, it is becoming difficult to classify from Class 3(2-axle truck) by using an embedded sensor. The collected images were labeled to generate training data. A total of 17,536 vehicle images (640x480 pixels) training data were produced. CNN (Convolutional Neural Network) was used to achieve vehicle classification based on thermal image. Based on the limited data volume and quality, a classification accuracy of 97.7% was achieved. It shows the possibility of traffic monitoring system based on AI. If more learning data is collected in the future, 12-class classification will be possible. Also, AI-based traffic monitoring will be able to classify not only 12-class, but also new various class such as eco-friendly vehicles, vehicle in violation, motorcycles, etc. Which can be used as statistical data for national policy, research, and industry.

Generalization of the Extreme Floods for Various Sizes of Ungauged Watersheds Using Generated Streamflow Data (생성된 유량자료를 활용한 미계측유역 극한 홍수 범위 일반화)

  • Yang, Zhipeng;Jung, Yong
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.42 no.5
    • /
    • pp.627-637
    • /
    • 2022
  • To know the magnitudes of extreme floods for various sizes of watersheds, massive streamflow data is fundamentally required. However, small/medium-size watersheds missed streamflow data because of the lack of gauge stations. In this study, the Streamflow Propagation Method (SPM) was applied to generate streamflow data for small/medium size watersheds with no measurements. Based on the generated streamflow data for ungauged watersheds at three different locations (i.e., Chungju Dam (CJD), Seomjin Dam (SJD), and Andong Dam (ADD) watersheds), the scale ranges of extreme floods were evaluated for different sizes of ungauged watersheds by using the specific flood distribution analysis. As a general result, a range of specific floods decreases with increasing watershed size. The distribution of the specific flood in the same size of a watershed possibly depends on the size and topography of the watershed area. The delivered equations were compared to show the relations between the specific flood and sizes of watersheds. In the comparisons of equations, the Creager envelope curve has the higher potential to represent the maximum flood distribution for each watershed. For the generalization of the maximum flood distribution for three watersheds, optimized envelop curves are obtained with lower RMSE than that of Creager envelope curve.

Automatic Drawing and Structural Editing of Road Lane Markings for High-Definition Road Maps (정밀도로지도 제작을 위한 도로 노면선 표시의 자동 도화 및 구조화)

  • Choi, In Ha;Kim, Eui Myoung
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.39 no.6
    • /
    • pp.363-369
    • /
    • 2021
  • High-definition road maps are used as the basic infrastructure for autonomous vehicles, so the latest road information must be quickly reflected. However, the current drawing and structural editing process of high-definition road maps are manually performed. In addition, it takes the longest time to generate road lanes, which are the main construction targets. In this study, the point cloud of the road lane markings, in which color types(white, blue, and yellow) were predicted through the PointNet model pre-trained in previous studies, were used as input data. Based on the point cloud, this study proposed a methodology for automatically drawing and structural editing of the layer of road lane markings. To verify the usability of the 3D vector data constructed through the proposed methodology, the accuracy was analyzed according to the quality inspection criteria of high-definition road maps. In the positional accuracy test of the vector data, the RMSE (Root Mean Square Error) for horizontal and vertical errors were within 0.1m to verify suitability. In the structural editing accuracy test of the vector data, the structural editing accuracy of the road lane markings type and kind were 88.235%, respectively, and the usability was verified. Therefore, it was found that the methodology proposed in this study can efficiently construct vector data of road lanes for high-definition road maps.

Performance Analysis of Trading Strategy using Gradient Boosting Machine Learning and Genetic Algorithm

  • Jang, Phil-Sik
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.11
    • /
    • pp.147-155
    • /
    • 2022
  • In this study, we developed a system to dynamically balance a daily stock portfolio and performed trading simulations using gradient boosting and genetic algorithms. We collected various stock market data from stocks listed on the KOSPI and KOSDAQ markets, including investor-specific transaction data. Subsequently, we indexed the data as a preprocessing step, and used feature engineering to modify and generate variables for training. First, we experimentally compared the performance of three popular gradient boosting algorithms in terms of accuracy, precision, recall, and F1-score, including XGBoost, LightGBM, and CatBoost. Based on the results, in a second experiment, we used a LightGBM model trained on the collected data along with genetic algorithms to predict and select stocks with a high daily probability of profit. We also conducted simulations of trading during the period of the testing data to analyze the performance of the proposed approach compared with the KOSPI and KOSDAQ indices in terms of the CAGR (Compound Annual Growth Rate), MDD (Maximum Draw Down), Sharpe ratio, and volatility. The results showed that the proposed strategies outperformed those employed by the Korean stock market in terms of all performance metrics. Moreover, our proposed LightGBM model with a genetic algorithm exhibited competitive performance in predicting stock price movements.

Card Transaction Data-based Deep Tourism Recommendation Study (카드 데이터 기반 심층 관광 추천 연구)

  • Hong, Minsung;Kim, Taekyung;Chung, Namho
    • Knowledge Management Research
    • /
    • v.23 no.2
    • /
    • pp.277-299
    • /
    • 2022
  • The massive card transaction data generated in the tourism industry has become an important resource that implies tourist consumption behaviors and patterns. Based on the transaction data, developing a smart service system becomes one of major goals in both tourism businesses and knowledge management system developer communities. However, the lack of rating scores, which is the basis of traditional recommendation techniques, makes it hard for system designers to evaluate a learning process. In addition, other auxiliary factors such as temporal, spatial, and demographic information are needed to increase the performance of a recommendation system; but, gathering those are not easy in the card transaction context. In this paper, we introduce CTDDTR, a novel approach using card transaction data to recommend tourism services. It consists of two main components: i) Temporal preference Embedding (TE) represents tourist groups and services into vectors through Doc2Vec. And ii) Deep tourism Recommendation (DR) integrates the vectors and the auxiliary factors from a tourism RDF (resource description framework) through MLP (multi-layer perceptron) to provide services to tourist groups. In addition, we adopt RFM analysis from the field of knowledge management to generate explicit feedback (i.e., rating scores) used in the DR part. To evaluate CTDDTR, the card transactions data that happened over eight years on Jeju island is used. Experimental results demonstrate that the proposed method is more positive in effectiveness and efficacies.

Development of Homogenization Data-based Transfer Learning Framework to Predict Effective Mechanical Properties and Thermal Conductivity of Foam Structures (폼 구조의 유효 기계적 물성 및 열전도율 예측을 위한 균질화 데이터 기반 전이학습 프레임워크의 개발)

  • Wonjoo Lee;Suhan Kim;Hyun Jong Sim;Ju Ho Lee;Byeong Hyeok An;Yu Jung Kim;Sang Yung Jeong;Hyunseong Shin
    • Composites Research
    • /
    • v.36 no.3
    • /
    • pp.205-210
    • /
    • 2023
  • In this study, we developed a transfer learning framework based on homogenization data for efficient prediction of the effective mechanical properties and thermal conductivity of cellular foam structures. Mean-field homogenization (MFH) based on the Eshelby's tensor allows for efficient prediction of properties in porous structures including ellipsoidal inclusions, but accurately predicting the properties of cellular foam structures is challenging. On the other hand, finite element homogenization (FEH) is more accurate but comes with relatively high computational cost. In this paper, we propose a data-driven transfer learning framework that combines the advantages of mean-field homogenization and finite element homogenization. Specifically, we generate a large amount of mean-field homogenization data to build a pre-trained model, and then fine-tune it using a relatively small amount of finite element homogenization data. Numerical examples were conducted to validate the proposed framework and verify the accuracy of the analysis. The results of this study are expected to be applicable to the analysis of materials with various foam structures.

Soil Moisture Estimation Using KOMPSAT-3 and KOMPSAT-5 SAR Images and Its Validation: A Case Study of Western Area in Jeju Island (KOMPSAT-3와 KOMPSAT-5 SAR 영상을 이용한 토양수분 산정과 결과 검증: 제주 서부지역 사례 연구)

  • Jihyun Lee;Hayoung Lee;Kwangseob Kim;Kiwon Lee
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_1
    • /
    • pp.1185-1193
    • /
    • 2023
  • The increasing interest in soil moisture data from satellite imagery for applications in hydrology, meteorology, and agriculture has led to the development of methods to produce variable-resolution soil moisture maps. Research on accurate soil moisture estimation using satellite imagery is essential for remote sensing applications. The purpose of this study is to generate a soil moisture estimation map for a test area using KOMPSAT-3/3A and KOMPSAT-5 SAR imagery and to quantitatively compare the results with soil moisture data from the Soil Moisture Active Passive (SMAP) mission provided by NASA, with a focus on accuracy validation. In addition, the Korean Environmental Geographic Information Service (EGIS) land cover map was used to determine soil moisture, especially in agricultural and forested regions. The selected test area for this study is the western part of Jeju, South Korea, where input data were available for the soil moisture estimation algorithm based on the Water Cloud Model (WCM). Synthetic Aperture Radar (SAR) imagery from KOMPSAT-5 HV and Sentinel-1 VV were used for soil moisture estimation, while vegetation indices were calculated from the surface reflectance of KOMPSAT-3 imagery. Comparison of the derived soil moisture results with SMAP (L-3) and SMAP (L-4) data by differencing showed a mean difference of 4.13±3.60 p% and 14.24±2.10 p%, respectively, indicating a level of agreement. This research suggests the potential for producing highly accurate and precise soil moisture maps using future South Korean satellite imagery and publicly available data sources, as demonstrated in this study.

An Analysis of Big Video Data with Cloud Computing in Ubiquitous City (클라우드 컴퓨팅을 이용한 유시티 비디오 빅데이터 분석)

  • Lee, Hak Geon;Yun, Chang Ho;Park, Jong Won;Lee, Yong Woo
    • Journal of Internet Computing and Services
    • /
    • v.15 no.3
    • /
    • pp.45-52
    • /
    • 2014
  • The Ubiquitous-City (U-City) is a smart or intelligent city to satisfy human beings' desire to enjoy IT services with any device, anytime, anywhere. It is a future city model based on Internet of everything or things (IoE or IoT). It includes a lot of video cameras which are networked together. The networked video cameras support a lot of U-City services as one of the main input data together with sensors. They generate huge amount of video information, real big data for the U-City all the time. It is usually required that the U-City manipulates the big data in real-time. And it is not easy at all. Also, many times, it is required that the accumulated video data are analyzed to detect an event or find a figure among them. It requires a lot of computational power and usually takes a lot of time. Currently we can find researches which try to reduce the processing time of the big video data. Cloud computing can be a good solution to address this matter. There are many cloud computing methodologies which can be used to address the matter. MapReduce is an interesting and attractive methodology for it. It has many advantages and is getting popularity in many areas. Video cameras evolve day by day so that the resolution improves sharply. It leads to the exponential growth of the produced data by the networked video cameras. We are coping with real big data when we have to deal with video image data which are produced by the good quality video cameras. A video surveillance system was not useful until we find the cloud computing. But it is now being widely spread in U-Cities since we find some useful methodologies. Video data are unstructured data thus it is not easy to find a good research result of analyzing the data with MapReduce. This paper presents an analyzing system for the video surveillance system, which is a cloud-computing based video data management system. It is easy to deploy, flexible and reliable. It consists of the video manager, the video monitors, the storage for the video images, the storage client and streaming IN component. The "video monitor" for the video images consists of "video translater" and "protocol manager". The "storage" contains MapReduce analyzer. All components were designed according to the functional requirement of video surveillance system. The "streaming IN" component receives the video data from the networked video cameras and delivers them to the "storage client". It also manages the bottleneck of the network to smooth the data stream. The "storage client" receives the video data from the "streaming IN" component and stores them to the storage. It also helps other components to access the storage. The "video monitor" component transfers the video data by smoothly streaming and manages the protocol. The "video translator" sub-component enables users to manage the resolution, the codec and the frame rate of the video image. The "protocol" sub-component manages the Real Time Streaming Protocol (RTSP) and Real Time Messaging Protocol (RTMP). We use Hadoop Distributed File System(HDFS) for the storage of cloud computing. Hadoop stores the data in HDFS and provides the platform that can process data with simple MapReduce programming model. We suggest our own methodology to analyze the video images using MapReduce in this paper. That is, the workflow of video analysis is presented and detailed explanation is given in this paper. The performance evaluation was experiment and we found that our proposed system worked well. The performance evaluation results are presented in this paper with analysis. With our cluster system, we used compressed $1920{\times}1080(FHD)$ resolution video data, H.264 codec and HDFS as video storage. We measured the processing time according to the number of frame per mapper. Tracing the optimal splitting size of input data and the processing time according to the number of node, we found the linearity of the system performance.

Rough Set Analysis for Stock Market Timing (러프집합분석을 이용한 매매시점 결정)

  • Huh, Jin-Nyung;Kim, Kyoung-Jae;Han, In-Goo
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.77-97
    • /
    • 2010
  • Market timing is an investment strategy which is used for obtaining excessive return from financial market. In general, detection of market timing means determining when to buy and sell to get excess return from trading. In many market timing systems, trading rules have been used as an engine to generate signals for trade. On the other hand, some researchers proposed the rough set analysis as a proper tool for market timing because it does not generate a signal for trade when the pattern of the market is uncertain by using the control function. The data for the rough set analysis should be discretized of numeric value because the rough set only accepts categorical data for analysis. Discretization searches for proper "cuts" for numeric data that determine intervals. All values that lie within each interval are transformed into same value. In general, there are four methods for data discretization in rough set analysis including equal frequency scaling, expert's knowledge-based discretization, minimum entropy scaling, and na$\ddot{i}$ve and Boolean reasoning-based discretization. Equal frequency scaling fixes a number of intervals and examines the histogram of each variable, then determines cuts so that approximately the same number of samples fall into each of the intervals. Expert's knowledge-based discretization determines cuts according to knowledge of domain experts through literature review or interview with experts. Minimum entropy scaling implements the algorithm based on recursively partitioning the value set of each variable so that a local measure of entropy is optimized. Na$\ddot{i}$ve and Booleanreasoning-based discretization searches categorical values by using Na$\ddot{i}$ve scaling the data, then finds the optimized dicretization thresholds through Boolean reasoning. Although the rough set analysis is promising for market timing, there is little research on the impact of the various data discretization methods on performance from trading using the rough set analysis. In this study, we compare stock market timing models using rough set analysis with various data discretization methods. The research data used in this study are the KOSPI 200 from May 1996 to October 1998. KOSPI 200 is the underlying index of the KOSPI 200 futures which is the first derivative instrument in the Korean stock market. The KOSPI 200 is a market value weighted index which consists of 200 stocks selected by criteria on liquidity and their status in corresponding industry including manufacturing, construction, communication, electricity and gas, distribution and services, and financing. The total number of samples is 660 trading days. In addition, this study uses popular technical indicators as independent variables. The experimental results show that the most profitable method for the training sample is the na$\ddot{i}$ve and Boolean reasoning but the expert's knowledge-based discretization is the most profitable method for the validation sample. In addition, the expert's knowledge-based discretization produced robust performance for both of training and validation sample. We also compared rough set analysis and decision tree. This study experimented C4.5 for the comparison purpose. The results show that rough set analysis with expert's knowledge-based discretization produced more profitable rules than C4.5.

The Efficiency Analysis of CRM System in the Hotel Industry Using DEA (DEA를 이용한 호텔 관광 서비스 업계의 CRM 도입 효율성 분석)

  • Kim, Tai-Young;Seol, Kyung-Jin;Kwak, Young-Dai
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.91-110
    • /
    • 2011
  • This paper analyzes the cases where the hotels have increased their services and enhanced their work process through IT solutions to cope with computerization globalization. Also the cases have been studies where national hotels use the CRM solution internally to respond effectively to customers requests, increase customer analysis, and build marketing strategies. In particular, this study discusses the introduction of the CRM solutions and CRM sales business and marketing services using a process for utilizing the presumed, CRM by introducing effective DEA(Data Envelopment Analysis). First, the comparison has done regarding the relative efficiency of L Company with the CCR model, then compared L Company's restaurants and facilities' effectiveness through BCC model. L Company reached a conclusion that it is important to precisely create and manage sales data which are the preliminary data for CRM, and for that reason it made it possible to save sales data generated by POS system on each sales performance database. In order to do that, it newly established Oracle POS system and LORIS POS system concerned with restaurants for food and beverage as well as rooms, and made it possible to stably generate and manage sales data and manage. Moreover, it set up a composite database to control comprehensively the results of work processes during a specific period by collecting customer registration information and made it possible to systematically control the information on sales performances. By establishing a system which unifies database and managing it comprehensively, impeccability of data has been greatly enhanced and a problem which generated asymmetric data could be thoroughly solved. Using data accumulated on the comprehensive database, sales data can be analyzed, categorized, classified through data mining engine imbedded in Polaris CRM and the results can be organized on data mart to provide them in the form of CRM application data. By transforming original sales data into forms which are easy to handle and saving them on data mart separately, it enabled acquiring well-organized data with ease when engaging in various marketing operations, holding a morning meeting and working on decision-making. By using summarized data at data mart, it was possible to process marketing operations such as telemarketing, direct mailing, internet marketing service and service product developments for perceived customers; moreover, information on customer perceptions which is one of CRM's end-products could feed back into the comprehensive database. This research was undertaken to find out how effectively CRM has been employed by comparing and analyzing the management performance of each enterprise site and store after introducing CRM to Hotel enterprises using DEA technique. According to the research results, efficiency evaluation for each site was calculated through input and output factors to find out comparative CRM system usage efficiency of L's Company four sites; moreover, with regard to stores, the sizes of workforce and budget application show a huge difference and so does the each store efficiency. Furthermore, by using the DEA technique, it could assess which sites have comparatively high efficiency and which don't by comparing and evaluating hotel enterprises IT project outcomes such as CRM introduction using the CCR model for each site of the related enterprises. By using the BCC model, it could comparatively evaluate the outcome of CRM usage at each store of A site, which is representative of L Company, and as a result, it could figure out which stores maintain high efficiency in using CRM and which don't. It analyzed the cases of CRM introduction at L Company, which is a hotel enterprise, and precisely evaluated them through DEA. L Company analyzed the customer analysis system by introducing CRM and achieved to provide customers identified through client analysis data with one to one tailored services. Moreover, it could come up with a plan to differentiate the service for customers who revisit by assessing customer discernment rate. As tasks to be solved in the future, it is required to do research on the process analysis which can lead to a specific outcome such as increased sales volumes by carrying on test marketing, target marketing using CRM. Furthermore, it is also necessary to do research on efficiency evaluation in accordance with linkages between other IT solutions such as ERP and CRM system.