• Title/Summary/Keyword: Generate Data

Search Result 3,065, Processing Time 0.033 seconds

Anomaly Detection Methodology Based on Multimodal Deep Learning (멀티모달 딥 러닝 기반 이상 상황 탐지 방법론)

  • Lee, DongHoon;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.101-125
    • /
    • 2022
  • Recently, with the development of computing technology and the improvement of the cloud environment, deep learning technology has developed, and attempts to apply deep learning to various fields are increasing. A typical example is anomaly detection, which is a technique for identifying values or patterns that deviate from normal data. Among the representative types of anomaly detection, it is very difficult to detect a contextual anomaly that requires understanding of the overall situation. In general, detection of anomalies in image data is performed using a pre-trained model trained on large data. However, since this pre-trained model was created by focusing on object classification of images, there is a limit to be applied to anomaly detection that needs to understand complex situations created by various objects. Therefore, in this study, we newly propose a two-step pre-trained model for detecting abnormal situation. Our methodology performs additional learning from image captioning to understand not only mere objects but also the complicated situation created by them. Specifically, the proposed methodology transfers knowledge of the pre-trained model that has learned object classification with ImageNet data to the image captioning model, and uses the caption that describes the situation represented by the image. Afterwards, the weight obtained by learning the situational characteristics through images and captions is extracted and fine-tuning is performed to generate an anomaly detection model. To evaluate the performance of the proposed methodology, an anomaly detection experiment was performed on 400 situational images and the experimental results showed that the proposed methodology was superior in terms of anomaly detection accuracy and F1-score compared to the existing traditional pre-trained model.

A Study for Generation of Artificial Lunar Topography Image Dataset Using a Deep Learning Based Style Transfer Technique (딥러닝 기반 스타일 변환 기법을 활용한 인공 달 지형 영상 데이터 생성 방안에 관한 연구)

  • Na, Jong-Ho;Lee, Su-Deuk;Shin, Hyu-Soung
    • Tunnel and Underground Space
    • /
    • v.32 no.2
    • /
    • pp.131-143
    • /
    • 2022
  • The lunar exploration autonomous vehicle operates based on the lunar topography information obtained from real-time image characterization. For highly accurate topography characterization, a large number of training images with various background conditions are required. Since the real lunar topography images are difficult to obtain, it should be helpful to be able to generate mimic lunar image data artificially on the basis of the planetary analogs site images and real lunar images available. In this study, we aim to artificially create lunar topography images by using the location information-based style transfer algorithm known as Wavelet Correct Transform (WCT2). We conducted comparative experiments using lunar analog site images and real lunar topography images taken during China's and America's lunar-exploring projects (i.e., Chang'e and Apollo) to assess the efficacy of our suggested approach. The results show that the proposed techniques can create realistic images, which preserve the topography information of the analog site image while still showing the same condition as an image taken on lunar surface. The proposed algorithm also outperforms a conventional algorithm, Deep Photo Style Transfer (DPST) in terms of temporal and visual aspects. For future work, we intend to use the generated styled image data in combination with real image data for training lunar topography objects to be applied for topographic detection and segmentation. It is expected that this approach can significantly improve the performance of detection and segmentation models on real lunar topography images.

Descent Dataset Generation and Landmark Extraction for Terrain Relative Navigation on Mars (화성 지형상대항법을 위한 하강 데이터셋 생성과 랜드마크 추출 방법)

  • Kim, Jae-In
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_1
    • /
    • pp.1015-1023
    • /
    • 2022
  • The Entry-Descent-Landing process of a lander involves many environmental and technical challenges. To solve these problems, recently, terrestrial relative navigation (TRN) technology has been essential for landers. TRN is a technology for estimating the position and attitude of a lander by comparing Inertial Measurement Unit (IMU) data and image data collected from a descending lander with pre-built reference data. In this paper, we present a method for generating descent dataset and extracting landmarks, which are key elements for developing TRN technologies to be used on Mars. The proposed method generates IMU data of a descending lander using a simulated Mars landing trajectory and generates descent images from high-resolution ortho-map and digital elevation map through a ray tracing technique. Landmark extraction is performed by an area-based extraction method due to the low-textured surfaces on Mars. In addition, search area reduction is carried out to improve matching accuracy and speed. The performance evaluation result for the descent dataset generation method showed that the proposed method can generate images that satisfy the imaging geometry. The performance evaluation result for the landmark extraction method showed that the proposed method ensures several meters of positioning accuracy while ensuring processing speed as fast as the feature-based methods.

A Comparison of Analysis Methods for Work Environment Measurement Databases Including Left-censored Data (불검출 자료를 포함한 작업환경측정 자료의 분석 방법 비교)

  • Park, Ju-Hyun;Choi, Sangjun;Koh, Dong-Hee;Park, Donguk;Sung, Yeji
    • Journal of Korean Society of Occupational and Environmental Hygiene
    • /
    • v.32 no.1
    • /
    • pp.21-30
    • /
    • 2022
  • Objectives: The purpose of this study is to suggest an optimal method by comparing the analysis methods of work environment measurement datasets including left-censored data where one or more measurements are below the limit of detection (LOD). Methods: A computer program was used to generate left-censored datasets for various combinations of censoring rate (1% to 90%) and sample size (30 to 300). For the analysis of the censored data, the simple substitution method (LOD/2), β-substitution method, maximum likelihood estimation (MLE) method, Bayesian method, and regression on order statistics (ROS)were all compared. Each method was used to estimate four parameters of the log-normal distribution: (1) geometric mean (GM), (2) geometric standard deviation (GSD), (3) 95th percentile (X95), and (4) arithmetic mean (AM) for the censored dataset. The performance of each method was evaluated using relative bias and relative root mean squared error (rMSE). Results: In the case of the largest sample size (n=300), when the censoring rate was less than 40%, the relative bias and rMSE were small for all five methods. When the censoring rate was large (70%, 90%), the simple substitution method was inappropriate because the relative bias was the largest, regardless of the sample size. When the sample size was small and the censoring rate was large, the Bayesian method, the β-substitution method, and the MLE method showed the smallest relative bias. Conclusions: The accuracy and precision of all methods tended to increase as the sample size was larger and the censoring rate was smaller. The simple substitution method was inappropriate when the censoring rate was high, and the β-substitution method, MLE method, and Bayesian method can be widely applied.

A study on the effect of tax evasion controversy on corporate values in internet news portals through big data analysis (빅데이터 분석을 통한 인터넷 뉴스 포털에서의 탈세 논란이 기업 가치에 미치는 영향 연구)

  • Lee, Sang-Min;Park, Myung-Ho;Kim, Byung-Jun;Park, Dae-Keun
    • Journal of Internet Computing and Services
    • /
    • v.22 no.6
    • /
    • pp.51-57
    • /
    • 2021
  • If a company's actions to save or avoid taxes are judged to be tax evasion rather than legal tax action by the tax authorities, the company will not only pay tax but also non-tax costs such as damage to corporate image and stock price decline due to a series of tax evasion-related news articles. Therefore, this study measures the frequency of occurrence of tax evasion controversial keywords in internet news portal as a factor to measure the severity of the case, and analyzes the effect of the frequency of occurrence on corporate value. In the Korean stock market, we crawl related articles from internet news portal by using keywords that are controversial for tax evasion targeting top companies based on market capitalization, and generate a time series of the frequency of occurrence of keywords about tax evasion by company and analyze the effect of frequency of appearance on book value versus market capitalization. Through panel regression and impulse response analysis, it is analyzed that the frequency of appearance has a negative effect on the market capitalization and the effect gradually decreases until 12 months. This study examines whether the tax evasion issue affects the corporate value of Korean companies and suggests that it is necessary to take these influences into account when entrepreneurs set up tax-planning schemes.

General Relation Extraction Using Probabilistic Crossover (확률적 교차 연산을 이용한 보편적 관계 추출)

  • Je-Seung Lee;Jae-Hoon Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.8
    • /
    • pp.371-380
    • /
    • 2023
  • Relation extraction is to extract relationships between named entities from text. Traditionally, relation extraction methods only extract relations between predetermined subject and object entities. However, in end-to-end relation extraction, all possible relations must be extracted by considering the positions of the subject and object for each pair of entities, and so this method uses time and resources inefficiently. To alleviate this problem, this paper proposes a method that sets directions based on the positions of the subject and object, and extracts relations according to the directions. The proposed method utilizes existing relation extraction data to generate direction labels indicating the direction in which the subject points to the object in the sentence, adds entity position tokens and entity type to sentences to predict the directions using a pre-trained language model (KLUE-RoBERTa-base, RoBERTa-base), and generates representations of subject and object entities through probabilistic crossover operation. Then, we make use of these representations to extract relations. Experimental results show that the proposed model performs about 3 ~ 4%p better than a method for predicting integrated labels. In addition, when learning Korean and English data using the proposed model, the performance was 1.7%p higher in English than in Korean due to the number of data and language disorder and the values of the parameters that produce the best performance were different. By excluding the number of directional cases, the proposed model can reduce the waste of resources in end-to-end relation extraction.

How to automatically extract 2D deliverables from BIM?

  • Kim, Yije;Chin, Sangyoon
    • International conference on construction engineering and project management
    • /
    • 2022.06a
    • /
    • pp.1253-1253
    • /
    • 2022
  • Although the construction industry is changing from a 2D-based to a 3D BIM-based management process, 2D drawings are still used as standards for permits and construction. For this reason, 2D deliverables extracted from 3D BIM are one of the essential achievements of BIM projects. However, due to technical and institutional problems that exist in practice, the process of extracting 2D deliverables from BIM requires additional work beyond generating 3D BIM models. In addition, the consistency of data between 3D BIM models and 2D deliverables is low, which is a major factor hindering work productivity in practice. To solve this problem, it is necessary to build BIM data that meets information requirements (IRs) for extracting 2D deliverables to minimize the amount of work of users and maximize the utilization of BIM data. However, despite this, the additional work that occurs in the BIM process for drawing creation is still a burden on BIM users. To solve this problem, the purpose of this study is to increase the productivity of the BIM process by automating the process of extracting 2D deliverables from BIM and securing data consistency between the BIM model and 2D deliverables. For this, an expert interview was conducted, and the requirements for automation of the process of extracting 2D deliverables from BIM were analyzed. Based on the requirements, the types of drawings and drawing expression elements that require automation of drawing generation in the design development stage were derived. Finally, the method for developing automation technology targeting elements that require automation was classified and analyzed, and the process for automatically extracting BIM-based 2D deliverables through templates and rule-based automation modules were derived. At this time, the automation module was developed as an add-on to Revit software, a representative BIM authoring tool, and 120 rule-based automation rulesets, and the combinations of these rulesets were used to automatically generate 2D deliverables from BIM. Through this, it was possible to automatically create about 80% of drawing expression elements, and it was possible to simplify the user's work process compared to the existing work. Through the automation process proposed in this study, it is expected that the productivity of extracting 2D deliverables from BIM will increase, thereby increasing the practical value of BIM utilization.

  • PDF

Analysis on Looped Stage-Discharge Relation and Its Simulation using the Numerical Model (수치모형을 이용한 고리형 수위-유량 관계 분석)

  • Kim, Ji Sung;Kim, Won;Kim, Dong Gu;Kim, Chi Young
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.29 no.1B
    • /
    • pp.1-9
    • /
    • 2009
  • This study is focused on the analysis of loop characteristics of stage-discharge relation which is widely used for the production of discharge data and the simulation of loop stage-discharge relation using the numerical model. Analysis of consecutive stage and discharge data at 3 points revealed that loop of stage-discharge relationship is very strong. This means that the existing single stage-discharge relation may include large amount of error. Various flood events are simulated in mainstream of Han river with one-dimensional numerical model. The calculated stage data are compared with measured data. Especially continuous field-flow measurements concurrently collected with an Acoustic Doppler Velocity Meter (ADVM) on Hangang bridge in the case of 2007 flood event are used to verify the model applicability of estimating flows in open channels. This comparison shows that numerical model is an accurate and reliable alternative for making the real stage-discharge relation. Simulation of stage-discharge relation by a numerical model at Paldang and Hangang bridge showed good agreements with measured one, so it may be possible to generate real loop stage-discharge relation with properly calibrated and verified numerical model. It can be concluded that results of this study can contribute to error analysis of conventional single stage-discharge relation and development of loop stage-discharge relation with numerical model.

Utilization of Weather, Satellite and Drone Data to Detect Rice Blast Disease and Track its Propagation (벼 도열병 발생 탐지 및 확산 모니터링을 위한 기상자료, 위성영상, 드론영상의 공동 활용)

  • Jae-Hyun Ryu;Hoyong Ahn;Kyung-Do Lee
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.25 no.4
    • /
    • pp.245-257
    • /
    • 2023
  • The representative crop in the Republic of Korea, rice, is cultivated over extensive areas every year, which resulting in reduced resistance to pests and diseases. One of the major rice diseases, rice blast disease, can lead to a significant decrease in yields when it occurs on a large scale, necessitating early detection and effective control of rice blast disease. Drone-based crop monitoring techniques are valuable for detecting abnormal growth, but frequent image capture for potential rice blast disease occurrences can consume significant labor and resources. The purpose of this study is to early detect rice blast disease using remote sensing data, such as drone and satellite images, along with weather data. Satellite images was helpful in identifying rice cultivation fields. Effective detection of paddy fields was achieved by utilizing vegetation and water indices. Subsequently, air temperature, relative humidity, and number of rainy days were used to calculate the risk of rice blast disease occurrence. An increase in the risk of disease occurrence implies a higher likelihood of disease development, and drone measurements perform at this time. Spectral reflectance changes in the red and near-infrared wavelength regions were observed at the locations where rice blast disease occurred. Clusters with low vegetation index values were observed at locations where rice blast disease occurred, and the time series data for drone images allowed for tracking the spread of the disease from these points. Finally, drone images captured before harvesting was used to generate spatial information on the incidence of rice blast disease in each field.

Event Log Analysis Framework Based on the ATT&CK Matrix in Cloud Environments (클라우드 환경에서의 ATT&CK 매트릭스 기반 이벤트 로그 분석 프레임워크)

  • Yeeun Kim;Junga Kim;Siyun Chae;Jiwon Hong;Seongmin Kim
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.34 no.2
    • /
    • pp.263-279
    • /
    • 2024
  • With the increasing trend of Cloud migration, security threats in the Cloud computing environment have also experienced a significant increase. Consequently, the importance of efficient incident investigation through log data analysis is being emphasized. In Cloud environments, the diversity of services and ease of resource creation generate a large volume of log data. Difficulties remain in determining which events to investigate when an incident occurs, and examining all the extensive log data requires considerable time and effort. Therefore, a systematic approach for efficient data investigation is necessary. CloudTrail, the Amazon Web Services(AWS) logging service, collects logs of all API call events occurring in an account. However, CloudTrail lacks insights into which logs to analyze in the event of an incident. This paper proposes an automated analysis framework that integrates Cloud Matrix and event information for efficient incident investigation. The framework enables simultaneous examination of user behavior log events, event frequency, and attack information. We believe the proposed framework contributes to Cloud incident investigations by efficiently identifying critical events based on the ATT&CK Framework.