• Title/Summary/Keyword: Generate Data

Search Result 3,084, Processing Time 0.028 seconds

A Study on the Decryption Method for Volume Encryption and Backup Applications (볼륨 암호화 및 백업 응용프로그램에 대한 복호화 방안 연구)

  • Gwui-eun Park;Min-jeong Lee;Soo-jin Kang;Gi-yoon Kim;Jong-sung Kim
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.3
    • /
    • pp.511-525
    • /
    • 2023
  • As awareness of personal data protection increases, various Full Disk Encryption (FDE)-based applications are being developed that real-time encryption or use virtual drive volumes to protect data on user's PC. FDE-based applications encrypt and protect the volume containing user's data. However, as disk encryption technology advances, some users are abusing FDE-based applications to encrypt evidence associated with criminal activities, which makes difficulties in digital forensic investigations. Thus, it is necessary to analyze the encryption process used in FDE-based applications and decrypt the encrypted data. In this paper, we analyze Cryptomator and Norton Ghost, which provide volume encryption and backup functions. We analyze the encrypted data structure and encryption process to classify the main data of each application and identify the encryption algorithm used for data decryption. The encryption algorithms of these applications are recently emergin gor customized encryption algorithms which are analyzed to decrypt data. User password is essential to generate a data encryption key used for decryption, and a password acquisition method is suggested using the function of each application. This supplemented the limitations of password investigation, and identifies user data by decrypting encrypted data based on the acquired password.

Bridge Safety Determination Edge AI Model Based on Acceleration Data (가속도 데이터 기반 교량 안전 판단을 위한 Edge AI 모델)

  • Jinhyo Park;Yong-Geun Hong;Joosang Youn
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.29 no.4
    • /
    • pp.1-11
    • /
    • 2024
  • Bridges crack and become damaged due to age and external factors such as earthquakes, lack of maintenance, and weather conditions. With the number of aging bridge on the rise, lack of maintenance can lead to a decrease in safety, resulting in structural defects and collapse. To prevent these problems and reduce maintenance costs, a system that can monitor the condition of bridge and respond quickly is needed. To this end, existing research has proposed artificial intelligence model that use sensor data to identify the location and extent of cracks. However, existing research does not use data from actual bridge to determine the performance of the model, but rather creates the shape of the bridge through simulation to acquire data and use it for training, which does not reflect the actual bridge environment. In this paper, we propose a bridge safety determination edge AI model that detects bridge abnormalities based on artificial intelligence by utilizing acceleration data from bridge occurring in the field. To this end, we newly defined filtering rules for extracting valid data from acceleration data and constructed a model to apply them. We also evaluated the performance of the proposed bridge safety determination edge AI model based on data collected in the field. The results showed that the F1-Score was up to 0.9565, confirming that it is possible to determine safety using data from real bridge, and that rules that generate similar data patterns to real impact data perform better.

Improving minority prediction performance of support vector machine for imbalanced text data via feature selection and SMOTE (단어선택과 SMOTE 알고리즘을 이용한 불균형 텍스트 데이터의 소수 범주 예측성능 향상 기법)

  • Jongchan Kim;Seong Jun Chang;Won Son
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.4
    • /
    • pp.395-410
    • /
    • 2024
  • Text data is usually made up of a wide variety of unique words. Even in standard text data, it is common to find tens of thousands of different words. In text data analysis, usually, each unique word is treated as a variable. Thus, text data can be regarded as a dataset with a large number of variables. On the other hand, in text data classification, we often encounter class label imbalance problems. In the cases of substantial imbalances, the performance of conventional classification models can be severely degraded. To improve the classification performance of support vector machines (SVM) for imbalanced data, algorithms such as the Synthetic Minority Over-sampling Technique (SMOTE) can be used. The SMOTE algorithm synthetically generates new observations for the minority class based on the k-Nearest Neighbors (kNN) algorithm. However, in datasets with a large number of variables, such as text data, errors may accumulate. This can potentially impact the performance of the kNN algorithm. In this study, we propose a method for enhancing prediction performance for the minority class of imbalanced text data. Our approach involves employing variable selection to generate new synthetic observations in a reduced space, thereby improving the overall classification performance of SVM.

Classification of Multi-temporal SAR Data by Using Data Transform Based Features and Multiple Classifiers (자료변환 기반 특징과 다중 분류자를 이용한 다중시기 SAR자료의 분류)

  • Yoo, Hee Young;Park, No-Wook;Hong, Sukyoung;Lee, Kyungdo;Kim, Yeseul
    • Korean Journal of Remote Sensing
    • /
    • v.31 no.3
    • /
    • pp.205-214
    • /
    • 2015
  • In this study, a novel land-cover classification framework for multi-temporal SAR data is presented that can combine multiple features extracted through data transforms and multiple classifiers. At first, data transforms using principle component analysis (PCA) and 3D wavelet transform are applied to multi-temporal SAR dataset for extracting new features which were different from original dataset. Then, three different classifiers including maximum likelihood classifier (MLC), neural network (NN) and support vector machine (SVM) are applied to three different dataset including data transform based features and original backscattering coefficients, and as a result, the diverse preliminary classification results are generated. These results are combined via a majority voting rule to generate a final classification result. From an experiment with a multi-temporal ENVISAT ASAR dataset, every preliminary classification result showed very different classification accuracy according to the used feature and classifier. The final classification result combining nine preliminary classification results showed the best classification accuracy because each preliminary classification result provided complementary information on land-covers. The improvement of classification accuracy in this study was mainly attributed to the diversity from combining not only different features based on data transforms, but also different classifiers. Therefore, the land-cover classification framework presented in this study would be effectively applied to the classification of multi-temporal SAR data and also be extended to multi-sensor remote sensing data fusion.

Development of Examination Model of Weather Factors on Garlic Yield Using Big Data Analysis (빅데이터 분석을 활용한 마늘 생산에 미치는 날씨 요인에 관한 영향 조사 모형 개발)

  • Kim, Shinkon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.5
    • /
    • pp.480-488
    • /
    • 2018
  • The development of information and communication technology has been carried out actively in the field of agriculture to generate valuable information from large amounts of data and apply big data technology to utilize it. Crops and their varieties are determined by the influence of the natural environment such as temperature, precipitation, and sunshine hours. This paper derives the climatic factors affecting the production of crops using the garlic growth process and daily meteorological variables. A prediction model was also developed for the production of garlic per unit area. A big data analysis technique considering the growth stage of garlic was used. In the exploratory data analysis process, various agricultural production data, such as the production volume, wholesale market load, and growth data were provided from the National Statistical Office, the Rural Development Administration, and Korea Rural Economic Institute. Various meteorological data, such as AWS, ASOS, and special status data, were collected and utilized from the Korea Meteorological Agency. The correlation analysis process was designed by comparing the prediction power of the models and fitness of models derived from the variable selection, candidate model derivation, model diagnosis, and scenario prediction. Numerous weather factor variables were selected as descriptive variables by factor analysis to reduce the dimensions. Using this method, it was possible to effectively control the multicollinearity and low degree of freedom that can occur in regression analysis and improve the fitness and predictive power of regression analysis.

A Study of Measuring Traffic Congestion for Urban Network using Average Link Travel Time based on DTG Big Data (DTG 빅데이터 기반의 링크 평균통행시간을 이용한 도심네트워크 혼잡분석 방안 연구)

  • Han, Yohee;Kim, Youngchan
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.16 no.5
    • /
    • pp.72-84
    • /
    • 2017
  • Together with the Big Data of the 4th Industrial Revolution, the traffic information system has been changed to an section detection system by the point detection system. With DTG(Digital Tachograph) data based on Global Navigation Satellite System, the properties of raw data and data according to processing step were examined. We identified the vehicle trajectory, the link travel time of individual vehicle, and the link average travel time which are generated according to the processing step. In this paper, we proposed a application method for traffic management as characteristics of processing data. We selected the historical data considering the data management status of the center and the availability at the present time. We proposed a method to generate the Travel Time Index with historical link average travel time which can be collected all the time with wide range. We propose a method to monitor the traffic congestion using the Travel Time Index, and analyze the case of intersections when the traffic operation method changed. At the same time, the current situation which makes it difficult to fully utilize DTG data are suggested as limitations.

Analysis of Defective Causes in Real Time and Prediction of Facility Replacement Cycle based on Big Data (빅데이터 기반 실시간 불량품 발생 원인 분석 및 설비 교체주기 예측)

  • Hwang, Seung-Yeon;Kwak, Kyung-Min;Shin, Dong-Jin;Kwak, Kwang-Jin;Rho, Young-J;Park, Kyung-won;Park, Jeong-Min;Kim, Jeong-Joon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.6
    • /
    • pp.203-212
    • /
    • 2019
  • Along with the recent fourth industrial revolution, the world's manufacturing powerhouses are pushing for national strategies to revive the sluggish manufacturing industry. Moon Jae-in, the government is in accordance with the trend, called 'advancement of science and technology is leading the fourth round of the Industrial Revolution' strategy. Intelligent information technology such as IoT, Cloud, Big Data, Mobile, and AI, which are key technologies that lead the fourth industrial revolution, is promoting the emergence of new industries such as robots and 3D printing and the smarting of existing major manufacturing industries. Advances in technologies such as smart factories have enabled IoT-based sensing technology to measure various data that could not be collected before, and data generated by each process has also exploded. Thus, this paper uses data generators to generate virtual data that can occur in smart factories, and uses them to analyze the cause of the defect in real time and to predict the replacement cycle of the facility.

A Study on the Integration of Airborne LiDAR and UAV Data for High-resolution Topographic Information Construction of Tidal Flat (갯벌지역 고해상도 지형정보 구축을 위한 항공 라이다와 UAV 데이터 통합 활용에 관한 연구)

  • Kim, Hye Jin;Lee, Jae Bin;Kim, Yong Il
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.38 no.4
    • /
    • pp.345-352
    • /
    • 2020
  • To preserve and restore tidal flats and prevent safety accidents, it is necessary to construct tidal flat topographic information including the exact location and shape of tidal creeks. In the tidal flats where the field surveying is difficult to apply, airborne LiDAR surveying can provide accurate terrain data for a wide area. On the other hand, we can economically obtain relatively high-resolution data from UAV (Unmanned Aerial Vehicle) surveying. In this study, we proposed the methodology to generate high-resolution topographic information of tidal flats effectively by integrating airborne LiDAR and UAV point clouds. For the purpose, automatic ICP (Iterative Closest Points) registration between two different datasets was conducted and tidal creeks were extracted by applying CSF (Cloth Simulation Filtering) algorithm. Then, we integrated high-density UAV data for tidal creeks and airborne LiDAR data for flat grounds. DEM (Digital Elevation Model) and tidal flat area and depth were generated from the integrated data to construct high-resolution topographic information for large-scale tidal flat map creation. As a result, UAV data was registered without GCP (Ground Control Point), and integrated data including detailed topographic information of tidal creeks with a relatively small data size was generated.

Host Interface Design for TCP/IP Hardware Accelerator (TCP/IP Hardware Accelerator를 위한 Host Interface의 설계)

  • Jung, Yeo-Jin;Lim, Hye-Sook
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.2B
    • /
    • pp.1-10
    • /
    • 2005
  • TCP/IP protocols have been implemented in software program running on CPU in end systems. As the increased demand of fast protocol processing, it is required to implement the protocols in hardware, and Host Interface is responsible for communication between external CPU and the hardware blocks of TCP/IP implementation. The Host Interface follows AMBA AHB specification for the communication with external world. For control flow, the Host Interface behaves as a slave of AMBA AHB. Using internal Command/status Registers, the Host Interface receives commands from CPU and transfers hardware status and header information to CPU. On the other hand, the Host Interface behaves as a master for data flow. Data flow has two directions, Receive Flow and Transmit Flow. In Receive Flow, using internal RxFIFO, the Host Interface reads data from UDP FIFO or TCP buffer and transfers data to external RAM for CPU to read. For Transmit Flow, the Host Interface reads data from external RAM and transfers data to UDP buffer or TCP buffer through internal TxFIFO. TCP/IP hardware blocks generate packets using the data and transmit. Buffer Descriptor is one of the Command/Status Registers, and the information stored in Buffer Descriptor is used for external RAM access. Several testcases are designed to verify TCP/IP functions. The Host Interface is synthesized using the 0.18 micron technology, and it results in 173 K gates including the Command/status Registers and internal FIFOs.

Data Mining Algorithm Based on Fuzzy Decision Tree for Pattern Classification (퍼지 결정트리를 이용한 패턴분류를 위한 데이터 마이닝 알고리즘)

  • Lee, Jung-Geun;Kim, Myeong-Won
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.11
    • /
    • pp.1314-1323
    • /
    • 1999
  • 컴퓨터의 사용이 일반화됨에 따라 데이타를 생성하고 수집하는 것이 용이해졌다. 이에 따라 데이타로부터 자동적으로 유용한 지식을 얻는 기술이 필요하게 되었다. 데이타 마이닝에서 얻어진 지식은 정확성과 이해성을 충족해야 한다. 본 논문에서는 데이타 마이닝을 위하여 퍼지 결정트리에 기반한 효율적인 퍼지 규칙을 생성하는 알고리즘을 제안한다. 퍼지 결정트리는 ID3와 C4.5의 이해성과 퍼지이론의 추론과 표현력을 결합한 방법이다. 특히, 퍼지 규칙은 속성 축에 평행하게 판단 경계선을 결정하는 방법으로는 어려운 속성 축에 평행하지 않는 경계선을 갖는 패턴을 효율적으로 분류한다. 제안된 알고리즘은 첫째, 각 속성 데이타의 히스토그램 분석을 통해 적절한 소속함수를 생성한다. 둘째, 주어진 소속함수를 바탕으로 ID3와 C4.5와 유사한 방법으로 퍼지 결정트리를 생성한다. 또한, 유전자 알고리즘을 이용하여 소속함수를 조율한다. IRIS 데이타, Wisconsin breast cancer 데이타, credit screening 데이타 등 벤치마크 데이타들에 대한 실험 결과 제안된 방법이 C4.5 방법을 포함한 다른 방법보다 성능과 규칙의 이해성에서 보다 효율적임을 보인다.Abstract With an extended use of computers, we can easily generate and collect data. There is a need to acquire useful knowledge from data automatically. In data mining the acquired knowledge needs to be both accurate and comprehensible. In this paper, we propose an efficient fuzzy rule generation algorithm based on fuzzy decision tree for data mining. We combine the comprehensibility of rules generated based on decision tree such as ID3 and C4.5 and the expressive power of fuzzy sets. Particularly, fuzzy rules allow us to effectively classify patterns of non-axis-parallel decision boundaries, which are difficult to do using attribute-based classification methods.In our algorithm we first determine an appropriate set of membership functions for each attribute of data using histogram analysis. Given a set of membership functions then we construct a fuzzy decision tree in a similar way to that of ID3 and C4.5. We also apply genetic algorithm to tune the initial set of membership functions. We have experimented our algorithm with several benchmark data sets including the IRIS data, the Wisconsin breast cancer data, and the credit screening data. The experiment results show that our method is more efficient in performance and comprehensibility of rules compared with other methods including C4.5.