• Title/Summary/Keyword: label generate

Search Result 44, Processing Time 0.027 seconds

An improved multiple-vertical-line-element model for RC shear walls using ANN

  • Xiaolei Han;Lei Zhang;Yankun Qiu;Jing Ji
    • Earthquakes and Structures
    • /
    • v.25 no.5
    • /
    • pp.385-398
    • /
    • 2023
  • The parameters of the multiple-vertical-line-element model (MVLEM) of reinforced concrete (RC) shear walls are often empirically determined, which causes large simulation errors. To improve the simulation accuracy of the MVLEM for RC shear walls, this paper proposed a novel method to determine the MVLEM parameters using the artificial neural network (ANN). First, a comprehensive database containing 193 shear wall specimens with complete parameter information was established. And the shear walls were simulated using the classic MVLEM. The average simulation errors of the lateral force and drift of the peak and ultimate points on the skeleton curves were approximately 18%. Second, the MVLEM parameters were manually optimized to minimize the simulation error and the optimal MVLEM parameters were used as the label data of the training of the ANN. Then, the trained ANN was used to generate the MVLEM parameters of the collected shear walls. The results show that the simulation error of the predicted MVLEM was reduced to less than 13% from the original 18%. Particularly, the responses generated by the predicted MVLEM are more identical to the experimental results for the testing set, which contains both flexure-control and shear-control shear wall specimens. It indicates that establishing MVLEM for RC shear walls using ANN is feasible and promising, and that the predicted MVLEM substantially improves the simulation accuracy.

Malicious Traffic Classification Using Mitre ATT&CK and Machine Learning Based on UNSW-NB15 Dataset (마이터 어택과 머신러닝을 이용한 UNSW-NB15 데이터셋 기반 유해 트래픽 분류)

  • Yoon, Dong Hyun;Koo, Ja Hwan;Won, Dong Ho
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.2
    • /
    • pp.99-110
    • /
    • 2023
  • This study proposed a classification of malicious network traffic using the cyber threat framework(Mitre ATT&CK) and machine learning to solve the real-time traffic detection problems faced by current security monitoring systems. We applied a network traffic dataset called UNSW-NB15 to the Mitre ATT&CK framework to transform the label and generate the final dataset through rare class processing. After learning several boosting-based ensemble models using the generated final dataset, we demonstrated how these ensemble models classify network traffic using various performance metrics. Based on the F-1 score, we showed that XGBoost with no rare class processing is the best in the multi-class traffic environment. We recognized that machine learning ensemble models through Mitre ATT&CK label conversion and oversampling processing have differences over existing studies, but have limitations due to (1) the inability to match perfectly when converting between existing datasets and Mitre ATT&CK labels and (2) the presence of excessive sparse classes. Nevertheless, Catboost with B-SMOTE achieved the classification accuracy of 0.9526, which is expected to be able to automatically detect normal/abnormal network traffic.

A Method for Twitter Spam Detection Using N-Gram Dictionary Under Limited Labeling (트레이닝 데이터가 제한된 환경에서 N-Gram 사전을 이용한 트위터 스팸 탐지 방법)

  • Choi, Hyeok-Jun;Park, Cheong Hee
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.9
    • /
    • pp.445-456
    • /
    • 2017
  • In this paper, we propose a method to detect spam tweets containing unhealthy information by using an n-gram dictionary under limited labeling. Spam tweets that contain unhealthy information have a tendency to use similar words and sentences. Based on this characteristic, we show that spam tweets can be effectively detected by applying a Naive Bayesian classifier using n-gram dictionaries which are constructed from spam tweets and normal tweets. On the other hand, constructing an initial training set requires very high cost because a large amount of data flows in real time in a twitter. Therefore, there is a need for a spam detection method that can be applied in an environment where the initial training set is very small or non exist. To solve the problem, we propose a method to generate pseudo-labels by utilizing twitter's retweet function and use them for the configuration of the initial training set and the n-gram dictionary update. The results from various experiments using 1.3 million korean tweets collected from December 1, 2016 to December 7, 2016 prove that the proposed method has superior performance than the compared spam detection methods.

Automatic Generation of Training Character Samples for OCR Systems

  • Le, Ha;Kim, Soo-Hyung;Na, In-Seop;Do, Yen;Park, Sang-Cheol;Jeong, Sun-Hwa
    • International Journal of Contents
    • /
    • v.8 no.3
    • /
    • pp.83-93
    • /
    • 2012
  • In this paper, we propose a novel method that automatically generates real character images to familiarize existing OCR systems with new fonts. At first, we generate synthetic character images using a simple degradation model. The synthetic data is used to train an OCR engine, and the trained OCR is used to recognize and label real character images that are segmented from ideal document images. Since the OCR engine is unable to recognize accurately all real character images, a substring matching method is employed to fix wrongly labeled characters by comparing two strings; one is the string grouped by recognized characters in an ideal document image, and the other is the ordered string of characters which we are considering to train and recognize. Based on our method, we build a system that automatically generates 2350 most common Korean and 117 alphanumeric characters from new fonts. The ideal document images used in the system are postal envelope images with characters printed in ascending order of their codes. The proposed system achieved a labeling accuracy of 99%. Therefore, we believe that our system is effective in facilitating the generation of numerous character samples to enhance the recognition rate of existing OCR systems for fonts that have never been trained.

Improvement of colored thread algorithm for network reachability test (칼라 스레드 알고리즘을 이용한 네트워크 도달성 검사)

  • Kim, Han-Kyoung;Lee, Kwang-Hui
    • Journal of Internet Computing and Services
    • /
    • v.10 no.5
    • /
    • pp.27-32
    • /
    • 2009
  • Colored thread algorithm, suggested to be used for the label switching network, needs to be modified for the packet switching network. In this paper, it is recommended to add a merged state, besides the 3 states - null, colored and transparent - which are resulted from the behaviors of extend, rewind, stall, withdraw and merge events. The original colored thread algorithm is designed to generate a new thread and extend it to the downstream direction with unknown hop count when the thread has revisited the node that was visited. It also suggested rewinding the thread to the downstream direction by the source node, instead of rewinding it upstream direction by the revisited node. If a node received multiple threads which had a same forward equivalent class, then it checks first whether the hop counts are ascending or not. If it is in ascending order, then threads are merged. Otherwise the later thread is stalled until the former thread's color is to be changed to transparent or it is removed. This idea removes the effort of generating a new thread with unknown hop count.

  • PDF

Sub Oriented Histograms of Local Binary Patterns for Smoke Detection and Texture Classification

  • Yuan, Feiniu;Shi, Jinting;Xia, Xue;Yang, Yong;Fang, Yuming;Wang, Rui
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.4
    • /
    • pp.1807-1823
    • /
    • 2016
  • Local Binary Pattern (LBP) and its variants have powerful discriminative capabilities but most of them just consider each LBP code independently. In this paper, we propose sub oriented histograms of LBP for smoke detection and image classification. We first extract LBP codes from an image, compute the gradient of LBP codes, and then calculate sub oriented histograms to capture spatial relations of LBP codes. Since an LBP code is just a label without any numerical meaning, we use Hamming distance to estimate the gradient of LBP codes instead of Euclidean distance. We propose to use two coordinates systems to compute two orientations, which are quantized into discrete bins. For each pair of the two discrete orientations, we generate a sub LBP code map from the original LBP code map, and compute sub oriented histograms for all sub LBP code maps. Finally, all the sub oriented histograms are concatenated together to form a robust feature vector, which is input into SVM for training and classifying. Experiments show that our approach not only has better performance than existing methods in smoke detection, but also has good performance in texture classification.

A Study of Rapid Prototyping Based on GOMS Model (GOMS 모델을 기반으로 한 Rapid Prototyping에 관한 연구)

  • Cha, Yeon-Joo;Jo, Sung-Sik;Myung, Ro-Hae
    • IE interfaces
    • /
    • v.24 no.1
    • /
    • pp.1-7
    • /
    • 2011
  • The purpose of this research was to develop an integrated interface for the usability test of systems or products in the design process. It is capable of automatically creating GOMS models which can predict human task performances. It can generate GOMS models to be interacted with the prototype interfaces. It can also effectively manage various design information and various usability test results to be implemented into the new product and/or system design. Thus we can perform usability test for products or system prototypes more effectively and also reduce time and effort required for this test. For usability tests, we established an integrated interface based on GOMS model by the LabVIEW program. We constructed the system that the linkage to GOMS model is available. Using this integrated interface, the menu structure of mobile phone can be constructed easily. User can design a depth and a breath that he want. The size of button and the label of the button is changable. The path to the goal can be defined by the user. Using a designed menu structure, the experiment could be performed. The results of GOMS model and the actual time are presented. Besides, values of operators of GOMS model can be defined as the value that user wants. Using the integrated interface that we developed, the optimal menu structure deducted. The menu structure that user wants can be established easily. The optimal layout and button size can be decided by comparison of numerous menu structures. User can choose the method of usability test among GOMS model and empirical data. Using this integrated interface, the time and costs can be saved and the optimal menu structure can be found easily.

Deep-learning based SAR Ship Detection with Generative Data Augmentation (영상 생성적 데이터 증강을 이용한 딥러닝 기반 SAR 영상 선박 탐지)

  • Kwon, Hyeongjun;Jeong, Somi;Kim, SungTai;Lee, Jaeseok;Sohn, Kwanghoon
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.1
    • /
    • pp.1-9
    • /
    • 2022
  • Ship detection in synthetic aperture radar (SAR) images is an important application in marine monitoring for the military and civilian domains. Over the past decade, object detection has achieved significant progress with the development of convolutional neural networks (CNNs) and lot of labeled databases. However, due to difficulty in collecting and labeling SAR images, it is still a challenging task to solve SAR ship detection CNNs. To overcome the problem, some methods have employed conventional data augmentation techniques such as flipping, cropping, and affine transformation, but it is insufficient to achieve robust performance to handle a wide variety of types of ships. In this paper, we present a novel and effective approach for deep SAR ship detection, that exploits label-rich Electro-Optical (EO) images. The proposed method consists of two components: a data augmentation network and a ship detection network. First, we train the data augmentation network based on conditional generative adversarial network (cGAN), which aims to generate additional SAR images from EO images. Since it is trained using unpaired EO and SAR images, we impose the cycle-consistency loss to preserve the structural information while translating the characteristics of the images. After training the data augmentation network, we leverage the augmented dataset constituted with real and translated SAR images to train the ship detection network. The experimental results include qualitative evaluation of the translated SAR images and the comparison of detection performance of the networks, trained with non-augmented and augmented dataset, which demonstrates the effectiveness of the proposed framework.

Automatic Training Corpus Generation Method of Named Entity Recognition Using Knowledge-Bases (개체명 인식 코퍼스 생성을 위한 지식베이스 활용 기법)

  • Park, Youngmin;Kim, Yejin;Kang, Sangwoo;Seo, Jungyun
    • Korean Journal of Cognitive Science
    • /
    • v.27 no.1
    • /
    • pp.27-41
    • /
    • 2016
  • Named entity recognition is to classify elements in text into predefined categories and used for various departments which receives natural language inputs. In this paper, we propose a method which can generate named entity training corpus automatically using knowledge bases. We apply two different methods to generate corpus depending on the knowledge bases. One of the methods attaches named entity labels to text data using Wikipedia. The other method crawls data from web and labels named entities to web text data using Freebase. We conduct two experiments to evaluate corpus quality and our proposed method for generating Named entity recognition corpus automatically. We extract sentences randomly from two corpus which called Wikipedia corpus and Web corpus then label them to validate both automatic labeled corpus. We also show the performance of named entity recognizer trained by corpus generated in our proposed method. The result shows that our proposed method adapts well with new corpus which reflects diverse sentence structures and the newest entities.

  • PDF

Generating Rank-Comparison Decision Rules with Variable Number of Genes for Cancer Classification (순위 비교를 기반으로 하는 다양한 유전자 개수로 이루어진 암 분류 결정 규칙의 생성)

  • Yoon, Young-Mi;Bien, Sang-Jay;Park, Sang-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.15D no.6
    • /
    • pp.767-776
    • /
    • 2008
  • Microarray technology is extensively being used in experimental molecular biology field. Microarray experiments generate quantitative expression measurements for thousands of genes simultaneously, which is useful for the phenotype classification of many diseases. One of the two major problems in microarray data classification is that the number of genes exceeds the number of tissue samples. The other problem is that current methods generate classifiers that are accurate but difficult to interpret. Our paper addresses these two problems. We performed a direct integration of individual microarrays with same biological objectives by transforming an expression value into a rank value within a sample and generated rank-comparison decision rules with variable number of genes for cancer classification. Our classifier is an ensemble method which has k top scoring decision rules. Each rule contains a number of genes, a relationship among involved genes, and a class label. Current classifiers which are also ensemble methods consist of k top scoring decision rules. However these classifiers fix the number of genes in each rule as a pair or a triple. In this paper we generalized the number of genes involved in each rule. The number of genes in each rule is in the range of 2 to N respectively. Generalizing the number of genes increases the robustness and the reliability of the classifier for the class prediction of an independent sample. Also our classifier is readily interpretable, accurate with small number of genes, and shed a possibility of the use in a clinical setting.