• Title/Summary/Keyword: Intelligent Data Analysis

Search Result 1,456, Processing Time 0.026 seconds

Pattern Recognition of Ship Navigational Data Using Support Vector Machine

  • Kim, Joo-Sung;Jeong, Jung Sik
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.15 no.4
    • /
    • pp.268-276
    • /
    • 2015
  • A ship's sailing route or plan is determined by the master as the decision maker of the vessel, and depends on the characteristics of the navigational environment and the conditions of the ship. The trajectory, which appears as a result of the ship's navigation, is monitored and stored by a Vessel Traffic Service center, and is used for an analysis of the ship's navigational pattern and risk assessment within a particular area. However, such an analysis is performed in the same manner, despite the different navigational environments between coastal areas and the harbor limits. The navigational environment within the harbor limits changes rapidly owing to construction of the port facilities, dredging operations, and so on. In this study, a support vector machine was used for processing and modeling the trajectory data. A K-fold cross-validation and a grid search were used for selecting the optimal parameters. A complicated traffic route similar to the circumstances of the harbor limits was constructed for a validation of the model. A group of vessels was composed, each vessel of which was given various speed and course changes along a specified route. As a result of the machine learning, the optimal route and voyage data model were obtained. Finally, the model was presented to Vessel Traffic Service operators to detect any anomalous vessel behaviors. Using the proposed data modeling method, we intend to support the decision-making of Vessel Traffic Service operators in terms of navigational patterns and their characteristics.

Analysis of Rear-end Collision Risks Using Weigh-in-Motion Data (고속도로 Weigh-in-Motion(WIM) 이벤트 자료를 활용한 후미추돌 위험도 분석 기법)

  • Oh, Min Soo;Park, Hyeon Jin;Oh, Cheol;Park, Soon Min
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.17 no.2
    • /
    • pp.152-167
    • /
    • 2018
  • The high-speed weigh-in-motion system can collect the traveling speed and load information of individual vehicles, which can be used in a variety of ways for the traffic surveillance. However, it has a limit to apply the high-speed weigh-in-motion data directly to a safety analysis because high-speed weigh-in-motion's raw data are point measured data. In order to overcome this problem, this paper proposes a method to calculate the conflict rate and the Impulse severity based on surrogate safety measures derived from the detection time, detection speed, vehicle length, vehicle type, vehicle weight. It will be possible to analyze and evaluate the risk of rear-end collision on freeway traffic. In addition, this study is expected to be used as a fundamental for identifying crash risks and developing policies to enhance traffic safety on freeways.

Microarray Data Retrieval Using Fuzzy Signature Sets (퍼지 시그너쳐 집합을 이용한 마이크로어레이 데이터 검색)

  • Lee, Sun-A;Lee, Keon-Myung;Ryu, Keun-Ho
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.4
    • /
    • pp.545-549
    • /
    • 2009
  • Microarray data sets could contain thousands of gene expression levels and have been considered as an important source from which meaningful patterns could be extracted for further analysis in biological studies. It is sometimes necessary to retrieve out specific genes or samples of analyst's interest in an effective way. This paper is concerned with a method to make use of fuzzy signature set in order to filter out genes or samples which satisfy complicated constraints as well as simple ones. Fuzzy signatures are an extension of vector valued fuzzy sets, in which elements of the vector are allowed to have a vector. Fuzzy signature sets are similar to fuzzy signatures except that their leaf elements are fuzzy sets defined on the interval [0,1]. This paper introduces an extension of fuzzy signature sets which specifies aggregation operators at each internal node and comparison operators for aggregation. It also shows how to use the extended fuzzy signature sets in microarray data retrieval and some examples of its usage.

Sparse Document Data Clustering Using Factor Score and Self Organizing Maps (인자점수와 자기조직화지도를 이용한 희소한 문서데이터의 군집화)

  • Jun, Sung-Hae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.2
    • /
    • pp.205-211
    • /
    • 2012
  • The retrieved documents have to be transformed into proper data structure for the clustering algorithms of statistics and machine learning. A popular data structure for document clustering is document-term matrix. This matrix has the occurred frequency value of a term in each document. There is a sparsity problem in this matrix because most frequencies of the matrix are 0 values. This problem affects the clustering performance. The sparseness of document-term matrix decreases the performance of clustering result. So, this research uses the factor score by factor analysis to solve the sparsity problem in document clustering. The document-term matrix is transformed to document-factor score matrix using factor scores in this paper. Also, the document-factor score matrix is used as input data for document clustering. To compare the clustering performances between document-term matrix and document-factor score matrix, this research applies two typed matrices to self organizing map (SOM) clustering.

Candidate Marker Identification from Gene Expression Data with Attribute Value Discretization and Negation (속성값 이산화 및 부정값 허용을 하는 의사결정트리 기반의 유전자 발현 데이터의 마커 후보 식별)

  • Lee, Kyung-Mi;Lee, Keon-Myung
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.21 no.5
    • /
    • pp.575-580
    • /
    • 2011
  • With the increasing expectation on personalized medicine, it is getting importance to analyze medical information in molecular biology perspective. Gene expression data are one of representative ones to show the microscopic phenomena of biological activities. In gene expression data analysis, one of major concerns is to identify markers which can be used to predict disease occurrence, progression or recurrence in the molecular level. Existing markers candidate identification methods mainly depend on statistical hypothesis test methods. This paper proposes a search method based decision tree induction to identify candidate markers which consist of multiple genes. The propose method discretizes numeric expression level into three categorical values and allows candidate markers' genes to be expressed by their negation as well as categorical values. It is desirable to have some number of genes to be included in markers. Hence the method is devised to try to find candidate markers with restricted number of genes.

Analysis and Detection Method for Line-shaped Echoes using Support Vector Machine (Support Vector Machine을 이용한 선에코 특성 분석 및 탐지 방법)

  • Lee, Hansoo;Kim, Eun Kyeong;Kim, Sungshin
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.6
    • /
    • pp.665-670
    • /
    • 2014
  • A SVM is a kind of binary classifier in order to find optimal hyperplane which separates training data into two groups. Due to its remarkable performance, the SVM is applied in various fields such as inductive inference, binary classification or making predictions. Also it is a representative black box model; there are plenty of actively discussed researches about analyzing trained SVM classifier. This paper conducts a study on a method that is automatically detecting the line-shaped echoes, sun strobe echo and radial interference echo, using the SVM algorithm because the line-shaped echoes appear relatively often and disturb weather forecasting process. Using a spatial clustering method and corrected reflectivity data in the weather radar, the training data is made up with mean reflectivity, size, appearance, centroid altitude and so forth. With actual occurrence cases of the line-shaped echoes, the trained SVM classifier is verified, and analyzed its characteristics using the decision tree method.

A Selection Method of Implementation Area for Emergency Vehicle Preemption System Using Dispatch Data Analysis (출동현황자료 분석을 통한 재난대비 긴급차량 우선신호제어 시스템 도입지역 선정방안 연구)

  • Sung, Joong Gi;Ha, Dongik
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.15 no.2
    • /
    • pp.24-35
    • /
    • 2016
  • Emergency Vehicle Preemption(EVP) is an operation method which helps to improve response condition of Emergency Vehicle(EV) and it has not yet been introduced in Korea. In order to implement the system, it requires step-by-step plan and selecting a priority area for trial operation. Since a municipal government such as Seoul is too large so it is limited in time and cost to analyze the whole area. Therefore, quantitative and effective selection method for priority area is critical. The aim of this study is to propose a selection method of implementation area for EVP system using the dispatch data analysis. This study also determined the priority area for EVP implementation by analyzing the dispatch data in Seoul and conducted a simulation to evaluate the effects of implementing EVP.

A Study on Improving Minimum Level of Service for Public Transportation Using Altteul Transport Card Data (알뜰교통카드를 활용한 대중교통 최소서비스 수준 분석 기준 개선 방안 연구)

  • Sangwoo Shim;Junyoung Joung;Kwankyo Oh;Minseok Kim
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.22 no.3
    • /
    • pp.104-115
    • /
    • 2023
  • User-centered public transportation services such as DRT, Autonomous Transit etc. have been provied but current minimum level of service for public transportation has been evaluated by the operator because there is no data on user's accessibility to use public transportation. This study was performed GRID analysis using altteul transport card data including user's accessibility to use public transportation. The analysis result showed that user's accessibility to use public transportation was different within a same dong area. We proposed improving minimum level of service for public transportation considered by the user. The result of applying the proposed method showed that many area was changed to unsatisfied area for minimum level of service for public transportation

A Study of the DSSAD Data Elements Derivation through Autonomous Driving Data Analysis on Expressways (자동차 전용도로 자율주행 데이터 분석을 통한 DSSAD 기록항목 도출)

  • Seunghwa Hyun;Jinwoo Son;Youngchul Oh;Byungyong You
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.23 no.3
    • /
    • pp.97-106
    • /
    • 2024
  • The Data Storage System for Automated Driving(DSSAD) is a system that records driving information of Lv.4 or higher autonomous vehicles and is different from EDR that records car information in emergency situations. The study of DSSAD recordings is important for responding to various events that may occur in the future commercialization of Lv.4 autonomous vehicles. Therefore, in this study, we conducted a expressway automated driving demonstration and analyzed the collected data to derive the recording elements of DSSAD. During our two-year demonstration of autonomous driving on expressways, we collected and analyzed instances of disengagement. Our findings indicate that 51.6% of disengagement on expressways occurred during lane changes. From the study, we have identified DSSAD record elements for analyzing disengagement situations. Furthermore, implications of future research direction of disengagement analysis were presented.

A New Statistical Sampling Method for Reducing Computing time of Machine Learning Algorithms (기계학습 알고리즘의 컴퓨팅시간 단축을 위한 새로운 통계적 샘플링 기법)

  • Jun, Sung-Hae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.21 no.2
    • /
    • pp.171-177
    • /
    • 2011
  • Accuracy and computing time are considerable issues in machine learning. In general, the computing time for data analysis is increased in proportion to the size of given data. So, we need a sampling approach to reduce the size of training data. But, the accuracy of constructed model is decreased by going down the data size simultaneously. To solve this problem, we propose a new statistical sampling method having similar performance to the total data. We suggest a rule to select optimal sampling techniques according to given data structure. This paper shows a sampling method for reducing computing time with keeping the most of accuracy using cluster sampling, stratified sampling, and systematic sampling. We verify improved performance of proposed method by accuracy and computing time between sample data and total data using objective machine learning data sets.