• Title/Summary/Keyword: time domain data

Search Result 1,310, Processing Time 0.034 seconds

Diagnosis of Valve Internal Leakage for Ship Piping System using Acoustic Emission Signal-based Machine Learning Approach (선박용 밸브의 내부 누설 진단을 위한 음향방출신호의 머신러닝 기법 적용 연구)

  • Lee, Jung-Hyung
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.28 no.1
    • /
    • pp.184-192
    • /
    • 2022
  • Valve internal leakage is caused by damage to the internal parts of the valve, resulting in accidents and shutdowns of the piping system. This study investigated the possibility of a real-time leak detection method using the acoustic emission (AE) signal generated from the piping system during the internal leakage of a butterfly valve. Datasets of raw time-domain AE signals were collected and postprocessed for each operation mode of the valve in a systematic manner to develop a data-driven model for the detection and classification of internal leakage, by applying machine learning algorithms. The aim of this study was to determine whether it is possible to treat leak detection as a classification problem by applying two classification algorithms: support vector machine (SVM) and convolutional neural network (CNN). The results showed different performances for the algorithms and datasets used. The SVM-based binary classification models, based on feature extraction of data, achieved an overall accuracy of 83% to 90%, while in the case of a multiple classification model, the accuracy was reduced to 66%. By contrast, the CNN-based classification model achieved an accuracy of 99.85%, which is superior to those of any other models based on the SVM algorithm. The results revealed that the SVM classification model requires effective feature extraction of the AE signals to improve the accuracy of multi-class classification. Moreover, the CNN-based classification can be a promising approach to detect both leakage and valve opening as long as the performance of the processor does not degrade.

Online Information Sources of Coronavirus Using Webometric Big Data (코로나19 사태와 온라인 정보의 다양성 연구 - 빅데이터를 활용한 글로벌 접근법)

  • Park, Han Woo;Kim, Ji-Eun;Zhu, Yu-Peng
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.11
    • /
    • pp.728-739
    • /
    • 2020
  • Using webometric big data, this study examines the diversity of online information sources about the novel coronavirus causing the COVID-19 pandemic. Specifically, it focuses on some 28 countries where confirmed coronavirus cases occurred in February 2020. In the results, the online visibility of Australia, Canada, and Italy was the highest, based on their producing the most relevant information. There was a statistically significant correlation between the hit counts per country and the frequency of visiting the domains that act as information channels. Interestingly, Japan, China, and Singapore, which had a large number of confirmed cases at that time, were providing web data related to the novel coronavirus. Online sources were classified using an N-tuple helix model. The results showed that government agencies were the largest supplier of coronavirus information in cyberspace. Furthermore, the two-mode network technique revealed that media companies, university hospitals, and public healthcare centers had taken a positive attitude towards online circulation of coronavirus research and epidemic prevention information. However, semantic network analysis showed that health, school, home, and public had high centrality values. This means that people were concerned not only about personal prevention rules caused by the coronavirus outbreak, but also about response plans caused by life inconveniences and operational obstacles.

Evaluation of MODIS-derived Evapotranspiration According to the Water Budget Analysis (물 수지 분석에 의한 MODIS 위성 기반의 증발산량 평가)

  • Lee, Yeongil;Lee, Junghun;Choi, Minha;Jung, Sungwon
    • Journal of Korea Water Resources Association
    • /
    • v.48 no.10
    • /
    • pp.831-843
    • /
    • 2015
  • This study estimates MODIS-derived evapotranspiration data quality by revised RS-PM algorithm in Seolmacheon test basin. We used latent flux with eddy covariance method to evaluate MODIS-derived spatial evapotranspiration and gap-filled these data by three methods (FAO-PM, MDV and Kalman Filter) and to quantify daily evapotranspiration. Gap-filled daily evapotranspiration data was used to evaluate evapotranspiration computed by revised RS-PM algorithm derived MODIS satellite images. For the water budget analysis, we used soil moisture content that is quantified to average individual soil moisture rate observed by TDR (Time Domain Reflectometry) sensor at soil depth. The soil moisture variation is calculated in consideration from initial to final soil moisture content. According to the result of this study, evapotranspiration computed by revised RS-PM algorithm is very larger than eddy covariance data gap-filled by three methods. Also, water budget characteristics is not closed. We could analysis that MODIS-derived spatial evapotranspiration does not represent actual evapotranspiration in Seolmacheon.

Development of Landslide Detection Algorithm Using Fully Polarimetric ALOS-2 SAR Data (Fully-Polarimetric ALOS-2 자료를 이용한 산사태 탐지 알고리즘 개발)

  • Kim, Minhwa;Cho, KeunHoo;Park, Sang-Eun;Cho, Jae-Hyoung;Moon, Hyoi;Han, Seung-hoon
    • Economic and Environmental Geology
    • /
    • v.52 no.4
    • /
    • pp.313-322
    • /
    • 2019
  • SAR (Synthetic Aperture Radar) remote sensing data is a very useful tool for near-real-time identification of landslide affected areas that can occur over a large area due to heavy rains or typhoons. This study aims to develop an effective algorithm for automatically delineating landslide areas from the polarimetric SAR data acquired after the landslide event. To detect landslides from SAR observations, reduction of the speckle effects in the estimation of polarimetric SAR parameters and the orthorectification of geometric distortions on sloping terrain are essential processing steps. Based on the experimental analysis, it was found that the IDAN filter can provide a better estimation of the polarimetric parameters. In addition, it was appropriate to apply orthorectification process after estimating polarimetric parameters in the slant range domain. Furthermore, it was found that the polarimetric entropy is the most appropriate parameters among various polarimetric parameters. Based on those analyses, we proposed an automatic landslide detection algorithm using the histogram thresholding of the polarimetric parameters with the aid of terrain slope information. The landslide detection algorithm was applied to the ALOS-2 PALSAR-2 data which observed landslide areas in Japan triggered by Typhoon in September 2011. Experimental results showed that the landslide areas were successfully identified by using the proposed algorithm with a detection rate of about 82% and a false alarm rate of about 3%.

Analysis of Characteristics of Clusters of Middle School Students Using K-Means Cluster Analysis (K-평균 군집분석을 활용한 중학생의 군집화 및 특성 분석)

  • Jaebong, Lee
    • Journal of The Korean Association For Science Education
    • /
    • v.42 no.6
    • /
    • pp.611-619
    • /
    • 2022
  • The purpose of this study is to explore the possibility of applying big data analysis to provide appropriate feedback to students using evaluation data in science education at a time when interest in educational data mining has recently increased in education. In this study, we use the evaluation data of 2,576 students who took 24 questions of the national assessment of educational achievement. And we use K-means cluster analysis as a method of unsupervised machine learning for clustering. As a result of clustering, students were divided into six clusters. The middle-ranking students are divided into various clusters when compared to upper or lower ranks. According to the results of the cluster analysis, the most important factor influencing clusterization is academic achievement, and each cluster shows different characteristics in terms of content domains, subject competencies, and affective characteristics. Learning motivation is important among the affective domains in the lower-ranking achievement cluster, and scientific inquiry and problem-solving competency, as well as scientific communication competency have a major influence in terms of subject competencies. In the content domain, achievement of motion and energy and matter are important factors to distinguish the characteristics of the cluster. As a result, we can provide students with customized feedback for learning based on the characteristics of each cluster. We discuss implications of these results for science education, such as the possibility of using this study results, balanced learning by content domains, enhancement of subject competency, and improvement of scientific attitude.

Corporate Bankruptcy Prediction Model using Explainable AI-based Feature Selection (설명가능 AI 기반의 변수선정을 이용한 기업부실예측모형)

  • Gundoo Moon;Kyoung-jae Kim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.2
    • /
    • pp.241-265
    • /
    • 2023
  • A corporate insolvency prediction model serves as a vital tool for objectively monitoring the financial condition of companies. It enables timely warnings, facilitates responsive actions, and supports the formulation of effective management strategies to mitigate bankruptcy risks and enhance performance. Investors and financial institutions utilize default prediction models to minimize financial losses. As the interest in utilizing artificial intelligence (AI) technology for corporate insolvency prediction grows, extensive research has been conducted in this domain. However, there is an increasing demand for explainable AI models in corporate insolvency prediction, emphasizing interpretability and reliability. The SHAP (SHapley Additive exPlanations) technique has gained significant popularity and has demonstrated strong performance in various applications. Nonetheless, it has limitations such as computational cost, processing time, and scalability concerns based on the number of variables. This study introduces a novel approach to variable selection that reduces the number of variables by averaging SHAP values from bootstrapped data subsets instead of using the entire dataset. This technique aims to improve computational efficiency while maintaining excellent predictive performance. To obtain classification results, we aim to train random forest, XGBoost, and C5.0 models using carefully selected variables with high interpretability. The classification accuracy of the ensemble model, generated through soft voting as the goal of high-performance model design, is compared with the individual models. The study leverages data from 1,698 Korean light industrial companies and employs bootstrapping to create distinct data groups. Logistic Regression is employed to calculate SHAP values for each data group, and their averages are computed to derive the final SHAP values. The proposed model enhances interpretability and aims to achieve superior predictive performance.

Mid Frequency Band Reverberation Model Development Using Ray Theory and Comparison with Experimental Data (음선 기반 중주파수 대역 잔향음 모델 개발 및 실측 데이터 비교)

  • Chu, Young-Min;Seong, Woo-Jae;Yang, In-Sik;Oh, Won-Tchon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.8
    • /
    • pp.740-754
    • /
    • 2009
  • Sound in the ocean is scattered by inhomogeneities of many different kinds, such as the sea surface, the sea bottom, or the randomly distributed bubble layer and school of fish. The total sum of the scattered signals from these scatterers is called reverberation. In order to simulate the reverberation signal precisely, combination of a propagation model with proper scattering models, corresponding to each scattering mechanism, is required. In this article, we develop a reverberation model based on the ray theory easily combined with the existing scattering models. Developed reverberation model uses (1) Chapman-Harris empirical formula and APL-UW model/SSA model for the sea surface scattering. For the sea bottom scattering, it uses (2) Lambert's law and APL-UW model/SSA model. To verify our developed reverberation model, we compare our results with those in Ellis' article and 2006 reverberation workshop. This verified reverberation model SNURM is used to simulate reverberation signal for the neighboring seas of South Korea at mid frequency and the results from model are compared with experimental data in time domain. Through comparison between experiment data and model results, the features of reverberation signal dependent on environment of each sea is investigated and this analysis leads us to select an appropriate scattering function for each area of interest.

Case Analysis of the Promotion Methodologies in the Smart Exhibition Environment (스마트 전시 환경에서 프로모션 적용 사례 및 분석)

  • Moon, Hyun Sil;Kim, Nam Hee;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.171-183
    • /
    • 2012
  • In the development of technologies, the exhibition industry has received much attention from governments and companies as an important way of marketing activities. Also, the exhibitors have considered the exhibition as new channels of marketing activities. However, the growing size of exhibitions for net square feet and the number of visitors naturally creates the competitive environment for them. Therefore, to make use of the effective marketing tools in these environments, they have planned and implemented many promotion technics. Especially, through smart environment which makes them provide real-time information for visitors, they can implement various kinds of promotion. However, promotions ignoring visitors' various needs and preferences can lose the original purposes and functions of them. That is, as indiscriminate promotions make visitors feel like spam, they can't achieve their purposes. Therefore, they need an approach using STP strategy which segments visitors through right evidences (Segmentation), selects the target visitors (Targeting), and give proper services to them (Positioning). For using STP Strategy in the smart exhibition environment, we consider these characteristics of it. First, an exhibition is defined as market events of a specific duration, which are held at intervals. According to this, exhibitors who plan some promotions should different events and promotions in each exhibition. Therefore, when they adopt traditional STP strategies, a system can provide services using insufficient information and of existing visitors, and should guarantee the performance of it. Second, to segment automatically, cluster analysis which is generally used as data mining technology can be adopted. In the smart exhibition environment, information of visitors can be acquired in real-time. At the same time, services using this information should be also provided in real-time. However, many clustering algorithms have scalability problem which they hardly work on a large database and require for domain knowledge to determine input parameters. Therefore, through selecting a suitable methodology and fitting, it should provide real-time services. Finally, it is needed to make use of data in the smart exhibition environment. As there are useful data such as booth visit records and participation records for events, the STP strategy for the smart exhibition is based on not only demographical segmentation but also behavioral segmentation. Therefore, in this study, we analyze a case of the promotion methodology which exhibitors can provide a differentiated service to segmented visitors in the smart exhibition environment. First, considering characteristics of the smart exhibition environment, we draw evidences of segmentation and fit the clustering methodology for providing real-time services. There are many studies for classify visitors, but we adopt a segmentation methodology based on visitors' behavioral traits. Through the direct observation, Veron and Levasseur classify visitors into four groups to liken visitors' traits to animals (Butterfly, fish, grasshopper, and ant). Especially, because variables of their classification like the number of visits and the average time of a visit can estimate in the smart exhibition environment, it can provide theoretical and practical background for our system. Next, we construct a pilot system which automatically selects suitable visitors along the objectives of promotions and instantly provide promotion messages to them. That is, based on the segmentation of our methodology, our system automatically selects suitable visitors along the characteristics of promotions. We adopt this system to real exhibition environment, and analyze data from results of adaptation. As a result, as we classify visitors into four types through their behavioral pattern in the exhibition, we provide some insights for researchers who build the smart exhibition environment and can gain promotion strategies fitting each cluster. First, visitors of ANT type show high response rate for promotion messages except experience promotion. So they are fascinated by actual profits in exhibition area, and dislike promotions requiring a long time. Contrastively, visitors of GRASSHOPPER type show high response rate only for experience promotion. Second, visitors of FISH type appear favors to coupon and contents promotions. That is, although they don't look in detail, they prefer to obtain further information such as brochure. Especially, exhibitors that want to give much information for limited time should give attention to visitors of this type. Consequently, these promotion strategies are expected to give exhibitors some insights when they plan and organize their activities, and grow the performance of them.

Personal Information Overload and User Resistance in the Big Data Age (빅데이터 시대의 개인정보 과잉이 사용자 저항에 미치는 영향)

  • Lee, Hwansoo;Lim, Dongwon;Zo, Hangjung
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.125-139
    • /
    • 2013
  • Big data refers to the data that cannot be processes with conventional contemporary data technologies. As smart devices and social network services produces vast amount of data, big data attracts much attention from researchers. There are strong demands form governments and industries for bib data as it can create new values by drawing business insights from data. Since various new technologies to process big data introduced, academic communities also show much interest to the big data domain. A notable advance related to the big data technology has been in various fields. Big data technology makes it possible to access, collect, and save individual's personal data. These technologies enable the analysis of huge amounts of data with lower cost and less time, which is impossible to achieve with traditional methods. It even detects personal information that people do not want to open. Therefore, people using information technology such as the Internet or online services have some level of privacy concerns, and such feelings can hinder continued use of information systems. For example, SNS offers various benefits, but users are sometimes highly exposed to privacy intrusions because they write too much personal information on it. Even though users post their personal information on the Internet by themselves, the data sometimes is not under control of the users. Once the private data is posed on the Internet, it can be transferred to anywhere by a few clicks, and can be abused to create fake identity. In this way, privacy intrusion happens. This study aims to investigate how perceived personal information overload in SNS affects user's risk perception and information privacy concerns. Also, it examines the relationship between the concerns and user resistance behavior. A survey approach and structural equation modeling method are employed for data collection and analysis. This study contributes meaningful insights for academic researchers and policy makers who are planning to develop guidelines for privacy protection. The study shows that information overload on the social network services can bring the significant increase of users' perceived level of privacy risks. In turn, the perceived privacy risks leads to the increased level of privacy concerns. IF privacy concerns increase, it can affect users to from a negative or resistant attitude toward system use. The resistance attitude may lead users to discontinue the use of social network services. Furthermore, information overload is mediated by perceived risks to affect privacy concerns rather than has direct influence on perceived risk. It implies that resistance to the system use can be diminished by reducing perceived risks of users. Given that users' resistant behavior become salient when they have high privacy concerns, the measures to alleviate users' privacy concerns should be conceived. This study makes academic contribution of integrating traditional information overload theory and user resistance theory to investigate perceived privacy concerns in current IS contexts. There is little big data research which examined the technology with empirical and behavioral approach, as the research topic has just emerged. It also makes practical contributions. Information overload connects to the increased level of perceived privacy risks, and discontinued use of the information system. To keep users from departing the system, organizations should develop a system in which private data is controlled and managed with ease. This study suggests that actions to lower the level of perceived risks and privacy concerns should be taken for information systems continuance.

Predicting the Performance of Recommender Systems through Social Network Analysis and Artificial Neural Network (사회연결망분석과 인공신경망을 이용한 추천시스템 성능 예측)

  • Cho, Yoon-Ho;Kim, In-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.159-172
    • /
    • 2010
  • The recommender system is one of the possible solutions to assist customers in finding the items they would like to purchase. To date, a variety of recommendation techniques have been developed. One of the most successful recommendation techniques is Collaborative Filtering (CF) that has been used in a number of different applications such as recommending Web pages, movies, music, articles and products. CF identifies customers whose tastes are similar to those of a given customer, and recommends items those customers have liked in the past. Numerous CF algorithms have been developed to increase the performance of recommender systems. Broadly, there are memory-based CF algorithms, model-based CF algorithms, and hybrid CF algorithms which combine CF with content-based techniques or other recommender systems. While many researchers have focused their efforts in improving CF performance, the theoretical justification of CF algorithms is lacking. That is, we do not know many things about how CF is done. Furthermore, the relative performances of CF algorithms are known to be domain and data dependent. It is very time-consuming and expensive to implement and launce a CF recommender system, and also the system unsuited for the given domain provides customers with poor quality recommendations that make them easily annoyed. Therefore, predicting the performances of CF algorithms in advance is practically important and needed. In this study, we propose an efficient approach to predict the performance of CF. Social Network Analysis (SNA) and Artificial Neural Network (ANN) are applied to develop our prediction model. CF can be modeled as a social network in which customers are nodes and purchase relationships between customers are links. SNA facilitates an exploration of the topological properties of the network structure that are implicit in data for CF recommendations. An ANN model is developed through an analysis of network topology, such as network density, inclusiveness, clustering coefficient, network centralization, and Krackhardt's efficiency. While network density, expressed as a proportion of the maximum possible number of links, captures the density of the whole network, the clustering coefficient captures the degree to which the overall network contains localized pockets of dense connectivity. Inclusiveness refers to the number of nodes which are included within the various connected parts of the social network. Centralization reflects the extent to which connections are concentrated in a small number of nodes rather than distributed equally among all nodes. Krackhardt's efficiency characterizes how dense the social network is beyond that barely needed to keep the social group even indirectly connected to one another. We use these social network measures as input variables of the ANN model. As an output variable, we use the recommendation accuracy measured by F1-measure. In order to evaluate the effectiveness of the ANN model, sales transaction data from H department store, one of the well-known department stores in Korea, was used. Total 396 experimental samples were gathered, and we used 40%, 40%, and 20% of them, for training, test, and validation, respectively. The 5-fold cross validation was also conducted to enhance the reliability of our experiments. The input variable measuring process consists of following three steps; analysis of customer similarities, construction of a social network, and analysis of social network patterns. We used Net Miner 3 and UCINET 6.0 for SNA, and Clementine 11.1 for ANN modeling. The experiments reported that the ANN model has 92.61% estimated accuracy and 0.0049 RMSE. Thus, we can know that our prediction model helps decide whether CF is useful for a given application with certain data characteristics.