• Title/Summary/Keyword: Multimodal model

Search Result 142, Processing Time 0.024 seconds

Wavelet-based Statistical Noise Detection and Emotion Classification Method for Improving Multimodal Emotion Recognition (멀티모달 감정인식률 향상을 위한 웨이블릿 기반의 통계적 잡음 검출 및 감정분류 방법 연구)

  • Yoon, Jun-Han;Kim, Jin-Heon
    • Journal of IKEEE
    • /
    • v.22 no.4
    • /
    • pp.1140-1146
    • /
    • 2018
  • Recently, a methodology for analyzing complex bio-signals using a deep learning model has emerged among studies that recognize human emotions. At this time, the accuracy of emotion classification may be changed depending on the evaluation method and reliability depending on the kind of data to be learned. In the case of biological signals, the reliability of data is determined according to the noise ratio, so that the noise detection method is as important as that. Also, according to the methodology for defining emotions, appropriate emotional evaluation methods will be needed. In this paper, we propose a wavelet -based noise threshold setting algorithm for verifying the reliability of data for multimodal bio-signal data labeled Valence and Arousal and a method for improving the emotion recognition rate by weighting the evaluation data. After extracting the wavelet component of the signal using the wavelet transform, the distortion and kurtosis of the component are obtained, the noise is detected at the threshold calculated by the hampel identifier, and the training data is selected considering the noise ratio of the original signal. In addition, weighting is applied to the overall evaluation of the emotion recognition rate using the euclidean distance from the median value of the Valence-Arousal plane when classifying emotional data. To verify the proposed algorithm, we use ASCERTAIN data set to observe the degree of emotion recognition rate improvement.

Developing Integrated Transportation Service Index for Encouraging Transit-oriented Development (TOD형 개발 촉진을 위한 통합교통서비스 지표의 개발)

  • Hwang, Kee Yeon;Shin, Sang Young;Cho, Yong Hak;Sohn, Kee Min
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.28 no.1D
    • /
    • pp.1-10
    • /
    • 2008
  • Recently, the Seoul Metropolitan Government (SMG) has initiated several urban redevelopment projects to revitalize the downtown well equipped for transit oriented development (TOD). Since, TOD should incur higher density development in our context, it has negative impacts on travel patterns, congestion, and urban environmental quality. The purpose of this study is to develop new transportation service index which can facilitate higher density TOD. This study includes relevant foreign case studies, the development of multimodal transportation index, and the impact analysis of TOD when it is applied in the downtown Seoul. In chapter III, it developed a so-called ITLOS, new multimodal transportation service index which shows the possibility of accommodating further development by integrating roadway service index with public transportation service index. The study sets ten policy scenarios by varying densities, and run the Seoul Congestion Management Model (SECOMM) to estimate the sustainable transportation impacts of TOD in the downtown. Travel speed index that only represents the availability of road capacity for development reveal that higher density development in the downtown can deteriorate traffic congestion while improving region-wide transportation level of service in Seoul. Also, it is proved that higher density development is more feasible when using ITLOS as the index because it considers not only available road capacity but subway capacity in the analysis area.

Human Action Recognition Using Pyramid Histograms of Oriented Gradients and Collaborative Multi-task Learning

  • Gao, Zan;Zhang, Hua;Liu, An-An;Xue, Yan-Bing;Xu, Guang-Ping
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.2
    • /
    • pp.483-503
    • /
    • 2014
  • In this paper, human action recognition using pyramid histograms of oriented gradients and collaborative multi-task learning is proposed. First, we accumulate global activities and construct motion history image (MHI) for both RGB and depth channels respectively to encode the dynamics of one action in different modalities, and then different action descriptors are extracted from depth and RGB MHI to represent global textual and structural characteristics of these actions. Specially, average value in hierarchical block, GIST and pyramid histograms of oriented gradients descriptors are employed to represent human motion. To demonstrate the superiority of the proposed method, we evaluate them by KNN, SVM with linear and RBF kernels, SRC and CRC models on DHA dataset, the well-known dataset for human action recognition. Large scale experimental results show our descriptors are robust, stable and efficient, and outperform the state-of-the-art methods. In addition, we investigate the performance of our descriptors further by combining these descriptors on DHA dataset, and observe that the performances of combined descriptors are much better than just using only sole descriptor. With multimodal features, we also propose a collaborative multi-task learning method for model learning and inference based on transfer learning theory. The main contributions lie in four aspects: 1) the proposed encoding the scheme can filter the stationary part of human body and reduce noise interference; 2) different kind of features and models are assessed, and the neighbor gradients information and pyramid layers are very helpful for representing these actions; 3) The proposed model can fuse the features from different modalities regardless of the sensor types, the ranges of the value, and the dimensions of different features; 4) The latent common knowledge among different modalities can be discovered by transfer learning to boost the performance.

Video Highlight Prediction Using GAN and Multiple Time-Interval Information of Audio and Image (오디오와 이미지의 다중 시구간 정보와 GAN을 이용한 영상의 하이라이트 예측 알고리즘)

  • Lee, Hansol;Lee, Gyemin
    • Journal of Broadcast Engineering
    • /
    • v.25 no.2
    • /
    • pp.143-150
    • /
    • 2020
  • Huge amounts of contents are being uploaded every day on various streaming platforms. Among those videos, game and sports videos account for a great portion. The broadcasting companies sometimes create and provide highlight videos. However, these tasks are time-consuming and costly. In this paper, we propose models that automatically predict highlights in games and sports matches. While most previous approaches use visual information exclusively, our models use both audio and visual information, and present a way to understand short term and long term flows of videos. We also describe models that combine GAN to find better highlight features. The proposed models are evaluated on e-sports and baseball videos.

A Study on the Fitness of Korea's Hub-Port Strategy in Northeast Asia by SCM (공급사슬관리에 의한 동북아 거점항만전략의 적합성에 관한 연구)

  • Lee In-Soo;Ahn Ki-Myung;Kim Hyun-Duk
    • Journal of Navigation and Port Research
    • /
    • v.29 no.8 s.104
    • /
    • pp.709-714
    • /
    • 2005
  • The purpose of this research is to verify the strategic fitness and relevance of the hub port strategy by SCM in Northeast Asia and to find a method to be a hub-port with a competitive edge. The fitness of the hub port development strategy is analysed by the structural equation model. The essential results of the research show that minimizing lead time from arrival of ship to inland transport and maximizing logistic services of each stage are important to provide optimal logistic service. And value-added port supply chain strategy is highly co-related with all the parts of port operation system, port transport system, distribution park and port information system. It shows that: various value added logistic service activity is more important than lowing cost; inland multimodal system should be rightly connected; distribution park should be connected to industry park to be a port cluster; and port information system should be developed.

Improvement of Environment Recognition using Multimodal Signal (멀티 신호를 이용한 환경 인식 성능 개선)

  • Park, Jun-Qyu;Baek, Seong-Joon
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.12
    • /
    • pp.27-33
    • /
    • 2010
  • In this study, we conducted the classification experiments with GMM (Gaussian Mixture Model) from combining the extracted features by using microphone, Gyro sensor and Acceleration sensor in 9 different environment types. Existing studies of Context Aware wanted to recognize the Environment situation mainly using the Environment sound data with microphone, but there was limitation of reflecting recognition owing to structural characteristics of Environment sound which are composed of various noises combination. Hence we proposed the additional application methods which added Gyro sensor and Acceleration sensor data in order to reflect recognition agent's movement feature. According to the experimental results, the method combining Acceleration sensor data with the data of existing Environment sound feature improves the recognition performance by more than 5%, when compared with existing methods of getting only Environment sound feature data from the Microphone.

U.S. Port Investment Strategies and the Corresponding Economic Impacts Stemming from the Panama Canal Expansion

  • Park, ChangKeun
    • Asian Journal of Innovation and Policy
    • /
    • v.10 no.2
    • /
    • pp.195-211
    • /
    • 2021
  • This paper measures the economic impacts of the U.S. port investment strategies coping with the Panama Canal expansion. Using secondary import data, negative and positive estimates of the impacts were presented in this study. Reduced port activities into the West Coast Customs Districts negatively affect transportation and warehousing industries, among other effects. Still, they have simultaneous positive effects in other states from increased imports resulting from modal shifts and changes in the entry port located in the South and East coasts. This study applied the supply-driven National Interstate Economic Model that measures all interstate trade among the U.S. states to divert foreign imports from 15 Pacific Rim countries. For this purpose, the following assumption was adopted: larger ships using the canal will lead to a redirection of seaborne trade among U.S. (and other) ports and result in secondary effects, e.g., using different freight modes and regional growth spillovers. This study also accounted for the entry point change and significant port investments for foreign trade under alternative scenarios. The choice of ports for international trade depends on decisions about how to minimize multimodal delivery costs. The total direct reduction of transportation and warehousing activities associated with foreign imports in the West Coast ports was estimated at $3.3 billion, leading to total negative effects of $5.8 billion. Total positive impacts from the shift of transportation modes with the choice of an entry port and new warehousing activities for foreign imports in the selected 12 states varied. As expected, states that involved an entry port had the most prominent benefits, but Texas, New York, and New Jersey may be benefited through all the port enhancement projects in the U.S. Also, except for Transportation and Postal, and Warehousing industries, Construction is another dominant positive affected industry of the Canal expansion in the U.S.

Analytic Hierarchy Process Modelling of Location Competitiveness for a Regional Logistics Distribution Center Serving Northeast Asia

  • Kim, Si-Hyun;Lee, Kwang-Ho;Kang, Dal-Won
    • Journal of Korea Trade
    • /
    • v.24 no.3
    • /
    • pp.20-36
    • /
    • 2020
  • Purpose - As the global product network expands through both internationalization and diversification of the multimodal transportation system, corporate strategies have shifted to emphasize the importance of a high value-added international logistics system. To guide policies and strategies to attract relevant industries, this study aims to analyze the location competitiveness of regional logistics distribution center to serve Northeast Asia. Design/methodology - Multi-criteria techniques are considered to offer a promising framework for evaluating decision-making factors. This paper employed an analytic hierarchy process to analyze the hierarchal structure of determinants for selecting the location of a regional logistics distribution center. Adopting both qualitative and quantitative evaluations, this study suggest political implications for a regional logistics distribution center development, such as the direction of political support, service differentiation and infrastructure development. Findings - This study developed a location competitiveness evaluation model, based on the case study of the major port-cities in Northeast Asia. Evaluation model incorporates five factors underpinning 17 components extracted using factor analysis. The results revealed that the logistics factor is the most significant factor for evaluating the competitiveness of a regional logistics distribution center. The remaining factors were market, costs, and services environment. Comparing qualitative and quantitative evaluations, results provide useful insights for a regional logistics distribution center development in Northeast Asia. Originality/value - This study revealed differences between qualitative and quantitative evaluations. The finding implies that prior works on evaluation models of competitiveness has not successfully measured the gap between quantitative data and expert' evaluations. To overcome this limitation, this paper considered both actual data such as actual distance, cost, the number of companies located, and expert opinions.

Multi-Emotion Regression Model for Recognizing Inherent Emotions in Speech Data (음성 데이터의 내재된 감정인식을 위한 다중 감정 회귀 모델)

  • Moung Ho Yi;Myung Jin Lim;Ju Hyun Shin
    • Smart Media Journal
    • /
    • v.12 no.9
    • /
    • pp.81-88
    • /
    • 2023
  • Recently, communication through online is increasing due to the spread of non-face-to-face services due to COVID-19. In non-face-to-face situations, the other person's opinions and emotions are recognized through modalities such as text, speech, and images. Currently, research on multimodal emotion recognition that combines various modalities is actively underway. Among them, emotion recognition using speech data is attracting attention as a means of understanding emotions through sound and language information, but most of the time, emotions are recognized using a single speech feature value. However, because a variety of emotions exist in a complex manner in a conversation, a method for recognizing multiple emotions is needed. Therefore, in this paper, we propose a multi-emotion regression model that extracts feature vectors after preprocessing speech data to recognize complex, inherent emotions and takes into account the passage of time.

Enhancing Acute Kidney Injury Prediction through Integration of Drug Features in Intensive Care Units

  • Gabriel D. M. Manalu;Mulomba Mukendi Christian;Songhee You;Hyebong Choi
    • International journal of advanced smart convergence
    • /
    • v.12 no.4
    • /
    • pp.434-442
    • /
    • 2023
  • The relationship between acute kidney injury (AKI) prediction and nephrotoxic drugs, or drugs that adversely affect kidney function, is one that has yet to be explored in the critical care setting. One contributing factor to this gap in research is the limited investigation of drug modalities in the intensive care unit (ICU) context, due to the challenges of processing prescription data into the corresponding drug representations and a lack in the comprehensive understanding of these drug representations. This study addresses this gap by proposing a novel approach that leverages patient prescription data as a modality to improve existing models for AKI prediction. We base our research on Electronic Health Record (EHR) data, extracting the relevant patient prescription information and converting it into the selected drug representation for our research, the extended-connectivity fingerprint (ECFP). Furthermore, we adopt a unique multimodal approach, developing machine learning models and 1D Convolutional Neural Networks (CNN) applied to clinical drug representations, establishing a procedure which has not been used by any previous studies predicting AKI. The findings showcase a notable improvement in AKI prediction through the integration of drug embeddings and other patient cohort features. By using drug features represented as ECFP molecular fingerprints along with common cohort features such as demographics and lab test values, we achieved a considerable improvement in model performance for the AKI prediction task over the baseline model which does not include the drug representations as features, indicating that our distinct approach enhances existing baseline techniques and highlights the relevance of drug data in predicting AKI in the ICU setting.