• Title/Summary/Keyword: pooling layer

Search Result 51, Processing Time 0.024 seconds

CNN Based 2D and 2.5D Face Recognition For Home Security System (홈보안 시스템을 위한 CNN 기반 2D와 2.5D 얼굴 인식)

  • MaYing, MaYing;Kim, Kang-Chul
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.14 no.6
    • /
    • pp.1207-1214
    • /
    • 2019
  • Technologies of the 4th industrial revolution have been unknowingly seeping into our lives. Many IoT based home security systems are using the convolutional neural network(CNN) as good biometrics to recognize a face and protect home and family from intruders since CNN has demonstrated its excellent ability in image recognition. In this paper, three layouts of CNN for 2D and 2.5D image of small dataset with various input image size and filter size are explored. The simulation results show that the layout of CNN with 50*50 input size of 2.5D image, 2 convolution and max pooling layer, and 3*3 filter size for small dataset of 2.5D image is optimal for a home security system with recognition accuracy of 0.966. In addition, the longest CPU time consumption for one input image is 0.057S. The proposed layout of CNN for a face recognition is suitable to control the actuators in the home security system because a home security system requires good face recognition and short recognition time.

Quality grading of Hanwoo (Korean native cattle breed) sub-images using convolutional neural network

  • Kwon, Kyung-Do;Lee, Ahyeong;Lim, Jongkuk;Cho, Soohyun;Lee, Wanghee;Cho, Byoung-Kwan;Seo, Youngwook
    • Korean Journal of Agricultural Science
    • /
    • v.47 no.4
    • /
    • pp.1109-1122
    • /
    • 2020
  • The aim of this study was to develop a marbling classification and prediction model using small parts of sirloin images based on a deep learning algorithm, namely, a convolutional neural network (CNN). Samples were purchased from a commercial slaughterhouse in Korea, images for each grade were acquired, and the total images (n = 500) were assigned according to their grade number: 1++, 1+, 1, and both 2 & 3. The image acquisition system consists of a DSLR camera with a polarization filter to remove diffusive reflectance and two light sources (55 W). To correct the distorted original images, a radial correction algorithm was implemented. Color images of sirloins of Hanwoo (mixed with feeder cattle, steer, and calf) were divided and sub-images with image sizes of 161 × 161 were made to train the marbling prediction model. In this study, the convolutional neural network (CNN) has four convolution layers and yields prediction results in accordance with marbling grades (1++, 1+, 1, and 2&3). Every single layer uses a rectified linear unit (ReLU) function as an activation function and max-pooling is used for extracting the edge between fat and muscle and reducing the variance of the data. Prediction accuracy was measured using an accuracy and kappa coefficient from a confusion matrix. We summed the prediction of sub-images and determined the total average prediction accuracy. Training accuracy was 100% and the test accuracy was 86%, indicating comparably good performance using the CNN. This study provides classification potential for predicting the marbling grade using color images and a convolutional neural network algorithm.

MLCNN-COV: A multilabel convolutional neural network-based framework to identify negative COVID medicine responses from the chemical three-dimensional conformer

  • Pranab Das;Dilwar Hussain Mazumder
    • ETRI Journal
    • /
    • v.46 no.2
    • /
    • pp.290-306
    • /
    • 2024
  • To treat the novel COronaVIrus Disease (COVID), comparatively fewer medicines have been approved. Due to the global pandemic status of COVID, several medicines are being developed to treat patients. The modern COVID medicines development process has various challenges, including predicting and detecting hazardous COVID medicine responses. Moreover, correctly predicting harmful COVID medicine reactions is essential for health safety. Significant developments in computational models in medicine development can make it possible to identify adverse COVID medicine reactions. Since the beginning of the COVID pandemic, there has been significant demand for developing COVID medicines. Therefore, this paper presents the transferlearning methodology and a multilabel convolutional neural network for COVID (MLCNN-COV) medicines development model to identify negative responses of COVID medicines. For analysis, a framework is proposed with five multilabel transfer-learning models, namely, MobileNetv2, ResNet50, VGG19, DenseNet201, and Inceptionv3, and an MLCNN-COV model is designed with an image augmentation (IA) technique and validated through experiments on the image of three-dimensional chemical conformer of 17 number of COVID medicines. The RGB color channel is utilized to represent the feature of the image, and image features are extracted by employing the Convolution2D and MaxPooling2D layer. The findings of the current MLCNN-COV are promising, and it can identify individual adverse reactions of medicines, with the accuracy ranging from 88.24% to 100%, which outperformed the transfer-learning model's performance. It shows that three-dimensional conformers adequately identify negative COVID medicine responses.

Mospholops Mosphogicel Chauge on the Derelopmest of Duitus Oeferers of Meat Trppe Cockerds (육닭 정관의 발육에 따른 형태학적 변화)

  • 한방근;김우권;이재홍
    • Korean Journal of Animal Reproduction
    • /
    • v.8 no.1
    • /
    • pp.46-55
    • /
    • 1984
  • The purpose of the experiment was to clarify morphologically normal growth pattern of the ductus deference in accordance with the sex maturity of meat-type cockerels. 1. Diameter of lumens in u, pp.r, mid and lower parts of ductus deferens, the most conspicuous enlargement of lumen was observed in the lower part. Heights of epithelial layers of ductus deferens showed abrupt growth at 12 weeks of age with subsequent gradual growth in all the part of u, pp.r, mid and lower, and heights of those at 30 weeks were a, pp.oximately 4 times as large in the u, pp.r and mid parts and 5 times as large in the lower part in contrast to those at 4 weeks of age. Thickness of muscular layer of ductus showed gradual growth in contrast with the diameter of lumen and height of epithelial layer, showing 1.3 times as large in the u, pp.r part, 1.6 times in the mid part and 1.9 times in the lower part at 30 weeks of age in contrast to the thickness at 4 weeks of age. 2. Within 10 weeks after hatching, lining cells of ductus deferens were mainly composed of round cells and columnar cells in simple columnar epithelium. During 10th to 20th week, the lining cells were mainly composed of high columnar cells and round cells in pseudostratified epithelium. From 22nd week, the lining cells were composed of pseudostratified columnar cells. Whereas round cells disa, pp.ared gradually. Enlargement of lumen and pooling of sperms in ductus deferens coincided with the maturation of seminiferous tubules. 3. In simple correlation between the values of testis weight and the values from various measurements in the ductus deferens, there was significant correlation coefficient with each other. 4. In the India ink absorption test, India ink granules were not absorbed on the epithelium of the ductus deferens, but the granules reactive to acid phosphatase a, pp.ared in a line on the free border of each parts of the ductus deferens. The granules reactive to alkaline phosphatase were noted on the luminal border of ductus deferens mainly, but weak reaction showed than acid phophatase were a, pp.ared. The granules reactive to PAS were a, pp.ared mostly near on the free border of hte epithelial cells of ductus deferens. 5. Number of sperm, Indes of sperm vitality and MRT in the different parts of ductus deferens were tended to be somewhat dominant in the mid and lower parts than in u, pp.r part, even though not significant in the statistical analysis. Ratio of sperm abnormality was tended to be relatively high in the u, pp.r part too, and in the sperm of abnormality blunted head was less in number significantly in the mid and lower part than in the u, pp.r part.

  • PDF

Histological and Histochemical Studies on the Epididymal Region and Deferent Ducts of the Drakes by the Age in Weeks (오리 부고환(副睾丸) 및 정관(精管)의 주령별(週齡別) 조직학적(組織學的) 및 조직화학적(組織化學的) 연구(硏究))

  • Lee, Jae-Hong;Ha, Chang-Su
    • Korean Journal of Veterinary Research
    • /
    • v.23 no.2
    • /
    • pp.137-148
    • /
    • 1983
  • This study was made for the better information of the male reproductive system on the meat-type drake, Cherry Belly X White Golden. The epithelium of ductules of epididymal region and deferent duct were observed histologically and histochemically with the progress of their development. India-ink absorbability on the luminal epithelium was also investigated after the administration of India-ink. The results are as follows; 1. Rete testis and various round ductules in immature form appeared in epididymis within 6 weeks after hatching, and simple cuboidal and simple columnar epithelium were found in the epithelia of the ductules within 8 weeks after hatching. Larger ductules were found on epididymal surface which was in the developing stage near to the immature efferent ductule. From 10th to 20th week, various ductules appeared in epididymis, and developing form of efferent ductules were much more increased on epididymal surface. The luminal epithelium of the ductules were composed of ciliated simple columnar and pseudostratified ciliated columnar cells. At the same time, deferent duct appeared. From the 21th week, various ductules in epididymis became abruptly matured. Lumen of rete testis was lined by simple squamous or simple cuboidal epithelium, and that of efferent ductules, having many folds and being larger than any others were lined by pseudostratified ciliated columnar epithelium in which ciliated columnar cells, non-ciliated cells(clear cells) and basal cells were noted. Connecting tubules of star shaped lumen were composed of pseudostratified ciliated columnar epithelium in which ciliated columnar cells, nonciliated cells, and basal cells were observed. The luminal surface of epididymal ducts was smooth and has thick pseudostratified columnar epithelium which was composed of high columnar cells and basal cells. From 26th week after hatching, sperm pooling was started in various ductules. 2. From 4th to 10th week, simple cuboidal epithelium of deferent duct transformed to simple columnar epithelium with the progress of aging. At the basement of epithelium, clear round cells were noted. From 12th to 20th week, high columnar cells with enlongated nucleus were noted on the luminal border of deferent ducts, forming folds of pseuclostratified columnar epithelium. From 20th week, the deferent duct started to have septa in it's lumen and composed mainly of pseudostratified columnar epithelium, and round cells disappeared. From 20th week, the lumen diameter of deferent duct became wider with the progress of aging, but there was no difference among the values of lumen diameter in upper, middle, and lower part of deferent ducts. At 26th week, the pooling period of sperms in deferent ducts, the lumen diameter became rapidly widen, especially in the lower part of deferent ducts. Thickness of muscular layer of ductus deferens showed gradual growth within 24 weeks but did abrupt thickening from 26th week. 3. Saliva resistant PAS granules were dotted on the top of nucleus in efferent ductules epithelium but the amount of the granules were little in the connecting ductules's epithelium. The granules reactive to acid phosphatase were abundant in the some epithelial cells of efferent ductules and connecting ductules, especially above the nucleus of cells. The granules reactive to alkaline phosphatase were noted on the luminal border of efferent ductules. Parts of free border of efferent ductules and middle portion of deferent ducts were stained slightly by alcian blue technique. India ink granules were found mainly in the epithelium of efferent ductules but were few in that of connecting ductules.

  • PDF

Further Evidence of Linkage at the tva and tvc Loci in the Layer Lines and a Possibility of Polyallelism at the tvc Locus

  • Ghosh, A.K.;Pani, P.K.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.18 no.5
    • /
    • pp.601-605
    • /
    • 2005
  • Three lines of White Leghorn (WL) chickens (IWJ, IWG and IWC) maintained at Central Avian Research Institute, Izatnagar (UP), were used for chorioallantoic membrane (CAM) and liver tumour (LT) assay. Eleven-day-old embryos of each line were partitioned into three groups and inoculated with 0.2 ml of subgroup A, subgroup C and an equal mixture of subgroup A and C Rous sarcoma virus (RSV). Subgroup virus receptor on the cell surface membrane for subgroup A is coded for by tumour virus a (tva) locus and for subgroup C by tumour virus c (tvc) locus. The random association of the genes at the tva and tvc loci in IWJ and IWC line was assessed and the $x^2$-values for phenotypic classes were found to be significant, indicating the linkage between the tva and tvc loci. The linkage value was estimated to be 0.09 on pooled sex and pooled line basis. On the basis of four subclass tumour phenotypes a 4-allele model was proposed for tva locus having $a^{s1}$, $a^{s2}$, $a^{r1}$ and $a^{r2}$ alleles and the frequencies were calculated as 0.47, 0.13, 0.13 and 0.27 for IWJ line, 0.31, 0.33, 0.14 and 0.22 for IWG line and 0.44, 0.11, 0.21 and 0.24 for IWC line, respectively. Similarly, for tvc locus the frequencies of four alleles i.e. $c^{s1}$, $c^{s2}$, $c^{r1}$ and $c^{r2}$ were calculated as 0.42, 0.20, 0.21 and 0.17 for IWJ line, 0.42, 0.17, 0.27 and 0.14 for IWG line and 0.30, 0.21, 0.16 and 0.33 for IWC line, respectively. The $x^2$-values for all classes of observations were not significant (p>0.05), indicating a good fit to the 4-allele model for the occurrence of 4-subclass tumour phenotypes for tva and tvc loci. On the basis of the 2-allele model both tva and tvc locus carries three genotypes each. But, on the basis of the 4-allele model tva and tvc locus carries 10 genotypes each. The interaction between A-resistance and C-resistance (both CAM and LT death) was ascertained by taking the 10 genotypes of tva locus and 3 genotypes of tvc locus by pooling the lines and partitioning the observations into 3 classes. The $x^2$-values for the genotypic classes of CAM (-) LT (+) and CAM (-) LT (-) phenotypes to mixed virus (A+C) infection were found to be highly significant (p<0.01), indicating increased resistance, which indicates the joint segregation of $a^r$ and $c^r$ genes, suggesting the existence of close linkage between the tva and tvc loci. Therefore, an indirect selection approach using subgroup C viruses can be employed to generate stocks resistant to subgroup A LLV, obviating contamination with the most common agent causing LL in field condition.

A Study on Person Re-Identification System using Enhanced RNN (확장된 RNN을 활용한 사람재인식 시스템에 관한 연구)

  • Choi, Seok-Gyu;Xu, Wenjie
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.17 no.2
    • /
    • pp.15-23
    • /
    • 2017
  • The person Re-identification is the most challenging part of computer vision due to the significant changes in human pose and background clutter with occlusions. The picture from non-overlapping cameras enhance the difficulty to distinguish some person from the other. To reach a better performance match, most methods use feature selection and distance metrics separately to get discriminative representations and proper distance to describe the similarity between person and kind of ignoring some significant features. This situation has encouraged us to consider a novel method to deal with this problem. In this paper, we proposed an enhanced recurrent neural network with three-tier hierarchical network for person re-identification. Specifically, the proposed recurrent neural network (RNN) model contain an iterative expectation maximum (EM) algorithm and three-tier Hierarchical network to jointly learn both the discriminative features and metrics distance. The iterative EM algorithm can fully use of the feature extraction ability of convolutional neural network (CNN) which is in series before the RNN. By unsupervised learning, the EM framework can change the labels of the patches and train larger datasets. Through the three-tier hierarchical network, the convolutional neural network, recurrent network and pooling layer can jointly be a feature extractor to better train the network. The experimental result shows that comparing with other researchers' approaches in this field, this method also can get a competitive accuracy. The influence of different component of this method will be analyzed and evaluated in the future research.

Business Application of Convolutional Neural Networks for Apparel Classification Using Runway Image (합성곱 신경망의 비지니스 응용: 런웨이 이미지를 사용한 의류 분류를 중심으로)

  • Seo, Yian;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.1-19
    • /
    • 2018
  • Large amount of data is now available for research and business sectors to extract knowledge from it. This data can be in the form of unstructured data such as audio, text, and image data and can be analyzed by deep learning methodology. Deep learning is now widely used for various estimation, classification, and prediction problems. Especially, fashion business adopts deep learning techniques for apparel recognition, apparel search and retrieval engine, and automatic product recommendation. The core model of these applications is the image classification using Convolutional Neural Networks (CNN). CNN is made up of neurons which learn parameters such as weights while inputs come through and reach outputs. CNN has layer structure which is best suited for image classification as it is comprised of convolutional layer for generating feature maps, pooling layer for reducing the dimensionality of feature maps, and fully-connected layer for classifying the extracted features. However, most of the classification models have been trained using online product image, which is taken under controlled situation such as apparel image itself or professional model wearing apparel. This image may not be an effective way to train the classification model considering the situation when one might want to classify street fashion image or walking image, which is taken in uncontrolled situation and involves people's movement and unexpected pose. Therefore, we propose to train the model with runway apparel image dataset which captures mobility. This will allow the classification model to be trained with far more variable data and enhance the adaptation with diverse query image. To achieve both convergence and generalization of the model, we apply Transfer Learning on our training network. As Transfer Learning in CNN is composed of pre-training and fine-tuning stages, we divide the training step into two. First, we pre-train our architecture with large-scale dataset, ImageNet dataset, which consists of 1.2 million images with 1000 categories including animals, plants, activities, materials, instrumentations, scenes, and foods. We use GoogLeNet for our main architecture as it has achieved great accuracy with efficiency in ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Second, we fine-tune the network with our own runway image dataset. For the runway image dataset, we could not find any previously and publicly made dataset, so we collect the dataset from Google Image Search attaining 2426 images of 32 major fashion brands including Anna Molinari, Balenciaga, Balmain, Brioni, Burberry, Celine, Chanel, Chloe, Christian Dior, Cividini, Dolce and Gabbana, Emilio Pucci, Ermenegildo, Fendi, Giuliana Teso, Gucci, Issey Miyake, Kenzo, Leonard, Louis Vuitton, Marc Jacobs, Marni, Max Mara, Missoni, Moschino, Ralph Lauren, Roberto Cavalli, Sonia Rykiel, Stella McCartney, Valentino, Versace, and Yve Saint Laurent. We perform 10-folded experiments to consider the random generation of training data, and our proposed model has achieved accuracy of 67.2% on final test. Our research suggests several advantages over previous related studies as to our best knowledge, there haven't been any previous studies which trained the network for apparel image classification based on runway image dataset. We suggest the idea of training model with image capturing all the possible postures, which is denoted as mobility, by using our own runway apparel image dataset. Moreover, by applying Transfer Learning and using checkpoint and parameters provided by Tensorflow Slim, we could save time spent on training the classification model as taking 6 minutes per experiment to train the classifier. This model can be used in many business applications where the query image can be runway image, product image, or street fashion image. To be specific, runway query image can be used for mobile application service during fashion week to facilitate brand search, street style query image can be classified during fashion editorial task to classify and label the brand or style, and website query image can be processed by e-commerce multi-complex service providing item information or recommending similar item.

Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Mode (CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석)

  • Park, Ho-yeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.141-154
    • /
    • 2019
  • Rapid growth of internet technology and social media is progressing. Data mining technology has evolved to enable unstructured document representations in a variety of applications. Sentiment analysis is an important technology that can distinguish poor or high-quality content through text data of products, and it has proliferated during text mining. Sentiment analysis mainly analyzes people's opinions in text data by assigning predefined data categories as positive and negative. This has been studied in various directions in terms of accuracy from simple rule-based to dictionary-based approaches using predefined labels. In fact, sentiment analysis is one of the most active researches in natural language processing and is widely studied in text mining. When real online reviews aren't available for others, it's not only easy to openly collect information, but it also affects your business. In marketing, real-world information from customers is gathered on websites, not surveys. Depending on whether the website's posts are positive or negative, the customer response is reflected in the sales and tries to identify the information. However, many reviews on a website are not always good, and difficult to identify. The earlier studies in this research area used the reviews data of the Amazon.com shopping mal, but the research data used in the recent studies uses the data for stock market trends, blogs, news articles, weather forecasts, IMDB, and facebook etc. However, the lack of accuracy is recognized because sentiment calculations are changed according to the subject, paragraph, sentiment lexicon direction, and sentence strength. This study aims to classify the polarity analysis of sentiment analysis into positive and negative categories and increase the prediction accuracy of the polarity analysis using the pretrained IMDB review data set. First, the text classification algorithm related to sentiment analysis adopts the popular machine learning algorithms such as NB (naive bayes), SVM (support vector machines), XGboost, RF (random forests), and Gradient Boost as comparative models. Second, deep learning has demonstrated discriminative features that can extract complex features of data. Representative algorithms are CNN (convolution neural networks), RNN (recurrent neural networks), LSTM (long-short term memory). CNN can be used similarly to BoW when processing a sentence in vector format, but does not consider sequential data attributes. RNN can handle well in order because it takes into account the time information of the data, but there is a long-term dependency on memory. To solve the problem of long-term dependence, LSTM is used. For the comparison, CNN and LSTM were chosen as simple deep learning models. In addition to classical machine learning algorithms, CNN, LSTM, and the integrated models were analyzed. Although there are many parameters for the algorithms, we examined the relationship between numerical value and precision to find the optimal combination. And, we tried to figure out how the models work well for sentiment analysis and how these models work. This study proposes integrated CNN and LSTM algorithms to extract the positive and negative features of text analysis. The reasons for mixing these two algorithms are as follows. CNN can extract features for the classification automatically by applying convolution layer and massively parallel processing. LSTM is not capable of highly parallel processing. Like faucets, the LSTM has input, output, and forget gates that can be moved and controlled at a desired time. These gates have the advantage of placing memory blocks on hidden nodes. The memory block of the LSTM may not store all the data, but it can solve the CNN's long-term dependency problem. Furthermore, when LSTM is used in CNN's pooling layer, it has an end-to-end structure, so that spatial and temporal features can be designed simultaneously. In combination with CNN-LSTM, 90.33% accuracy was measured. This is slower than CNN, but faster than LSTM. The presented model was more accurate than other models. In addition, each word embedding layer can be improved when training the kernel step by step. CNN-LSTM can improve the weakness of each model, and there is an advantage of improving the learning by layer using the end-to-end structure of LSTM. Based on these reasons, this study tries to enhance the classification accuracy of movie reviews using the integrated CNN-LSTM model.

A Deep Learning Based Approach to Recognizing Accompanying Status of Smartphone Users Using Multimodal Data (스마트폰 다종 데이터를 활용한 딥러닝 기반의 사용자 동행 상태 인식)

  • Kim, Kilho;Choi, Sangwoo;Chae, Moon-jung;Park, Heewoong;Lee, Jaehong;Park, Jonghun
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.163-177
    • /
    • 2019
  • As smartphones are getting widely used, human activity recognition (HAR) tasks for recognizing personal activities of smartphone users with multimodal data have been actively studied recently. The research area is expanding from the recognition of the simple body movement of an individual user to the recognition of low-level behavior and high-level behavior. However, HAR tasks for recognizing interaction behavior with other people, such as whether the user is accompanying or communicating with someone else, have gotten less attention so far. And previous research for recognizing interaction behavior has usually depended on audio, Bluetooth, and Wi-Fi sensors, which are vulnerable to privacy issues and require much time to collect enough data. Whereas physical sensors including accelerometer, magnetic field and gyroscope sensors are less vulnerable to privacy issues and can collect a large amount of data within a short time. In this paper, a method for detecting accompanying status based on deep learning model by only using multimodal physical sensor data, such as an accelerometer, magnetic field and gyroscope, was proposed. The accompanying status was defined as a redefinition of a part of the user interaction behavior, including whether the user is accompanying with an acquaintance at a close distance and the user is actively communicating with the acquaintance. A framework based on convolutional neural networks (CNN) and long short-term memory (LSTM) recurrent networks for classifying accompanying and conversation was proposed. First, a data preprocessing method which consists of time synchronization of multimodal data from different physical sensors, data normalization and sequence data generation was introduced. We applied the nearest interpolation to synchronize the time of collected data from different sensors. Normalization was performed for each x, y, z axis value of the sensor data, and the sequence data was generated according to the sliding window method. Then, the sequence data became the input for CNN, where feature maps representing local dependencies of the original sequence are extracted. The CNN consisted of 3 convolutional layers and did not have a pooling layer to maintain the temporal information of the sequence data. Next, LSTM recurrent networks received the feature maps, learned long-term dependencies from them and extracted features. The LSTM recurrent networks consisted of two layers, each with 128 cells. Finally, the extracted features were used for classification by softmax classifier. The loss function of the model was cross entropy function and the weights of the model were randomly initialized on a normal distribution with an average of 0 and a standard deviation of 0.1. The model was trained using adaptive moment estimation (ADAM) optimization algorithm and the mini batch size was set to 128. We applied dropout to input values of the LSTM recurrent networks to prevent overfitting. The initial learning rate was set to 0.001, and it decreased exponentially by 0.99 at the end of each epoch training. An Android smartphone application was developed and released to collect data. We collected smartphone data for a total of 18 subjects. Using the data, the model classified accompanying and conversation by 98.74% and 98.83% accuracy each. Both the F1 score and accuracy of the model were higher than the F1 score and accuracy of the majority vote classifier, support vector machine, and deep recurrent neural network. In the future research, we will focus on more rigorous multimodal sensor data synchronization methods that minimize the time stamp differences. In addition, we will further study transfer learning method that enables transfer of trained models tailored to the training data to the evaluation data that follows a different distribution. It is expected that a model capable of exhibiting robust recognition performance against changes in data that is not considered in the model learning stage will be obtained.