• Title/Summary/Keyword: audiovisual data

Search Result 34, Processing Time 0.021 seconds

KMSAV: Korean multi-speaker spontaneous audiovisual dataset

  • Kiyoung Park;Changhan Oh;Sunghee Dong
    • ETRI Journal
    • /
    • v.46 no.1
    • /
    • pp.71-81
    • /
    • 2024
  • Recent advances in deep learning for speech and visual recognition have accelerated the development of multimodal speech recognition, yielding many innovative results. We introduce a Korean audiovisual speech recognition corpus. This dataset comprises approximately 150 h of manually transcribed and annotated audiovisual data supplemented with additional 2000 h of untranscribed videos collected from YouTube under the Creative Commons License. The dataset is intended to be freely accessible for unrestricted research purposes. Along with the corpus, we propose an open-source framework for automatic speech recognition (ASR) and audiovisual speech recognition (AVSR). We validate the effectiveness of the corpus with evaluations using state-of-the-art ASR and AVSR techniques, capitalizing on both pretrained models and fine-tuning processes. After fine-tuning, ASR and AVSR achieve character error rates of 11.1% and 18.9%, respectively. This error difference highlights the need for improvement in AVSR techniques. We expect that our corpus will be an instrumental resource to support improvements in AVSR.

Incomplete Cholesky Decomposition based Kernel Cross Modal Factor Analysis for Audiovisual Continuous Dimensional Emotion Recognition

  • Li, Xia;Lu, Guanming;Yan, Jingjie;Li, Haibo;Zhang, Zhengyan;Sun, Ning;Xie, Shipeng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.810-831
    • /
    • 2019
  • Recently, continuous dimensional emotion recognition from audiovisual clues has attracted increasing attention in both theory and in practice. The large amount of data involved in the recognition processing decreases the efficiency of most bimodal information fusion algorithms. A novel algorithm, namely the incomplete Cholesky decomposition based kernel cross factor analysis (ICDKCFA), is presented and employed for continuous dimensional audiovisual emotion recognition, in this paper. After the ICDKCFA feature transformation, two basic fusion strategies, namely feature-level fusion and decision-level fusion, are explored to combine the transformed visual and audio features for emotion recognition. Finally, extensive experiments are conducted to evaluate the ICDKCFA approach on the AVEC 2016 Multimodal Affect Recognition Sub-Challenge dataset. The experimental results show that the ICDKCFA method has a higher speed than the original kernel cross factor analysis with the comparable performance. Moreover, the ICDKCFA method achieves a better performance than other common information fusion methods, such as the Canonical correlation analysis, kernel canonical correlation analysis and cross-modal factor analysis based fusion methods.

Condition and Effect of Sex Education Program for Korean Middle School Students (중학생의 성교육 실태 및 프로그램 효과분석)

  • Moon, In-Ok;Youn, Young-Ok;Kim, Ro-Eul
    • The Journal of Korean Society for School & Community Health Education
    • /
    • v.8 no.1
    • /
    • pp.1-11
    • /
    • 2007
  • Objectives: School must provide the proper sex education to students, thereby, the students for have right standards of sexuality and preventing from sexual crimes. This study conducted to identify the effectiveness and students' satisfaction level on Sex education program for middle school students prepared by the Ministry of Education and Human Resources. Methods: The sample size of the study was 644 students(458 female students and 186 male students) in middle school. A self reporting type of questionnaire survey was conducted from May 2 through May 27, 2005. Collected data were processed using SPSSwin 12.0; The data were analysed through t-test, stepwise multiple regression analysis. Results: Lectures and audiovisual materials were mostly used for sex education for students. Many students were satisfied with the program of physical and sexual organ development, pregnancy, contraceptive methods and sexual abuse. Many students wanted to study more on courtship, love, and marriage. The programs which the students did not understand well were sexually transmitted diseases, pregnancy and mass media and sex. Forty six percentage of the students reported that they were satisfied with the education program. Thirty three percentage of the students said that they were not satisfied with the program. The students who had earlier menstruation experience and the students whose academic achievement were higher were more satisfied with sex education program (P<.05). The students who were satisfied with the sex education CD prepared by the ministry of education were more satisfied with sex education program. (P<.001). When the CDs were appropriately used, the students were more satisfied with the education program (P<.05). The sound and pictures in the CD did not much affect the students. Audiovisual programs were more effective than lectures.

  • PDF

The effect of preparatory audiovisual information with videotape influencing on sleep and anxiety of abdominal sugical patients (비디오테잎을 이용한 간호정보 제공이 수술전 수면 및 불안에 미치는 영향 -위수술환자를 중심으로-)

  • Kim Keum-Soon;Kang Jiy-Eon
    • Journal of Korean Academy of Fundamentals of Nursing
    • /
    • v.1 no.1
    • /
    • pp.19-35
    • /
    • 1994
  • To test the effectiveness of the preparatory audiovisual information with videotape, 34 patients with gastric cancer and who have scheduled for subtotal gastrectomy were studied with quasiexperimental research design. The subjects were selected from the 4 general surgical wards of one university hospital in Seoul, and assigned to experimental and control group conveniently. The videotaped information on the preparation and recovery for surgery was showed to the experimental subjects once before having operation. Data on the sleep and the state anxiety level before and after treatment day was collected with VSH sleep scale and STAI. The data was analyzed with t-test to test the effect of preparatory information and the Pearson's correlation to identify the correlation between anxiety and sleep. The results were summerized as follows : 1. After receiving the preparatory information, the level of anxiety of the experimental group was the same level as the initial, whereas that of the control group showed markedly increase. However no significant difference in anxiety between the two groups was found. 2. There was significant difference in sleep score between the experimental and the control group. 3. There was significant negative correlation between the state anxiety score and the sleep score. Based upon the above findings, this study concludes that preparatory information is effective to enhance sleep just prior to the surgery.

  • PDF

Research on Audiovisual Type Preservation Format Selection Criteria and Recommended Formats: Focusing on Audio Types (시청각 유형 보존포맷 선정기준 및 권고포맷 연구 - 오디오 유형을 중심으로 -)

  • Hanyeok Jeon;Dongmin Yang
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.35 no.1
    • /
    • pp.273-300
    • /
    • 2024
  • In the electronic records environment, along with discussions on ways to digitize analog records, it is important to prepare preservation strategies for each type of records produced and received electronically. In the same context, there is a need for discussion on applying a preservation format selection system with the goal of long-term preservation of data sets and audio-visual type electronic records other than document types. Audiovisual records must apply preservation strategies appropriate to the characteristics of each medium, such as images, audio, and video. This study establishes unique standards for selecting a preservation format for audio-visual electronic records through analysis of Significant Properties based on literature review, composed audio-type preservation format suitability evaluation items, and proposed a recommended format based on the results of applying them.

A Study on Real-Time Multimedia Service Considering Network Performance in ATM Networks (ATM망에서 Network Performance를 고려한 Real-Time Multimedia Service에 관한 연구)

  • 김영준;이병호
    • Proceedings of the IEEK Conference
    • /
    • 1998.10a
    • /
    • pp.91-94
    • /
    • 1998
  • ATM technology is reaching a certain level of maturity that allow for its deployment in local as well as in wide area networks. Concurrently, audiovisual applications are foreseen as one of the major users of such broadband networks. We present in this paper requriement of real-time multimedia service on B-ISDN networks and simulating the transport of MPEG-2 encoded multimedia data over ATM networks using CBR, VBR, ABR of ATM Traffic Service. We compare each delay time considering network performance and propose need for real-time multimedia service.

  • PDF

T-DMB Hybrid Data Service Part 1: Hybrid BIFS Technology (T-DMB 하이브리드 데이터 서비스 Part 1: 하이브리드 BIFS 기술)

  • Lim, Young-Kwon;Kim, Kyu-Heon;Jeong, Je-Chang
    • Journal of Broadcast Engineering
    • /
    • v.16 no.2
    • /
    • pp.350-359
    • /
    • 2011
  • Fast developments of broadcasting technologies since 1990s enabled not only High Definition Television service providing high quality audiovisual contents at home but also mobile broadcasting service providing audiovisual contents to high speed moving vehicle. Terrestrial Digital Multimedia Broadcasting (T-DMB) is one of the technologies developed for mobile broadcasting service, which has been successfully commercialized. One of the major technical breakthroughs achieved by T-DMB in addition to robust vehicular reception is an adoption of framework based on MPEG-4 System. It naturally enables integrated interactive data services by using Binary Format for Scene (BIFS) technology for scene description and representation of graphics object and Object Descriptor Framework representing multimedia service components as objects. T-DMB interactive data service has two fundamental limitations. Firstly, graphic data for interactive service should be always overlaid on top of a video not to be rendered out of it. Secondly, data for interactive service is only received by broadcasting channel. These limitations were considered as general in broadcasting systems. However, they are being considered as hard limitations for personalized data services using location information and user characteristics which are becoming widely used for data services of smart devices in these days. In this paper, the architecture of T-DMB hybrid data service is proposed which is utilizing broadcasting network, wireless internet and local storage for delivering BIFS data to overcome these limitations. This paper also presents hybrid BIFS technology to implement T-DMB hybrid data service while maintaining backward compatibility with legacy T-DMB players.

A Study on the Construction of a Linked Database for an Integrated Service Platform of Local Culture and Arts Resources

  • Younghee Noh;Woojeong Kwak
    • International Journal of Knowledge Content Development & Technology
    • /
    • v.13 no.4
    • /
    • pp.119-137
    • /
    • 2023
  • In this study, it was intended to explore a way to build a DB which links the resources and areas and regions already registered as cultural assets in connection with a project which is newly building local culture and arts resources. Towards this end, this study first identified the type and scale of existing local culture and arts resources that could be linked. Following which, to link the local cultural resources and collected cultural assets, this study investigated the websites such as the Cultural Heritage Administration's National Cultural Heritage Portal, municipal and provincial tangible cultural festivals, municipal and provincial intangible cultural assets, and Gyeonggi Memory. Furthermore, this study identified the amount of information sources to be built and the current status of each information source to identify detailed information sources. Finally, the metadata of local culture and arts resources were presented by classifying them into material and publication data metadata, document metadata, audiovisual metadata, oral recording metadata, village information metadata, and personal information village information metadata.

Role of Print and Audiovisual Media in Cervical Cancer Prevention in Bangladesh

  • Nessa, Ashrafun;Hussain, Muhammad Anwar;Ur Rashid, Mohammad Harun;Akhter, Nargis;Roy, Joya Shree;Afroz, Romena
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.14 no.5
    • /
    • pp.3131-3137
    • /
    • 2013
  • Background: Visual inspection of cervix with acetic acid (VIA) is offered at 252 centers in 64 districts of Bangladesh. VIA+ve women are managed at colposcopy clinics of Bangabandhu Sheikh Mujib Medical University (BSMMU) and 14 Medical College Hospitals (MCHs). This research work has been supported by 'UICC Cancer Prevention Campaign' programme. Objectives: This study explored the role of print materials and electronic media to improve cervical cancer screening in the present socio-cultural context of Bangladesh. Methods: This study was performed from January to August 2011 at two upazilas of Bangladesh (Singair with screening facility and Sonargaon without screening facility). Data were collected by focus group discussion (FGD) with women, husbands and community people before and after intervention. Information on cervical cancer screening and VIA camps was disseminated using advertisement through local cable line of the television, microphone announcement, service providers and leaflet throughout the week prior to a VIA camp. Three-day VIA camps were organized at the upazila health complex (UHC) of both upazilas. Quantitative data was gathered from women at the camps on source of information on VIA and the best method of awareness creation. Results: The population was aware of "cancer" and a notable number knew about cervical cancer. Baseline awareness on prevention and VIA was low and it was negligible where screening services were unavailable. Awareness was increased fourfold in both upazilas after interventions and half of the women and the majority of the community people became aware of screening and available facilities. Cable line advertisement (25.5%), microphone announcement (21.4%), and discussion sessions (20.4%) were effective for awareness creation on VIA. Television was mentioned as the best method (37.4%) of awareness creation. Conclusion: Television should be used for nation-wide awareness creation. For local awareness creation, cable line advertisement, microphone announcements and health education at Uthan Baithaks/ EPI sessions can easily be adopted by the government.

Elementary, Middle, and High School Students' Perception of Polar Region (초·중·고등학생들의 극지에 대한 인식)

  • Chung, Sueim;Choi, Haneul;Kim, Minjee;Shin, Donghee
    • Journal of the Korean earth science society
    • /
    • v.42 no.6
    • /
    • pp.717-733
    • /
    • 2021
  • This study is aimed to provide basic data to set the direction of polar literacy education and to raise awareness of the importance of polar research. Elementary, middle, and high school students' perception of the polar region was examined in terms of current status of polar information, impression regarding polar regions, and awareness of related issues. The study included 975 students from nine elementary, middle, and high schools, who responded to 16 questions, including close-ended and open-ended items. The results suggest that students had more experiences regarding the polar region on audiovisual media, but relatively limited learning experiences in school education. The impression they had of the polar region was confined to the monotonous image of a polar bear in crisis, following the melting of the glacier due to global warming. The students formed powerful images by combining scenes they saw in audiovisual media with emotions. In terms of recognizing problems in the polar region, the students were generally interested in creatures, natural environment, and climate change, but their interests varied depending on their school level and their own career path. The students highly valued the scientist's status as agents to address the problems facing the region, and gave priority to global citizenship values rather than practical standards. Based on the results, we suggest the following: introducing and systematizing content focusing on the polar region in the school curriculum, providing a differentiated learning experience through cooperation between scientists and educators, establishing polar literacy based on concepts that are relevant to various subjects, earth system-centered learning approach, setting the direction for follow-up studies and the need for science education that incorporates diverse values.