Recently, according to the number of internet in widely use and the development of the related application program, the distribution and use of multimedia content(text, images, video, audio etc.) is very easy. Digital signal may be easily duplicated and the duplicated data can have same quality of original data so that it is difficult to warrant original owner. For the solution of this problem, the protection method of copyright which is encipher and watermarking. Digital watermarking is used to protect IP(Intellectual Property) and authenticate the owner of multimedia content. In this paper, the proposed watermarking algerian embeds watermark into multiple lower bitplanes of digital image. In the proposed algorithm, original and watermark images are decomposed to bitplanes each other and the watermarking operation is executed in the corresponded bitplane. The position of watermark image embedded in each bitplane is used to the watermarking key and executed in multiple lower bitplane which has no an influence on human visual recognition. Thus this algorithm can present watermark image to the multiple inherent patterns and needs small watermarking quantity. In the experiment, the author confirmed that it has high robustness against attacks of JPEG, MEDIAN and PSNR but it is weakness against attacks of NOISE, RNDDIST, ROT, SCALE, SS on spatial domain when a criterion PSNR of watermarked image is 40dB.
Place awareness is an essential for location-based services that are widely provided to smartphone users. However, traditional GPS-based methods are only valid outdoors where the GPS signal is strong and also require symbolic place information of the physical location. In this paper, environmental sounds and images are used to recognize important aspects of each place. The proposed method extracts feature vectors from visual, auditory and location data recorded by a smartphone with built-in camera, microphone and GPS sensors modules. The heterogeneous feature vectors were then learned by an ensemble learning method that learns each group of feature vectors for each classifier respectively and votes to produce the highest weighted result. The proposed method is evaluated for place recognition using a data group of 3000 samples in six places and the experimental results show a remarkably improved recognition accuracy when using all kinds of sensory data comparing to results using data from a single sensor or audio-visual integrated data only.
Sound event detection is one of the research areas to model human auditory cognitive characteristics by recognizing events in an environment with multiple acoustic events and determining the onset and offset time for each event. DCASE, a research group on acoustic scene classification and sound event detection, is proceeding challenges to encourage participation of researchers and to activate sound event detection research. However, the size of the dataset provided by the DCASE Challenge is relatively small compared to ImageNet, which is a representative dataset for visual object recognition, and there are not many open sources for the acoustic dataset. In this study, the sound events that can occur in indoor and outdoor are collected on a larger scale and annotated for dataset construction. Furthermore, to improve the performance of the sound event detection task, we developed a dual CNN structured sound event detection system by adding a supplementary neural network to a convolutional neural network to determine the presence of sound events. Finally, we conducted a comparative experiment with both baseline systems of the DCASE 2016 and 2017.
In this study, we comprehensively analyze the generalization performance of various deep learning-based active sonar target classifiers when applied to small and imbalanced active sonar datasets. To generate the active sonar datasets, we use data from two different oceanic experiments conducted at different times and ocean. Each sample in the active sonar datasets is a time-frequency domain image, which is extracted from audio signal of contact after the detection process. For the comprehensive analysis, we utilize 22 Convolutional Neural Networks (CNN) models. Two datasets are used as train/validation datasets and test datasets, alternatively. To calculate the variance in the output of the target classifiers, the train/validation/test datasets are repeated 10 times. Hyperparameters for training are optimized using Bayesian optimization. The results demonstrate that shallow CNN models show superior robustness and generalization performance compared to most of deep CNN models. The results from this paper can serve as a valuable reference for future research directions in deep learning-based active sonar target classification.
In this paper, a new audio reproduction system was developed in which the cross-talk signals would be reasonably cancelled at an arbitrary listener position. To adaptively remove the cross-talk signals according to the listener's position, a method of tracking the listener position was employed. This was achieved using the two microphones, where the listener direction was estimated using the time-delay between the two signals from the two microphones, respectively. Moreover, room reverberation effects were taken into consideration where linear prediction analysis was involved. To remove the cross-talk signals at the left-and right-ears, the paths between the sources and the ears were represented using the KEMAR head-related transfer functions (HRTFs) which were measured from the artificial dummy head. To evaluate the usefulness of the proposed listener tracking system, the performance of cross-talk cancellation was evaluated at the estimated listener positions. The performance was evaluated in terms of the channel separation ration (CSR), a -10 dB of CSR was experimentally achieved although the listener positions were more or less deviated. A real-time system was implemented using a floating-point digital signal processor (DSP). It was confirmed that the average errors of the listener direction was 5 degree and the subjects indicated that 80 % of the stimuli was perceived as the correct directions.
This paper describes a sound engine of Korean traditional instruments, which are the Gayageum and Taepyeongso, by using a TMS320F2812. The Gayageum and Taepyeongso models based on commuted waveguide synthesis (CWS) are required to synthesize each sound. There is an instrument selection button to choose one of instruments in the proposed sound engine, and thus a corresponding sound is produced by the relative model at every certain time. Every synthesized sound sample is transmitted to a DAC (TLV5638) using SPI communication, and it is played through a speaker via an audio interface. The length of the delay line determines a fundamental frequency of a desired sound. In order to determine the length of the delay line, it is needed that the time for synthesizing a sound sample should be checked by using a GPIO. It takes
This work proposes a 10b 25MS/s
Speaker recognition is generally divided into speaker identification and speaker verification. Speaker recognition plays an important function in the automatic voice system, and the importance of speaker recognition technology is becoming more prominent as the recent development of portable devices, voice technology, and audio content fields continue to expand. Previous speaker recognition studies have been conducted with the goal of automatically determining who the speaker is based on voice files and improving accuracy. Speech is an important sociolinguistic subject, and it contains very useful information that reveals the speaker's attitude, conversation intention, and personality, and this can be an important clue to speaker recognition. The final ending used in the speaker's speech determines the type of sentence or has functions and information such as the speaker's intention, psychological attitude, or relationship to the listener. The use of the terminating ending has various probabilities depending on the characteristics of the speaker, so the type and distribution of the terminating ending of a specific unidentified speaker will be helpful in recognizing the speaker. However, there have been few studies that considered speech in the existing text-based speaker recognition, and if speech information is added to the speech signal-based speaker recognition technique, the accuracy of speaker recognition can be further improved. Hence, the purpose of this paper is to propose a novel method using speech style expressed as a sentence-final ending to improve the accuracy of Korean speaker recognition. To this end, a method called sentence sequencing that generates vector values by using the type and frequency of the sentence-final ending appearing in the utterance of a specific person is proposed. To evaluate the performance of the proposed method, learning and performance evaluation were conducted with a actual drama script. The method proposed in this study can be used as a means to improve the performance of Korean speech recognition service.
The wall shear stress in the vicinity of end-to end anastomoses under steady flow conditions was measured using a flush-mounted hot-film anemometer(FMHFA) probe. The experimental measurements were in good agreement with numerical results except in flow with low Reynolds numbers. The wall shear stress increased proximal to the anastomosis in flow from the Penrose tubing (simulating an artery) to the PTFE: graft. In flow from the PTFE graft to the Penrose tubing, low wall shear stress was observed distal to the anastomosis. Abnormal distributions of wall shear stress in the vicinity of the anastomosis, resulting from the compliance mismatch between the graft and the host artery, might be an important factor of ANFH formation and the graft failure. The present study suggests a correlation between regions of the low wall shear stress and the development of anastomotic neointimal fibrous hyperplasia(ANPH) in end-to-end anastomoses. 30523 T00401030523 ^x Air pressure decay(APD) rate and ultrafiltration rate(UFR) tests were performed on new and saline rinsed dialyzers as well as those roused in patients several times. C-DAK 4000 (Cordis Dow) and CF IS-11 (Baxter Travenol) reused dialyzers obtained from the dialysis clinic were used in the present study. The new dialyzers exhibited a relatively flat APD, whereas saline rinsed and reused dialyzers showed considerable amount of decay. C-DAH dialyzers had a larger APD(11.70
The wall shear stress in the vicinity of end-to end anastomoses under steady flow conditions was measured using a flush-mounted hot-film anemometer(FMHFA) probe. The experimental measurements were in good agreement with numerical results except in flow with low Reynolds numbers. The wall shear stress increased proximal to the anastomosis in flow from the Penrose tubing (simulating an artery) to the PTFE: graft. In flow from the PTFE graft to the Penrose tubing, low wall shear stress was observed distal to the anastomosis. Abnormal distributions of wall shear stress in the vicinity of the anastomosis, resulting from the compliance mismatch between the graft and the host artery, might be an important factor of ANFH formation and the graft failure. The present study suggests a correlation between regions of the low wall shear stress and the development of anastomotic neointimal fibrous hyperplasia(ANPH) in end-to-end anastomoses. 30523 T00401030523 ^x Air pressure decay(APD) rate and ultrafiltration rate(UFR) tests were performed on new and saline rinsed dialyzers as well as those roused in patients several times. C-DAK 4000 (Cordis Dow) and CF IS-11 (Baxter Travenol) reused dialyzers obtained from the dialysis clinic were used in the present study. The new dialyzers exhibited a relatively flat APD, whereas saline rinsed and reused dialyzers showed considerable amount of decay. C-DAH dialyzers had a larger APD(11.70