Search | Korea Science

Integrated receptive field diversification method for improving speaker verification performance for variable-length utterances (가변 길이 입력 발성에서의 화자 인증 성능 향상을 위한 통합된 수용 영역 다양화 기법)

Shin, Hyun-seo;Kim, Ju-ho;Heo, Jungwoo;Shim, Hye-jin;Yu, Ha-Jin
- The Journal of the Acoustical Society of Korea
- /
- v.41 no.3
- /
- pp.319-325
- /
- 2022
The variation of utterance lengths is a representative factor that can degrade the performance of speaker verification systems. To handle this issue, previous studies had attempted to extract speaker features from various branches or to use convolution layers with different receptive fields. Combining the advantages of the previous two approaches for variable-length input, this paper proposes integrated receptive field diversification that extracts speaker features through more diverse receptive field. The proposed method processes the input features by convolutional layers with different receptive fields at multiple time-axis branches, and extracts speaker embedding by dynamically aggregating the processed features according to the lengths of input utterances. The deep neural networks in this study were trained on the VoxCeleb2 dataset and tested on the VoxCeleb1 evaluation dataset that divided into 1 s, 2 s, 5 s, and full-length. Experimental results demonstrated that the proposed method reduces the equal error rate by 19.7 % compared to the baseline.
https://doi.org/10.7776/ASK.2022.41.3.319 인용 PDF KSCI

Line Tracer Modeling for Educational Virtual Experiment (교육용 가상실험 라인 트레이서 모델링)

Ki, Jang-Geun;Kwon, Kee-Young
- Journal of Software Assessment and Valuation
- /
- v.17 no.2
- /
- pp.109-116
- /
- 2021
Traditionally, the engineering field has been dominated by face-to-face education focused on experimental practice, but demand for online learning has soared due to the rapid development of IT technology and Internet communication networks and recent changes in the social environment such as COVID-19. In order for efficient online education to be conducted in the engineering field, where the proportion of experimental practice is relatively high compared to other fields, virtual laboratory practice content that can replace actual experimental practice is very necessary. In this study, we developed a line tracer model and a virtual experimental software to simulate it for efficient online learning of microprocessor applications that are essential not only in the electric and electronic field but also in the overall engineering field where IT convergence takes place. In the developed line tracer model, the user can set various hardware parameter values in the desired form and write the software in assembly language or C language to test the operation on the computer. The developed line tracer virtual experimental software has been used in actual classes to verify its operation, and is expected to be an efficient virtual experimental practice tool in online non-face-to-face classes.
https://doi.org/10.29056/jsav.2021.12.12 인용

Low Power ADC Design for Mixed Signal Convolutional Neural Network Accelerator (혼성신호 컨볼루션 뉴럴 네트워크 가속기를 위한 저전력 ADC설계)

Lee, Jung Yeon;Asghar, Malik Summair;Arslan, Saad;Kim, HyungWon
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.25 no.11
- /
- pp.1627-1634
- /
- 2021
This paper introduces a low-power compact ADC circuit for analog Convolutional filter for low-power neural network accelerator SOC. While convolutional neural network accelerators can speed up the learning and inference process, they have drawback of consuming excessive power and occupying large chip area due to large number of multiply-and-accumulate operators when implemented in complex digital circuits. To overcome these drawbacks, we implemented an analog convolutional filter that consists of an analog multiply-and-accumulate arithmetic circuit along with an ADC. This paper is focused on the design optimization of a low-power 8bit SAR ADC for the analog convolutional filter accelerator We demonstrate how to minimize the capacitor-array DAC, an important component of SAR ADC, which is three times smaller than the conventional circuit. The proposed ADC has been fabricated in CMOS 65nm process. It achieves an overall size of 1355.7㎛², power consumption of 2.6㎼ at a frequency of 100MHz, SNDR of 44.19 dB, and ENOB of 7.04bit.
https://doi.org/10.6109/jkiice.2021.25.11.1627 인용 PDF KSCI

One-to-All and All-to-all Broadcasting Algorithms of Matrix Hypercube (매트릭스 하이퍼큐브의 일-대-다 방송과 다-대-다 방송 알고리즘)

Kim, Jongseok;Lee, Heongok
- Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
- /
- v.8 no.8
- /
- pp.825-834
- /
- 2018
Broadcasting is a basic data communication method for interconnection networks. There are two types of broadcasting. One-to-all broadcasting is to transmit a message from one node to all other nodes and all-to-all broadcasting is to transmit a message from all the nodes that have messages to other nodes. And by the using way of the transmission port per unit time, there are two schemes of broadcasting. Single port telecommunication(SLA) is to transmit messages from one node that contains the messages to one adjacent node only and all port telecommunication(MLA) is to transmit messages from one node to all adjacent nodes within a time of unit. Matrix hypercube is that an interconnection network has improved network cost than that of hypercube with the same number of nodes. In this paper, we analyze broadcasting scheme of matirx hypercube. First, we propose one-to-all and all-to-all broadcasting algorithms of matrix hypercube. And we prove that one-to-all broadcasting times are 2n+1 and $2{\lceil}{\frac{n}{2}}{\rceil}+1$ based on the SLA and MLA models, respectively. Also, we show all-to-all broadcasting time using SLA model is $5{\times}2^{\frac{n}{2}}-2$ when n=even, and is $5{\times}2^{\frac{n-1}{2}}+2$ when n=odd.
https://doi.org/10.21742/AJMAHS.2018.08.29 인용

Automated Story Generation with Image Captions and Recursiva Calls (이미지 캡션 및 재귀호출을 통한 스토리 생성 방법)

Isle Jeon;Dongha Jo;Mikyeong Moon
- Journal of the Institute of Convergence Signal Processing
- /
- v.24 no.1
- /
- pp.42-50
- /
- 2023
The development of technology has achieved digital innovation throughout the media industry, including production techniques and editing technologies, and has brought diversity in the form of consumer viewing through the OTT service and streaming era. The convergence of big data and deep learning networks automatically generated text in format such as news articles, novels, and scripts, but there were insufficient studies that reflected the author's intention and generated story with contextually smooth. In this paper, we describe the flow of pictures in the storyboard with image caption generation techniques, and the automatic generation of story-tailored scenarios through language models. Image caption using CNN and Attention Mechanism, we generate sentences describing pictures on the storyboard, and input the generated sentences into the artificial intelligence natural language processing model KoGPT-2 in order to automatically generate scenarios that meet the planning intention. Through this paper, the author's intention and story customized scenarios are created in large quantities to alleviate the pain of content creation, and artificial intelligence participates in the overall process of digital content production to activate media intelligence.
https://doi.org/10.23087/jkicsp.2023.24.1.006 인용 PDF

Network Coding Technologies for Wireless Bidirectional Asymmetric Relay (무선 양방향 비대칭 상호중계를 위한 네트워크 코딩 기법)

Bongseop Song;Sangpill Lee;Choong-Hee Lee;Inho Lee;In-Joong Nam
- Journal of Korea Society of Industrial Information Systems
- /
- v.29 no.5
- /
- pp.1-9
- /
- 2024
With the emergence of various next-generation wireless networks, the traditional store and forward(SF) method at network nodes has faced limitations in efficiently utilizing network capacity. To overcome these limitations, various network coding techniques based on the decode and forward(DF) method have been proposed. However, these techniques have primarily focused on traffic environments with asymmetric packet lengths between relay nodes, limiting their applicability when different modulation and coding schemes(MCS) are applied to relay nodes. This paper proposes a relay network coding scheme that supports high frequency efficiency while simultaneously enabling bidirectional relaying using DF, considering asymmetric MCS traffic that reflects different transmission data and wireless channel conditions between individual nodes for efficient utilization of wireless network capacity. Additionally, this paper demonstrates the possibility of cooperative communication at the relay and examines the effect of increased communication distance. Subsequently, computer simulations are conducted to verify the performance gains of the proposed technique in terms of network coding for each source node with asymmetric information lengths. This proposed technique shows additional bit error rate(BER) performance gains by adopting an incremental redundancy(IR) scheme that follows network coding, even in mobile node environments where direct link transmission between source nodes is possible.
https://doi.org/10.9723/jksiis.2024.29.5.001 인용 PDF

Development of Forest Road Network Model Using Digital Terrain Model (수치지형(數値地形)모델을 이용(利用)한 임도망(林道網) 배치(配置)모델의 개발(開發))

Lee, Jun Woo
- Journal of Korean Society of Forest Science
- /
- v.81 no.4
- /
- pp.363-371
- /
- 1992
This study was aimed at developing a computer model to determine rational road networks in mountainous forests. The computer model is composed of two major subroutines for digital terrain analyses and route selection. The digital terrain model(DTM) provides various information on topographic and vegetative characteristics of forest stands. The DTM also evaluates the effectiveness of road construction based on slope gradients. Using the results of digital terrain analyses, the route selection subroutine, heuristically, determines the optimal road layout satisfying the predefined road densities. The route selection subroutine uses the area-partitioning method in order to fully of roads. This method leads to unbiased road layouts in forest areas. The size of the unit partitiones area can be calculated as a function of the predefined road density. In addition, the user-defined road density of the area-partitioning method provides flexibility in applying the model to real situations. The rational road network can be easily achived for varying road densities, which would be an essential element for network design of forest roads. The optimality conditions are evaluated in conjuction with longitudinal gradients, investment efficiency earthwork quantity or the mixed criteria of these three. The performance of the model was measured and, then, compared with those of conventional ones in terns of average skidding distance, accessibility of stands, development index and circulated road network index. The results of the performance analysis indicate that selection of roading routes for network design using the digital terrain analysis and the area-partitioning method improves performance of the network design medel.
PDF

Three Dimensional Measurements of Pore Morphological and Hydraulic Properties (토양 공극 형태와 수문학적 특성에 대한 3 차원적 측정)

Chun, Hyen-Chung;Gimenez, Daniel;Yoon, Sung-Won;Heck, Richard;Elliot, Tom;Ziska, Laise;Geaorge, Kate;Sonn, Yeon-Kyu;Ha, Sang-Keun
- Korean Journal of Soil Science and Fertilizer
- /
- v.43 no.4
- /
- pp.415-423
- /
- 2010
Pore network models are useful tools to investigate soil pore geometry. These models provide quantitative information of pore geometry from 3D images. This study presents a pore network model to quantify pore structure and hydraulic characteristics. The objectives of this work were to apply the pore network model to characterize pore structure from large images to quantify pore structure, calculate water retention and hydraulic conductivity properties from a three dimensional soil image, and to combine measured hydraulic properties from experiments with calculated hydraulic properties from image. Soil samples were taken from a site located at the Baltimore science center, which is located inside of the city. Undisturbed columns were taken from the site and scanned with a computer tomographer at resolutions of 22 ${\mu}m$. Pore networks were extracted by medial-axis transformation and were used to measure pore geometry from one of the scanned samples. Water retention and unsaturated hydraulic conductivity values were calculated from the soil image. Properties of soil bulk density, water retention and unsaturated hydraulic conductivity were measured from three replicates of scanned soil samples. 3D image analysis provided accurate detailed pore properties such as individual pore volumes, pore length, and tortuosity of all pores. These data made possible to calculate accurate estimations of water retention and hydraulic conductivity. Combination of the calculated and measured hydraulic properties gave more accurate information on pore sizes over wider range than measured or calculated data alone. We could conclude that the hydraulic property computed from soil images and laboratory measurements can describe a full structure of intra- and inter-aggregate pores in soil.
PDF KSCI

Electronic Roll Book using Electronic Bracelet.Child Safe-Guarding Device System (전자 팔찌를 이용한 전자 출석부.어린이 보호 장치 시스템)

Moon, Seung-Jin;Kim, Tae-Nam;Kim, Pan-Su
- Journal of Intelligence and Information Systems
- /
- v.17 no.4
- /
- pp.143-155
- /
- 2011
Lately electronic tagging policy for the sexual offenders was introduced in order to reduce and prevent sexual offences. However, most sexual offences against children happening these days are committed by the tagged offenders whose identities have been released. So, for the crime prevention, we need measures with which we could minimize the suffers more promptly and actively. This paper suggests a new system to relieve the sexual abuse related anxiety of the children and solve the problems that electronic bracelet has. Existing bracelets are only worn by serious criminals, and it's only for risk management and positioning, there is no way to protect the children who are the potential victims of sexual abuse and there actually happened some cases. So we suggest also letting the students(children) wear the LBS(Location Based Service) and USN(Ubiquitous Sensor Network) technology based electronic bracelets to monitor and figure out dangerous situations intelligently, so that we could prevent sexual offences against children beforehand, and while a crime is happening, we could judge the situation of the crime intelligently and take swift action to minimize the suffer. And by checking students' attendance and position, guardians could know where their children are in real time and could protect the children from not only sexual offences but also violent crimes against children like kidnapping. The overall system is like follows : RFID Tag for children monitors the approach of offenders. While an offender's RFID tag is approaching, it will transmit the situation and position as the first warning message to the control center and the guardians. When the offender is going far away, it turns to monitoring mode, and if the tag of the child or the offender is taken off or the child and offender stay at one position for 3~5 minutes or longer, then it will consider this as a dangerous situation, then transmit the emergency situations and position as the second warning message to the control center and the guardians, and ask for the dispatch of police to prevent the crime at the initial stage. The RFID module of criminals' electronic bracelets is RFID TAG, and the RFID module for the children is RFID receiver(reader), so wherever the offenders are, if an offender is at a place within 20m from a child, RFID module for children will transmit the situation every certain periods to the control center by the automatic response of the receiver. As for the positioning module, outdoors GPS or mobile communications module(CELL module)is used and UWB, WI-FI based module is used indoors. The sensor is set under the purpose of making it possible to measure the position coordinates even indoors, so that one could send his real time situation and position to the server of central control center. By using the RFID electronic roll book system of educational institutions and safety system installed at home, children's position and situation can be checked. When the child leaves for school, attendance can be checked through the electronic roll book, and when school is over the information is sent to the guardians. And using RFID access control turnstiles installed at the apartment or entrance of the house, the arrival of the children could be checked and the information is transmitted to the guardians. If the student is absent or didn't arrive at home, the information of the child is sent to the central control center from the electronic roll book or access control turnstiles, and look for the position of the child's electronic bracelet using GPS or mobile communications module, then send the information to the guardians and teacher so that they could report to the police immediately if necessary. Central management and control system is built under the purpose of monitoring dangerous situations and guardians' checking. It saves the warning and pattern data to figure out the areas with dangerous situation, and could help introduce crime prevention systems like CCTV with the highest priority. And by DB establishment personal data could be saved, the frequency of first and second warnings made, the terminal ID of the specific child and offender, warning made position, situation (like approaching, taken off of the electronic bracelet, same position for a certain time) and so on could be recorded, and the data is going to be used for preventing crimes. Even though we've already introduced electronic tagging to prevent recurrence of child sexual offences, but the crimes continuously occur. So I suggest this system to prevent crimes beforehand concerning the children's safety. If we make electronic bracelets easy to use and carry, and set the price reasonably so that many children can use, then lots of criminals could be prevented and we can protect the children easily. By preventing criminals before happening, it is going to be a helpful system for our safe life.
https://doi.org/10.13088/jiis.2011.17.4.143 인용 PDF KSCI

Automatic gasometer reading system using selective optical character recognition (관심 문자열 인식 기술을 이용한 가스계량기 자동 검침 시스템)

Lee, Kyohyuk;Kim, Taeyeon;Kim, Wooju
- Journal of Intelligence and Information Systems
- /
- v.26 no.2
- /
- pp.1-25
- /
- 2020
In this paper, we suggest an application system architecture which provides accurate, fast and efficient automatic gasometer reading function. The system captures gasometer image using mobile device camera, transmits the image to a cloud server on top of private LTE network, and analyzes the image to extract character information of device ID and gas usage amount by selective optical character recognition based on deep learning technology. In general, there are many types of character in an image and optical character recognition technology extracts all character information in an image. But some applications need to ignore non-of-interest types of character and only have to focus on some specific types of characters. For an example of the application, automatic gasometer reading system only need to extract device ID and gas usage amount character information from gasometer images to send bill to users. Non-of-interest character strings, such as device type, manufacturer, manufacturing date, specification and etc., are not valuable information to the application. Thus, the application have to analyze point of interest region and specific types of characters to extract valuable information only. We adopted CNN (Convolutional Neural Network) based object detection and CRNN (Convolutional Recurrent Neural Network) technology for selective optical character recognition which only analyze point of interest region for selective character information extraction. We build up 3 neural networks for the application system. The first is a convolutional neural network which detects point of interest region of gas usage amount and device ID information character strings, the second is another convolutional neural network which transforms spatial information of point of interest region to spatial sequential feature vectors, and the third is bi-directional long short term memory network which converts spatial sequential information to character strings using time-series analysis mapping from feature vectors to character strings. In this research, point of interest character strings are device ID and gas usage amount. Device ID consists of 12 arabic character strings and gas usage amount consists of 4 ~ 5 arabic character strings. All system components are implemented in Amazon Web Service Cloud with Intel Zeon E5-2686 v4 CPU and NVidia TESLA V100 GPU. The system architecture adopts master-lave processing structure for efficient and fast parallel processing coping with about 700,000 requests per day. Mobile device captures gasometer image and transmits to master process in AWS cloud. Master process runs on Intel Zeon CPU and pushes reading request from mobile device to an input queue with FIFO (First In First Out) structure. Slave process consists of 3 types of deep neural networks which conduct character recognition process and runs on NVidia GPU module. Slave process is always polling the input queue to get recognition request. If there are some requests from master process in the input queue, slave process converts the image in the input queue to device ID character string, gas usage amount character string and position information of the strings, returns the information to output queue, and switch to idle mode to poll the input queue. Master process gets final information form the output queue and delivers the information to the mobile device. We used total 27,120 gasometer images for training, validation and testing of 3 types of deep neural network. 22,985 images were used for training and validation, 4,135 images were used for testing. We randomly splitted 22,985 images with 8:2 ratio for training and validation respectively for each training epoch. 4,135 test image were categorized into 5 types (Normal, noise, reflex, scale and slant). Normal data is clean image data, noise means image with noise signal, relfex means image with light reflection in gasometer region, scale means images with small object size due to long-distance capturing and slant means images which is not horizontally flat. Final character string recognition accuracies for device ID and gas usage amount of normal data are 0.960 and 0.864 respectively.
https://doi.org/10.13088/jiis.2020.26.2.001 인용 PDF KSCI

Search Result 5,261, Processing Time 0.049 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)