• Title/Summary/Keyword: in-memory computing

Search Result 766, Processing Time 0.037 seconds

A new lightweight network based on MobileNetV3

  • Zhao, Liquan;Wang, Leilei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.1
    • /
    • pp.1-15
    • /
    • 2022
  • The MobileNetV3 is specially designed for mobile devices with limited memory and computing power. To reduce the network parameters and improve the network inference speed, a new lightweight network is proposed based on MobileNetV3. Firstly, to reduce the computation of residual blocks, a partial residual structure is designed by dividing the input feature maps into two parts. The designed partial residual structure is used to replace the residual block in MobileNetV3. Secondly, a dual-path feature extraction structure is designed to further reduce the computation of MobileNetV3. Different convolution kernel sizes are used in the two paths to extract feature maps with different sizes. Besides, a transition layer is also designed for fusing features to reduce the influence of the new structure on accuracy. The CIFAR-100 dataset and Image Net dataset are used to test the performance of the proposed partial residual structure. The ResNet based on the proposed partial residual structure has smaller parameters and FLOPs than the original ResNet. The performance of improved MobileNetV3 is tested on CIFAR-10, CIFAR-100 and ImageNet image classification task dataset. Comparing MobileNetV3, GhostNet and MobileNetV2, the improved MobileNetV3 has smaller parameters and FLOPs. Besides, the improved MobileNetV3 is also tested on CPU and Raspberry Pi. It is faster than other networks

Development of Information Technology Infrastructures through Construction of Big Data Platform for Road Driving Environment Analysis (도로 주행환경 분석을 위한 빅데이터 플랫폼 구축 정보기술 인프라 개발)

  • Jung, In-taek;Chong, Kyu-soo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.3
    • /
    • pp.669-678
    • /
    • 2018
  • This study developed information technology infrastructures for building a driving environment analysis platform using various big data, such as vehicle sensing data, public data, etc. First, a small platform server with a parallel structure for big data distribution processing was developed with H/W technology. Next, programs for big data collection/storage, processing/analysis, and information visualization were developed with S/W technology. The collection S/W was developed as a collection interface using Kafka, Flume, and Sqoop. The storage S/W was developed to be divided into a Hadoop distributed file system and Cassandra DB according to the utilization of data. Processing S/W was developed for spatial unit matching and time interval interpolation/aggregation of the collected data by applying the grid index method. An analysis S/W was developed as an analytical tool based on the Zeppelin notebook for the application and evaluation of a development algorithm. Finally, Information Visualization S/W was developed as a Web GIS engine program for providing various driving environment information and visualization. As a result of the performance evaluation, the number of executors, the optimal memory capacity, and number of cores for the development server were derived, and the computation performance was superior to that of the other cloud computing.

Design and Implementation of User Authentication Protocol for Wireless Devices based on Java Card (자바카드 기반 무선단말기용 사용자 인증 프로토콜의 설계 및 구현)

  • Lee, Ju-Hwa;Seol, Kyoung-Su;Jung, Min-Soo
    • The KIPS Transactions:PartC
    • /
    • v.10C no.5
    • /
    • pp.585-594
    • /
    • 2003
  • Java card is one of promising smart card platform with java technology. Java card defines necessary packages and classes for Embedded device that have small memory such as smart card Jana card is compatible with EMV that is Industry specification standard and ISO-7816 that is international standard. However, Java card is not offers user authentication protocol. In this paper, We design and implement an user authentication protocol applicable wireless devices based on Java Card using standard 3GPP Specification (SMS), Java Card Specification (APDU), Cryptography and so on. Our Java Card user authentication techniques can possibly be applied to the area of M-Commerce, Wireless Security, E-Payment System, Mobile Internet, Global Position Service, Ubiquitous Computing and so on.

An Efficient Join Algorithm for Data Streams with Overlapping Window (중첩 윈도우를 가진 데이터 스트링을 위한 효율적인 조인 알고리즘)

  • Kim, Hyeon-Gyu;Kang, Woo-Lam;Kim, Myoung-Ho
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.5
    • /
    • pp.365-369
    • /
    • 2009
  • Overlapping windows are generally used for queries to process continuous data streams. Nevertheless, existing approaches discussed join algorithms only for basic types of windows such as tumbling windows and tuple-driven windows. In this paper, we propose an efficient join algorithm for overlapping windows, which are considered as a more general type of windows. The proposed algorithm is based on an incremental window join. It focuses on producing join results continuously when the memory overflow frequently occurs. It consists of (1) a method to use both of the incremental and full joins selectively, (2) a victim selection algorithm to minimize latency of join processing and (3) an idle time professing algorithm. We show through our experiments that the selective use of incremental and full joins provides better performance than using one of them only.

Design of Systolic Array for High Speed Processing of Block Matching Motion Estimation Algorithm (블록 정합 움직임추정 알고리즘의 고속처리를 위한 시스토릭 어레이의 설계)

  • 추봉조;김혁진;이수진
    • Journal of the Korea Society of Computer and Information
    • /
    • v.3 no.2
    • /
    • pp.119-124
    • /
    • 1998
  • Block Matching Motion Estimation(BMME) Algorithm is demands a very large amount of computing power and have been proposed many fast algorithms. These algorithms are many problem that larger size of VLSI scale due to non-localized search block data and problem of non-reuse of input data for each processing step. In this paper, we designed systolic arry of high processing capacity, constraints input output pin size and reuse of input data for small VLSI size. The proposed systolic array is optimized memory access time because of iterative reuse of input data on search block and become independent of problem size due to increase of algorithm's parallelism and total processing elements connection is localized spatial and temporal. The designed systolic array is reduced O(N6) time complexity to O(N3) on moving vector and has O(N) input/output pin size.

  • PDF

A Bayesian Inference Model for Landmarks Detection on Mobile Devices (모바일 디바이스 상에서의 특이성 탐지를 위한 베이지안 추론 모델)

  • Hwang, Keum-Sung;Cho, Sung-Bae;Lea, Jong-Ho
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.13 no.1
    • /
    • pp.35-45
    • /
    • 2007
  • The log data collected from mobile devices contains diverse meaningful and practical personal information. However, this information is usually ignored because of its limitation of memory capacity, computation power and analysis. We propose a novel method that detects landmarks of meaningful information for users by analyzing the log data in distributed modules to overcome the problems of mobile environment. The proposed method adopts Bayesian probabilistic approach to enhance the inference accuracy under the uncertain environments. The new cooperative modularization technique divides Bayesian network into modules to compute efficiently with limited resources. Experiments with artificial data and real data indicate that the result with artificial data is amount to about 84% precision rate and about 76% recall rate, and that including partial matching with real data is about 89% hitting rate.

Circle Detection and Approximation for Inspecting a Fiber Optic Connector Endface (광섬유 연결 종단면 검사를 위한 원형 검출과 근사화 방법)

  • Kim, Jin-Soo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.12
    • /
    • pp.2953-2960
    • /
    • 2014
  • In the field of image recognition, circle detection is one of the most widely used techniques. Conventional algorithms are mainly based on Hough transform, which is the most straightforward algorithm for detecting circles and for providing enough robust algorithm. However, it suffers from large memory requirements and high computational loads, and sometimes tends to detect incorrect circles. This paper proposes an optimal circle detection and approximation method which is applicable for inspecting fiber optic connector endface. The proposed method finds initial center coordinates and radius based on the initial edge lines. Then, by introducing the simplified K-means algorithm, the proposed method investigates a substitute-circle by minimizing the area of non-overlapped regions. Through extensive simulations, it is shown that the proposed method can improve the error rate by as much as 67% and also can reduce the computing time by as much as 80%, compared to the Hough transform provided by the OpenCV library.

A Resource-Aware Mapping Algorithm for Coarse-Grained Reconfigurable Architecture Using List Scheduling (리스트 스케줄링을 통한 Coarse-Grained 재구성 구조의 맵핑 알고리즘 개발)

  • Kim, Hyun-Jin;Hong, Hye-Jeong;Kim, Hong-Sik;Kang, Sung-Ho
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.6
    • /
    • pp.58-64
    • /
    • 2009
  • For the success of the reconfigurable computing, the algorithm for mapping operations onto coarse-grained reconfigurable architecture is very important. This paper proposes a resource-aware mapping system for the coarse-grained reconfigurable architecture and its own underlying heuristic algorithm. The operation assignment and the routing path allocation are simultaneously performed with a cycle-accurate time-exclusive resource model. The proposed algorithm minimizes the communication resource usage and the global memory access with the list scheduling heuristic. The operation to be mapped are prioritized with general properties of data flow. The evaluations of the proposed algorithm show that the performance is significantly enhanced in several benchmark applications.

Design, Implementation, and Performance Evaluation of File System on a Chip (파일시스템을 내장한 저장장치의 설계, 구현 및 성능분석)

  • Ahn Seongiun;Choi Jongmoo;Lee Donghee;Noh Sam H.;Min Sang Lyul;Cho Yookun
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.10 no.6
    • /
    • pp.448-459
    • /
    • 2004
  • Interoperability is an important requirement of portable storage devices that are used to exchange and share data among diverse hosts. However, the required interoperability cannot be provided if different host systems use different file systems. To address this problem, we propose a new type of storage device called FSOC(File System On a Chip) that contains the file system within the storage device. In this paper, we give an example of the design and implementation of a flash memory-based FSOC and propose the performance models of the conventional storage device and the FSOC. We also analyze the performance characteristics of the conventional storage device and the FSOC based on the proposed performance models, and provide several experimental results using real applications that validate the performance models.

Design and Implementation of Flash Translation Layer with O(1) Crash Recovery Time (O(1) 크래시 복구 수행시간을 갖는 FTL의 설계와 구현)

  • Park, Joon Young;Park, Hyunchan;Yoo, Chuck
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.10
    • /
    • pp.639-644
    • /
    • 2015
  • The capacity of flash-based storage such as Solid State Drive(SSD) and embedded Multi Media Card(eMMC) is ever-increasing because of the needs from the end-users. However, if a flash-based storage crashes, such as during power failure, the flash translation layer(FTL) is responsible for the crash recovery based on the entire flash memory. The recovery time increases as the capacity of the flash-based storages increases. We propose O1FTL with O(1) crash recovery time that is independent of the flash capacity. O1FTL adopts the working area technique suggested for the flash file system and evaluates the design on a real hardware platform. The results show that O1FTL achieves a crash recovery time that is independent of the capacity and the overhead, in terms of I/O performance, and achieves a low P/E cycle.