• Title/Summary/Keyword: Kernel Memory

Search Result 179, Processing Time 0.026 seconds

A Software VIA based PC Cluster System on SCI Network (SCI 네트워크 상의 소프트웨어 VIA기반 PC글러스터 시스템)

  • Shin, Jeong-Hee;Chung, Sang-Hwa;Park, Se-Jin
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.29 no.4
    • /
    • pp.192-200
    • /
    • 2002
  • The performance of a PC cluster system is limited by the use of traditional communication protocols, such as TCP/IP because these protocols are accompanied with significant software overheads. To overcome the problem, systems based on user-level interface for message passing without intervention of kernel have been developed. The VIA(Virtual Interface Architecture) is one of the representative user-level interfaces which provide low latency and high bandwidth. In this paper, a VIA system is implemented on an SCI(Scalable Coherent Interface) network based PC cluster. The system provides both message-passing and shared-memory programming environments and shows the maximum bandwidth of 84MB/s and the latency of $8{\mu}s$. The system also shows better performance in comparison with other comparable computer systems in carrying out parallel benchmark programs.

A Study on the Application of Zero Copy Technology to Improve the Transmission Efficiency and Recording Performance of Massive Data (대용량 데이터의 전송 효율 및 기록 성능 향상을 위한 Zero Copy 기술 적용에 관한 연구)

  • Song, Min-Gyu;Kim, Hyo-Ryoung;Kang, Yong-Woo;Je, Do-Heung;Wi, Seog-Oh;Lee, Sung-Mo;Kim, Seung-Rae
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.16 no.6
    • /
    • pp.1133-1144
    • /
    • 2021
  • Zero-copy is a technology that is also called no-memory copy, and through its use, context switching between the user space and the kernel space can be reduced to minimize the load on the CPU. However, this technology is only used to transmit small random files, and has not yet been widely used for large file transfers. This paper intends to discuss the practical application of zero-copy in processing large files via a network. To this end, we first developed a small test bed and program that can transmit and store data based on zero-copy. Afterwards, we intend to verify the usefulness of the applied technology in detail through detailed performance evaluation

A new lightweight network based on MobileNetV3

  • Zhao, Liquan;Wang, Leilei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.1
    • /
    • pp.1-15
    • /
    • 2022
  • The MobileNetV3 is specially designed for mobile devices with limited memory and computing power. To reduce the network parameters and improve the network inference speed, a new lightweight network is proposed based on MobileNetV3. Firstly, to reduce the computation of residual blocks, a partial residual structure is designed by dividing the input feature maps into two parts. The designed partial residual structure is used to replace the residual block in MobileNetV3. Secondly, a dual-path feature extraction structure is designed to further reduce the computation of MobileNetV3. Different convolution kernel sizes are used in the two paths to extract feature maps with different sizes. Besides, a transition layer is also designed for fusing features to reduce the influence of the new structure on accuracy. The CIFAR-100 dataset and Image Net dataset are used to test the performance of the proposed partial residual structure. The ResNet based on the proposed partial residual structure has smaller parameters and FLOPs than the original ResNet. The performance of improved MobileNetV3 is tested on CIFAR-10, CIFAR-100 and ImageNet image classification task dataset. Comparing MobileNetV3, GhostNet and MobileNetV2, the improved MobileNetV3 has smaller parameters and FLOPs. Besides, the improved MobileNetV3 is also tested on CPU and Raspberry Pi. It is faster than other networks

Booting Process Profiling Tool for Baseboard Management Controllers (베이스보드 매니지먼트 컨트롤러를 위한 부팅 과정 프로파일링 도구)

  • Jaeseop Kim;Minho Park;Jiman Hong
    • Smart Media Journal
    • /
    • v.11 no.11
    • /
    • pp.84-91
    • /
    • 2022
  • Baseboard Management Controller(BMC) supports server monitoring, maintenance, and control functions using various communication interfaces. However, if an unexpected problem occurs during the device driver initialization process, the BMC may not operate normally. Therefore, a boot process profiling tool that accurately analyzes the device driver initialization process and provides a function to check the analysis result is essential. Existing boot process profiling tools do not specifically provide the device driver initialization process and results required for BMC boot process analysis, forcing developers to use a combination of tools to analyze the boot process in detail. In this paper, we propose an integrated profiling tool for BMC's booting process. The proposed tool provides device driver initialization process analysis, CPU and memory usage analysis, and kernel version management functions. Users can easily analyze the booting process using the proposed tool, and the analysis result can be used to shorten the booting time. Also, the proposed tool is implemented in Linux-based BMC, and it is shown that the proposed tool is more efficient than the existing profiling tool.

Analysis of the Impact of Host Resource Exhaustion Attacks in a Container Environment (컨테이너 환경에서의 호스트 자원 고갈 공격 영향 분석)

  • Jun-hee Lee;Jae-hyun Nam;Jin-woo Kim
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.1
    • /
    • pp.87-97
    • /
    • 2023
  • Containers are an emerging virtualization technology that can build an isolated environment more lightweight and faster than existing virtual machines. For that reason, many organizations have recently adopted them for their services. Yet, the container architecture has also exposed many security problems since all containers share the same OS kernel. In this work, we focus on the fact that an attacker can abuse host resources to make them unavailable to benign containers-also known as host resource exhaustion attacks. Then, we analyze the impact of host resource exhaustion attacks through real attack scenarios exhausting critical host resources, such as CPU, memory, disk space, process ID, and sockets in Docker, the most popular container platform. We propose five attack scenarios performed in several different host environments and container images. The result shows that three of them put other containers in denial of service.

Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Mode (CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석)

  • Park, Ho-yeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.141-154
    • /
    • 2019
  • Rapid growth of internet technology and social media is progressing. Data mining technology has evolved to enable unstructured document representations in a variety of applications. Sentiment analysis is an important technology that can distinguish poor or high-quality content through text data of products, and it has proliferated during text mining. Sentiment analysis mainly analyzes people's opinions in text data by assigning predefined data categories as positive and negative. This has been studied in various directions in terms of accuracy from simple rule-based to dictionary-based approaches using predefined labels. In fact, sentiment analysis is one of the most active researches in natural language processing and is widely studied in text mining. When real online reviews aren't available for others, it's not only easy to openly collect information, but it also affects your business. In marketing, real-world information from customers is gathered on websites, not surveys. Depending on whether the website's posts are positive or negative, the customer response is reflected in the sales and tries to identify the information. However, many reviews on a website are not always good, and difficult to identify. The earlier studies in this research area used the reviews data of the Amazon.com shopping mal, but the research data used in the recent studies uses the data for stock market trends, blogs, news articles, weather forecasts, IMDB, and facebook etc. However, the lack of accuracy is recognized because sentiment calculations are changed according to the subject, paragraph, sentiment lexicon direction, and sentence strength. This study aims to classify the polarity analysis of sentiment analysis into positive and negative categories and increase the prediction accuracy of the polarity analysis using the pretrained IMDB review data set. First, the text classification algorithm related to sentiment analysis adopts the popular machine learning algorithms such as NB (naive bayes), SVM (support vector machines), XGboost, RF (random forests), and Gradient Boost as comparative models. Second, deep learning has demonstrated discriminative features that can extract complex features of data. Representative algorithms are CNN (convolution neural networks), RNN (recurrent neural networks), LSTM (long-short term memory). CNN can be used similarly to BoW when processing a sentence in vector format, but does not consider sequential data attributes. RNN can handle well in order because it takes into account the time information of the data, but there is a long-term dependency on memory. To solve the problem of long-term dependence, LSTM is used. For the comparison, CNN and LSTM were chosen as simple deep learning models. In addition to classical machine learning algorithms, CNN, LSTM, and the integrated models were analyzed. Although there are many parameters for the algorithms, we examined the relationship between numerical value and precision to find the optimal combination. And, we tried to figure out how the models work well for sentiment analysis and how these models work. This study proposes integrated CNN and LSTM algorithms to extract the positive and negative features of text analysis. The reasons for mixing these two algorithms are as follows. CNN can extract features for the classification automatically by applying convolution layer and massively parallel processing. LSTM is not capable of highly parallel processing. Like faucets, the LSTM has input, output, and forget gates that can be moved and controlled at a desired time. These gates have the advantage of placing memory blocks on hidden nodes. The memory block of the LSTM may not store all the data, but it can solve the CNN's long-term dependency problem. Furthermore, when LSTM is used in CNN's pooling layer, it has an end-to-end structure, so that spatial and temporal features can be designed simultaneously. In combination with CNN-LSTM, 90.33% accuracy was measured. This is slower than CNN, but faster than LSTM. The presented model was more accurate than other models. In addition, each word embedding layer can be improved when training the kernel step by step. CNN-LSTM can improve the weakness of each model, and there is an advantage of improving the learning by layer using the end-to-end structure of LSTM. Based on these reasons, this study tries to enhance the classification accuracy of movie reviews using the integrated CNN-LSTM model.

FPGA-based One-Chip Architecture and Design of Real-time Video CODEC with Embedded Blind Watermarking (블라인드 워터마킹을 내장한 실시간 비디오 코덱의 FPGA기반 단일 칩 구조 및 설계)

  • 서영호;김대경;유지상;김동욱
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.8C
    • /
    • pp.1113-1124
    • /
    • 2004
  • In this paper, we proposed a hardware(H/W) structure which can compress and recontruct the input image in real time operation and implemented it into a FPGA platform using VHDL(VHSIC Hardware Description Language). All the image processing element to process both compression and reconstruction in a FPGA were considered each of them was mapped into H/W with the efficient structure for FPGA. We used the DWT(discrete wavelet transform) which transforms the data from spatial domain to the frequency domain, because use considered the motion JPEG2000 as the application. The implemented H/W is separated to both the data path part and the control part. The data path part consisted of the image processing blocks and the data processing blocks. The image processing blocks consisted of the DWT Kernel fur the filtering by DWT, Quantizer/Huffman Encoder, Inverse Adder/Buffer for adding the low frequency coefficient to the high frequency one in the inverse DWT operation, and Huffman Decoder. Also there existed the interface blocks for communicating with the external application environments and the timing blocks for buffering between the internal blocks The global operations of the designed H/W are the image compression and the reconstruction, and it is operated by the unit of a field synchronized with the A/D converter. The implemented H/W used the 69%(16980) LAB(Logic Array Block) and 9%(28352) ESB(Embedded System Block) in the APEX20KC EP20K600CB652-7 FPGA chip of ALTERA, and stably operated in the 70MHz clock frequency. So we verified the real time operation of 60 fields/sec(30 frames/sec).

Massive Fluid Simulation Using a Responsive Interaction Between Surface and Wave Foams (수면거품과 웨이브거품의 미세한 상호작용을 이용한 대규모 유체 시뮬레이션)

  • Kim, Jong-Hyun
    • Journal of the Korea Computer Graphics Society
    • /
    • v.23 no.2
    • /
    • pp.29-39
    • /
    • 2017
  • This paper presents a unified framework to efficiently and realistically simulate surface and wave foams. The framework is designed to first project 3D water particles from an underlying water solver onto 2D screen space in order to reduce the computational complexity of determining where foam particles should be generated. Because foam effects are often created primarily in fast and complicated water flows, we analyze the acceleration and curvature values to identify the areas exhibiting such flow patterns. Foam particles are emitted from the identified areas in 3D space, and each foam particle is advected according to its type, which is classified on the basis of velocity, thereby capturing the essential characteristics of foam wave motions. We improve the realism of the resulting foam by classifying it into two types: surface foam and wave foam. Wave foam is characterized by the sharp wave patterns of torrential flow s, and surface foam is characterized by a cloudy foam shape even in water with reduced motion. Based on these features, we propose a technique to correct the velocity and position of a foam particle. In addition, we propose a kernel technique using the screen space density to efficiently reduce redundant foam particles, resulting in improved overall memory efficiency without loss of visual detail in terms of foam effects. Experiments convincingly demonstrate that the proposed approach is efficient and easy to use while delivering high-quality results.

HVIA-GE: A Hardware Implementation of Virtual Interface Architecture Based On Gigabit Ethernet (HVIA-GE: 기가비트 이더넷에 기반한 Virtual Interface Architecture의 하드웨어 구현)

  • 박세진;정상화;윤인수
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.31 no.5_6
    • /
    • pp.371-378
    • /
    • 2004
  • This paper presents the implementation and performance of the HVIA-GE card, which is a hardware implementation of the Virtual Interface Architecture (VIA) based on Gigabit Ethernet. The HVIA-GE card is a 32-bit/33MHz PCI adapter containing an FPGA for the VIA protocol engine and a Gigabit Ethernet chip set to construct a high performance physical network. HVIA-GE performs virtual-to-physical address translation, Doorbell, and send/receive completion operations in hardware without kernel intervention. In particular, the Address Translation Table (ATT) is stored on the local memory of the HVIA-GE card, and the VIA protocol engine efficiently controls the address translation process by directly accessing the ATT. As a result, the communication overhead during send/receive transactions is greatly reduced. Our experimental results show the maximum bandwidth of 93.7MB/s and the minimum latency of 11.9${\mu}\textrm{s}$. In terms of minimum latency HVIA-GE performs 4.8 times and 9.9 times faster than M-VIA and TCP/IP, respectively, over Gigabit Ethernet. In addition, the maximum bandwidth of HVIA-GE is 50.4% and 65% higher than M-VIA and TCP/IP respectively.