• Title/Summary/Keyword: Write Performance

Search Result 391, Processing Time 0.03 seconds

Robustness of Differentiable Neural Computer Using Limited Retention Vector-based Memory Deallocation in Language Model

  • Lee, Donghyun;Park, Hosung;Seo, Soonshin;Son, Hyunsoo;Kim, Gyujin;Kim, Ji-Hwan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.3
    • /
    • pp.837-852
    • /
    • 2021
  • Recurrent neural network (RNN) architectures have been used for language modeling (LM) tasks that require learning long-range word or character sequences. However, the RNN architecture is still suffered from unstable gradients on long-range sequences. To address the issue of long-range sequences, an attention mechanism has been used, showing state-of-the-art (SOTA) performance in all LM tasks. A differentiable neural computer (DNC) is a deep learning architecture using an attention mechanism. The DNC architecture is a neural network augmented with a content-addressable external memory. However, in the write operation, some information unrelated to the input word remains in memory. Moreover, DNCs have been found to perform poorly with low numbers of weight parameters. Therefore, we propose a robust memory deallocation method using a limited retention vector. The limited retention vector determines whether the network increases or decreases its usage of information in external memory according to a threshold. We experimentally evaluate the robustness of a DNC implementing the proposed approach according to the size of the controller and external memory on the enwik8 LM task. When we decreased the number of weight parameters by 32.47%, the proposed DNC showed a low bits-per-character (BPC) degradation of 4.30%, demonstrating the effectiveness of our approach in language modeling tasks.

An Automated Industry and Occupation Coding System using Deep Learning (딥러닝 기법을 활용한 산업/직업 자동코딩 시스템)

  • Lim, Jungwoo;Moon, Hyeonseok;Lee, Chanhee;Woo, Chankyun;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.4
    • /
    • pp.23-30
    • /
    • 2021
  • An Automated Industry and Occupation Coding System assigns statistical classification code to the enormous amount of natural language data collected from people who write about their industry and occupation. Unlike previous studies that applied information retrieval, we propose a system that does not need an index database and gives proper code regardless of the level of classification. Also, we show our model, which utilized KoBERT that achieves high performance in natural language downstream tasks with deep learning, outperforms baseline. Our method achieves 95.65%, 91.51%, and 97.66% in Occupation/Industry Code Classification of Population and Housing Census, and Industry Code Classification of Census on Basic Characteristics of Establishments. Moreover, we also demonstrate future improvements through error analysis in the respect of data and modeling.

User Transparent File Encryption Mechanisms at Kernel Level (사용자 투명성을 갖는 커널 수준의 파일 암호화 메카니즘)

  • Kim Jae-Hwan;Park Tae-Kyou;Cho Gi-Hwan
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.16 no.3
    • /
    • pp.3-16
    • /
    • 2006
  • Encipherment in existing OS(Operating Systems) has typically used the techniques which encrypt and decrypt entirely a secret file at the application level with keys chosen by user In this mechanism it causes much overhead on the performance. However when a security-classified user-process writes a secret file, our proposed mechanism encrypts and stores automatically and efficiently the file by providing transparency to the user at the kernel level of Linux. Also when the user modifies the encrypted secret file, this mechanism decrypts partially the file and encrypts partially the file for restoring. When user reads only the part of the encrypted file, this mechanism decrypts automatically and partially the file. Therefore our proposed mechanism provides user much faster enciphering speed than that of the existing techniques at the application level.

Hazelcast Vs. Ignite: Opportunities for Java Programmers

  • Maxim, Bartkov;Tetiana, Katkova;S., Kruglyk Vladyslav;G., Murtaziev Ernest;V., Kotova Olha
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.2
    • /
    • pp.406-412
    • /
    • 2022
  • Storing large amounts of data has always been a big problem from the beginning of computing history. Big Data has made huge advancements in improving business processes by finding the customers' needs using prediction models based on web and social media search. The main purpose of big data stream processing frameworks is to allow programmers to directly query the continuous stream without dealing with the lower-level mechanisms. In other words, programmers write the code to process streams using these runtime libraries (also called Stream Processing Engines). This is achieved by taking large volumes of data and analyzing them using Big Data frameworks. Streaming platforms are an emerging technology that deals with continuous streams of data. There are several streaming platforms of Big Data freely available on the Internet. However, selecting the most appropriate one is not easy for programmers. In this paper, we present a detailed description of two of the state-of-the-art and most popular streaming frameworks: Apache Ignite and Hazelcast. In addition, the performance of these frameworks is compared using selected attributes. Different types of databases are used in common to store the data. To process the data in real-time continuously, data streaming technologies are developed. With the development of today's large-scale distributed applications handling tons of data, these databases are not viable. Consequently, Big Data is introduced to store, process, and analyze data at a fast speed and also to deal with big users and data growth day by day.

A Proposal for Partial Automation Preparation System of BIM-based Energy Conservation Plan - Case Study on Automation Process Using BIM Software and Excel VBA - (BIM기반 에너지절약계획서 건축부문 부분자동화 작성 시스템 제안 - BIM 소프트웨어와 EXCEL VBA를 이용한 자동화과정을 중심으로 -)

  • Ryu, Jea-Ho;Hwang, Jong-Min;Kim, Sol-Yee;Seo, Hwa-Yeong;Lee, Ji-Hyun
    • Journal of KIBIM
    • /
    • v.12 no.2
    • /
    • pp.49-59
    • /
    • 2022
  • The main idea of this study is to propose a BIM-based automation system drawing up a report of energy conservation plan in the architecture division. In order to obtain a building permit, an energy conservation plan must be prepared for buildings with a total floor area of 500m2 or more under the current law. Currently, it is adopted as a general method to complete a report by obtaining data and drawings necessary for an energy conservation plan through manual work and input them directly into the verification system. This method takes a lot of effort and time in the design phase which ultimately increases the initial cost of the business, including the services of companies specialized in the environmental field. However, in preparation for mandatory BIM work process in the future, it is necessary to introduce BIM-based automatic creation system that has an advantage for shortening the whole process to enable rapid permission of energy-saving designs for buildings. There may be many methods of automation, but this study introduces how to build an application using Dynamo of Revit, in terms of utilizing BIM, and write an energy conservation plan by automatic completion of report through Dynamo and Excel's VBA algorithm, which can save time and cost in preparing the report of energy conservation plan compared with the manual process. Also we have insisted that the digital transformation of architectural process is a necessary for an efficient use of our automation system in the current energy conservation plan workflow.

MATERIAL MATCHING PROCESS FOR ENERGY PERFORMANCE ANALYSIS

  • Jung-Ho Yu;Ka-Ram Kim;Me-Yeon Jeon
    • International conference on construction engineering and project management
    • /
    • 2011.02a
    • /
    • pp.213-220
    • /
    • 2011
  • In the current construction industry where various stakeholders take part, BIM Data exchange using standard format can provide a more efficient working environment for related staffs during the life-cycle of the building. Currently, the formats used to exchange the data from 3D-CAD application to structure energy analysis at the design stages are IFC, the international standard format provided by IAI, and gbXML, developed by Autodesk. However, because of insufficient data compatibility, the BIM data produced in the 3D-CAD application cannot be directly used in the energy analysis, thus there needs to be additional data entry. The reasons for this are as follows: First, an IFC file cannot contain all the data required for energy simulation. Second, architects sometimes write material names on the drawings that are not matching to those in the standard material library used in energy analysis tools. DOE-2.2 and Energy Plus are the most popular energy analysis engines. And both engines have their own material libraries. However, our investigation revealed that the two libraries are not compatible. First, the types and unit of properties were different. Second, material names used in the library and the codes of the materials were different. Furthermore, there is no material library in Korean language. Thus, by comparing the basic library of DOE-2, the most commonly used energy analysis engine worldwide, and EnergyPlus regarding construction materials; this study will analyze the material data required for energy analysis and propose a way to effectively enter these using semantic web's ontology. This study is meaningful as it enhances the objective credibility of the analysis result when analyzing the energy, and as a conceptual study on the usage of ontology in the construction industry.

  • PDF

An Effective Method Guaranteeing Mutual Exclusion of Lock Waiting Information for Deadlock Detection in Main Memory Databases (주기억장치 데이타베이스에서 교착 상태의 검출을 위한 락 대기 정보의 효과적인 상호 배제 기법)

  • Kim, Sang-Wook;Lee, Seung-Sun;Choi, Wan
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.24 no.7B
    • /
    • pp.1313-1321
    • /
    • 1999
  • The two-phase locking protocol(2PL) is the most widely-used concurrency control mechanism for guaranteeing logical consistency of data in a database environment where a number of transactions perform concurrently. The problem inherent in the 2PL protocol is a deadlock, where a set of transactions holding some locks indefinitely wait an additional lock that is already held by other transactions in the set. The deadlock detector is a DBMS sub-component that examines periodically whether a system is in a deadlock state based on lock waiting information of transactions. The deadlock detector and transactions execute concurrently in a DBMS and read and/or write the lock waiting information simultaneously. Since the lock waiting information is a shared one, we need an efficient method guaranteeing its physical consistency by using mutual exclusion. The efficiency of the mutual exclusion method is crucial especially in a main memory DBMS with high performance since it seriously affects the performance of an entire system. In this paper, we propose a new method that effectively guarantees physical consistency of lock waiting information. Two primary goals of our method are to minimize the processing overhead and to maximize system concurrency.

  • PDF

Data Deduplication Method using Locality-based Chunking policy for SSD-based Server Storages (SSD 기반 서버급 스토리지를 위한 지역성 기반 청킹 정책을 이용한 데이터 중복 제거 기법)

  • Lee, Seung-Kyu;Kim, Ju-Kyeong;Kim, Deok-Hwan
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.2
    • /
    • pp.143-151
    • /
    • 2013
  • NAND flash-based SSDs (Solid State Drive) have advantages of fast input/output performance and low power consumption so that they could be widely used as storages on tablet, desktop PC, smart-phone, and server. But, SSD has the disadvantage of wear-leveling due to increase of the number of writes. In order to improve the lifespan of the SSD, a variety of data deduplication techniques have been introduced. General fixed-size splitting method allocates fixed size of chunk without considering locality of data so that it may execute unnecessary chunking and hash key generation, and variable-size splitting method occurs excessive operation since it compares data byte-by-byte for deduplication. This paper proposes adaptive chunking method based on application locality and file name locality of written data in SSD-based server storage. The proposed method split data into 4KB or 64KB chunks adaptively according to application locality and file name locality of duplicated data so that it can reduce the overhead of chunking and hash key generation and prevent duplicated data writing. The experimental results show that the proposed method can enhance write performance, reduce power consumption and operation time compared to existing variable-size splitting method and fixed size splitting method using 4KB.

The Accelerated Life Test of 2.5 Inch Hard Disk In The Environment of PC using (PC 사용 환경의 2.5 인치 하드디스크의 가속 수명 시험)

  • Cho, Euy-Hyun;Park, Jeong-Kyu;Seo, Hui-Don
    • Journal of Digital Contents Society
    • /
    • v.15 no.1
    • /
    • pp.19-27
    • /
    • 2014
  • In order to estimate the life of 2,5 inch HDD which is adopted by PC environment, make the test plan which reflect the failure mode of market, make the test model of accelerated life test which reflect the stress of temperature. after an analysis of the environment of PC using, test procedure was decided that operation was write 50 % and read 50 %, and then access method was sequential 50 % and random 50%. The acceleration life test was executed on condition that temperature was $50^{\circ}C$ and $60^{\circ}C$, performance was 95 % in max performance, test time was 1000 hours. by the test of goodness of fit of anderson-darling of the failure data during test, it was confirmed that the distribution of failure fellow weibull. test for shape and scale was equal, and shape parameter was 0.7177, characteristic life was 429434 hours at normal user condition($30^{\circ}C$) by the analysis of weibull-arrhenius modeling. It made no difference about the statistics when equality test was executed. The activation energy was 0.2775eV. In analyzing between the failure samples of acceleration test and the samples of market return even though there is detail difference about the share of failure mode, the rank of share was almost same. This study suggest the test procedure of acceleration test of 2.5 inch HDD in PC using environment, and help the life estimation at manufacture and user.

A Distributed Real-Time Concurrency Control Scheme using Transaction the Rise of Priority (트랜잭션 우선 순위 상승을 이용한 분산 실시간 병행수행제어 기법)

  • Lee, Jong-Sul;Shin, Jae-Ryong;Cho, Ki-Hyung;Yoo, Jae-Soo
    • Journal of KIISE:Databases
    • /
    • v.28 no.3
    • /
    • pp.484-493
    • /
    • 2001
  • As real-time database systems are extended to the distributed computing environment, the need to apply the existing real-time concurrency control schemes to the distributed computing environment has been made. In this paper we propose an efficient concurrency control scheme for distributed real-time database system. Our proposed scheme guarantees a transaction to commit at its maximum, reduces the restart of a transaction that is on the prepared commit phase, and minimizes the time of the lock holding. This is because it raises the priority of the transaction that is on the prepared commit phase in the distributed real-time computing environment. In addition, it reduces the waiting time of a transaction that owns borrowed data and improves the performance of the system, as a result of lending the data that the transaction with the raised priority holds. We compare the proposed scheme with DO2PL_PA(Distributed Optimistic Two-Phase Locking) and MIRROR(Managing Isolation in Replicated Real-time Object Repositories) protocol in terms of the arrival rate of transactions, the size of transactions, the write probability of transactions, and the replication degree of data in a firm-deadline real-time database system based on two-phase commit protocol. It is shown through the performance evaluation that our scheme outperforms the existing schemes.

  • PDF