• Title/Summary/Keyword: code optimization techniques

Search Result 50, Processing Time 0.021 seconds

A novel radioactive particle tracking algorithm based on deep rectifier neural network

  • Dam, Roos Sophia de Freitas;dos Santos, Marcelo Carvalho;do Desterro, Filipe Santana Moreira;Salgado, William Luna;Schirru, Roberto;Salgado, Cesar Marques
    • Nuclear Engineering and Technology
    • /
    • v.53 no.7
    • /
    • pp.2334-2340
    • /
    • 2021
  • Radioactive particle tracking (RPT) is a minimally invasive nuclear technique that tracks a radioactive particle inside a volume of interest by means of a mathematical location algorithm. During the past decades, many algorithms have been developed including ones based on artificial intelligence techniques. In this study, RPT technique is applied in a simulated test section that employs a simplified mixer filled with concrete, six scintillator detectors and a137Cs radioactive particle emitting gamma rays of 662 keV. The test section was developed using MCNPX code, which is a mathematical code based on Monte Carlo simulation, and 3516 different radioactive particle positions (x,y,z) were simulated. Novelty of this paper is the use of a location algorithm based on a deep learning model, more specifically a 6-layers deep rectifier neural network (DRNN), in which hyperparameters were defined using a Bayesian optimization method. DRNN is a type of deep feedforward neural network that substitutes the usual sigmoid based activation functions, traditionally used in vanilla Multilayer Perceptron Networks, for rectified activation functions. Results show the great accuracy of the DRNN in a RPT tracking system. Root mean squared error for x, y and coordinates of the radioactive particle is, respectively, 0.03064, 0.02523 and 0.07653.

Analysis of Programming Techniques for Creating Optimized CUDA Software (최적화된 CUDA 소프트웨어 제작을 위한 프로그래밍 기법 분석)

  • Kim, Sung-Soo;Kim, Dong-Heon;Woo, Sang-Kyu;Ihm, In-Sung
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.7
    • /
    • pp.775-787
    • /
    • 2010
  • Unlike general-purpose CPUs, the GPUs have been specialized as many-core streaming processors, and are frequently replacing the CPUs in an increasing range of computations thanks to their outstanding parallel computing capacity. In order to respond to such trend, NVIDIA has recently issued a new parallel computing architecture called CUDA(Compute Unified Device Architecture), offering a flexible GPU programming environment for GPGPU(General Purpose GPU) computing. In general, when programmers use the CUDA API, they should clearly understand many aspects of GPU's computing architecture to produce efficient parallel software. In this article, we explain several optimization techniques for CUDA programming that we have verified through a lot of experiment and trial and error, and review how those techniques affect the performance of code execution. In particular, we use a specific problem as an example to analyze several elements that affect performances, such as effective accesses to hierarchical memory system, processor occupancy, and latency hiding. In conclusion, we present several directions that may be utilized effectively in CUDA-based parallel programming.

Development of Cr cold spray-coated fuel cladding with enhanced accident tolerance

  • Sevecek, Martin;Gurgen, Anil;Seshadri, Arunkumar;Che, Yifeng;Wagih, Malik;Phillips, Bren;Champagne, Victor;Shirvan, Koroush
    • Nuclear Engineering and Technology
    • /
    • v.50 no.2
    • /
    • pp.229-236
    • /
    • 2018
  • Accident-tolerant fuels (ATFs) are currently of high interest to researchers in the nuclear industry and in governmental and international organizations. One widely studied accident-tolerant fuel concept is multilayer cladding (also known as coated cladding). This concept is based on a traditional Zr-based alloy (Zircaloy-4, M5, E110, ZIRLO etc.) serving as a substrate. Different protective materials are applied to the substrate surface by various techniques, thus enhancing the accident tolerance of the fuel. This study focuses on the results of testing of Zircaloy-4 coated with pure chromium metal using the cold spray (CS) technique. In comparison with other deposition methods, e.g., Physical vapor deposition (PVD), laser coating, or Chemical vapor deposition techniques (CVD), the CS technique is more cost efficient due to lower energy consumption and high deposition rates, making it more suitable for industry-scale production. The Cr-coated samples were tested at different conditions ($500^{\circ}C$ steam, $1200^{\circ}C$ steam, and Pressurized water reactor (PWR) pressurization test) and were precharacterized and postcharacterized by various techniques, such as scanning electron microscopy, Energy-dispersive X-ray spectroscopy (EDX), or nanoindentation; results are discussed. Results of the steady-state fuel performance simulations using the Bison code predicted the concept's feasibility. It is concluded that CS Cr coating has high potential benefits but requires further optimization and out-of-pile and in-pile testing.

A Design of Parameterized Viterbi Decoder using Hardware Sharing (하드웨어 공유를 이용한 파라미터화된 비터비 복호기 설계)

  • Park, Sang-Deok;Jeon, Heung-Woo;Shin, Kyung-Wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.05a
    • /
    • pp.93-96
    • /
    • 2008
  • This paper describes an efficient design of a multi-standard Viterbi decoder that supports multiple constraint lengths and code rates. The Viterbi decode. is parameterized for the code rates 1/2, 1/3 and constraint lengths 7, 9, thus it has four operation modes. In order to achieve low hardware complexity and low power, an efficient architecture based on hardware sharing techniques is devised. Also, the optimization of ACCS (Accumulate-Subtract) circuit for the one-point trace-back algorithm reduces its area by about 35% compared to the full parallel ACCS circuit. The parameterized Viterbi decoder core has 79,818 gates and 25,600 bits memory, and the estimated throughput is about 105 Mbps at 70 MHz clock frequency.

  • PDF

Efficient Motion Information Representation in Splitting Region of HEVC (HEVC의 분할 영역에서 효율적인 움직임 정보 표현)

  • Lee, Dong-Shik;Kim, Young-Mo
    • Journal of Korea Multimedia Society
    • /
    • v.15 no.4
    • /
    • pp.485-491
    • /
    • 2012
  • This paper proposes 'Coding Unit Tree' based on quadtree efficiently with motion vector to represent splitting information of a Coding Unit (CU) in HEVC. The new international video coding, High Efficiency Video Coding (HEVC), adopts various techniques and new unit concept: CU, Prediction Unit (PU), and Transform Unit (TU). The basic coding unit, CU is larger than macroblock of H.264/AVC and it splits to process image-based quadtree with a hierarchical structure. However, in case that there are complex motions in CU, the more signaling bits with motion information need to be transmitted. This structure provides a flexibility and a base for a optimization, but there are overhead about splitting information. This paper analyzes those signals and proposes a new algorithm which removes those redundancy. The proposed algorithm utilizes a type code, a dominant value, and residue values at a node in quadtree to remove the addition bits. Type code represents a structure of an image tree and the two values represent a node value. The results show that the proposed algorithm gains 13.6% bit-rate reduction over the HM-1.0.

Embryo transfer of dorper breed to Mongolian sheep

  • Chuluunbayar Uuganbayar;Tsolmonbaatar Boldsaikhan;Byambasaikhan Danzan-Osor;Ho-Jun Lee;Sang-Hwan Kim;Enkhbolor Barsuren
    • Journal of Animal Reproduction and Biotechnology
    • /
    • v.37 no.4
    • /
    • pp.226-230
    • /
    • 2022
  • The sheep can be reproduced by natural mating as well as applied reproductive biotechnology, embryo transfer (ET). However, this method in sheep is influenced by several factors such as season, photoperiod, latitude, temperature, nutrition, and breed. In addition, there is still less research on assisted reproductive technologies in small ruminants, compared to other livestock species such as cattle and pigs. Because there has been a need for an optimization and a continuous improvement of ET techniques in small ruminants. the main objective of this study was to evaluate the conception rate obtained after ET in Mongolian sheep (Dorper breed). After embryo recover, code 1 and 2 embryos (morula or blastocyst stage) for ET in the present study were 63% (63/100) and 24% (24/100), respectively. Then Each single embryo was transferred to a synchronized recipient who prepared by estrous synchronization protocol with fluorogestone acetate-cloprostenol sodium. The results demonstrated that an average conception rate and lambing rate was 35.6% (31/87) and 33.3% (29/87), respectively. Further study is still necessary, but these results indicated that single embryo of Mongolian sheep with the present protocol was enough to conducting ET when the genetically superior sheep were necessary to be expanded.

Mechanical Characteristic Test of Architectural ETFE Film Membrane (크기최적화 이후에 나타나는 공간구조물의 후 좌굴 거동 변화에 대한 연구)

  • Lee, Sang-Jin;Jung, Ji-Myoung
    • Journal of Korean Association for Spatial Structures
    • /
    • v.9 no.3
    • /
    • pp.75-82
    • /
    • 2009
  • This paper investigates the variation of post-buckling behaviours of spatial structures after sizing optimization with linear assumptions. The mathematical programming technique is used to produce the optimum member size of spatial structures against external load. Total weight of structure is considered as the objective function to be minimized and the displacement occurred at loading point and member stresses of structures are used as the constraint functions. The finite difference method is used to calculate the design sensitivity of objective function with respect to design variables. The post-buckling analysis carried out by using the geometrically nonlinear finite element analysis code ISADO-GN. It is found to be that there is a huge difference between the post buckling behaviours of the initial and optimized structures. Therefore, the stability of optimized spatial structures with linear assumption should be throughly checked by appropriate nonlinear analysis techniques. Finally, the present numerical results are provided as benchmark test suite for future study of large spatial structures.

  • PDF

Storage Assignment for Variables Considering Efficient Memory Access in Embedded System Design (임베디드 시스템 설계에서 효율적인 메모리 접근을 고려한 변수 저장 방법)

  • Choi Yoonseo;Kim Taewhan
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.2
    • /
    • pp.85-94
    • /
    • 2005
  • It has been reported and verified in many design experiences that a judicious utilization of the page and burst access modes supported by DRAMs contributes a great reduction in not only the DRAM access latency but also DRAM's energy consumption. Recently, researchers showed that a careful arrangement of data variables in memory directly leads to a maximum utilization of the page and burst access modes for the variable accesses, but unfortunately, found that the problems are not tractable, consequently, resorting to simple (e.g., greedy) heuristic solutions to the problems. In this parer, to improve the quality of existing solutions, we propose 0-1 ILP-based techniques which produce optimal or near-optimal solution depending on the formulation parameters. It is shown that the proposed techniques use on average 32.2%, l5.1% and 3.5% more page accesses, and 84.0%, 113.5% and 10.1% more burst accesses compared to OFU (the order of first use) and the technique in [l, 2] and the technique in [3], respectively.

A Study on Scalability of Profiling Method Based on Hardware Performance Counter for Optimal Execution of Supercomputer (슈퍼컴퓨터 최적 실행 지원을 위한 하드웨어 성능 카운터 기반 프로파일링 기법의 확장성 연구)

  • Choi, Jieun;Park, Guenchul;Rho, Seungwoo;Park, Chan-Yeol
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.9 no.10
    • /
    • pp.221-230
    • /
    • 2020
  • Supercomputer that shares limited resources to multiple users needs a way to optimize the execution of application. For this, it is useful for system administrators to get prior information and hint about the applications to be executed. In most high-performance computing system operations, system administrators strive to increase system productivity by receiving information about execution duration and resource requirements from users when executing tasks. They are also using profiling techniques that generates the necessary information using statistics such as system usage to increase system utilization. In a previous study, we have proposed a scheduling optimization technique by developing a hardware performance counter-based profiling technique that enables characterization of applications without further understanding of the source code. In this paper, we constructed a profiling testbed cluster to support optimal execution of the supercomputer and experimented with the scalability of the profiling method to analyze application characteristics in the built cluster environment. Also, we experimented that the profiling method can be utilized in actual scheduling optimization with scalability even if the application class is reduced or the number of nodes for profiling is minimized. Even though the number of nodes used for profiling was reduced to 1/4, the execution time of the application increased by 1.08% compared to profiling using all nodes, and the scheduling optimization performance improved by up to 37% compared to sequential execution. In addition, profiling by reducing the size of the problem resulted in a quarter of the cost of collecting profiling data and a performance improvement of up to 35%.

3D Printing in Modular Construction: Opportunities and Challenges

  • Li, Mingkai;Li, Dezhi;Zhang, Jiansong;Cheng, Jack C.P.;Gan, Vincent J.L.
    • International conference on construction engineering and project management
    • /
    • 2020.12a
    • /
    • pp.75-84
    • /
    • 2020
  • Modular construction is a construction method whereby prefabricated volumetric units are produced in a factory and are installed on site to form a building block. The construction productivity can be substantially improved by the manufacturing and assembly of standardized modular units. 3D printing is a computer-controlled fabrication method first adopted in the manufacturing industry and was utilized for the automated construction of small-scale houses in recent years. Implementing 3D printing in the fabrication of modular units brings huge benefits to modular construction, including increased customization, lower material waste, and reduced labor work. Such implementation also benefits the large-scale and wider adoption of 3D printing in engineering practice. However, a critical issue for 3D printed modules is the loading capacity, particularly in response to horizontal forces like wind load, which requires a deeper understanding of the building structure behavior and the design of load-bearing modules. Therefore, this paper presents the state-of-the-art literature concerning recent achievement in 3D printing for buildings, followed by discussion on the opportunities and challenges for examining 3D printing in modular construction. Promising 3D printing techniques are critically reviewed and discussed with regard to their advantages and limitations in construction. The appropriate structural form needs to be determined at the design stage, taking into consideration the overall building structural behavior, site environmental conditions (e.g., wind), and load-carrying capacity of the 3D printed modules. Detailed finite element modelling of the entire modular buildings needs to be conducted to verify the structural performance, considering the code-stipulated lateral drift, strength criteria, and other design requirements. Moreover, integration of building information modelling (BIM) method is beneficial for generating the material and geometric details of the 3D printed modules, which can then be utilized for the fabrication.

  • PDF