• Title/Summary/Keyword: BLAS

Search Result 12, Processing Time 0.022 seconds

Performance Improvements of SCAM Climate Model using LAPACK BLAS Library (SCAM 기상모델의 성능향상을 위한 LAPACK BLAS 라이브러리의 활용)

  • Dae-Yeong Shin;Ye-Rin Cho;Sung-Wook Chung
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.16 no.1
    • /
    • pp.33-40
    • /
    • 2023
  • With the development of supercomputing technology and hardware technology, numerical computation methods are also being advanced. Accordingly, improved weather prediction becomes possible. In this paper, we propose to apply the LAPACK(Linear Algebra PACKage) BLAS(Basic Linear Algebra Subprograms) library to the linear algebraic numerical computation part within the source code to improve the performance of the cumulative parametric code, Unicon(A Unified Convection Scheme), which is included in SCAM(Single-Columns Atmospheric Model, simplified version of CESM(Community Earth System Model)) and performs standby operations. In order to analyze this, an overall execution structure diagram of SCAM was presented and a test was conducted in the relevant execution environment. Compared to the existing source code, the SCOPY function achieved 0.4053% performance improvement, the DSCAL function 0.7812%, and the DDOT function 0.0469%, and all of them showed a 0.8537% performance improvement. This means that the LAPACK BLAS application method, a library for high-density linear algebra operations proposed in this paper, can improve performance without additional hardware intervention in the same CPU environment.

Design Considerations on Large-scale Parallel Finite Element Code in Shared Memory Architecture with Multi-Core CPU (멀티코어 CPU를 갖는 공유 메모리 구조의 대규모 병렬 유한요소 코드에 대한 설계 고려 사항)

  • Cho, Jeong-Rae;Cho, Keunhee
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.30 no.2
    • /
    • pp.127-135
    • /
    • 2017
  • The computing environment has changed rapidly to enable large-scale finite element models to be analyzed at the PC or workstation level, such as multi-core CPU, optimal math kernel library implementing BLAS and LAPACK, and popularization of direct sparse solvers. In this paper, the design considerations on a parallel finite element code for shared memory based multi-core CPU system are proposed; (1) the use of optimized numerical libraries, (2) the use of latest direct sparse solvers, (3) parallelism using OpenMP for computing element stiffness matrices, and (4) assembly techniques using triplets, which is a type of sparse matrix storage. In addition, the parallelization effect is examined on the time-consuming works through a large scale finite element model.

Novel Low-Power High-dB Range CMOS Pseudo-Exponential Cells

  • De La Cruz Blas, Carlos A.;Lopez-Martin, Antonio
    • ETRI Journal
    • /
    • v.28 no.6
    • /
    • pp.732-738
    • /
    • 2006
  • In this paper, novel CMOS pseudo-exponential circuits operating in a class-AB mode are presented. The pseudo-exponential approximation employed is based on second order equations. Such terms are derived in a straightforward way from the inherent nonlinear currents of class-AB transconductors. The cells are appropriate to be integrated in portable equipment due to their compactness and very low power consumption. Measurement results from a fabricated prototype in a 0.5 ${\mu}m$ technology reveal a range of 45 dB with errors lower than ${\pm}0.5$ dB, a power consumption of 100 ${\mu}W$, and an area of 0.01 $mm^2$.

  • PDF

Direct Methods for Linear System on Distributed Memory Parallel Computers

  • Nishimura, S.;Shigehara, T.;Mizoguchi, H.;Mishima, T.;Kobayashi, H.
    • Proceedings of the IEEK Conference
    • /
    • 2000.07a
    • /
    • pp.333-336
    • /
    • 2000
  • We discuss the direct methods (Gauss-Jordan and Gaussian eliminations) to solve linear systems on distributed memory parallel computers. It will be shown that the so-called row-cyclic storage gives rise to the best performance among the standard three (row-cyclic, column-cyclic and cyclic-cyclic) data storages. We also show that Gauss-Jordan elimination, rather than Gaussian elimination, is highly efficient for the direct solution of linear systems in parallel processing, though Gauss-Jordan elimination requires a larger number of arithmetic operations than Gaussian elimination. Numerical experiment is performed on HITACHI SR12201 with the standard libraries MPI and BLAS.

  • PDF

Phylogenetic study of penicillium chrysogenum based on the amino acid sequence analysis of chitin synthase

  • Park, Bum-Chan;Lee, Dong-Hun;Sook, Bae-Kyung;Park, Hee-Moon
    • Journal of Microbiology
    • /
    • v.35 no.3
    • /
    • pp.159-164
    • /
    • 1997
  • The phylogenetic study of Penicilium chrysogenum was performed based on amino acid sequence comparison of chitin synthase. Phylogenetic trees were constructed with the deduced amino acid sequences of the highly conserved region of chitin synthease gene fragments amplified by PCR. The BlasP similarity searcch and the bootstrap analysis of the deduced amino acid sequences of chitin synthase from P. chrysogenum with those form other fungi showed a close evolutionary relationship of Penicillium to ascomycetous fungi, especially to genus Aspergilus. The result from bootstrap analysis of the deduced amino acid sequences of the Class II chitin synthase from ascomyceteous fungi supported the usefulness of the Class II chitin synthease for phylogenetic study of filamentous fungi.

  • PDF

A Study on the Stage of Embryos Non-Surgically Recovered from Heifers and Cows in Natural Heat (자연배란된 처녀우와 경산으로부터 비외과적으로 회수한 수정란의 발육단계에 관한 연구)

  • 정구민;김종국;임경순
    • Journal of Embryo Transfer
    • /
    • v.4 no.1
    • /
    • pp.41-45
    • /
    • 1989
  • Total thirty of flushing were attempted on day 4 to 15 of estrus cycle with S heifers and 9 cows by nonsurgical method. The flushed or recovered rate among flushings was 86.7% (26/30) or 88.5% (23/26), respectively. There was no difference in the recovered rate between heifers (85.7%,6/7) and cows (89.5%, 17119). The embryo was recovered on day 4 to 15 of estrus cycle from the donors in natural heat without any technical difficulties.The I2FG Foley catheter used for pubertal heifers had sometimes plug in it with uterine mucus during flushing of uterine horn. But the problem could be overcomed by pumping the catherter with fluthing solution or by changing the catheter. Three normal embryos were recovered from 3 pubertal (10-11 month old) heifers. The rate of normal and abnormal eggs was 60.9% (14123) and 39.1% (9/23), respectively. The abnormal eggs were on degenerating except one unfertilized egg and were mostly recovered from heifers or cows flushed consecutively during the estrus cycle. The developmental states of normal embryos were l6-cells on day 5, 32-cells on day 6, compacted-morula on day 7, early-to expanded-blastocyst on day 8-to 9, and hatching-to hatched-blastocyst on day 10 to 11 of estrus cycle. The stage of embryos on day 8 to 10 showed varities among donors. On day 8 to 9 of estrus cycle hatching-blas tocyst was recovered from some donors.

  • PDF

Coherent motion of microwave-induced fluxons in intrinsic Josephson junctions of HgI$_2$-intercalated Bi$_2$Sr$_2$C aCu$_2$O$_{8+x}$ single crystals

  • Kim, Jin-Hee;Doh, Yong-Joo;Chang, Sung-Ho;Lee, Hu-Jong;Chang, Hyun-Sik;Kim, Kyu-Tae;Jang, Eue-Soon;Choy, Jin-Ho
    • 한국초전도학회:학술대회논문집
    • /
    • v.10
    • /
    • pp.65-65
    • /
    • 2000
  • Microwave response of intrinsic Josephson junctions in mesa structure formed on HgI2-intercalated Bi2Sr2CaCu2O8+x single crystals was studied in a wide range of microwave frequency. With irradiation of 73${\sim}$76 GHz microwave, the supercurrent branch becomes resistive above a certain onset microwave power. At low current bias, the current-voltage characteristics show linear behavior, while at high current bias, the resistive branch splits into multiple sub-branches. The voltage spacing between neighboring sub-branches increase with the microwave power and the total number of sub-branches is almost identical to the number of intrinsic Josephson junctions in the mesa. All the experimental results suggest that each sub-branch represents a specific mode of collective motion of Josephson vortices generated by the microwave irradiation. With irradiation of microwave of microwave of frequency lower than 20 GHz, on the other hand, no branch splitting was observed and the current-voltage characteristics exhibited complex behavior at hlgh blas currents. This result can be explained in terms of incoherent motion of Josephson vortices generated by non-uniform microwave irradiation.

  • PDF

Pulse Width Modulation by Tunnel Diode Pair Circuit (쌍턴넬다이오드회로를 이용한 펄스폭변조)

  • 오현위
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.9 no.3
    • /
    • pp.1-8
    • /
    • 1972
  • The characteristics of tunnel diode pair circuit biased within the negative resistance region has also the voltage-control type negative resistance region, and the voltage at the center point of negative resistance region is described as the square-wave relaxation oscillation. In this paper, the period T, positive duration T1, negative duration T2 of the pulse are obatined from the characteristic curve T, positive duration T1, negative duration T2 of the pulse are obtained from the characteristic curve and observed actually, considring the fact that the pulse width and the period of square-wave at the center point of the negative resistance region is able to be controlle dby the blas volgate. Mereover, the relationship between T, T1 or T2 and circuit parameters is searched for and the Circuit parameters that satisfy the conditions of T1-T2 being proportional to the variation of bias voltage with Teonstant are determined. Thereafter, the bias voltage and the signal voltage are inserted serially to the PWM circuit and the characteristics of that circuit is analyzed.

  • PDF