• Title/Summary/Keyword: computing speed

Search Result 898, Processing Time 0.028 seconds

Gesture Control Gaming for Motoric Post-Stroke Rehabilitation

  • Andi Bese Firdausiah Mansur
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.10
    • /
    • pp.37-43
    • /
    • 2023
  • The hospital situation, timing, and patient restrictions have become obstacles to an optimum therapy session. The crowdedness of the hospital might lead to a tight schedule and a shorter period of therapy. This condition might strike a post-stroke patient in a dilemma where they need regular treatment to recover their nervous system. In this work, we propose an in-house and uncomplex serious game system that can be used for physical therapy. The Kinect camera is used to capture the depth image stream of a human skeleton. Afterwards, the user might use their hand gesture to control the game. Voice recognition is deployed to ease them with play. Users must complete the given challenge to obtain a more significant outcome from this therapy system. Subjects will use their upper limb and hands to capture the 3D objects with different speeds and positions. The more substantial challenge, speed, and location will be increased and random. Each delegated entity will raise the scores. Afterwards, the scores will be further evaluated to correlate with therapy progress. Users are delighted with the system and eager to use it as their daily exercise. The experimental studies show a comparison between score and difficulty that represent characteristics of user and game. Users tend to quickly adapt to easy and medium levels, while high level requires better focus and proper synchronization between hand and eye to capture the 3D objects. The statistical analysis with a confidence rate(α:0.05) of the usability test shows that the proposed gaming is accessible, even without specialized training. It is not only for therapy but also for fitness because it can be used for body exercise. The result of the experiment is very satisfying. Most users enjoy and familiarize themselves quickly. The evaluation study demonstrates user satisfaction and perception during testing. Future work of the proposed serious game might involve haptic devices to stimulate their physical sensation.

Comparing MCMC algorithms for the horseshoe prior (Horseshoe 사전분포에 대한 MCMC 알고리듬 비교 연구)

  • Miru Ma;Mingi Kang;Kyoungjae Lee
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.1
    • /
    • pp.103-118
    • /
    • 2024
  • The horseshoe prior is notably one of the most popular priors in sparse regression models, where only a small fraction of coefficients are nonzero. The parameter space of the horseshoe prior is much smaller than that of the spike and slab prior, so it enables us to efficiently explore the parameter space even in high-dimensions. However, on the other hand, the horseshoe prior has a high computational cost for each iteration in the Gibbs sampler. To overcome this issue, various MCMC algorithms for the horseshoe prior have been proposed to reduce the computational burden. Especially, Johndrow et al. (2020) recently proposes an approximate algorithm that can significantly improve the mixing and speed of the MCMC algorithm. In this paper, we compare (1) the traditional MCMC algorithm, (2) the approximate MCMC algorithm proposed by Johndrow et al. (2020) and (3) its variant in terms of computing times, estimation and variable selection performance. For the variable selection, we adopt the sequential clustering-based method suggested by Li and Pati (2017). Practical performances of the MCMC methods are demonstrated via numerical studies.

Goal-oriented Movement Reality-based Skeleton Animation Using Machine Learning

  • Yu-Won JEONG
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.16 no.2
    • /
    • pp.267-277
    • /
    • 2024
  • This paper explores the use of machine learning in game production to create goal-oriented, realistic animations for skeleton monsters. The purpose of this research is to enhance realism by implementing intelligent movements in monsters within game development. To achieve this, we designed and implemented a learning model for skeleton monsters using reinforcement learning algorithms. During the machine learning process, various reward conditions were established, including the monster's speed, direction, leg movements, and goal contact. The use of configurable joints introduced physical constraints. The experimental method validated performance through seven statistical graphs generated using machine learning methods. The results demonstrated that the developed model allows skeleton monsters to move to their target points efficiently and with natural animation. This paper has implemented a method for creating game monster animations using machine learning, which can be applied in various gaming environments in the future. The year 2024 is expected to bring expanded innovation in the gaming industry. Currently, advancements in technology such as virtual reality, AI, and cloud computing are redefining the sector, providing new experiences and various opportunities. Innovative content optimized for this period is needed to offer new gaming experiences. A high level of interaction and realism, along with the immersion and fun it induces, must be established as the foundation for the environment in which these can be implemented. Recent advancements in AI technology are significantly impacting the gaming industry. By applying many elements necessary for game development, AI can efficiently optimize the game production environment. Through this research, We demonstrate that the application of machine learning to Unity and game engines in game development can contribute to creating more dynamic and realistic game environments. To ensure that VR gaming does not end as a mere craze, we propose new methods in this study to enhance realism and immersion, thereby increasing enjoyment for continuous user engagement.

ONNX-based Runtime Performance Analysis: YOLO and ResNet (ONNX 기반 런타임 성능 분석: YOLO와 ResNet)

  • Jeong-Hyeon Kim;Da-Eun Lee;Su-Been Choi;Kyung-Koo Jun
    • The Journal of Bigdata
    • /
    • v.9 no.1
    • /
    • pp.89-100
    • /
    • 2024
  • In the field of computer vision, models such as You Look Only Once (YOLO) and ResNet are widely used due to their real-time performance and high accuracy. However, to apply these models in real-world environments, factors such as runtime compatibility, memory usage, computing resources, and real-time conditions must be considered. This study compares the characteristics of three deep model runtimes: ONNX Runtime, TensorRT, and OpenCV DNN, and analyzes their performance on two models. The aim of this paper is to provide criteria for runtime selection for practical applications. The experiments compare runtimes based on the evaluation metrics of time, memory usage, and accuracy for vehicle license plate recognition and classification tasks. The experimental results show that ONNX Runtime excels in complex object detection performance, OpenCV DNN is suitable for environments with limited memory, and TensorRT offers superior execution speed for complex models.

A Fast Algorithm for Computing Multiplicative Inverses in GF(2$^{m}$) using Factorization Formula and Normal Basis (인수분해 공식과 정규기저를 이용한 GF(2$^{m}$ ) 상의 고속 곱셈 역원 연산 알고리즘)

  • 장용희;권용진
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.30 no.5_6
    • /
    • pp.324-329
    • /
    • 2003
  • The public-key cryptosystems such as Diffie-Hellman Key Distribution and Elliptical Curve Cryptosystems are built on the basis of the operations defined in GF(2$^{m}$ ):addition, subtraction, multiplication and multiplicative inversion. It is important that these operations should be computed at high speed in order to implement these cryptosystems efficiently. Among those operations, as being the most time-consuming, multiplicative inversion has become the object of lots of investigation Formant's theorem says $\beta$$^{-1}$ =$\beta$$^{2}$sup m/-2/, where $\beta$$^{-1}$ is the multiplicative inverse of $\beta$$\in$GF(2$^{m}$ ). Therefore, to compute the multiplicative inverse of arbitrary elements of GF(2$^{m}$ ), it is most important to reduce the number of times of multiplication by decomposing 2$^{m}$ -2 efficiently. Among many algorithms relevant to the subject, the algorithm proposed by Itoh and Tsujii[2] has reduced the required number of times of multiplication to O(log m) by using normal basis. Furthermore, a few papers have presented algorithms improving the Itoh and Tsujii's. However they have some demerits such as complicated decomposition processes[3,5]. In this paper, in the case of 2$^{m}$ -2, which is mainly used in practical applications, an efficient algorithm is proposed for computing the multiplicative inverse at high speed by using both the factorization formula x$^3$-y$^3$=(x-y)(x$^2$+xy+y$^2$) and normal basis. The number of times of multiplication of the algorithm is smaller than that of the algorithm proposed by Itoh and Tsujii. Also the algorithm decomposes 2$^{m}$ -2 more simply than other proposed algorithms.

An Iterative, Interactive and Unified Seismic Velocity Analysis (반복적 대화식 통합 탄성파 속도분석)

  • Suh Sayng-Yong;Chung Bu-Heung;Jang Seong-Hyung
    • Geophysics and Geophysical Exploration
    • /
    • v.2 no.1
    • /
    • pp.26-32
    • /
    • 1999
  • Among the various seismic data processing sequences, the velocity analysis is the most time consuming and man-hour intensive processing steps. For the production seismic data processing, a good velocity analysis tool as well as the high performance computer is required. The tool must give fast and accurate velocity analysis. There are two different approches in the velocity analysis, batch and interactive. In the batch processing, a velocity plot is made at every analysis point. Generally, the plot consisted of a semblance contour, super gather, and a stack pannel. The interpreter chooses the velocity function by analyzing the velocity plot. The technique is highly dependent on the interpreters skill and requires human efforts. As the high speed graphic workstations are becoming more popular, various interactive velocity analysis programs are developed. Although, the programs enabled faster picking of the velocity nodes using mouse, the main improvement of these programs is simply the replacement of the paper plot by the graphic screen. The velocity spectrum is highly sensitive to the presence of the noise, especially the coherent noise often found in the shallow region of the marine seismic data. For the accurate velocity analysis, these noise must be removed before the spectrum is computed. Also, the velocity analysis must be carried out by carefully choosing the location of the analysis point and accuarate computation of the spectrum. The analyzed velocity function must be verified by the mute and stack, and the sequence must be repeated most time. Therefore an iterative, interactive, and unified velocity analysis tool is highly required. An interactive velocity analysis program, xva(X-Window based Velocity Analysis) was invented. The program handles all processes required in the velocity analysis such as composing the super gather, computing the velocity spectrum, NMO correction, mute, and stack. Most of the parameter changes give the final stack via a few mouse clicks thereby enabling the iterative and interactive processing. A simple trace indexing scheme is introduced and a program to nike the index of the Geobit seismic disk file was invented. The index is used to reference the original input, i.e., CDP sort, directly A transformation techinique of the mute function between the T-X domain and NMOC domain is introduced and adopted to the program. The result of the transform is simliar to the remove-NMO technique in suppressing the shallow noise such as direct wave and refracted wave. However, it has two improvements, i.e., no interpolation error and very high speed computing time. By the introduction of the technique, the mute times can be easily designed from the NMOC domain and applied to the super gather in the T-X domain, thereby producing more accurate velocity spectrum interactively. The xva program consists of 28 files, 12,029 lines, 34,990 words and 304,073 characters. The program references Geobit utility libraries and can be installed under Geobit preinstalled environment. The program runs on X-Window/Motif environment. The program menu is designed according to the Motif style guide. A brief usage of the program has been discussed. The program allows fast and accurate seismic velocity analysis, which is necessary computing the AVO (Amplitude Versus Offset) based DHI (Direct Hydrocarn Indicator), and making the high quality seismic sections.

  • PDF

Benchmark Results of a Monte Carlo Treatment Planning system (몬데카를로 기반 치료계획시스템의 성능평가)

  • Cho, Byung-Chul
    • Progress in Medical Physics
    • /
    • v.13 no.3
    • /
    • pp.149-155
    • /
    • 2002
  • Recent advances in radiation transport algorithms, computer hardware performance, and parallel computing make the clinical use of Monte Carlo based dose calculations possible. To compare the speed and accuracies of dose calculations between different developed codes, a benchmark tests were proposed at the XIIth ICCR (International Conference on the use of Computers in Radiation Therapy, Heidelberg, Germany 2000). A Monte Carlo treatment planning comprised of 28 various Intel Pentium CPUs was implemented for routine clinical use. The purpose of this study was to evaluate the performance of our system using the above benchmark tests. The benchmark procedures are comprised of three parts. a) speed of photon beams dose calculation inside a given phantom of 30.5 cm$\times$39.5 cm $\times$ 30 cm deep and filled with 5 ㎣ voxels within 2% statistical uncertainty. b) speed of electron beams dose calculation inside the same phantom as that of the photon beams. c) accuracy of photon and electron beam calculation inside heterogeneous slab phantom compared with the reference results of EGS4/PRESTA calculation. As results of the speed benchmark tests, it took 5.5 minutes to achieve less than 2% statistical uncertainty for 18 MV photon beams. Though the net calculation for electron beams was an order of faster than the photon beam, the overall calculation time was similar to that of photon beam case due to the overhead time to maintain parallel processing. Since our Monte Carlo code is EGSnrc, which is an improved version of EGS4, the accuracy tests of our system showed, as expected, very good agreement with the reference data. In conclusion, our Monte Carlo treatment planning system shows clinically meaningful results. Though other more efficient codes are developed such like MCDOSE and VMC++, BEAMnrc based on EGSnrc code system may be used for routine clinical Monte Carlo treatment planning in conjunction with clustering technique.

  • PDF

Acceleration of computation speed for elastic wave simulation using a Graphic Processing Unit (그래픽 프로세서를 이용한 탄성파 수치모사의 계산속도 향상)

  • Nakata, Norimitsu;Tsuji, Takeshi;Matsuoka, Toshifumi
    • Geophysics and Geophysical Exploration
    • /
    • v.14 no.1
    • /
    • pp.98-104
    • /
    • 2011
  • Numerical simulation in exploration geophysics provides important insights into subsurface wave propagation phenomena. Although elastic wave simulations take longer to compute than acoustic simulations, an elastic simulator can construct more realistic wavefields including shear components. Therefore, it is suitable for exploration of the responses of elastic bodies. To overcome the long duration of the calculations, we use a Graphic Processing Unit (GPU) to accelerate the elastic wave simulation. Because a GPU has many processors and a wide memory bandwidth, we can use it in a parallelised computing architecture. The GPU board used in this study is an NVIDIA Tesla C1060, which has 240 processors and a 102 GB/s memory bandwidth. Despite the availability of a parallel computing architecture (CUDA), developed by NVIDIA, we must optimise the usage of the different types of memory on the GPU device, and the sequence of calculations, to obtain a significant speedup of the computation. In this study, we simulate two- (2D) and threedimensional (3D) elastic wave propagation using the Finite-Difference Time-Domain (FDTD) method on GPUs. In the wave propagation simulation, we adopt the staggered-grid method, which is one of the conventional FD schemes, since this method can achieve sufficient accuracy for use in numerical modelling in geophysics. Our simulator optimises the usage of memory on the GPU device to reduce data access times, and uses faster memory as much as possible. This is a key factor in GPU computing. By using one GPU device and optimising its memory usage, we improved the computation time by more than 14 times in the 2D simulation, and over six times in the 3D simulation, compared with one CPU. Furthermore, by using three GPUs, we succeeded in accelerating the 3D simulation 10 times.

Design of a Bit-Serial Divider in GF(2$^{m}$ ) for Elliptic Curve Cryptosystem (타원곡선 암호시스템을 위한 GF(2$^{m}$ )상의 비트-시리얼 나눗셈기 설계)

  • 김창훈;홍춘표;김남식;권순학
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.12C
    • /
    • pp.1288-1298
    • /
    • 2002
  • To implement elliptic curve cryptosystem in GF(2$\^$m/) at high speed, a fast divider is required. Although bit-parallel architecture is well suited for high speed division operations, elliptic curve cryptosystem requires large m(at least 163) to support a sufficient security. In other words, since the bit-parallel architecture has an area complexity of 0(m$\^$m/), it is not suited for this application. In this paper, we propose a new serial-in serial-out systolic array for computing division operations in GF(2$\^$m/) using the standard basis representation. Based on a modified version of tile binary extended greatest common divisor algorithm, we obtain a new data dependence graph and design an efficient bit-serial systolic divider. The proposed divider has 0(m) time complexity and 0(m) area complexity. If input data come in continuously, the proposed divider can produce division results at a rate of one per m clock cycles, after an initial delay of 5m-2 cycles. Analysis shows that the proposed divider provides a significant reduction in both chip area and computational delay time compared to previously proposed systolic dividers with the same I/O format. Since the proposed divider can perform division operations at high speed with the reduced chip area, it is well suited for division circuit of elliptic curve cryptosystem. Furthermore, since the proposed architecture does not restrict the choice of irreducible polynomial, and has a unidirectional data flow and regularity, it provides a high flexibility and scalability with respect to the field size m.

Hardware Approach to Fuzzy Inference―ASIC and RISC―

  • Watanabe, Hiroyuki
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 1993.06a
    • /
    • pp.975-976
    • /
    • 1993
  • This talk presents the overview of the author's research and development activities on fuzzy inference hardware. We involved it with two distinct approaches. The first approach is to use application specific integrated circuits (ASIC) technology. The fuzzy inference method is directly implemented in silicon. The second approach, which is in its preliminary stage, is to use more conventional microprocessor architecture. Here, we use a quantitative technique used by designer of reduced instruction set computer (RISC) to modify an architecture of a microprocessor. In the ASIC approach, we implemented the most widely used fuzzy inference mechanism directly on silicon. The mechanism is beaded on a max-min compositional rule of inference, and Mandami's method of fuzzy implication. The two VLSI fuzzy inference chips are designed, fabricated, and fully tested. Both used a full-custom CMOS technology. The second and more claborate chip was designed at the University of North Carolina(U C) in cooperation with MCNC. Both VLSI chips had muliple datapaths for rule digital fuzzy inference chips had multiple datapaths for rule evaluation, and they executed multiple fuzzy if-then rules in parallel. The AT & T chip is the first digital fuzzy inference chip in the world. It ran with a 20 MHz clock cycle and achieved an approximately 80.000 Fuzzy Logical inferences Per Second (FLIPS). It stored and executed 16 fuzzy if-then rules. Since it was designed as a proof of concept prototype chip, it had minimal amount of peripheral logic for system integration. UNC/MCNC chip consists of 688,131 transistors of which 476,160 are used for RAM memory. It ran with a 10 MHz clock cycle. The chip has a 3-staged pipeline and initiates a computation of new inference every 64 cycle. This chip achieved an approximately 160,000 FLIPS. The new architecture have the following important improvements from the AT & T chip: Programmable rule set memory (RAM). On-chip fuzzification operation by a table lookup method. On-chip defuzzification operation by a centroid method. Reconfigurable architecture for processing two rule formats. RAM/datapath redundancy for higher yield It can store and execute 51 if-then rule of the following format: IF A and B and C and D Then Do E, and Then Do F. With this format, the chip takes four inputs and produces two outputs. By software reconfiguration, it can store and execute 102 if-then rules of the following simpler format using the same datapath: IF A and B Then Do E. With this format the chip takes two inputs and produces one outputs. We have built two VME-bus board systems based on this chip for Oak Ridge National Laboratory (ORNL). The board is now installed in a robot at ORNL. Researchers uses this board for experiment in autonomous robot navigation. The Fuzzy Logic system board places the Fuzzy chip into a VMEbus environment. High level C language functions hide the operational details of the board from the applications programme . The programmer treats rule memories and fuzzification function memories as local structures passed as parameters to the C functions. ASIC fuzzy inference hardware is extremely fast, but they are limited in generality. Many aspects of the design are limited or fixed. We have proposed to designing a are limited or fixed. We have proposed to designing a fuzzy information processor as an application specific processor using a quantitative approach. The quantitative approach was developed by RISC designers. In effect, we are interested in evaluating the effectiveness of a specialized RISC processor for fuzzy information processing. As the first step, we measured the possible speed-up of a fuzzy inference program based on if-then rules by an introduction of specialized instructions, i.e., min and max instructions. The minimum and maximum operations are heavily used in fuzzy logic applications as fuzzy intersection and union. We performed measurements using a MIPS R3000 as a base micropro essor. The initial result is encouraging. We can achieve as high as a 2.5 increase in inference speed if the R3000 had min and max instructions. Also, they are useful for speeding up other fuzzy operations such as bounded product and bounded sum. The embedded processor's main task is to control some device or process. It usually runs a single or a embedded processer to create an embedded processor for fuzzy control is very effective. Table I shows the measured speed of the inference by a MIPS R3000 microprocessor, a fictitious MIPS R3000 microprocessor with min and max instructions, and a UNC/MCNC ASIC fuzzy inference chip. The software that used on microprocessors is a simulator of the ASIC chip. The first row is the computation time in seconds of 6000 inferences using 51 rules where each fuzzy set is represented by an array of 64 elements. The second row is the time required to perform a single inference. The last row is the fuzzy logical inferences per second (FLIPS) measured for ach device. There is a large gap in run time between the ASIC and software approaches even if we resort to a specialized fuzzy microprocessor. As for design time and cost, these two approaches represent two extremes. An ASIC approach is extremely expensive. It is, therefore, an important research topic to design a specialized computing architecture for fuzzy applications that falls between these two extremes both in run time and design time/cost. TABLEI INFERENCE TIME BY 51 RULES {{{{Time }}{{MIPS R3000 }}{{ASIC }}{{Regular }}{{With min/mix }}{{6000 inference 1 inference FLIPS }}{{125s 20.8ms 48 }}{{49s 8.2ms 122 }}{{0.0038s 6.4㎲ 156,250 }} }}

  • PDF