• Title/Summary/Keyword: computing speed

Search Result 898, Processing Time 0.031 seconds

The Performance Analysis of GPU-based Cloth simulation according to the Change of Work Group Configuration (워크 그룹 구성 변화에 따른 GPU 기반 천 시뮬레이션의 성능 분석)

  • Choi, Young-Hwan;Hong, Min;Lee, Seung-Hyun;Choi, Yoo-Joo
    • Journal of Internet Computing and Services
    • /
    • v.18 no.3
    • /
    • pp.29-36
    • /
    • 2017
  • In these days, 3D dynamic simulation is closely related to many industries. In the past, physically-based 3D simulation was used mainly in the car crash or construction related fields, but it also plays an important role in movies or games today. Many mathematical computations are needed to represent the 3D object realistically, but it is difficult to process a large amount of calculations for simulation of application based on CPU in real-time. Recently, with the advanced graphic hardware and improved architecture, GPU can be utilized for the general purposes of computation function as well as graphic computation. Many approaches using GPU have been applied for various research fields. In this paper, we analyze the performance variation of two cloth simulation algorithms based on GPU according to the change of execution properties of GPU shaders in oder to optimize the performance of GPU-based cloth simulation. Cloth simulation is implemented by the spring centric algorithm and node centric algorithm with GPU parallel computing using compute shader of GLSL 4.3. We compare the performance of between these algorithms according to the change of the size and dimension of work group. The experiment is repeated to 10 times during 5,000 frames for each test and experimental results are provided by averaging of FPS. The experimental result shows that the node centric algorithm is executed in higher speed than the spring centric algorithm.

A Scalable OWL Horst Lite Ontology Reasoning Approach based on Distributed Cluster Memories (분산 클러스터 메모리 기반 대용량 OWL Horst Lite 온톨로지 추론 기법)

  • Kim, Je-Min;Park, Young-Tack
    • Journal of KIISE
    • /
    • v.42 no.3
    • /
    • pp.307-319
    • /
    • 2015
  • Current ontology studies use the Hadoop distributed storage framework to perform map-reduce algorithm-based reasoning for scalable ontologies. In this paper, however, we propose a novel approach for scalable Web Ontology Language (OWL) Horst Lite ontology reasoning, based on distributed cluster memories. Rule-based reasoning, which is frequently used for scalable ontologies, iteratively executes triple-format ontology rules, until the inferred data no longer exists. Therefore, when the scalable ontology reasoning is performed on computer hard drives, the ontology reasoner suffers from performance limitations. In order to overcome this drawback, we propose an approach that loads the ontologies into distributed cluster memories, using Spark (a memory-based distributed computing framework), which executes the ontology reasoning. In order to implement an appropriate OWL Horst Lite ontology reasoning system on Spark, our method divides the scalable ontologies into blocks, loads each block into the cluster nodes, and subsequently handles the data in the distributed memories. We used the Lehigh University Benchmark, which is used to evaluate ontology inference and search speed, to experimentally evaluate the methods suggested in this paper, which we applied to LUBM8000 (1.1 billion triples, 155 gigabytes). When compared with WebPIE, a representative mapreduce algorithm-based scalable ontology reasoner, the proposed approach showed a throughput improvement of 320% (62k/s) over WebPIE (19k/s).

SPQUSAR : A Large-Scale Qualitative Spatial Reasoner Using Apache Spark (SPQUSAR : Apache Spark를 이용한 대용량의 정성적 공간 추론기)

  • Kim, Jongwhan;Kim, Jonghoon;Kim, Incheol
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.12
    • /
    • pp.774-779
    • /
    • 2015
  • In this paper, we present the design and implementation of a large-scale qualitative spatial reasoner using Apache Spark, an in-memory high speed cluster computing environment, which is effective for sequencing and iterating component reasoning jobs. The proposed reasoner can not only check the integrity of a large-scale spatial knowledge base representing topological and directional relationships between spatial objects, but also expand the given knowledge base by deriving new facts in highly efficient ways. In general, qualitative reasoning on topological and directional relationships between spatial objects includes a number of composition operations on every possible pair of disjunctive relations. The proposed reasoner enhances computational efficiency by determining the minimal set of disjunctive relations for spatial reasoning and then reducing the size of the composition table to include only that set. Additionally, in order to improve performance, the proposed reasoner is designed to minimize disk I/Os during distributed reasoning jobs, which are performed on a Hadoop cluster system. In experiments with both artificial and real spatial knowledge bases, the proposed Spark-based spatial reasoner showed higher performance than the existing MapReduce-based one.

Multi-Level Sequence Alignment : An Adaptive Control Method Between Speed and Accuracy for Document Comparison (계산속도 및 정확도의 적응적 제어가 가능한 다단계 문서 비교 시스템)

  • Seo, Jong-Kyu;Tak, Haesung;Cho, Hwan-Gue
    • Journal of KIISE
    • /
    • v.41 no.9
    • /
    • pp.728-743
    • /
    • 2014
  • Finger printing and sequence alignment are well-known approaches for document similarity comparison. A fingerprinting method is simple and fast, but it can not find particular similar regions. A string alignment method is used for identifying regions of similarity by arranging the sequences of a string. It has an advantage of finding particular similar regions, but it also has a disadvantage of taking more computing time. The Multi-Level Alignment (MLA) is a new method designed for taking the advantages of both methods. The MLA divides input documents into uniform length blocks, and then extracts fingerprints from each block and calculates similarity of block pairs by comparing the fingerprints. A similarity table is created in this process. Finally, sequence alignment is used for specifying longest similar regions in the similarity table. The MLA allows users to change block's size to control proportion of the fingerprint algorithm and the sequence alignment. As a document is divided into several blocks, similar regions are also fragmented into two or more blocks. To solve this fragmentation problem, we proposed a united block method. Experimentally, we show that computing document's similarity with the united block is more accurate than the original MLA method, with minor time loss.

Design of a Binding for the performance Improvement of 3D Engine based on the Embedded Mobile Java Environment (자바 기반 휴대용 임베디드 기기의 삼차원 엔진 성능 향상을 위한 바인딩 구현)

  • Kim, Young-Ouk;Roh, Young-Sup
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.11
    • /
    • pp.1460-1471
    • /
    • 2007
  • A 3-Dimensional engine in a mobile embedded device is divided into a C-based OpenGL/ES and a Java-based JSR184 which interprets and executes a byte code in a real-time. In these two standards, the JSR184 supporting Java objects uses more processor resources than an OpenGL/ES and thus has a constraint when it is used in an embedded device with a limited computing power. On the other hand, 3-Dimensional contents employed in existing personal computer are created by utilizing advantages of Java and secured numerous users in European market, due to the good quality in contents and extensive service in a commercial network, GSM. Because of the reason, a mobile embedded device used in a GSM network needs a JSR184 which can provide an existing Java-based 3-Dimensional contents without extra conversion processes, but the current version of Java-based 3-Dimensional engine has drawbacks in application to commercial products because it requires more computing power than the mobile embedded device. This paper proposes a binding technique with the advantages of Java objects to improve a processing speed of 3-Dimensional contents in limited resources of a mobile embedded device. The technique supports a JSR184 standard interface in the upper layer to utilize 3-Dimensional contents using Java, employs a different code-conversion language, KNI(Kilo Native Interface), in the middle layer to interface between OpenGL/ES and JSR184, and embodies an OpenGL/ES standard in the lower layer. The validity of the binding technique is demonstrated through a simulator and a FPGA embedding an ARM.

  • PDF

System Implementation for Generating High Quality Digital Holographic Video using Vertical Rig based on Depth+RGB Camera (Depth+RGB 카메라 기반의 수직 리그를 이용한 고화질 디지털 홀로그래픽 비디오 생성 시스템의 구)

  • Koo, Ja-Myung;Lee, Yoon-Hyuk;Seo, Young-Ho;Kim, Dong-Wook
    • Journal of Broadcast Engineering
    • /
    • v.17 no.6
    • /
    • pp.964-975
    • /
    • 2012
  • Recently the attention on digital hologram that is regarded as to be the final goal of the 3-dimensional video technology has been increased. A digital hologram can be generated with a depth and a RGB image. We proposed a new system to capture RGB and depth images and to convert them to digital holograms. First a new cold mirror was designed and produced. It has the different transmittance ratio against various wave length and can provide the same view and focal point to the cameras. After correcting various distortions with the camera system, the different resolution between depth and RGB images was adjusted. The interested object was extracted by using the depth information. Finally a digital hologram was generated with the computer generated hologram (CGH) algorithm. All algorithms were implemented with C/C++/CUDA and integrated in LabView environment. A hologram was calculated in the general-purpose computing on graphics processing unit (GPGPU) for high-speed operation. We identified that the visual quality of the hologram produced by the proposed system is better than the previous one.

Implementation of a TCP/IP Offload Engine Using High Performance Lightweight TCP/IP (고성능 경량 TCP/IP를 이용한 소프트웨어 기반 TCP/IP 오프로드 엔진 구현)

  • Jun, Yong-Tae;Chung, Sang-Hwa;Yoon, In-Su
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.4
    • /
    • pp.369-377
    • /
    • 2008
  • Today, Ethernet technology is rapidly developing to have a bandwidth of 10Gbps beyond 1Gbps. In such high-speed networks, the existing method that host CPU processes TCP/IP in the operating system causes numerous overheads. As a result of the overheads, user applications cannot get the enough computing power from the host CPU. To solve this problem, the TCP/IP Offload Engine(TOE) technology was emerged. TOE is a specialized NIC which processes the TCP/IP instead of the host CPU. In this paper, we implemented a high-performance, lightweight TCP/IP(HL-TCP) for the TOE and applied it to an embedded system. The HL-TCP supports existing fundamental TCP/IP functions; flow control, congestion control, retransmission, delayed ACK, processing out-of-order packets. And it was implemented to utilize Ethernet MAC's hardware features such as TCP segmentation offload(TSO), checksum offload(CSO) and interrupt coalescing. Also we eliminated the copy overhead from the host memory to the NIC memory when sending data and we implemented an efficient DMA mechanism for the TCP retransmission. The TOE using the HL-TCP has the CPU utilization of less than 6% and the bandwidth of 453Mbps.

A High Performance Flash Memory Solid State Disk (고성능 플래시 메모리 솔리드 스테이트 디스크)

  • Yoon, Jin-Hyuk;Nam, Eyee-Hyun;Seong, Yoon-Jae;Kim, Hong-Seok;Min, Sang-Lyul;Cho, Yoo-Kun
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.4
    • /
    • pp.378-388
    • /
    • 2008
  • Flash memory has been attracting attention as the next mass storage media for mobile computing systems such as notebook computers and UMPC(Ultra Mobile PC)s due to its low power consumption, high shock and vibration resistance, and small size. A storage system with flash memory excels in random read, sequential read, and sequential write. However, it comes short in random write because of flash memory's physical inability to overwrite data, unless first erased. To overcome this shortcoming, we propose an SSD(Solid State Disk) architecture with two novel features. First, we utilize non-volatile FRAM(Ferroelectric RAM) in conjunction with NAND flash memory, and produce a synergy of FRAM's fast access speed and ability to overwrite, and NAND flash memory's low and affordable price. Second, the architecture categorizes host write requests into small random writes and large sequential writes, and processes them with two different buffer management, optimized for each type of write request. This scheme has been implemented into an SSD prototype and evaluated with a standard PC environment benchmark. The result reveals that our architecture outperforms conventional HDD and other commercial SSDs by more than three times in the throughput for random access workloads.

Scalable RDFS Reasoning Using the Graph Structure of In-Memory based Parallel Computing (인메모리 기반 병렬 컴퓨팅 그래프 구조를 이용한 대용량 RDFS 추론)

  • Jeon, MyungJoong;So, ChiSeoung;Jagvaral, Batselem;Kim, KangPil;Kim, Jin;Hong, JinYoung;Park, YoungTack
    • Journal of KIISE
    • /
    • v.42 no.8
    • /
    • pp.998-1009
    • /
    • 2015
  • In recent years, there has been a growing interest in RDFS Inference to build a rich knowledge base. However, it is difficult to improve the inference performance with large data by using a single machine. Therefore, researchers are investigating the development of a RDFS inference engine for a distributed computing environment. However, the existing inference engines cannot process data in real-time, are difficult to implement, and are vulnerable to repetitive tasks. In order to overcome these problems, we propose a method to construct an in-memory distributed inference engine that uses a parallel graph structure. In general, the ontology based on a triple structure possesses a graph structure. Thus, it is intuitive to design a graph structure-based inference engine. Moreover, the RDFS inference rule can be implemented by utilizing the operator of the graph structure, and we can thus design the inference engine according to the graph structure, and not the structure of the data table. In this study, we evaluate the proposed inference engine by using the LUBM1000 and LUBM3000 data to test the speed of the inference. The results of our experiment indicate that the proposed in-memory distributed inference engine achieved a performance of about 10 times faster than an in-storage inference engine.

Development of a Remotely Sensed Image Processing/Analysis System : GeoPixel Ver. 1.0 (JAVA를 이용한 위성영상처리/분석 시스템 개발 : GeoPixel Ver. 1.0)

  • 안충현;신대혁
    • Korean Journal of Remote Sensing
    • /
    • v.13 no.1
    • /
    • pp.13-30
    • /
    • 1997
  • Recent improvements of satellite remote sensing sensors which are represented by hyperspectral imaging sensors and high spatial resolution sensors provide a large amount of data, typically several hundred megabytes per one scene. Moreover, increasing information exchange via internet and information super-highway requires the developments of more active service systems for processing and analysing of remote sensing data in order to provide value-added products. In this sense, an advanced satellite data processing system is being developed to achive high performance in computing speed and efficieney in processing a huge volume of data, and to make possible network computing and easy improving, upgrading and managing of systems. JAVA internet programming language provides several advantages for developing software such as object-oriented programming, multi-threading and robust memory managent. Using these features, a satellite data processing system named as GeoPixel has been developing using JAVA language. The GeoPixel adopted newly developed techniques including object-pipe connect method between each process and multi-threading structure. In other words, this system has characteristics such as independent operating platform and efficient data processing by handling a huge volume of remote sensing data with robustness. In the evaluation of data processing capability, the satisfactory results were shown in utilizing computer resources(CPU and Memory) and processing speeds.