Search | Korea Science

Fast GPU Implementation for the Solution of Tridiagonal Matrix Systems (삼중대각행렬 시스템 풀이의 빠른 GPU 구현)

Kim, Yong-Hee;Lee, Sung-Kee
- Journal of KIISE:Computer Systems and Theory
- /
- v.32 no.11_12
- /
- pp.692-704
- /
- 2005
With the improvement of computer hardware, GPUs(Graphics Processor Units) have tremendous memory bandwidth and computation power. This leads GPUs to use in general purpose computation. Especially, GPU implementation of compute-intensive physics based simulations is actively studied. In the solution of differential equations which are base of physics simulations, tridiagonal matrix systems occur repeatedly by finite-difference approximation. From the point of view of physics based simulations, fast solution of tridiagonal matrix system is important research field. We propose a fast GPU implementation for the solution of tridiagonal matrix systems. In this paper, we implement the cyclic reduction(also known as odd-even reduction) algorithm which is a popular choice for vector processors. We obtained a considerable performance improvement for solving tridiagonal matrix systems over Thomas method and conjugate gradient method. Thomas method is well known as a method for solving tridiagonal matrix systems on CPU and conjugate gradient method has shown good results on GPU. We experimented our proposed method by applying it to heat conduction, advection-diffusion, and shallow water simulations. The results of these simulations have shown a remarkable performance of over 35 frame-per-second on the 1024x1024 grid.
PDF KSCI

Compact Field Remapping for Dynamically Allocated Structures (동적으로 할당된 구조체를 위한 압축된 필드 재배치)

Kim, Jeong-Eun;Han, Hwan-Soo
- Journal of KIISE:Software and Applications
- /
- v.32 no.10
- /
- pp.1003-1012
- /
- 2005
The most significant difference of embedded systems from general purpose systems is that embedded systems are allowed to use only limited resources including battery and memory. Especially, the number of applications increases which deal with multimedia data. In those systems with high data computations, the delay of memory access is one of the major bottlenecks hurting the system performance. As a result, many researchers have investigated various techniques to reduce the memory access cost. Most programs generally have locality in memory references. Temporal locality of references means that a resource accessed at one point will be used again in the near future. Spatial locality of references is that likelihood of using a resource gets higher if resources near it were just accessed. The latest embedded processors usually adapt cache memory to exploit these two types of localities. Processors access faster cache memory than off-chip memory, reducing the latency. In this paper we will propose the enhanced dynamic allocation technique for structure-type data in order to eliminate unused memory space and to reduce both the cache miss rate and the application execution time. The proposed approach aggregates fields from multiple records dynamically allocated and consecutively remaps them on the memory space. Experiments on Olden benchmarks show $13.9\%$ L1 cache miss rate drop and $15.9\%$ L2 cache miss drop on average, compared to the previously proposed techniques. We also find execution time reduced by $10.9\%$ on average, compared to the previous work.
PDF KSCI

A Congestion Control Algorithm for the fairness Improvement of TCP Vegas (TCP Vegas의 공정성 향상을 위한 혼잡 제어 알고리즘)

오민철;송병훈;정광수
- Journal of KIISE:Information Networking
- /
- v.31 no.3
- /
- pp.269-279
- /
- 2004
The most important factor influencing the robustness of the Internet Is the end-to-end TCP congestion control. However, the congestion control scheme of TCP Reno, the most popular TCP version on the Internet, employs passive congestion indication. It makes worse the network congestion. Recently, Brakmo and Peterson have proposed a new version of TCP, which is named TCP Vegas, with a fundamentally different congestion control scheme from that of the Reno. Many studies indicate that the Vegas is able to achieve better throughput and higher stability than the Reno. But there are two unfairness problems in Vegas. These problems hinder the spread of the Vegas in current Internet. In this paper, in order to solve these unfairness problems, we propose a new congestion control algorithm called TCP PowerVegas. The existing Vegas depends mainly only on the rtt(round trip time), but the proposed PowerVegas use the new congestion control scheme combined the Information on the rtt with the information on the packet loss. Therefore the PowerVegas performs the congestion control more competitively than the Vegas. Thus, the PowerVegas is able to solve effectively these unfairness problems which the Vegas has experienced. To evaluate the proposed approach, we compare the performance among PowerVegas, Reno and Vegas under same network environment. Using simulation, the PowerVegas is able to achieve better throughput and higher stability than the Reno and is shown to achieve much better fairness than the existing Vegas.
PDF KSCI

Color Media Instructions for Embedded Parallel Processors (임베디드 병렬 프로세서를 위한 칼라미디어 명령어 구현)

Kim, Cheol-Hong;Kim, Jong-Myon
- Journal of KIISE:Computer Systems and Theory
- /
- v.35 no.7
- /
- pp.305-317
- /
- 2008
As a mobile computing environment is rapidly changing, increasing user demand for multimedia-over-wireless capabilities on embedded processors places constraints on performance, power, and sire. In this regard, this paper proposes color media instructions (CMI) for single instruction, multiple data (SIMD) parallel processors to meet the computational requirements and cost goals. While existing multimedia extensions store and process 48-bit pixels in a 32-bit register, CMI, which considers that color components are perceptually less significant, supports parallel operations on two-packed compressed 16-bit YCbCr (6 bit Y and 5 bits Cb, Cr) data in a 32-bit datapath processor. This provides greater concurrency and efficiency for YCbCr data processing. Moreover, the ability to reduce data format size reduces system cost. The reduction in data bandwidth also simplifies system design. Experimental results on a representative SIMD parallel processor architecture show that CMI achieves an average speedup of 6.3x over the baseline SIMD parallel processor performance. This is in contrast to MMX (a representative Intel's multimedia extensions), which achieves an average speedup of only 3.7x over the same baseline SIMD architecture. CMI also outperforms MMX in both area efficiency (a 52% increase versus a 13% increase) and energy efficiency (a 50% increase versus an 11% increase). CMI improves the performance and efficiency with a mere 3% increase in the system area and a 5% increase in the system power, while MMX requires a 14% increase in the system area and a 16% increase in the system power.
PDF KSCI

Automatic Recognition and Normalization System of Korean Time Expression using the individual time units (시간의 단위별 처리를 이용한 자동화된 한국어 시간 표현 인식 및 정규화 시스템)

Seon, Choong-Nyoung;Kang, Sang-Woo;Seo, Jung-Yun
- Korean Journal of Cognitive Science
- /
- v.21 no.4
- /
- pp.447-458
- /
- 2010
Time expressions are a very important form of information in different types of data. Thus, the recognition of a time expression is an important factor in the field of information extraction. However, most previously designed systems consider only a specific domain, because time expressions do not have a regular form and frequently include different ellipsis phenomena. We present a two-level recognition method consisting of extraction and transformation phases to achieve generality and portability. In the extraction phase, time expressions are extracted by atomic time units for extensibility. Then, in the transformation phase, omitted information is restored using basis time and prior knowledge. Finally, every complete atomic time unit is transformed into a normalized form. The proposed system can be used as a general-purpose system, because it has a language- and domain-independent architecture. In addition, this system performs robustly in noisy data like SMS data, which include various errors. For SMS data, the accuracies of time-expression extraction and time-expression normalization by using the proposed system are 93.8% and 93.2%, respectively. On the basis of these experimental results, we conclude that the proposed system shows high performance in noisy data.
PDF

Influence of Water Depth on Climate Change Impacts on Caisson Sliding of Vertical Breakwater (직립방파제의 케이슨 활동에 미치는 기후변화영향에 대한 수심의 효과)

Kim, Seung-Woo;Kim, So-Yeon;Suh, Kyung-Duck
- Journal of Korean Society of Coastal and Ocean Engineers
- /
- v.24 no.3
- /
- pp.179-188
- /
- 2012
Performance analyses of vertical breakwaters were conducted for fictitiously designed breakwaters for various water depths to analyze the influence of climate change on the structures. The performance-based design method considering sea level rise and wave height increase due to climate change was used for the performance analysis. One of the problems of the performance-based design method is the large calculation time of wave transformation. To overcome this problem, the SWAN model combined with artificial neural network was used. The significant wave height and principal wave direction at the breakwater site are quickly calculated by using a trained neural network with inputs of deepwater significant wave height and principal wave direction, and tidal level. In general, structural stability becomes low due to climate change impacts, but the trend of stability is different depending on water depth. Outside surf zone, the influence of wave height increase becomes more significant, while that of sea level rise becomes negligible, as water depth increases. Inside surf zone, the influence of both wave height increase and sea level rise diminishes as water depth decreases, but the influence of wave height increase is greater than that of sea level rise. Reinforcement and maintenance policies for vertical breakwaters should be established with consideration of these results.
https://doi.org/10.9765/KSCOE.2012.24.3.179 인용 PDF KSCI

Large-Scale Slope Stability Analysis Using Climate Change Scenario (1): Methodologies (기후변화 시나리오를 이용한 광역 사면안정 해석(1): 방법론)

Choi, Byoung-Seub;Oh, Sung-Ryul;Lee, Kun-Hyuk;Lee, Gi-Ha;Kwon, Hyun-Han
- Journal of the Korean Association of Geographic Information Studies
- /
- v.16 no.3
- /
- pp.193-210
- /
- 2013
This study aims to assess the slope stability variation of Jeollabuk-do drainage areas by RCM model outputs based on A1B climate change scenario and infinite slope stability model based on the specific catchment area concept. For this objective, we downscaled RCM data in time and space: from watershed scale to rain gauge scale in space and from monthly data to daily data in time and also developed the GIS-based infinite slope stability model based on the concept of specific catchment area to calculate spatially-distributed wetness index. For model parameterization, topographic, geologic, forestry digital map were used and model parameters were set up in format of grid cells($90m{\times}90m$). Finally, we applied the future daily rainfall data to the infinite slope stability model and then assess slope stability variation under the climate change scenario. This research consists of two papers: the first paper focuses on the methodologies of climate change scenario preparation and infinite slope stability model development.
https://doi.org/10.11108/kagis.2013.16.3.193 인용 PDF KSCI KPUBS

The Froude Scaling Study on the Ventilation of Non-isothermal Concentrated Fume from the Semi-closed Space (반밀폐형 공간에서 비등온 고농도 연무의 배연산출량 산정을 위한 Froude 상사연구)

Chang, Hyuk-Sang;Choi, Byung-Il;Park, Jae-Cheul;Kim, Myung-Bae
- Journal of Korean Society of Environmental Engineers
- /
- v.27 no.8
- /
- pp.877-885
- /
- 2005
The Froude scaling between the prototype and the model was tried to estimate the necessary ventilation rate for non-isothermal concentrated fume from the semi-closed inner space. Based on the non-dimensional similitude equations derived from the Zukoski plume rise analysis, the scaling experiments were done to verify the relationship of the non-dimensional energy release rate and the non-dimensional mass flow rate by using two different scaled volume models, model A ($1\;m{\times}1\;m{\times}1\;m$) and model B ($0.5\;m{\times}0.5\;m{\times}0.5\;m$). The experimental results showed that the theoretical similitude between the models is acceptable for the prediction of ventilation rate of the concentrated fume. The maximum energy release rate used for the experiments was $20\;kW/m^3$. In the experimental range, the similitude between the energy release rate and the ventilation mass flow rate was well defined and the necessary ventilation rates were 20-30% higher than the stoichiometric ventilation mass flow rate. Based on results of current study, the design of the local air ventilation system can be improved by correcting the effects of buoyancy and diffusion of the non-isothermal concentrated fume.
PDF KSCI

Developing a General Recycling Method of FRP Boats (FRP선박의 범용 재활용을 위한 재처리시스템의 연구)

Yoon, Koo-Young
- Journal of the Korean Society for Marine Environment & Energy
- /
- v.12 no.1
- /
- pp.29-34
- /
- 2009
For several decades, many researchers have been involved in developing recycling methods for FRP boats. There are four basic classes of recycling covered in the literature. Despite of environmental problems(safety hazards), mechanical recycling of FRP boats, which involves shredding and grinding of the scrap FRP, is one of the simpler and more technically proven methods than incineration, reclamation or chemical ones. Because FRP is made up of reinforced fiber glass, it is very difficult to break into pieces. It also leads to secondary problem in recycling process, such as air pollution and unacceptable shredding noise level. Another serious problem of mechanical FRP recycling is very limited reusable applications for the residue. This study is to propose a new and efficient method which is more wide range applications and environment friendly waste FRP regenerating system. New system is added with the cyclone sorting machine for airborne pollutions and modified cutting system for several glass fiber chips sizes. It also has shown the FRP chip fiber-reinforced concrete and fiber-reinforced secondary concrete applications with the waste FRP boat to be more eligible than existing recycling method.
PDF

Development of Techniques for Testicular Germ Cell Transplantation in Pigs (돼지에 있어서 정소 생식세포의 이식 기법 개발)

Kim, Byung-Gak;Lee, Yong-An;Kim, Bang-Jin;Kim, Ki-Jung;Min, Kwan-Sik;Lee, Jang-Hee;Ryu, Jae-Weon;Kim, In-Cheul;Ryu, Buom-Yong
- Reproductive and Developmental Biology
- /
- v.32 no.3
- /
- pp.193-198
- /
- 2008
The current study was designed to extend the technique of spermatogonial transplantation to economically important pig model We evaluated the efficiency of pig to pig transplantation. Donor testis cells were harvested from testes obtained at castration of 10- to 14-day-old boars and were labeled with fluorescent marker(PKH26) before transplantation. The presence of infused dye or labeled pig testicular cells was confirmed in the seminiferous tubules from recipient pig. The most effective procedure of intratubular germ cell transfer was to insert an fine needle ($21{\sim}25$ gauge) through the cauda epididymis and testis into the rete testis under ultrasound guidance. Infusion of $5{\sim}7ml$ of dye solution or cell suspension could fill the rete and up to 50% of seminiferous tubules of 14-week-old boars. Testis were examined for the presence and localization of labeled donor cells immediately after transplantation and labeled donor cells were found in numerous seminiferous tubules from recipient pig testes. These results indicate that germ cell transplantation is feasible in recipient pig testis. This study represents successful spermatogonial transplantation between individual animals in a livestock species.
PDF KSCI

Search Result 2,873, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)