통합 검색 | Korea Science

Scalable Prediction Models for Airbnb Listing in Spark Big Data Cluster using GPU-accelerated RAPIDS

Muralidharan, Samyuktha;Yadav, Savita;Huh, Jungwoo;Lee, Sanghoon;Woo, Jongwook
- Journal of information and communication convergence engineering
- /
- 제20권2호
- /
- pp.96-102
- /
- 2022
We aim to build predictive models for Airbnb's prices using a GPU-accelerated RAPIDS in a big data cluster. The Airbnb Listings datasets are used for the predictive analysis. Several machine-learning algorithms have been adopted to build models that predict the price of Airbnb listings. We compare the results of traditional and big data approaches to machine learning for price prediction and discuss the performance of the models. We built big data models using Databricks Spark Cluster, a distributed parallel computing system. Furthermore, we implemented models using multiple GPUs using RAPIDS in the spark cluster. The model was developed using the XGBoost algorithm, whereas other models were developed using traditional central processing unit (CPU)-based algorithms. This study compared all models in terms of accuracy metrics and computing time. We observed that the XGBoost model with RAPIDS using GPUs had the highest accuracy and computing time.
https://doi.org/10.6109/jicce.2022.20.2.96 인용 PDF KSCI

실리카에 담지된 바나듐 촉매의 산성도에 대한 CNDO/2 분자궤도론적 계산 (CNDO/2 MO Calculations for Catalytic Acidity of V-silicalite)

김명철
- 공업화학
- /
- 제5권2호
- /
- pp.357-360
- /
- 1994
V-silicalite의 활성점을 표현하는 몇 가지 분자 cluster모델에 대하여 CNDO/2 계산을 수행하고 Wiberg결합차수와 LUMO에너지 및 전체에너지를 얻었다. 제안된 각 모델 분자들의 B산성도는 O-H결합의 결합차수를 통해 조사하였다. 또한 계산으로부터 구한 LUMO에너지는 각 활성점 모델의 L산성도를 나타내었다. 각 cluster모델의 구조적 안정성은 전체에너지를 통해 설명하였다.
PDF

군집 기반 트럭-드론 배송경로 모형의 효과분석 (Analysis of Cluster-based Truck-Drone Delivery Routing Models)

장용식
- Journal of Information Technology Applications and Management
- /
- 제26권1호
- /
- pp.53-64
- /
- 2019
The purpose of this study is to find out the fast delivery route that several drones return a truck again after departing from it for delivery locations at each cluster while the truck goes through the cluster composed of several delivery locations. The main issue is to reduce the total delivery time composed of the delivery time by relatively slow trucks via clusters and the sum of maximum delivery times by relatively fast drones in each cluster. To solve this problem, we use a three-step heuristic approach. First, we cluster the nearby delivery locations with minimal number of clusters satisfying a constraint of drone flight distance to set delivery paths for drones in each cluster. Second, we set an optimal delivery route for a truck through centers of the clusters using the TSP model. Finally, we find out the moved centers of clusters while maintaining the delivery paths for the truck and drones and satisfying the constraint of drone flight. distance in the two-dimensional region to reduce the total delivery time. In order to analyze the effect of this study model according to the change of the number of delivery locations, we developed a R-based simulation prototype and compared the relative efficiency, and performed paired t-test between TSP model and the cluster-based models. This study showed its excellence through this experimentation.
https://doi.org/10.21219/jitam.2019.26.1.053 인용 PDF KSCI HTML

유전 알고리듬 기반 집단분류기법의 개발과 성과평가 : 채권등급 평가를 중심으로 (Design and Performance Measurement of a Genetic Algorithm-based Group Classification Method : The Case of Bond Rating)

민재형;정철우
- 한국경영과학회지
- /
- 제32권1호
- /
- pp.61-75
- /
- 2007
The purpose of this paper is to develop a new group classification method based on genetic algorithm and to com-pare its prediction performance with those of existing methods in the area of bond rating. To serve this purpose, we conduct various experiments with pilot and general models. Specifically, we first conduct experiments employing two pilot models : the one searching for the cluster center of each group and the other one searching for both the cluster center and the attribute weights in order to maximize classification accuracy. The results from the pilot experiments show that the performance of the latter in terms of classification accuracy ratio is higher than that of the former which provides the rationale of searching for both the cluster center of each group and the attribute weights to improve classification accuracy. With this lesson in mind, we design two generalized models employing genetic algorithm : the one is to maximize the classification accuracy and the other one is to minimize the total misclassification cost. We compare the performance of these two models with those of existing statistical and artificial intelligent models such as MDA, ANN, and Decision Tree, and conclude that the genetic algorithm-based group classification method that we propose in this paper significantly outperforms the other methods in respect of classification accuracy ratio as well as misclassification cost.
PDF KSCI

변동계수를 이용한 반도체 결점 클러스터 지표 개발 및 수율 예측 (Development of a New Cluster Index for Semiconductor Wafer Defects and Simulation - Based Yield Prediction Models)

박항엽;전치혁;홍유신;김수영
- 대한산업공학회지
- /
- 제21권3호
- /
- pp.371-385
- /
- 1995
The yield of semiconductor chips is dependent not only on the average defect density but also on the distribution of defects over a wafer. The distribution of defects leads to consider a cluster index. This paper briefly reviews the existing yield prediction models ad proposes a new cluster index, which utilizes the information about the defect location on a wafer in terms of the coefficient of variation. An extensive simulation is performed under a variety of defect distributions and a yield prediction model is derived through the regression analysis to relate the yield with the proposed cluster index and the average number of defects per chip. The performance of the proposed simulation-based yield prediction model is compared with that of the well-known negative binomial model.
PDF

Cu-Y 제올라이트상의 NO분해반응에 대한 양자화학적 해석 (Quantum Chemical Calculation of NO Decomposition over Cu-Y Zeolite)

김명철
- 공업화학
- /
- 제7권2호
- /
- pp.321-325
- /
- 1996
$Cu^{n+}$ 교환된 Y형 제올라이트 상에서 진행되는 NO분해반응의 특성을 양자화학적 계산을 통해 해석하였다. 제올라이트내 양이온 자리를 나타내는 Cluster모델들에 대해 CNDO/2와 같은 이론적 계산을 수행하여 전체에너지, LUMO에너지 및 Wiberg결합차수값들을 얻었다. 각 모델들의 전체에너지와 결합차수값들을 통해 제올라이트 골격내 $Cu^{n+}$ 양이온 자리에서의 NO분해반응에 대한 반응기구를 고찰하였다. 제안된 분자모델들은 각기 다른 Si/Al비와 $Cu^+$ 및 $Cu^{2+}$ 교환된 양이온의 경우로 구분하여 고찰하였다. LUMO에너지의 계산을 통해 모델분자들의 L산성도를 해석하였다. NO분해반응의 메카니즘은 NO의 흡착, $N_2$ 및 $O_2$로의 분해, $N_2$ 및 $O_2$의 탈착의 단계가 연속적으로 진행될 가능성이 있었다. 양이온 자리에서 $Cu^{2+}$는 $Cu^+$ 보다 더 강한 L산성을 나타내었다.
PDF

Cluster Analysis of Daily Electricity Demand with t-SNE

Min, Yunhong
- 한국컴퓨터정보학회논문지
- /
- 제23권5호
- /
- pp.9-14
- /
- 2018
For an efficient management of electricity market and power systems, accurate forecasts for electricity demand are essential. Since there are many factors, either known or unknown, determining the realized loads, it is difficult to forecast the demands with the past time series only. In this paper we perform a cluster analysis on electricity demand data collected from Jan. 2000 to Dec. 2017. Our purpose of clustering on electricity demand data is that each cluster is expected to consist of data whose latent variables are same or similar values. Then, if properly clustered, it is possible to develop an accurate forecasting model for each cluster separately. To validate the feasibility of this approach for building better forecasting models, we clustered data with t-SNE. To apply t-SNE to time series data effectively, we adopt the dynamic time warping as a similarity measure. From the result of experiments, we found that several clusters are well observed and each cluster can be interpreted as a mix of well-known factors such as trends, seasonality and holiday effects and other unknown factors. These findings can motivate the approaches which build forecasting models with respect to each cluster independently.
https://doi.org/10.9708/jksci.2018.23.05.009 인용 PDF KSCI

Analytic Study of Acquiring KANSEI Information Regarding the Recognition of Shape Models

Wang, Shao-Chi;Hiroshi Kubo;Hiromitsu Kikita;Takashi Uozumi;Tohru Ifukube
- 한국감성과학회:학술대회논문집
- /
- 한국감성과학회 2002년도 춘계학술대회 논문집
- /
- pp.266-269
- /
- 2002
This paper explores a fundamental study of acquiring the users' KANSEI information regarding the recognition of shape models. Since there are many differences such as background differences and knowledge differences among users, they will produce different evaluations based on their KANSEI even when an identical shape model is presented. Cluster analysis is proved to be available for catching a group tendency and for constructing a mapping relation between a description of the shape model and the HANSEl database. In order to investigate an analogical relation and a mutual influence in our consciousness, first, we made a questionnaire that asked subjects to represent images having different colors and shape cones by using 4 pairs of adjectives (KANSEI words). Next, based on the cluster analysis of the questionnaire using a fuzzy set theory, we proposed a hypothesis showing how the analogical relation and the mutual influence work in our mind while viewing the shape models. Furthermore, how the properties of KANSEI depend on their descriptions was also investigated by virtue of the cluster analysis. This work will be valuable to construct a personal KANSEI database regarding the Shape Model Processing System.
PDF

Some models for rainfall focused on the inner correlation structure

Kim, Sangdan
- 한국수자원학회:학술대회논문집
- /
- 한국수자원학회 2004년도 학술발표회
- /
- pp.1290-1294
- /
- 2004
In this study, new stochastic point rainfall models which can consider the correlation structure between rainfall intensity and duration are developed. In order to consider the negative and positive correlation simultaneously, the Gumbels type-II bivariate distribution is applied, and for the cluster structure of rainfall events, the Neyman-Scott cluster point process is selected. In the theoretical point of view, it is shown that the models considering the dependent structure between rainfall intensity and duration have slightly heavier tail autocorrelation functions than the corresponding independent mode]s. Results from generating long time rainfall events show that the dependent models better reproduce historical rainfall time series than the corresponding independent models in the sense of autocorrelation structures, zero rainfall probabilities and extreme rainfall events.
PDF

지역혁신과 클러스터 정책의 평가: 이론과 방법 (The Evaluation of Regional Innovation and Cluster Policies : Theory and Methods)

마리아안젤레스디에즈
- 산업클러스터
- /
- 제1권1호
- /
- pp.1-15
- /
- 2007
지역혁신과 클러스터 정책은 지역정책의 새로운 아젠다로서, 최근 여러 국가와 지역에서 관심을 갖는 주제이다. 여기서 우리의 주요 관심사는 지역혁신과 클러스터 정책을 어떻게 평가할 것인가, 그리고 그 평가에 있어 어떤 모델과 방법을 사용할 것인가라는 데에 모아진다. 1990년 이후 국가 및 지방정부에서는 개선된 정책을 수립하기 위한 지식 창출의 도구로서 정책평가를 강조해 왔다. 이 논문의 목적은 지역혁신과 클러스터 정책의 평가와 관련된 주요 논의를 요약하고, 더 나은 정책평가에 기여할 수 있는 방법론을 제안하는 데에 있다. 우선 지역혁신과 클러스터 정책을 개관하고 그 정책의 주요 특성과 정책평가로 제기되는 주요 과제를 보다 상세히 분석한다. 다음으로 현재 사용되는 평가 관행을 개선시키는 데 도움이 되는 몇 가지 방법을 제안한다. 마지막으로 지역혁신과 클러스터 정책을 평가하는 데 있어 고려해야 할 몇 가지 사항을 제시한다.
PDF

Search Result 355, Processing Time 0.02 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)