Tightly Coupled Integration of Ranking SVM and RDBMS

Song, Jae-Hwan;Oh, Jin-Oh;Yang, Eun-Seok;Yu, Hwan-Jo;

Journal of KIISE:Databases (한국정보과학회논문지:데이타베이스)

Volume 36 Issue 4
/
Pages.247-253
/
2009
/
1229-7739(pISSN)

Korean Institute of Information Scientists and Engineers (한국정보과학회)

Tightly Coupled Integration of Ranking SVM and RDBMS

랭킹 SVM과 RDBMS의 밀결합 통합

송재환 ((주)LG CNS 기술서비스부문) ;
오진오 (포항공과대학교 컴퓨터공학과) ;
양은석 (포항공과대학교 컴퓨터공학과) ;
유환조 (포항공과대학교 컴퓨터공학과)

Published : 2009.08.15

PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Rank learning and processing have gained much attention in the IR and data mining communities for the last decade. While other data mining techniques such as classification and regression have been actively researched to interoperate with RDBMS by using the tightly coupled or loose coupling approaches, ranking has been researched independently without integrating into RDBMS. This paper proposes a tightly coupled integration of the Ranking SVM into MySQL in order to perform the rank learning task efficiently within the RDBMS. We implemented new SQL commands for learning ranking functions and predicting ranking scores. We evaluated our tightly coupled integration of Ranking SVM by comparing it to a loose coupling implementation. The experiment results show that our approach has a performance improvement of $10{\sim}40%$ in the training phase and 60% in the prediction phase.

지난 십 년간 랭킹은 데이터 마이닝 분야의 활발한 연구분야였다. 그러나 랭킹은 다른 데이터 마이닝 기법들과 비슷하게 RDBMS와는 독립적으로 개발되었고, 그로 인해 기존에 널리 사용되고 있는 RDBMS들과의 연동성이 떨어진다는 단점이 존재하게 되었다. 다른 데이터 마이닝 기법들은 소결합이나 밀결합 접근법을 이용하여 RDBMS와 연동하기 위한 연구가 활발하게 진행되어 왔고, 그 결과 실제로 사용 가능한 응용시스템들이 나오게 되었다. 그러나 랭킹에서는 이와 같은 노력들이 잘 이루어지지 않고 있다. 본 논문에서는 랭킹 작업을 RDBMS에 연동하여 효율적으로 수행하기 위하여 MySQL에 Ranking SVM을 통합하는 작업을 진행하였다. 밀결합 접근법을 기반으로 하는 우리의 구현은 MySQL에 랭킹을 위한 새로운 SQL 명령어를 추가하였고 랭킹 작업의 효율성을 확인하기 위해서 소결합 접근법을 기반으로 하는 Ranking SVM과 성능을 비교 평가하여 훈련단계에서 $10{\sim}40%$, 예측단계에서 평균 60%의 성능향상을 확인할 수 있었다.

Keywords

References

Ralf Herbrich, Thore Graepel, and Klaus Obermayer, 'Large margin rank boundaries for ordinal regression,' In Advances in Large Margin Classifiers, MIT Press, Cambridge, MA, 2000
Yoav Freund, Raj Iyer, Robert E, Schapire, Yoram Singer, 'An Efficient Boosting Algorithm For Combining Preference,' Journal of Machine Learning Research, 2003 https://doi.org/10.1162/jmlr.2003.4.6.933
Jin Xu, Hang Li, 'Adarank: A Boosting Algorithm for Information Retrieval,' SIGIR, Annual ACM Conference on Research and Development in Information Retrieval, 2007 https://doi.org/10.1145/1277741.1277809
Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, Greg Hullender, 'Learning to Rank using Gradient Descent,' ACM International Conference Proceeding Series, 2005 https://doi.org/10.1145/1102351.1102363
Rakesh Agrawal, Kyoseok Shim, 'Developing Tightly-Coupled Data Mining Applications on a Relational Database System,' Proc. Knowledge Discovery and Data Mining, 1996
Jiawei Han, Yongjian Fu, Wei Wang, Krzysztof Koperski, Osmar Zaiane, 'DMQL: A data mining query language for relational databases,' Proc. SIGMOD, 1996
Tomasz Imielinski, Aashu Virmani, 'MSQL: A Query Language for Database Mining,' Data Mining and Knowledge Discovery, 1999 https://doi.org/10.1023/A:1009816913055
Boriana L. Milenova, Joseph S. Yarmus, Marcos M. Campos, 'SVM in Oracle Database 10g: Removing the Barriers to Widespread Adoption of Support Vector Machines,' VLDB 2005
http://www.mysql.com
Jiawei Han, Micheline Kamber, 'Data Mining: Concepts and Techniques,' Second Edition, Morgan Kaufmann, 2006
Vapnik, 'The Nature of Statistical Learning Theory,' Springer, 1995
http://svmlight.joachim.org/
Amir Netz, Surajit Chaudhuri, Usama Fayyad, Jeff Bernhardt, 'Integrating Data Mining with SQL Databases: OLE DB for Data Mining,' icde, p. 0379, 17th International Conference on Data Engineering (ICDE'01), 2001
Zhaohui Tang, Jamie Maclennan, Peter Pyungchul Kim, 'Building data mining solutions with OLE DB for DM and XML for analysis,' ACM SIG-MOD Record, 2005 https://doi.org/10.1145/1083784.1083805

Journal of KIISE:Databases (한국정보과학회논문지:데이타베이스)

Tightly Coupled Integration of Ranking SVM and RDBMS

랭킹 SVM과 RDBMS의 밀결합 통합

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)