Workflow Clustering Methodology Using Structural Similarity Metrics

프로세스 유사성을 이용한 워크플로우 클러스터링

  • Jung, Jae-Yoon (Automation and Systems Research Institute, Seoul National University) ;
  • Bae, Joonsoo (Department of Industrial and Information Systems Engineering, Chonbuk National University) ;
  • Kang, Suk-Ho (Department of Industrial Engineering, Seoul National University)
  • 정재윤 (서울대학교 자동화시스템공동연구소) ;
  • 배준수 (전북대학교 산업정보시스템공학과) ;
  • 강석호 (서울대학교 산업공학과)
  • Published : 2007.03.31

Abstract

To realize process-driven management, so many companies have been launching business process managementsystems. Business process is collection of standardized and structured tasks inducing value creation of acompany. Moreover, it is recognized as one of significant intangible business assets to achieve competitiveadvantages. This research introduces a novel approach of workflow process analysis, which has more and moresignificance as process-aware information systems are spreading widely into a lot of companies, In this paper, amethodology of workflow clustering based on process similarity has been proposed. The purpose of workflowclustering is to analyze accumulated process definitions in order to assist design of new processes andimprovement of existing ones. The proposed methodology exploits measures of structural similarity of workflowprocesses.The methodology has been experimented with synthetic process models for illustrating the implicationofworkflow clustering.

Keywords

References

  1. van der Aalst, W. M. P. and Basten, T. (2002), Inheritance of workflows: an approach to tackling problems related to change, Theoretical Computer Science, 270(1), 125-203 https://doi.org/10.1016/S0304-3975(00)00321-2
  2. van der Aalst, W. M. P., Hofstede, A. H. M., Kiepuszewski, B., and Barros, A. P. (2003), Workflow Patterns, Distributed and Parallel Databases, 14(1),5-51 https://doi.org/10.1023/A:1022883727209
  3. van der Aalst, W. M. P. and Weijters, A. J. M. M. (2004), Process Mining: a Research Agenda, Computers in Industry, 53(3), 231-244 https://doi.org/10.1016/j.compind.2003.10.001
  4. Bae, J., Bae, H., Kang, S.-H. and Kim, Y. (2004), Automatic Control of Workflow Processes Using ECA Rules, IEEE Transactions on Knowledge and Data Engineering, 16(8), 1010-1023 https://doi.org/10.1109/TKDE.2004.20
  5. Bae, I.-S., Kwon, B.-C., Jung, J.-Y., and Kang, S.-H. (2005), Workflow Collaboration Design using Similarity Measures among Process Definitions, Review of Korean Society for Internet Information, 6(1),52-61
  6. Bae, J., Liu, L., Caverlee, J. and Rouse, W. (2006), Process Mining, Discovery, and Integration using Distance Measures, Proceedings of International Conference on Web Service 2006, September 18-22
  7. Bunke, H. and Shearer, K. (1998), A Graph Distance Metric based on the Maximal Common Subgraph, Pattern Recognition Letters, 19, 255-259 https://doi.org/10.1016/S0167-8655(97)00179-7
  8. Bunke, H. (1999), Error correcting graph matching: On the influence of the underlying cost function, IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(9), 917-922 https://doi.org/10.1109/34.790431
  9. Bunke, H. and Kandel, A. (2000), Mean and maximum common subgraph of two graphs, Pattern Recognition Letters, 21, 163-168 https://doi.org/10.1016/S0167-8655(99)00143-9
  10. Cardoso, J. (2005), How to Measure the Control-flow Complexity of Web processes and Workflows, In: Fischer, L. (ed.) : Workflow Handbook 2005, WfMC, Lighthouse Point, FL, 199-212
  11. Corneil, D. G. and Gotlieb, C. C. (1970), An Efficient Algorithm for Graph Isomorphism, Journal of the ACM, 17(1),51-64 https://doi.org/10.1145/321556.321562
  12. Fernandez, M. L. and Valiente, G. (2001), A graph distance metric combining maximum common subgraph and minimum common supergraph, Pattern Recognition Letters, 22,753-758 https://doi.org/10.1016/S0167-8655(01)00017-4
  13. Ha, B., Bae, J., Park, Y. T., and Kang, S-H. (2006), Development of process execution rules for workload balancing on agents, Data & Knowledge Engineering, 56(1), 64-84 https://doi.org/10.1016/j.datak.2005.02.007
  14. Hammouda, K. M. and Kamel, M. S. (2004), Efficient Phrase-Based Document Indexing for Web Document Clustering, IEEE Transactions on Knowledge and Data Engineering, 16(10), 1279-1296 https://doi.org/10.1109/TKDE.2004.58
  15. Hur, W., Bae, H., and Kang, S-H. (2003), Customizable Workflow Monitoring, Concurrent Engineering- Research and Applications, 11(4),313-326 https://doi.org/10.1177/1063293X03039903
  16. Jansen-Vullers, M. H., van der Aalst, W. M. P., and Rosemann, M. (2006), Mining Configurable Enterprise Information Systems, Data and Knowledge Engineering, 56(3), 195-244 https://doi.org/10.1016/j.datak.2005.03.007
  17. Jung, J., Hur, W., Kang, S-H., and Kim, H. (2004), Business Process Choreography for B2B Collaboration, IEEE Internet Computing, 8(1),37-45
  18. Karypis, G., Han, E., and Kumar, V. (1999), Chameleon: Hierarchical Clustering Using Dynamic Modeling, IEEE Computer, 32(8), 68-75 https://doi.org/10.1109/2.781637
  19. Kim, H., Jung, J-Y., and Kang, S-H. (2003), Extensible Collaborative Process Composition using Workflow Inheritance, IE Interfaces, 16,49-54
  20. Kim, Y., Kang, S-H., Kim, D., Bae, J., and Ju, K.-J. (2000), WW-Flow: Web-Based Workflow Management with Runtime Encapsulation, IEEE Internet Computing, 4(3), 55-64
  21. Lian, W., Cheung, W. W., Mamoulis, N., and Yiu, S. (2004), An Efficient and Scalable Algorithm for Clustering XML Documents by Structure, IEEE Transactions on Knowledge and Data Engineering, 16(1), 82-96 https://doi.org/10.1109/TKDE.2004.1264824
  22. Malone, T. W., Crowston, K., and Herman, G. A. (2003), Organizing Business Knowledge: The MIT Process Handbook, The MIT Press, Cambridge, MA
  23. Messmer, B. T. and Bunke, H. (1998), Error-correcting graph isomorphism using decision trees, Journal of Pattern Recognition andArtificialIntelligence, 12,721-742 https://doi.org/10.1142/S0218001498000415
  24. Reijers, H. A. and I. T. P. Vanderfeesten (2004), Cohesion and Coupling Metrics for Workflow Process Design, Proc. 2nd Int. Conf on Business Process Management, 290-305
  25. RosettaNet (2001), RosettaNet Implementation Framework: Core Specification 2.0. http://www.rosettanet.org/
  26. Shapiro, R. (2002), A Comparison of XPDL, BPML and BPEL4WS, Cape Visions, White paper
  27. Schimm, G. (2004), Mining exact models of concurrent workflows, Computers in Industry, 53(3),265-281 https://doi.org/10.1016/j.compind.2003.10.003
  28. Simitsis, A., Vassiliadis, P., and Sellis, T. (2005), State-Space Optim-ization of ETL Workflows, IEEE Knowledge and Data Engineering, 17(10), 1404-1419 https://doi.org/10.1109/TKDE.2005.169
  29. Verbeek, H. M. W., Basten, T., and van der Aalst, W. M. P. (2001), Diagnosing Workflow Processes using W oflan, The Computer Journal, 44(4), 246-279 https://doi.org/10.1093/comjnl/44.4.246
  30. WfMC. (1999), Terminology and Glossary, WFMC-TC-lOll, Workflow Management Coalition, Hampshire, United Kingdom
  31. WfMC. (2002). Workflow Process Definition Interface - XML Process Definition Language, WfMC- TC-1025, Workflow Management Coalition, Hampshire, United Kingdom
  32. Zamir, O. and Etzioni, O. (1998), Web Document Clustering: A Feasibility Demonstration, Proc. 21th Int. ACM S1G1R Conf, 46-54.
  33. Zhang, K. and Shasha, D. (1989), Simple Fast Algorithms for the Editing Distance between Trees and Related Problems, SlAM Journal of Computing, 18(6), 1245-1262