1. Introduction
Recently, collaborations and cooperations in business activities have been commonly happenning and actively adopted in various companies and organizations. Especially, the collaborative and cooperative activities among individuals as roles of employees in those companies and organizations have been represented through the concept of social networks intra-organizationally as well as inter-organizationally. That is, the collaborative behavior of individuals in organizational social networks [2] is also becoming an opportunity for the companies to learn about how much their employees are contributing to their products, services, and business operations. So, the concept of technology-supported social networks has received considerable attentions in the fields of information systems and business intelligence and knowledge discovery systems. Consequently, the recent workflow literature has also moved the focus to “People and their work-collaborating relationships,” which begins from the strong belief that the social relationships and collaborative behaviors among employees who are involved in enacting workflow models affect not only their overall decision-making and organizational performances but also their great successes in the real businesses and the working productivity. Accordingly, the workflow literature starts being interested in a new type of social networks, dubbed workflow-supported social networks (WSSNs) [3].
In this paper, we are particularly interested in the size of a workflow-supported social network. Assume that its size (the number of nodes) in a workflow-supported social network be increasingly large-scaled, which is so-called a large-scale workflow-supported social network [4]. A large-scale workflow-supported social network is graphically represented by a undirected graph evincing the relationships among the actors (or employees) who are involved in enacting a group of organization-wide workflow models. In a large-scale workflow-supported social network, nodes are actors who are performing the corresponding workflow models, edges are work-transferring relationships between the actors. Such a large-scale workflow-supported social network can be used for a means to analyze the degree of collaborative capability of each of the actors in enacting a group of organization-wide workflow models. There is a typical social network analysis method that is very suitable for analyzing the degree of collaborative capability, which is the centrality analysis. There are four type of centralities [1], such as degree centrality, closeness centrality, betweenness centrality, and eigen-value centrality. These centrality measurements can be used as valuable indicators for measuring for the human resource management metrics, such as personnel evaluations, work placements, work transferrings, and work intimateness.
The auhtors’ research group has developed a closeness centrality measurement function [5,6] as a component of the workflow-supported social network analysis system. Based upon the function, we also tried to develop a rank calculation function on the closeness centrality measures of all the actors associated with a workflow-supported social network. However, we were confronted with the scalability problem [4] on calculating ranks of the closeness centrality measures. According as the numer of nodes (actors) in a workflow-supported social network is dramatically increasing and becoming large-scaled, the function’s time complexity is exponentially going up. To solve the problem, this paper1 implements an estimated ranking algorithm for calculating closeness centrality measures in a large-scale workflow-supported social network, which is named as RankCCWSSN, and analyzes the algorithm’s performance evaluation.
We organize the paper into six sections including the introduction. The next section describes a literature survey previously studied for the analysis methods of workflow -supported social networks and the estimated ranking algorithms as well. In the concecutive three sections, we expatiate our estimated ranking algorithm of RankCCWSSN and its performance evaluation results with a series of experiments on a large-scale workflow -supported social network. Finally, we finalize the paper with concluding remarks and future works.
2. Related Work
Recently, technology-supported social networks and organizational behavioral analytics issues [3,5-7,10-20] have been raised in the IT literature. Naturally, the workflow literature has been focusing on the social and collaborative work analysis in workflow-supported organizations, because workflow management systems themselves are “human-centered systems,” where workflow procedures must be designed, deployed, and understood within their social and organizational contexts. From these human-centered organizational contexts, it is possible for a new concept of organizational social networks among those individuals to be formed, in particular, which is so-called workflow-supported social network [3]. So far, the literature has delivered several research results about modeling [19], discovering [2,3,10,11,15,18], analyzing [5-7,12,13,17], and visualizing [14,16,19,20] the workflow-supported social networks.
In particular, S. Park, et al. [5] built a theoretical approach for numerically analyzing closeness centrality measures among workflow-actors of workflow-supported social network models. The essential part of the proposed approach is a closeness centrality analysis equation and its algorithm that is able to efficiently compute the closeness centrality measures, and eventually the developed algorithm can be applied to analyzing the degree of work-intimacy among those workflow-actors who are allocated to perform the corresponding workflow procedure. M.-J. Kim, et al. [13,14] implemented a knowledge visualization framework and its system designated for the workflow-supported social networking knowledge, and the devised framework is pipelining from the XPDL 2 -formatted workflow model to the GraphML3 -formatted workflow-supported social network. The authors’ research group also developed and implemented the closeness centrality measurement functions on a workflow-supported social network. Based upon the closeness centrality measures, J. Kim, et al, [4,8,9], and D. Lee, et al. [21] tried to develop the ranking algorithms to calculate the rank of each workflow-actor, and expansively apply the ranking algorithms to large-scale workflow-supported social networks.
However, the traditional ranking algorithms [22-24] were faced with a fatal problem, the scalability problem: according as the numer of nodes (actors) in a workflow -supported social network is dramatically increasing and becoming large-scaled, the function’s time complexity is exponentially going up. In order to resolve the scalability problem, the authors developed a novel ranking algorithm and published in [8], which was devised from the estimation-driven ranking approach [25,26], and it coped with only the simple binary type of large-scale workflow-supported social networks that is mathematically represneted by a binary socio-matrix. In this paper, we extend the estimated ranking algorithm so as to support the weighted type of large-scale workflow-supported social networks that is mathematically represented by a valued socio-matrix. Finally, this paper extends the RankCCWSSN (Ranks of closeness centrality measures for large-scale workflow-supported social networks) algorithm published in [8], and analyzes the performance evaluations with its computational times.
3. Design and Implementation of RankCCWSSN
The closeness centrality measurement [1] is a typical analysis method for a social network, and it is able to quantify the degree of concentrativeness or importance of each node. The authors’ research group rigorously revised the closeness centrality analysis method so as to be applied to workflow-supported social network, where nodes are actors, and edges are task transfer relationships between actors. Through the closeness centrality measurement, we can find out valuable organizational and behavioral knowledge by answering to the following questions:
In this paper, we are particularly interested in making a rank of each actor based upon the closeness centrality measures calculated on a large-scale workflow-supported social network. In calculating the measures, we recognized that the traditional ranking algorithms are unappropriate for a large-scale workflow-supported social network because of the time complexity problem. Therefore, we develop an estimation-driven ranking algorithm that is abbreviated to RankCCWSSN as stated in the previous section.
3.1 A Procedural Framework
Fig. 1 illustrates a procedural framework for ranking the actors’ closeness centrailties by the estimation-driven ranking approach. It starts from discovering a large-scale workflow-supported social network from a group of organization-wide workflow models, which is done by the workflow-supported social network discovery algorithm developed by the authors’ research group [10]. Note that the discovered workflow-supported social network is a connected graph with weighted edges among the actor group. The connection establishment between certain two actor nodes is determined according to whether a work-transfer relationship is exist or not. In the next analysis phase, it makes up a sampled group comprising randomly selected actors out of the discovered workflow-supproted social network represented by sociomatrix. Based upon the sampled actor group, it is able to calculate the approximated closeness centralty measures and make estimated ranks of the corresponding actors.
Fig. 1.A procedural framework of the estimation-driven closeness centrality ranking approach
In our experience, when we analyze a large-scale workflow-suported social network containing actor-nodes more than about 500 nodes, we found out that the algorithm’s computaion time is dramatically increasing in ranks their closeness centrality measures. This is why we take the approximation (or estimation-driven) approach, in where it takes a sample actor group randomly selected out of the original actor group, measure the estimated closeness centralities of all actors by calculating shortest paths with the sample group. As a intermediary resualt, a candidate actor set is determined by estimated closecentrality measures and finally makes their ranks by applying the pure closeness centrality algorithm [1]. In the remainder of this section, we design and implement the procedural framework based upon the estimation-driven RankCCWSSN algorithm.
3.2 Design of the Estimated RankCCWSSN Ranking Algorithm
The traditional ranking algorithms calculate the closeness centralities on the whole of the input network, which causes the scalability problem. By contrast, the proposed ranking algorithm can reduce effectively the computation time by obtaining the estimated closeness centralities between the sampled actors and all the actors in the original large-scale workflow-supported social network.
Fig. 2 shows the estimated RankCCWSSN ranking algorithm, which stands for ranking of closeness centrality for large-scale workflow-supported social networks. Suppose that we apply the algorithm to a large-scale workflow-supported social network with 2500 actors. The algorithm takes a workflow-supported social network G(A, E) and ranking size k as input set. The network G consists of a set of actors (nodes) A and a set of edges E. According to the basis of the workflow domain, an actor is a human worker who is involved in certain workflow processes. Therefore, a set of edges in a workflow-supported social network is constructed through the analysis of the work-transfer relationship between actors within a workflow model. [3]. Firstly, our algorithm randomly selects a sample actor group As from the actor set A for calculating estimated closeness centralities Ĉ. We note that the estimation of closeness centrality measures in our algorithm enables us to save considerable computation time in comparison with the pure algorithm [1] because our algorithm only requires to calculate the shortest distances with the sampled actors. As a result, a candidate actor group Ac is determined by sorting the top k actors based on the estimated closeness centrality measures Ĉ. Lastly, the algorithm performs the phase that measuring exact closeness centralities of the candidate group Ac by applying the pure algorithm and ranking a top k actors list exactly. Additionally, the equation for the shortest distance between two actors in the RankCCWSSN algorithm, is as followings:
Fig. 2.The estimation-driven ranking algorithm for closeness centrality measures
where h are intermediary actors on paths between actor φi and φj, also x is a value of entry in a sociomatrix [10] reprensenting a workflow-supported social network.
Fig. 3 shows the RAND algorithm [25], which calculates the estimated centrality measures based on the sum of the shortest distance from each sample vertex to all other vertices in a graph. The fundamental idea of the algorithm is to overcome the inherent challenge in computational complexity of the SSSP (Single Source Shortest Path) problem [27]. Accordingly, the RankCCWSSN algorithm makes use of a variant of the RAND algorithm to take the important advantage of estimation.
Fig. 3.The RAND algorithm for estimating centrality
3.3 Implementation of the Proposed Algorithm
We implement the estimated ranking algorithm designed in the previous subsection, and apply the algorithm to an imaginary large-scale workflow-supported social network to evaluate and verify the algorithmic functionalities of the proposed ranking algorithm. Note that we use an imaginary large-scale workflow model ballooned by applying a simple matrix multiplication method instead of actually modeling a large-scale workflow model. The ranking result of the algorithm on the example model is represented in Table 1. We assume that the number of actors is 1000; the number of sampled actors is 500; and the total ranking size is 500. The column “Ranked group” means a group of actors whose closeness centrality measures are all the same value which resulted from the simplest matrix multiplication method (from 10×10 to 1000×1000).
Table 1.The ranking result of the proposed algorithm
4. Performance Analysis
In this section, we conduct the performance analysis to prove that our algorithm is much more efficient than the traditional pure algorithm [1] in terms of the time complexity. Firstly, we analyze the computation time between the pure closeness centrality algorithm and the proposed algorithm. Secondly, we examine the computation time of the proposed algorithm by adjusting two parameters: the number of actors (nodes) and the sampling sizes.
4.1 Comparing the Performance of the Target Algorithms
The computation time analysis result of the RankCCWSSN algorithm and the pure algorithm is depicted in Fig. 4, where the horizontal axis represents the number of actors and the vertical axis refers to the average computation time. The sampling rate and the ranking size of the proposed algorithm are fixed to 50%, and 10 respectively. The comparison result on the proposed algorithm (blue-colored vertical bar) and the pure algorithm (gray-colored vertical bar) shows that the performance of the proposed algorithm improved 40% ~ 50% than the pure algorithm in all cases. In particular, the proposed algorithm is 50.4% faster than the pure algorithm in the case where the number of actors is 2500. To sum up, as the size of a workflow-supported social network increases, we found that significant differences in the computation time between the proposed algorithm and the pure algorithm, and we conclude that the primary reason of these improvements is on the estimation-driven approach combining the approximation algorithm and the pure algorithm.
Fig. 4.Comparisons of computation time with two algorithms
4.2 Performance Analysis of the Proposed Algorithm with the Sampling Size
We confirmed that the important advantage of the proposed algorithm be on the computation time efficiency in comparison with the pure algorithm’s. In the second experiment, we analyze the performance of the proposed algorithm by adjusting two parameters: the number of actors and the sampling size. Especially, the sampling size is a crucial parameter because it has great effect on the performance and the accuracy in deciding the ranks. If the value of sampling size is high, it means a need to process a heavy computation load for estimating closeness centrality but it yields a higher ranking accuracy, while it will lead to the decrease of the computation cost with a lower ranking accuracy if the sampling size is lower. Therefore, we focus on the impact of performance according to the variation of the sampling size which is a key factor determining a trade-off between the performance and the accuracy. The analysis result presenting relationship between the sampling size and the computation time is depicted in Fig. 5. The result shows that the computation time is proportionally increasing in all the cases with regardless of the number of actors. On that basis, we prove that the proposed algorithm is capable of effectively solving the scalability problem, while the pure algorithm shows that the computation time becomes exponentially increasing.
Fig. 5.The relationship between the computation time and the sampling size
5. Ranking on Weighted Workflow-supported Social Networks
In order to supplement the contribution of the paper, we present an operational example of the RankCCWSSN algorithm to investigate the possibility of application to weighted workflow-supported social networks. Essentially, a workflow-supported social network is mathematically represented by a sociomatrix which can be refined on two groups of subtypes—binary directed/undirected sociomatrix and weighted directed/undirected sociomatrix. When a workflow model is transformed to a directed/undirected sociomatrix, the term, binary, implies the most basic measurement, presence or absence of a tie, which is a dichotomy indicated by the binary value of 1 or 0, respectively. A sociomatrix may include valued elements that are reflecting the intensity of relationships or ties, such as frequency, tie strength, or magnitude of associations. Therefore the entries in the sociomatrix, which is called weighted sociomatrix, can vary from 0 to the maximum level of dyadic interactions. According to the above definition, a value of entry in a weighted sociomatrix indicates the frequency of transferring task between two actors in enacting workflow models. Therefore, analyzing the weighted workflow-supported social network will bring more valuable knowledge to understand the organizational structure and its context.
In general, solving the shortest path problem [29] in a weighted network is not different from solving the same problem in a binary social network. For instance, [30,31] defined the shortest path between two nodes as the least costly path, and suggested that costs are based on tie weights. However, in cases of workflow-supported social networks, tie weights should be interpreted in a different point of view. For instance, frequency of work-transfers between two actors of a tie weight implies strength of the association with a positive meaning. Therefore, we simply consider the tie weight as a primary criteria to decide ranking of actors having the same value of closeness centrality. As shown in an example in Fig. 6, there is a tie group of actors where four actors φ2, φ4, φ6, and φ7 have the same value of closeness centrality measures (shown at the bottom of Fig. 6). However, all the members of the tie group are ranked uniquely by using the marginal values Sw, each of which is the sum of tie weights of a certain actor in the weighted sociomatrix Xw. Conclusively, this section presented that the ranking result from weighted networks may be more valuable than binary networks that only containing information referring to presence of tie. As a future work, the authors’ ranking approach can be complemented by devising new measurements from weighted workflow-supported social networks.
Fig. 6.Example of ranking result on a weighted workflow-supported social network
6. Conclusion
In this paper, we have proposed an estimated closeness centrality algorithm to be applied to large-scale workflow-supported social networks. The traditional pure algorithm was showing the scalability problem that is increasing exponentially its computation time along with the network sizes. In the large-scale workflow-supported social network, we showed that the scalability problem is a matter to be solved, too. In order to solve this problem we have proposed and implemented the RankCCWSSN algorithm, which is based on the estimation-driven approach. In conclusion, the performance of our ranking algorithm is much higher (about 50% improvement) than the traditional pure algorithm’s. Additionally we proved that our algorithm is able to effectively solve the scalability problem, because the computation time is increasing proportionally in a series of the experiments where both the network size (actors) and the sampling size are increased. As a future work, we need to devise a new measurement approach with respect to obtaining the reasonable ranking results for the weighted workflow-supported social networks.
References
- D. Knoke, S. Yang, Social Network Analysis - Edition, Series: Quantitative Applications in the Social Sciences, SAGE Publications, 2008.
- W. M. P. van der Aalst, H. A. Reijers, M. Song, “Discovering social networks from event logs,” Computer Supported Cooperative Work, vol. 14, no. 6, pp. 549-593, 2005. Article (CrossRef Link) https://doi.org/10.1007/s10606-005-9005-9
- J. Song, et al., "A framework: Workflow-based social network discovery and analysis," in Proc. of the 13th International Conference on Computational Science and Engineering, pp. 421-426, 2010. Article (CrossRef Link)
- J. Kim, H. Ahn, K. Kim, “Performance analysis of an estimated closeness centrality ranking algorithm in large-scale workflow-supported social networks,” Journal of Internet Computing and Services, vol. 16, no. 3, pp. 71-77, 2015. Article (CrossRef Link) https://doi.org/10.7472/jksii.2015.16.3.71
- S. Park, K. P. Kim, "A closeness centrality analysis algorithm for workflow-supported social networks," in Proc. of the 15th International Conference on Advanced Communication Technology, pp. 158-161, 2013.
- M.-S. Kim, K.-H. Kim, "A disconnectedness determination algorithm on workflow-supported enterprise social networks," Journal of Internet Computing and Services, vol. 16, no. 5, pp. 67-73, 2015. https://doi.org/10.7472/jksii.2015.16.5.67
- M. Kim, H. Kim, K. P. Kim, "Disconnectedness on workflow-supported organizational social networks," in Proc. of the 6th International Conference on Internet, pp. 109-114, 2014.
- J. Kim, et al., "A ranking algorithm of closeness centralities for large scale workflow-supported social networks," in Proc. of KSII Domestic Conference on Internet Information, pp. 55-56, 2014.
- J. Kim, H. Ahn, K. P. Kim, "Performance analysis of an estimated closeness centrality ranking algorithm in large-scale workflow-supported social networks," in Proc. of the 10th Asia-Pacific Conference on Information Science and Technology, pp. 75-80, 2015.
- K. Kim, “A workflow-based social network intelligence discovery algorithm,” Journal of Internet Computing and Services, vol. 13, no. 2, pp. 73-86, 2012. Article (CrossRef Link) https://doi.org/10.7472/jksii.2012.13.2.73
- K. Kim, “A workflow-based affiliation network knowledge discovery algorithm,” Journal of Internet Computing and Services, vol. 13, no. 2, pp. 109-118, 2012. Article (CrossRef Link) https://doi.org/10.7472/jksii.2012.13.2.109
- H. Jeong, H. Kim, K. P. Kim, "Betweenness centralization analysis formalisms on workflow-supported org-social networks," in Proc. of the 16th International Conference on Advanced Communications Technology, pp. 1173-1177. 2014. Article (CrossRef Link)
- M.-J. Kim, H. Ahn, M.-J. Park, “A theoretical framework for closeness centralization measurements in a workflow-supported organization,” KSII Transactions on Internet and Information Systems, vol. 9, no. 9, pp. 3611-3634, 2015. https://doi.org/10.3837/tiis.2015.09.018
- M.-J. Kim, H. Ahn, M. Park, “A GraphML-based visualization framework for workflow-performer's closeness centrality measurements,” KSII Transactions on Internet and Information Systems, vol. 9, no. 8, pp. 3216-3230, 2015. https://doi.org/10.3837/tiis.2015.08.028
- H. Kim, H. Ahn, K. P. Kim, “Discovering workflow performer-application affiliation knowledge and its implications,” ICIC Express Letters, vol. 9, no. 4, pp. 1049-1056, 2015.
- H. Ahn, H. Kim, K. P. Kim, “Visualizing workflow-supported social networks and their degree centrality measures,” International Journal of Advances in Soft Computing and Its Application, vol. 6, no. 3, pp. 82-93, 2014.
- H. Ahn, C. Park, K. Kim, "A BPM activity-performer correspondence analysis method," Journal of Internet Computing and Services, vol. 14, no. 4, pp. 64-72, 2013. Article (CrossRef Link)
- H. Ahn, K. Kim, "An activity-performer bipartite matrix generation algorithm for analyzing workflow-supported human-resource affiliations," Journal of Internet Computing and Services, vol. 14, no. 2, pp. 25-34, 2013. Article (CrossRef Link)
- H. Kim, H. Ahn, K. P. Kim, “Modeling, discovering, and visualizing workflow performer-role affiliation networking knowledge,” KSII Transactions on Internet and Information Systems, vol. 8, no. 2, pp. 134-151, 2014. Article (CrossRef Link)
- M. Jeon, K. P. Kim, “Workload-centrality analysis and visualization of workflow-supported social networks,” ICIC Express Letters, vol. 7, no. 3(B), pp. 1049-1054, 2013.
- D. Lee, et al., "An approximation approach to rank closeness centrality measures in a large-scale workflow affiliation network," in Proc. of the 10th Asia Pacific International Conference on Information Science and Technology, pp. 75-80, 2015.
- A. Graves, A. Sibel, H. Jim, "A method to rank nodes in an RDF graph," in Proc. of International Semantic Web Conference (Posters & Demos), vol. 401, 2008. Article (CrossRef Link)
- W. de Nooy, M. Andrej, B. Vladimir, "Exploratory Social Network Analysis with Pajek," Cambridge University Press, vol. 27, 2011.
- P. DeScioli, et al., “Best friends alliances, friend ranking, and the myspace social network,” Perspectives on Psychological Science, vol. 6, no. 1, pp. 6-8, 2011. Article (CrossRef Link) https://doi.org/10.1177/1745691610393979
- D. Eppstein, J. Wang, “Fast approximation of centrality,” Journal of Graph Algorithms and Applications, vol. 8, no. 1, pp. 39-45, 2004. Article (CrossRef Link) https://doi.org/10.7155/jgaa.00081
- K. Okamoto, W. Chen, X.-Y. Li, "Ranking of closeness centrality for large-scale social networks," Lecture Notes in Computer Science of Frontiers in Algorithmics, vol. 5059, pp. 186-195, 2008. Article (CrossRef Link)
- Shortest path problem, Article (CrossRef Link)
- T. Opsahl, A. Filip, S. John, “Node centrality in weighted networks: Generalizing degree and shortest paths,” Social Networks, vol. 32, no. 3, pp. 245-251, 2010. Article (CrossRef Link) https://doi.org/10.1016/j.socnet.2010.03.006
- E. W. Dijkstra, “A note on two problems in connexion with graphs,” Numerische Mathematik, vol. 1, no. 1, pp. 269-271, 1959. Article (CrossRef Link) https://doi.org/10.1007/BF01386390
- M. E. J. Newman, “Scientific collaboration networks. Ⅱ. Shortest paths, weighted networks, and centrality,” Physical Review E, vol. 64, 016132, 2001. Article (CrossRef Link) https://doi.org/10.1103/PhysRevE.64.016132
- U. Brandes, “A faster algorithm for betweenness centrality,” Journal of Mathematical Sociology, vol. 25, pp. 163-177, 2001. Article (CrossRef Link) https://doi.org/10.1080/0022250X.2001.9990249
Cited by
- A Conceptual Approach for Discovering Proportions of Disjunctive Routing Patterns in a Business Process Model vol.11, pp.2, 2016, https://doi.org/10.3837/tiis.2017.02.030
- 액티비티별 특징 정규화를 적용한 LSTM 기반 비즈니스 프로세스 잔여시간 예측 모델 vol.21, pp.3, 2016, https://doi.org/10.7472/jksii.2020.21.3.83