# 1. Introduction

Vehicular networks are one of the most promising extensions in future wireless and mobile communication systems, aiming to provide efficient V2V (Vehicle-to-Vehicle) and V2I (Vehicle-to-Infrastructure) communications for road safety, traffic efficiency, and infotainment applications [1][2]. Vehicles can use direct, multi-hop, or cluster-based transmission strategies to exchange data with other vehicles or the network infrastructure access points [3].

In a cluster-based transmission strategy, vehicles are organized into multiple clusters. A representative (i.e., a cluster head) is selected for each cluster. The cluster head receives data packets from its cluster members and then relays the packets outside (and vice versa) [3]. For V2I communication, the clustering strategy can efficiently decrease request/data congestion at the access points in the network infrastructure [4]. For V2V communication, the clustering strategy can provide a layered architecture to enhance the network scalability [5][6].

Clustering for MANETs (Mobile Ad-hoc Networks) has attracted considerable research interest [7]. However, the clustering schemes in traditional MANETs may not be suitable for VANETs (Vehicular Ad-hoc Networks) because of their specific requirements [6]. Some of VANETs’ notable features should be considered, such as high mobility, the variable density of the VANET nodes, and temporal and spatial dependency in mobility metrics, etc. [1][6]. Constructing and maintaining stable clusters in high mobility environments is crucial for communication continuity. Therefore, clustering is indeed an important aspect in a vehicular network.

The motivation of this paper originates from the following two aspects. First, most existing schemes focus on 1-hop clustering; however, d-hop clustering is capable of providing improved and reliable performance for VANETs with better cluster stability and lower cluster dynamics [6]. Second, mobility-based cluster head selection methods are considered crucial for clustering in VANETs [6] because of their resilience in handling the nodes’ mobility variation [7].

Based on the above observations, a d-hop dominating set based clustering algorithm, DWCM (Distributed and Weighted Clustering based on Mobility Metrics), is proposed in this paper. The goal of DWCM is to provide an optimal clustering scheme, adaptive to group mobility feature and with enhanced stability. To catch the group mobility feature, each vehicle is weighted with a priority that defines the cluster relationship between this vehicle and its neighbors.

Based on the d-hop dominating set problem in graph theory, a distributed approach for cluster formation and cluster head selection is designed where vehicles in the d-hop dominating set are selected as the cluster head nodes. In addition, cluster maintenance handles the cluster structure changes caused by node mobility; thus, maintaining cluster stability without incurring tremendous overhead. Simulations are conducted in the NS-2 and VanetMobiSim integrated environment. DWCM presents high stability in cluster head lifetime and re-affiliation times, as well as high scalability in the number of clusters.

The rest of this paper is organized as follows. Related work is reviewed in Section 2. Section 3 describes the network model with assumptions. Section 4 introduces the proposed DWCM clustering approach in detail. Section 5 illustrates the performance using simulation results, and is followed by conclusions in Section 6.

# 2. Related Work

Clustering schemes in traditional MANETs might not be suitable for VANETs because of VANETs’ unique characteristics [6]. The following features should be considered when designing clustering schemes [1][6]:

(1) Vehicular networks show obvious high mobility feature, which incurs frequent topology changes and makes stability the primary design objective in clustering.

(2) Traffic in VANETs presents unique temporal and spatial distributions, rather than the random distributions often used in MANETs. Different traffic states result in a variable vehicle density, e.g., a sparse distribution of vehicles in suburban areas and a dense distribution in congested areas.

(3) Adjacent vehicles show spatial dependency in their mobility metrics, i.e., consistency in the moving direction and similarity in velocity and acceleration. This is because a vehicle’s movement pattern can be influenced by, and thus correlated with, vehicles in its neighborhood.

In recent years, more effort has been applied toward efficient clustering in vehicular networks. Based on the distance in hops from an ordinary cluster member node to its cluster head, clustering approaches can be classified as 1-hop clustering or d-hop clustering. Most existing approaches are designed to form 1-hop clusters [9–11]. Only a few considered d-hop clustering algorithms [12–16]. In fact, multi-hop communication is undoubtedly a prevalent scenario in vehicular networks, which makes d-hop clustering a natural choice, with better cluster stability and low cluster dynamics [6].

Cluster size is another factor that should be considered. In [9], the authors defined the cluster size measured in geographical distance between a cluster member node and the cluster head node for better radio efficiency and throughput. In d-hop clustering, the cluster size can be defined in hops from a cluster member node to its cluster head. Such a cluster size should be determined by considering two factors. On one hand, a larger cluster size means fewer clusters, which can reduce network maintenance and control overhead. On the other hand, a network diameter limitation in hops [17] should be considered for efficient multi-hop communication. In addition, clustering that is adaptive to node mobility patterns is believed to have better stability.

Several d-hop clustering schemes in MANETs have also been proposed [12][18–21]. However, the schemes in [18][19] did not consider node mobility patterns. In [20], a load-balancing clustering scheme that focuses on cluster maintenance was proposed, with the goal of limiting the number of mobile nodes in each cluster so the clusters have similar sizes. Although [12][21] had mobility-aware features, the special mobility characteristics in VANETs were not considered. In VANETs—aside from simple mobility metrics, such as moving direction, velocity, and acceleration—more abstract metrics, e.g., mobility dependency, should be defined for an efficient clustering approach [22]. However, existing works lack conformance with the inherent nature of vehicular networks, such as multi-hop communications and mobility dependency between vehicles. Designing a d-hop clustering algorithm that explores the natural group mobility patterns presented by spatial dependency is a meaningful objective.

Existing schemes use different methods for selecting cluster heads [12][23–28]. Methods based on node characteristics use particular node features for cluster head selection, e.g., node ID [23], node degree [24], or use the bus node as the cluster head. In mobility-based methods, some mobility and position information is used for cluster head selection. In prediction-based methods, cluster head selection is often performed based on a prediction of link duration.

Among these different schemes, mobility-based methods are considered key for clustering in VANETs [6]. For example, relative mobility deduced from the received signal strength [12][25], vehicle velocity and position acquired from GPS (Global Positioning System) devices [26][27], and destination information acquired from GPS devices [28] can be used in cluster head selection. Mobility-based methods are considered to be most appropriate for VANETs because of their resilience in handling node mobility variation [8].

# 3. Network Model and Assumptions

In this paper, the topology of a vehicular network is modeled as an undirected graph G = (V,E), where V is the set of vertexes, and each vertex represents a vehicle in the network. If two vehicles are in direct transmission range of each other, there is an edge between them. Thus, E ⊆ V × V is the set of edges, and each edge represents a link between vehicles. DWCM depends on the d-hop dominating set to form d-hop clusters. Each vertex is either in the dominating set, or has a path of at most d hops to a vertex in the dominating set. The vertexes in the d-hop dominating set act as cluster heads, whereas the others behave as cluster member nodes.

In this study, the application of the graph theory DS (Dominating Set) problem is motivated by [29–31], where the DS problem provides a promising solution for topology control in MANETs for constructing an efficient network backbone. However, these approaches cannot be applied directly to clustering in vehicular networks because of their different requirements and assumptions.

Topology control in MANETs has unique requirements. For example, the dominating set is often required to be connected and have minimum nodes in the backbone or the shortest path between dominators; thus, the minimum or approximate minimal CDS (Connected Dominating Set) problem is a good choice for describing such approaches. If d-hop clusters are required, the problem evolves to d-hop CDS. In addition, when expecting load balancing or path redundancy features, a k-DS (k-Dominating Set) should be found, where every vertex not in DS has at least k neighboring vertexes in DS.

With regard to optimization objectives, some important factors in MANETs are unsuitable for vehicular networks. For example, energy efficiency is a critical issue for MANETs but meaningless for vehicular networks because the vehicles’ batteries recharge during their journey. In addition, vehicle mobility characteristics, along with location information conveniently acquired from GPS devices, should be considered for cluster head selection.

For convenience, the notations used in the rest of this paper are summarized in Table 1.

**Table 1.**Notations

# 4. Proposed DWCM Clustering Algorithm

## 4.1 d-Hop Dominating Set

As mentioned in Section 3, the network topology is modeled as a graph G = (V,E), and the d-hop dominating set problem in graph theory is adopted here as a basic approach. Formally, the problem can be defined as:

Definition 1 (d-hop neighborhood): The d-hop neighborhood of vertex i is the set of all vertexes within d(d>1) hops from vertex i.

Definition 2 (d-hop dominating set): For a graph G, a set of vertexes S is called a d-hop dominating set if every vertex is either in S or in the d-hop neighborhood of a vertex in S.

Previous studies on MANET topology control have proven that finding the minimum d-hop dominating set in such a network topology is an NP-complete problem [29–30,32]. Therefore, suboptimal solution which is the calculation of the minimal d-hop dominating set, should be used to approximate the minimum dominating set problem. In addition, because every node can only acquire the local topology information in vehicular networks, a distributed solution is a natural choice instead of a centralized solution. Based on the above observations, a distributed solution for a d-hop dominating set, based on local topology information, is proposed in this paper. The detailed algorithm is introduced in Section 4.2.

In addition to finding the d-hop dominating set, another important problem is defining the principle in dominating node selection. Each node is assigned a priority, used to evaluate the suitability of a node to act as a cluster head. Depending on different optimization objectives and network environments, the factors involved in the priority calculation could be node identification [19], node degree, residual battery power [32], travel time, node speed or speed deviation [9], and even the integrated result of multiple factors based on a weighted sum or other methods.

As mentioned before, stability is the primary objective in clustering research. Such stability is the result of either less mobility or group mobility (a node and all its neighboring nodes moving in the same direction at approximately the same speed) [32]. For vehicular networks with high mobility, it is undoubtedly a good choice to explore the priority definition that can well describe the group mobility relationship of a node with its surrounding neighbors. Therefore, the node priority is defined as a cluster relationship in this paper. A detailed introduction is given in Section 4.2.

## 4.2 Cluster Formation and Cluster Head Selection

Cluster formation is the procedure for finding the d-hop dominating set, where the nodes in the dominating set will be selected as the cluster heads. The rules for cluster head selection are introduced first with a correctness proof. The node priority definition is presented thereafter, followed by a distributed approach for cluster formation and cluster head selection.

### 4.2.1 Rules for cluster head selection

In the proposed DWCM clustering approach, every node communicates with its neighbors to obtain some necessary information. Using this information, a priority value is calculated for each node. Thereafter, a node decides to become a cluster head if either of the following criteria is satisfied:

Rule 1: The node has the highest priority in its d-hop neighborhood;

Rule 2: The node has the highest priority in the d-hop neighborhood of another node in its d-hop neighborhood.

Each node uses the above rules to determine its role in clustering. If the node satisfies either rule, it behaves as the cluster head node. Otherwise, it acts as an ordinary cluster member node. After all nodes perform such determination, the d-hop dominating set is found based on G = (V,E) derived from the network topology. The correctness of this approach can be proven as follows.

Theorem 1. The set of cluster heads selected by the DWCM algorithm constructs a d-hop dominating set.

Proof: Given the definition of a d-hop dominating set, a node is either a dominator itself (i.e., in the dominating set) or is a d-hop neighbor of a dominating node. This means, for every node i, it satisfies either:

or

According to the two rules used by a node to determine whether it is in the d-hop dominating set, the following two cases should be considered:

Case 1: When a node i has the highest priority in its d-hop neighborhood (i.e., i satisfies Rule 1), it becomes a cluster head because of Rule 1, and acts as a dominator, i.e., i∈DSd. At this time, node i belongs to the d-hop dominating set and satisfies Condition 1. The nodes in the d-hop neighborhood of node i, NDi, satisfy Condition 2.

Case 2: For other nodes, node , j, , can select the node with the highest priority in its d-hop neighborhood as the cluster head based on Rule 2. At this time, the selected cluster head node belongs to the d-hop dominating set and satisfies Condition 1. Node j is in the d-hop neighborhood of the selected cluster head, and thus satisfies Condition 2. Obviously, a conclusion can be reached that a d-hop dominating set is found after the cluster formation procedure is performed. □

### 4.2.2 Node priority definition

As mentioned in Section 4.1, when defining node priority, we hope to use the spatial dependency in the mobility metrics between adjacent vehicles. Such a priority describes the group mobility characteristics and natural cluster relationships, and thus becomes a reasonable choice for the criteria in cluster head selection.

Based on our previous work in [33], node priority is defined as the cluster relationship between this node and its neighbors, which is derived from some basic mobility metrics. Given that vehicles can obtain location information conveniently through GPS devices, such a location is represented by its Cartesian coordinates at every time interval T0. Then, the linear displacement of node i over T0 can be denoted as:

where ΔxT0 and ΔyT0 denote the increase in linear distance of the X and Y coordinates.

Thus, the average velocity of node i over T0 can be calculated as:

Similarly, the average acceleration of node i over T0 can be defined as:

where Δv is the velocity variation over T0.

Based on the above basic mobility metrics, the relative mobility metrics between two adjacent vehicles can be defined. For example, the relative velocity of nodes i and j is defined as:

where vmax is the vehicles’ maximum speed or the upper speed limit of the road.

Similarly, the relative acceleration of nodes i and j is defined as:

where amax is the vehicles’ maximum acceleration.

Then, the SD (Spatial Dependency) of nodes i and j can be defined considering both relative velocity and acceleration, as shown in Equation (6). The inclusion of the relative acceleration is necessary for a better description of the future relative position of the two nodes.

For a vehicle i with n neighbors, its CR (Cluster Relationship) can be defined as its average total spatial dependency on all its n neighbors, and calculated as:

Based on the above analysis, a node priority is defined using its CR, and then used in the clustering formation and cluster head selection algorithm. A detailed communication procedure for performing the distributed approach is introduced in Section 4.2.3.

### 4.2.3 Distributed approach

The distributed approach for cluster formation and cluster head selection is illustrated in Algorithm 1. For each node u, Algorithm 1 is executed to determine its state in the cluster, with ClusterMaxHop as the parameter to indicate the maximum cluster radius in hops. Each node is initialized to the Non_Clustered state. Each node maintains two lists for its d-hop neighbors: (1) mobilityInfoList, which contains the mobility information of its d-hop neighbors; (2) maxPriIdList, which records

Initially, mobilityInfoList and maxPriIdList are empty. Then, each node exchanges mobility information within its d-hop neighborhood to fill the mobilityInfoList, calculates its priority value accordingly, and exchanges priorities within its d-hop neighborhood. In the period of flooding MaxPriId, each node sends its

## 4.3 Clustering Maintenance

After cluster formation, the cluster structure may suffer frequent changes because of node mobility, e.g., nodes joining or leaving the cluster or a cluster head leaving or failing. Cluster maintenance is another important clustering procedure for handling cluster structure changes, aiming to maintain cluster stability without incurring tremendous overhead.

Some studies have addressed clustering maintenance simply by re-clustering. Obviously, in re-clustering, the cluster head re-election and the necessary information exchange between vehicles result in high computation costs and communication overhead. Therefore, cluster maintenance in DWCM follows a principle similar to [12] and [34], where each node continuously senses the surrounding topology and reacts accordingly with necessary adjustments based on the existing cluster structure.

The detailed cluster maintenance operations are shown in Algorithm 2. Each node executes Algorithm 2 to adjust its state in the cluster to adapt to topology changes. DWCM can address the following maintenance situations:

# 5. Simulation Results

Simulations were conducted in the network simulator NS-2 [35] to verify the effectiveness and performance of the proposed DWCM clustering algorithm. Because there are no predefined mobility models for vehicle movement, VanetMobiSim [36] was used as the microscopic traffic generator to provide traffic simulation data that describe realistic vehicle traffic. A common framework was designed in NS-2 to implement the clustering algorithms. The simulation parameters are defined in Table 2.

**Table 2.**Simulation Parameters

The proposed DWCM clustering algorithm and some classical approaches (including Lowest-ID, Highest-degree, MOBIC, and MobDHop), were implemented in NS-2. Lowest-ID [23] selects cluster heads with the lowest node ID and easily causes re-clustering when an un-clustered node with a lower ID reaches a cluster head’s range. Highest-degree [24] uses the local highest node degree as the attribute for cluster head selection. However, it still has the same problem as Lowest-ID, where re-clustering occurs frequently.

MOBIC [25] proposes an aggregate local mobility metric for the cluster formation process, such that mobile nodes with a lower speed than their neighbors have the chance to become cluster heads. MobDHop [12] forms variable diameter clusters based on the node mobility pattern in MANETs. It introduces a new metric for measuring the distance variation between nodes over time to estimate the relative mobility of two nodes. The multi-hop clustering schemes of MobDHop and the proposed DWCM are evaluated with 2-hop performance (i.e., d = 2) as a tradeoff between low cluster dynamics and complexity. It is practical to acquire 2-hop information with acceptable complexity, while providing enhanced stability compared with 1-hop clustering.

These clustering approaches were simulated under a series of similar configurations. For performance evaluation, the following performance metrics were chosen:

Firstly, the performance of Lowest-ID based 1-hop clustering, Highest-degree based 1-hop clustering, MOBIC based 1-hop clustering, MobDHop based 2-hop clustering, and DWCM based 2-hop clustering was simulated when the transmission range of the vehicles varied from 50 m to 300 m with a fixed N = 125 and S = 20 m/s. Fig. 1 shows the cluster stability of the above approaches when varying the transmission range. From the results, we can see that DWCM outperforms the other approaches in both cluster head lifetime and re-affiliation times.

**Fig. 1.**Cluster head lifetime and re-affiliation times vs. transmission range

As shown in Fig. 1(a), in terms of mean cluster head lifetime, DWCM, MobDHop, and MOBIC show obvious performance advantages compared with Lowest-ID and Highest-Degree. This is because DWCM, MobDHop, and MOBIC use relative mobility metric related parameters in cluster formation and cluster head selection. Such parameters can describe the natural group mobility feature and help select more suitable cluster head nodes; thus, improving the cluster lifetime.

Both MobDHop and MOBIC estimate the distance from a node to its neighbor based on the measured received signal strength from that particular neighbor. MobDHop performs better than MOBIC because it uses the standard deviation of the distance variation based on a series of continuous measurements to capture the group mobility, whereas MOBIC only uses a pair of consecutive measurements. DWCM can achieve better performance because more relative mobility metrics, e.g., relative direction, relative velocity, and relative acceleration, are used to define the cluster relationship. Fig. 1(b) shows that DWCM performs obviously better than the other approaches in terms of mean re-affiliation times because it considers the relative moving direction between different nodes in cluster formation and maintenance operations; thus, it avoids joining a cluster that moves in the opposite direction.

Then, the performance of Lowest-ID based 1-hop clustering, Highest-degree based 1-hop clustering, MOBIC based 1-hop clustering, MobDHop based 2-hop clustering, and DWCM based 2-hop clustering was simulated when the maximumspeed of the vehicles varied from 10 m/s to 30 m/s with a fixed N = 125 and Tr = 200 m. Such simulations are used to investigate the effect of vehicle speed on cluster performance. Fig. 2 illustrates the mean cluster head lifetime and mean re-affiliation times with varying maximum vehicle speed in Figs. 2(a) and 2(b), respectively. We can see performance results similar to those shown in Fig. 1.

**Fig. 2.**Cluster head lifetime and re-affiliation times vs. speed

Figs. 3(a) and 3(b) show the stability performance of Lowest-ID based 1-hop clustering, Highest-degree based 1-hop clustering, MOBIC based 1-hop clustering, MobDHop based 2-hop clustering, and DWCMbased 2-hop clustering when the number of vehicles varies from 50 to 75 with a fixed S = 20 m/s and Tr = 200 m. The clustering algorithms that consider mobility metrics reveal more obvious performance advantages as the number of vehicles increases. This is because the spatial dependency between the vehicles rises with the number of vehicles, thus improving the cluster stability.

**Fig. 3.**Cluster head lifetime and re-affiliation times vs. the number of vehicles

Figs. 4(a), 4(b), and 4(c) illustrate the cluster numbers of different approaches for various values of transmission range, vehicle speed, and number of vehicles. These figures show that multi-hop clustering approaches (e.g., MobDHop and DWCM) form obviously fewer clusters than 1-hop clustering approaches (e.g., Lowest-ID, Highest-Degree, and MOBIC) in the simulation. A smaller cluster number is desirable because the delay and overhead can be reduced in cluster-based hierarchical routing. Thus, multi-hop clustering is a more reasonable choice for large-scale vehicular networks.

**Fig. 4.**Cluster number when the transmission range, speed, and number of vehicles vary

We also note the difference between the simulation results in this paper and the performance data in other research papers. It seems that the same algorithm shows worse performance in our simulation. Such a deviation is caused by the mobility models used in the simulation. In this study, VanetMobiSim was used as the microscopic traffic generator to provide more realistic traffic simulation data. The IDM-LC (Intelligent Driver Model with Lane Change) model was used in the simulations. IDM-LC has the functions of IM (Intersection Management) and LC (Lane Change). The IM function induces congestion at intersections. Therefore, the cluster heads and members are more likely to disconnect from each other; thus, resulting in more cluster dismissals, joining new clusters, and re-clustering. The cluster head lifetime decreases, whereas the re-affiliation times increase accordingly. The LC function permits overtaking behaviors, which decrease the mobility dependency between vehicles, and thus degrade the performance of clustering algorithms based on mobility metrics.

Experiments were also conducted to investigate the effects of the cluster radius in hops (i.e., the value of d) on clustering performance. Figs. 5(a), 5(b), and 5(c) show the clustering performance of DWCM for mean cluster head lifetime, mean re-affiliation times, and number of clusters, respectively, when d = 2, 3, and 4 hops with a fixed N = 125 and Tr = 200 m, with the maximum speed varying from 10 m/s to 30 m/s. Although a clustering performance improvement (i.e., increase in cluster head lifetime, decrease in re-affiliation times, and decrease in number of clusters) can be observed when we increase the cluster radius, the performance benefit is rather limited and will shrink with a larger d value.

**Fig. 5.**Effects of d value on clustering performance

As the optimality of a clustering algorithm is defined as being able to form as few stable clusters as possible at a reasonable overhead [12], the overhead introduced by a larger d value should also be considered. For DWCM, the message overhead will increase with a larger number of hops because the vehicles’ mobility information is disseminated within the d-hop range. The message overhead of the cluster formation procedure in the worst case can be as high as O(Nd). When entering the cluster maintenance procedure, the message overhead for each topology change can be described as O(m·d), where m is the average number of members in a cluster. Therefore, for practical feasibility, the choice of d should consider the tradeoff between low cluster dynamics and overhead.

In addition, when considering practical applications, different city and rural scenarios can affect clustering performance. As observed in the simulation, cluster stability is affected by vehicle density, speed, and wireless transmission range. In city scenarios, the vehicle density is higher, whereas the speed is lower, compared with rural scenarios. In addition, the node priority definition in this paper is based on the group mobility features in VANETs. The spatial dependency in mobility metrics between adjacent vehicles in one road segment is more obvious in city scenarios than in rural ones. Therefore, in general, DWCM will perform better in city scenarios than in rural scenarios.

However, DWCM performance in city scenarios might vary depending on the different road layouts. For example, in a scenario with a dense deployment of intersections and traffic signals, the dynamics of the clusters increase because vehicles often join or leave the clusters; thus, incurring large cluster maintenance overhead. Another case is city expressways, where vehicles move within a long segment to maintain the stable group mobility feature. DWCM is expected to have better performance in such a scenario.

# 6. Conclusions and Future Work

In this study, DWCM, a distributed and weighted d-hop clustering method based on mobility metrics, was proposed. The goal of DWCM is to construct and maintain stable multi-hop clusters in vehicular networks. A weighted undirected graph was used as the network model. Each vertex in this graph was assigned a priority that described the group mobility feature in the vehicular network. A d-hop dominating set was found for cluster head nomination and the correctness of this algorithm was proven. In addition, cluster maintenance was used to handle the cluster structure changes, including cluster head contention, cluster gateway discovery, isolated node discovery, and joining a new cluster. The simulation results in the NS-2 and VanetMobiSim integrated environment showed that DWCM outperformed other classical clustering approaches in terms of cluster stability, with longer cluster head lifetimes and fewer re-affiliation times. In addition, d-hop clustering in DWCM forms a smaller number of clusters with high scalability. It is notable that recent work [37] has explored the topology characteristics based on large-scale realistic vehicle mobility traces. A good clustering algorithm should be capable of identifying and adapting to the inherent group mobility patterns of a temporally evolving topology. Designing such a clustering algorithm by employing the complex network theory and statistical physics theory will be our ongoing and future work.