1. Introduction
Cognitive radio (CR) or dynamic spectrum access [1] has newly emerged as a promising solution to improve the spectrum utilization by allowing unlicensed secondary users (SUs) to access idle licensed spectrum. In a CR network, SUs can periodically sense the licensed spectrum and opportunistically access the spectrum holes or spectrum opportunities (SOPs) unoccupied by primary users (PUs). In addition, SUs can further form an infrastructure based CR network or a multi-hop ad hoc network. In a cognitive radio ad hoc network (CRANET) [2], SUs can only access the SOPs by seeking to underlay, overlay, or interweave their signals with those of existing PUs without significantly impacting their communications.
The primary objective of CR networks is to achieve both quality of service (QoS) for SUs and system throughput, and also to avoid causing excessive interference to PUs by means of dynamically allocating the transmit power of SUs. In this case, power control is essential for CR networks. In conventional celluar wireless networks and ad hoc networks, many power control schemes [3]-[5] and energy-efficient routing schemes [6]-[8] have been proposed to increase the total system throughput or to improve the energy efficiency. Other scheme provides the optimal power allocation on subchannel in two-tier femtocell networks based on orthogonal frequency division multiple access (OFDMA), aiming to maximize the total capacity [9]. Recently, there has been active research efforts in power control for CR networks from different perspectives, such as imperfect channel knowledge [10], [11], rate and energy efficiency [12], worst-case robust optimization [13], joint power control and resource allocation [14]-[16], and so on. As an alternative framework for modeling, game theoretic approach has gained more attention as an economics tool to study the resource allocation in OFDMA femtocells [17] and CR networks [18], [19]. Another body of work formulates the problem of distributed channel selection in CR networks as the local interaction game [20] and the exact potential game [21]. From the point of view of the dynamic behavior of SUs, differential game as an effective method is used to study the dynamic spectrum leasing under noncooperative model [22] and the noncooperative Stackelberg model [23]. In addition, many game models for power control in CR networks are also designed as the noncooperative models [24]-[26] and the cooperative models [27]-[29]. See Section 2 for a review of related work.
Our work in this paper mainly focuses on the underlay CRANET scenario owing to its simplicity of implementation and its high spectrum utilization. Clearly, the distributed strategy needs to be used to design the power control scheme due to the lack of centralized control and global information. In CRANETs, SUs are expected to cooperate with each other to enhance their own access opportunities and achieve high spectrum utilization. To this end, cooperative game theoretic approach is better suited to the problem of power control. Moreover, both the mobility and sometimes random nature of SUs result in the dynamic change of the CRANET topology with respect to time dependency. In general, given the dynamic time-varying network topology, it is unlikely to keep the transmit power unchangeably. However, there is a need to dynamically regulate the transmit power of SUs according to the dynamic nature of time dependency. That is, it will be far more realistic to dynamically adjust the transmit power according to the current time due to the impact of the dynamic behavior of SUs on power control in practical environment. Overall, the dynamic nature of power control in our work is aimed at the time dependency in that the CRANET topology is changing dynamically over time in light of the mobility and and sometimes the erratic nature of SUs. Taking into account that differential game explores interactive decision making over time, in this paper, we present a differential game theoretic approach for the distributed dynamic cooperative power control in CRANETs. Our main contributions can be outlined as follows:
The rest of this paper is organized as follows. In Section 2, we review related work. Section 3 describes the system model and assumptions. In Section 4, we propose the differential game formulation and cooperative solution. Section 5 presents the numerical results. Finally, Section 6 concludes the paper.
2. Related Work
Many studies on power control for CR networks have been reported from different perspectives, such as imperfect channel knowledge [10], [11], rate and energy efficiency [12], worst-case robust optimization [13], joint power control and resource allocation [14]-[16], and so on. In the literature, game theoretic approach for resource allocation [18], [19] and distributed channel selection [20], [21] in CR networks has recently been well investigated. In comparison with the coalitional games [18] and the spectrum auction games [19], the local interaction game [20] and the exact potential game [21] has been used to invetigate the problem of distributed channel selection, aiming to obtain the existence of Nash equilibrium (NE) solutions under the constraint of local information of SUs. However, these solutions are designed under noncooperative game formulation, and they cannot point out the cooperative solution in the problem of distributed channel selection. In order to deal with the impact of the dynamic behavior of SUs on the design of game theoretic framework, differential game which investigates interactive decision making over time is leveraged to study dynamic spectrum leasing problem [22], [23]. In [22], a infinite-horizon noncooperative differential game model is formulated to describe the competition of dynamic spectrum leasing among secondary service providers. The instantaneous profit of secondary service provider is defined as the difference between the instant revenue and the cost of provider by devising two weighted cost factors. Both the open-loop NE strategy and the closed-loop NE strategy to the optimal control structure are also obtained. In [23], a noncooperative Stackelberg differential game model in the upper level of the hierarchical dynamic differential game framework is proposed to investigate the incentive mechanism for spectrum sharing between small cell service providers (SSPs) and macrocell service provider (MSP). Also, an open-loop Stackelberg equilibrium is derived as the solution of the optimal dynamic pricing problem for MSP and the dynamic open access ratios for SSPs.
In addition, recent work has investigated the noncooperative game models [24]-[26] and the cooperative game models [27]-[29] for power control in CR networks. In [24], a payoff function incorporating the utility function and the pricing function is presented in game model. The utility function is devised from the throughput perspective, and the pricing function is formulated as the exponential interference function. In [25], a cost function in game model is defined as logarithmic function in view of the guarantee of adequate QoS and interference temperature, and a distributed power iterative algorithm is developed. In [26], a game theoretic framework is established to solve the distributed power control under the condition of multiple secondary source-to-destination pairs and PUs. Also, a distributed algorithm is proposed to achieve a time average performance as good as that achieved when NE is chosen in hindsight. In [27], a cooperative Nash bargaining power control game model is formulated, in which interference power constraints and minimum signal-to-interference ratio (SIR) requirements are taken into account. An SIR-based utility function is further designed to comply with all the axioms in the Nash theorem, which guarantees the uniqueness and proportional fairness of the game equilibrium. In [28], a power control problem is modeled as a cooperative game under interference temperature limit, and a distributed algorithm that converges to the optimal solution of the power control problem based on Nash bargaining solution (NBS) is presented. Different from the utility function defined in [28], a utility function based on a fairness factor is designed in [29], and a distributed power control algorithm based on NBS is also introduced.
The differences between this paper and previous works are summarized as follows:
3. System Model and Assumptions
We consider a distributed underlay CRANET scenario as depicted in Fig. 1. In this scenario, PUs send their data to the primary base station (PBS) through the cellular primary networks with a finite set of m cells. Let M={1,2,…,m} denote the set of cells in the cellular primary networks. It is noteworthy that the spectrum bands of primary networks are divided into two sections, i.e., the uplink spectrum bands and the downlink spectrum bands. We assume that cell l∈M has a licensed access to a given uplink and downlink spectrum band, and different cells hold different spectrum bands for interference constraint. The uplink spectrum band of cell l∈M is divided into C uplink channels used for licensed PUs. We also employ the independent and identically distributed alternating ON-OFF process to model the occupation time length of PUs in uplink channels. Specifically, the OFF state indicates the idle state where the unoccupied uplink channels or so called spectrum opportunities (SOPs) can be freely occupied by SUs. By the means of collaborative spectrum sensing [30], SUs can only leverage the OFF state to access the SOPs over the authorized uplink channels.
Fig. 1.Coexistence of the distributed CRANET and the cellular primary networks.
Let t0 and T denote the starting time and the terminal time of dynamic power control in this scenario, respectively. We assume that n SU transmitter-receiver pairs are randomly distributed in cell l∈M where the PUs are inactive within time interval [t0,T]. The SU pairs in cell l∈M can exchange the control messages with the help of the common control channels (CCCs). Moreover, the SUs are either fixed or slowly moving in cell l∈M. For simplicity, the terms “SU” and “pairs” are used interchangeably henceforth. Let N={1,2,…,n} denote the set of SU pairs in cell l∈M. Owing to the randomness of the data traffic of PUs as well as the dynamics of the behavior of PUs, the SOPs are available for useage by SU i with a probability of δi, for i ∈ N . According to [31], the SOP usage probability δi by SU i is written as
where αi is the probability that SOP transits from OFF state to ON state, and βi is the probability that SOP transits from ON state to OFF state.
We assume that the propagation channel of SU pairs is characterized by a slow-fading channel model, in which the channel conditions remain constant throughout time interval [t0,T]. Let ζl and L be the interference caused by the PBS in cell l ∈ M and the normalized spread sequence length, respectively. At time instant s ∈ [t0,T], the SIR of SU i in cell l ∈ M, for i ∈ N, is given as
where pi(s) is the transmit power of SU i, hji is the channel gain from SU transmitter j to SU receiver i, N0 is the SU receiver’s background noise power. In the case of slow-fading channel model, the channel gain is defined as , where dji is the distance from SU transmitter j to SU receiver i , A > 0 is the constant gain, and θ is the propagation loss factor for outdoor wireless communications.
Let be the maximum transmit power of SU i . Then we have pi(s)≤ . Due to the impact of propagation loss of wireless link, we state that can be generally written as , where ηi denotes the transmission loss from SU transmitter i to SU receiver i , and ref denotes the received reference power at SU receiver i , respectively. In the analytical derivation, ηi=(c/4πfdii)2, where dii is the distance from SU transmitter i to SU receiver i, f is the carrier frequency operating under the uplink channel, and c is the speed of light. Here, we further assume that the received reference power is equal for all the SU receivers and define this power as a baseline power factor. In this case, by taking into account the constraint of propagation loss of wireless link, we assume that is a monotone increasing function of the distance from SU transmitter i to SU receiver i. To be specific, we can further have where pb is the predefined baseline power factor. Let dl be the cell radius of cell l. So we have the constraint of 0< dii ≤ 2dl. We also suppose that dii can be acquired over time through sensing the surrounding environment by SU i with the help of CCC. and pb is the received reference power at SU receiver i.
Let Ji(s) denote the throughput of SU i in cell l ∈ M at time instant s ∈ [t0,T]. According to Shannon’s capacity formula, Ji(s) s is approximately formulated as follows
where k = 1.5 (− ln ( 5BER ) ) is a constant for an acceptable bit error rate (BER) requirement [32]. Note that this is a reasonable choice for a slow-fading channel, such as additive white Gaussian noise environment. Therefore, the total system throughput J of the distributed CRANET is given as
With respect to the considered CRANET, there are two constraints that need to be taken into account.
C1) QoS constraint: let denote the target SIR of SU i. In order to maintain a certain QoS requirement, the SIR of SU i should be subject to the QoS constraint as follows
C2) Total interference constraint: in order to avoid bringing excessive interference to the cellular primary networks, it is required that the total transmit power of SUs in cell l ∈ M should not exceed the constraint of the average interference power threshold Pth. We assume that the interference measurement point (IMP) is located in the center of cell l. Let di be the distance from SU transmitter i to the IMP of PBS in cell l ∈ M. Thus, the channel gain from SU transmitter i to the IMP of PBS is defined as . Further, we obtain the total interference constraint as follows
4. Differential Game Formulation and Cooperative Solution
4.1 Payoff Function and Differential Game Formulation
Recall the SOP usage probability δi in (1). To descirbe the impact of SOP switching due to ON-OFF state of PUs on the channel conditions, we characterize the channel stability factor of SU i in cell l ∈ M, which should be defined as a function of some parameters such as the SOP usage probability and the channel gain. Therefor, without loss of generality, the channel stability factor Si of SU i in cell l ∈ M is measured by
Reamrk that the channel stability factor Si provides the tradeoff between the requirement for quickly finding an stable available SOP and the need for better channel conditions. Taking into consideration the impact of the differentiated types of data traffic on the demand of transmission quality, we introduce the traffic sensing factor to quantify the priorities of the different data traffic. We assume that the priority factor of the data traffic is denoted by υ. Based on the Enhanced Distributed Channel Access (EDCA) mechanism [33], the traffic sensing factor ρi of SU i in cell l ∈ M is defined as
where Randomi(υ) is a pseudo-random integer of SU i drawn from uniform distribution over the interval [0,CW(υ)], χ(υ) is an arbitration interframe space number, and the contention window CW(υ)∈[CWmin(υ), CWmax(υ)] is an integer within the range of values of the contention window limits CWmin(υ) and CWmax(υ). The parameters measured by the EDCA mechanism [33] used in (8) are defined as in Table 1. It is noteworthy that aCWmin and aCWmax denote the minimum size and the maximum size of CW(υ), respectively. According to the EDCA mechanism [33], we assume aCWmin=31 and aCWmax=1023 in the case of physical layer specification of direct sequence spread spectrum. From (8) and Table 1, we can observe that the higher ρi announced by SU i implies the lower priority with respect to the current type of data traffic.
Table 1.Data Traffic Types and Parameters
In general, SU i in cell l∈M wants to enhance the transmit power aiming at achieving the better system performance and dealing with the channel impairments. On the other hand, this higher system performance is obtained at the expense of increased unacceptable interference to PUs and other SUs. Recall that the transmit power of SU i should also satisfy the total interference constraint. Then it is also necessary to reduce the transmit power of SU i without generating excessive interference to PUs. Therefore, on one hand, SU i needs to pay for the used SOP to transfer the prioritized data traffic, and the cost is determined by the tradeoff between the power enhancement and the power decline. On the other hand, SU i needs to pay for the accumulated power interference to PUs. Taking into account the constraint of both the impact of SOP switching on the channel conditions and the priority of data traffic, we formulate the cost function of the dynamic power reduction via Definition 1.
Definition 1: The cost function of the dynamic power reduction for SU i is given by
Let y(s) and ui denote the stock of accumulated power interference to PUs and the pricing factor announced by SU i, respectively. Notice that the pricing factor ui implies the unit cost that SU i needs to pay due to the accumulated power interference to PUs. Thus, ui⋅y(s) corresponds to the net utility function with pricing for PUs.
Definition 2: The cost function of the accumulated power interference to PUs for SU i is given by ui⋅y(s).
Therefore, based on the cost functions formulated by Definition 1 and Definition 2, the payoff function Φi(pi,y) of SU i at time instant s ∈ [t0,T] is given as follows
It is clear that the payoff function Φi(pi,y) is a continuously differentiable function of pi(s) and y(s). Notice that the lower ρi or the higher priority for the current type of data traffic results in the higher payoff borne by SU i. Depending on the differential game theory framework [34], pi(s) in (9) will be viewed as the strategy or the control variable, while y(s) in (9) will be regarded as the state variable in differential game. In general, the strategy means the choice of action or behavior by player in differential game. The motivation behind using differential game is that the players need to dynamically adjust the transmit power, while traditional game theoretic formulation can be used mostly for static power control. Let SU i be player i of differential game. We assume that Qi(pi,y) is the terminal payoff of player i at time T. To this end, when the game terminates at time T, player i will receive a terminal payment of Qi(pi,y). With the underlying structure of differential game in mind [34], the payoff function of player i at time instant s ∈ [t0,T] in differential game holds an explicit game structure given as
where 0<σ<1 is the constant discount rate. It should be noted that Φi(pi,y) and Qi(pi,y) have to be discounted by the factors e−σ(s−t0) and e−σ(T−t0), respectively. For convenience of derivation, we relax the time interval of the game and discuss the infinite-horizon differential game (i.e., T→∞), and we also set t0=0. Moreover, it is easy to verify that . Hence, bearing in mind the game structure of differential game in (10), the objective of SU i is to minimize the payoff function
where r>0 is the constant discount rate. According to differential game theory [34], the state variable y(s) in (11) is assumed to satisfy the differential equation as follows
where τ>0 is the penalty factor of the stock of accumulated power interference to the cellular primary networks. Here, denote More formally, (11) and (12) constitute the differential game model for distributed dynamic power control.
4.2 Cooperative Solution
In this subsection, we proceed to identify the optimal cooperative solution to our proposed differential game model in (11) and (12), and we will solve the dynamic optimization problem. The technique of dynamic programming developed by Bellman will be exploited to obtain the optimal solution to our game model. The technique is given by Lemma 1 [34].
Lemma 1: A set of optimal strategies b∗=p∗(y) constitutes an optimal solution to the differential game model in (11) and (12), if there exists continuously differential value function W(pi,y) defined on Rn→R satisfying the Bellman equation as follows
where Φi(pi,y) and f(pi,y)
Definition 3: The grand coalition N is a coalition set N={1,2,…,n} containing all players which agree to cooperate according to an agreed upon payoff allocation principle.
Clearly, in view of cooperative power control, all SUs in cell l∈M constitute the grand coalition N to cooperatively regulate their transmit power. Hence, we formulate a dynamic programming problem to obtain a set of optimal strategies of SUs to the distributed dynamic cooperative power control as follows
Let a set of optimal strategies be the transmit power of n SUs under the condition of the grand coalition N to the dynamic programming problem in (14). We assume that there exists continuously differentiable function W(pi,y) which satisfies the Bellman equation based on Lemma 1 as follows
Theorem 1: A set of optimal strategies provides the transmit power of n SUs under the condition of grand coalition N, and the continuously differentiable function W(pi,y) is expressed as follows
Proof: Performing the minimization operation on the right side of (15) yields
Substituting in (17) into (15), we obtain
Upon solving the differential equation in (18), we obtain
Thus, the optimal strategy of of SU i is formulated as
Hence, the function W(pi,y) can be also obtained by (16).
From Theorem 1, we observe that the value of the optimal strategy of SU i is inversely proportional to both the pricing factor and the traffic sensing factor. Based on Theorem 1, we assume that the pricing factor and the traffic sensing factor of SU i in cell l∈M are given as ui=ul and ρi=ρl, ∀i∈N. It is reasonable to give this assumption in that the pricing factor can be equal to the same value, while the same data traffic type is also leveraged by SUs in cell l∈M. Under this conditon, we can further demonstrate the QoS constraint together with the total interference constraint in detail by Theorem 2 and Theorem 3, respectively. For analytical simplicity, we define the notations as follows
Theorem 2: Under the condition of grand coalition N, the following QoS constraint inequality is certainly strictly guaranteed if and only if ui=ul and ρi=ρl, ∀i∈N
Proof: Under the condition of ui=ul and ρi=ρl, we obtain
Based on substituting in (23) and Si in (7) into the QoS constraint in (5), we can easily prove the inequality in (22).
Theorem 3: Under the condition of grand coalition N, the following total interference constraint inequality is certainly strictly guaranteed if and only if ui=ul and ρi=ρl, ∀i∈N
Proof: Based on we substitute in (23) and Si in (7) into the total interference constraint in (6). The derivation is very similar to that of Theorem 2, and is therefore skipped for brevity.
Based on the optimal cooperative solution to our proposed differential game model, we turn to propose the distributed dynamic cooperative power control (DDCPC) algorithm in Algorithm 1 to dynamically regulate the transmit power of n SUs under grand coalition N in cell l∈M. According to the output of Algorithm 1, we obtain the optimal transmit power of n SUs, denoted by
5. Numerical Results
Consider a distributed CRANET scenario depicted in Fig. 2, involving n=12 SU transmitter (Tx)-receiver (Rx) pairs located randomly in the range of 50m×50m square area. The IMP is located in the center of a cell. We assume that the SU pairs employ the same best effort traffic as the type of data traffic. Thus, we choose υ=3 with aCWmin=31 and aCWmax=1023. The SOP usage probability δi is assumed to be generated with αi=0.2 and βi=0.8. The channel model parameters are set to θ=3, A=0.097, and N0=−100dBm for slow-fading channel model.We assume that the interference caused by the PBS in the cell is given by ζl=−20dBm and the normalized spread sequence length is defined as L=128. We choose the carrier frequency operating under the uplink channel f=890.4MHz. The received reference power at SU receiver is pb=20mW, and the initial transmit power pi(0)=2.26mW. We assume the target SIR = 8 for SU i, and the average interference power threshold Pth=−80dBm. The constant in the throughput of SU i is k=0.2. In the differential game model, we choose the pricing factor ui=ul=2.5 and the penalty factor τ=5. For performance comparison, we consider the existing classical distributed constrained power control (DCPC) algorithm in [14]. It should be noted that DCPC algorithm distributively and iteratively searches for the power level which is updated from the ςth iteration to the (ς+1)th iteration based on the current SIR. The iterative function of power adjustment in DCPC algorithm is given as
Fig. 2.The simulation scenario.
Fig. 3 shows the transmit power comparison between the proposed DDCPC algorithm under different discount factor r and DCPC algorithm. From Fig. 3, it is apparent that an increase in distance from SU Tx to SU Rx will result in an enhancement of the transmit power. In general, high transmission distance needs high average power as the expense. Moreover, it can be observed the transmit power of the proposed DDCPC algorithm is obviously lower than that of DCPC algorithm from 10m to 35m in distance from SU Tx to SU Rx. This implies that the proposed DDCPC algorithm is more adaptable to the scenario of power control in short distance between SUs. In addition, by observing the impact of discount rate r on the power regulation of the proposed DDCPC algorithm, we find that reducing discount rate r yields the lower transmit power. This can be explained by the fact that DCPC algorithm yields more power consumption for maintaining a certain SIR. However, the transmit power of the proposed DDCPC algorithm relies on the maximum transmit power of SUs and the power levels can be further reduced via the change of discount rate r. Essentially, this signifies the importance of selecting discount rate r on the transmit power control in the proposed DDCPC algorithm.
Fig. 3.The transmit power comparison between DDCPC algorithm and DCPC algorithm.
Fig. 4 compares the payoff of SUs under different discount factor rin the proposed DDCPC algorithm. From Fig. 4, we can see that the payoff of SUs will increase when the distance from SU Tx to SU Rx increases. This can be explained by the fact that more power consumption will result in more payoff that SUs need to pay with the increase of the distance from SU Tx to SU Rx. Moreover, it is clear that the payoff of SUs with lower discount factor r could be less than that with larger discount factor r. The reason for this is that the payoff function of each SU must be discounted by the factor e−rs at time instant s under the differential game structure. Then we find that the lower discount factor r will lead to the reduction of the factor e−rs. Further, the payoff will also be decrease.
Fig. 4.The impact of constant discount rate on the payoff of SUs.
Next, we examine the effect of different number of SUs n on the total throughput between the proposed DDCPC algorithm under different discount factor r and DCPC algorithm as depicted in Fig. 5. It is seen that with the increase of number of SUs, the total throughput of the proposed DDCPC algorithm will increase significantly, while the total throughput of DCPC algorithm will be growing slowly. This is a direct consequence of the design of SIR balancing used in DCPC algorithm. However, the proposed DDCPC algorithm just take into account the conditon of QoS constraint in obtaining the total throughput. In particular, the total throughput of the proposed DDCPC algorithm increases as the discount rate r decreases. This emphasizes the importance of selecting discount rate r on the total throughput in the proposed DDCPC algorithm.
Fig. 5.The total throughput comparison between DDCPC algorithm and DCPC algorithm.
Finally, to evaluate the impact of different number of SUs n on the total payoff of SUs, Fig. 6 shows the total payoff of SUs under different discount factor r. From Fig. 6, we can see that the total payoff of SUs will increase with the growing of the number of SUs. This can be easily explained by the fact that more number of SUs will result in more power consumption. Also, more power consumption will generate more payoff that SUs need to pay. Moreover, similar to the phenomenon in Fig. 4, it is seen that the payoff of SUs with lower discount factor r could also be less than that with larger discount factor r under the same number of SUs. This is due to the fact that there is lower discounted portion by the factor e−rs to the payoff function of each SU with lower discount factor r.
Fig. 6.The impact of constant discount rate on the total payoff of SUs.
6. Conclusion
In this paper, we have developed a differential game theoretic approach for distributed dynamic cooperative power control in the underlay CRANETs. We devise the payoff function of SU, and propose a differential game model for the distributed dynamic power control. By constructing the grand coalition, we present a dynamic programming problem to the proposed distributed dynamic cooperative power control model. Moreover, we obtain a set of optimal strategies of SUs, and show the effect of the channel stability factor and the traffic sensing factor on both the QoS and the total interference constraint. Based on the set of optimal strategies of SUs, we develop a distributed dynamic cooperative power control algorithm called the DDCPC algorithm to dynamically adjust the transmit power of SUs under grand coalition.