1. Introduction
By offering flexible access and powerful data management services, cloud storage plays an indispensable role in creating revenue for enterprises’ business and benefiting individuals’ life. From reports of IDG and Garter, 127 billion USD is spent globally on public cloud in 2017 [1], and approximately 28% of the total market revenue of cloud service, will be produced from infrastructure, application and business processing service in 2021 [2]. In the age of “Big Data” with critical “Data that is big”, cloud data storage size will swell to trillion gigabytes in 2025[1]. By managing huge data on different storage clouds, a great number of data owners enjoy customized applications for their business or utilities. When data owners’ mobile devices are of limited computation capacity or belong to an organization, their connections to remote cloud are always controllable. In such restricted outsourcing setting, a proxy with authorizations, could help data owners to perform data processing tasks before outsourcing them to remote clouds [3]. However, security risk of outsourcing data integrity still remains as major problem for cloud storage services even if equipped with encryption technologies [4], since outsourced data may be tampered by cloud system failure and external attack. To keep good reputations, cloud storage providers might also hide from the data owners that service quality is deteriorated by deleting data to save cost.
Fortunately, with remote integrity checking technology [5], data owners could turn to a third party auditor (TPA) for public auditing service, but great tasks from huge data of multiple owners will degrade TPA’s performance and it is also undesirable when individual owner’s data content is visible for TPA during the auditing process. Therefore, in the delegated proxy processing setting, it is imperative to enable secure and efficient remote integrity auditing for multiple owners. Meanwhile, data should be privacy-preserving if integrity auditing is conducted publicly by a third party auditor.
Provable Data Possession (PDP) [5] proposed by Ateniese, as a critical probabilistic remote integrity checking technology, could allow efficient data integrity auditing without having to download the entire data copy. With error correcting code, Shacham designed proof of retrievability [6] to check possibility of polynomial time data recovering. In 2010, based on Public key Infrastructure (PKI), Wang et al. supported cloud data integrity public auditing for the first time in [7], by employing a third party auditor to perform PDP in a privacy preserving manner. For auditing scalable storage data, distributed cloud data integrity for single owner was studied by Zhu et al.’s cooperative PDP [8], and Yang et al. made further effort of enabling the multiple clouds data integrity auditing for multiple data owners [9]. There are also data auditing schemes with special features, such as multiple data storage replica [10] and group user data share [11] and revocation. In [12][13], PDP scheme is investigated to support auditing for data with dynamic update. For recent years, continuous progress were made on cloud data auditing in [14-16] and key word search on encrypted data for fog computing and crowdsourcing [17-20]. However, these famous works were all built on PKI, where each owner’s public key certificate is required to be transferred to check public key indeed belonging to the owner.
To eliminate the complicated management issue of public key certificates, Zhao et al. proposed the first identity-based public auditing scheme [21] with PDP, to enable public auditing with identity based cryptography [22] and privacy-privacy auditing with TPA. In 2015, Wang et al. designed the identity based distributed PDP [23] to support multi-cloud storage for single owner. By combining PKI based PDP and Identity based signature [24], in 2016, Liu et al. considered generic construction of identity-based PDP [25]. Later, Yu et al. enabled zero knowledge privacy integrity checking for identity based PDP in [26]. For data auditing in the cloud access restricted setting, Wang et al. for the first time proposed an identity based PDP scheme with authorized proxy to process data in [27]. This work could support single owner’s data on single storage cloud, but not considering privacy-preserving issue for public auditing. There is also design on the lattices cryptography [28]. Spontaneously, security flaws were found in some classic design but luckily were repaired in [25][29][30]. So the challenging problem still remains to be unsolved, i.e., how to efficiently perform multiple cloud data auditing with all the following desirable features: 1) by identity-based cryptography, 2) with privacy-preserving, 3) for multiple data owners, and 4) with proxy data processing.
In 2017, Yu et al. designed an identity based batch public auditing scheme [32], trying to facilitate secure data integrity auditing to address the challenges mentioned above. However, after careful analysis upon potential malicious behaviors, this work is not able to achieve better efficiency and security simultaneously.
Contributions: Firstly, for the sake of data security, we demonstrate that Yu et al.’s work [32] is vulnerable to data loss and proxy private key recovering attacks. On one hand, malicious clouds are able to use masked data in place of original data to pass integrity auditing. For the other, arbitrary two pairs of data and tags are sufficient to recover private key of owners’ authorized proxy. In this way, the exclusive right of generating proxy tag will be undermined by clouds and data owners. Secondly, we propose our improved scheme for this proxy processing setting, which could perform identity-based privacy-preserving batch public auditing and resist these above security flaws. Thirdly, we prove security of our scheme in random oracle under CDH, BDH and DL assumptions. In the end, with the extensive overhead analysis and simulation, our improved scheme illustrates better auditing efficiency over an identity-based proxy-oriented data uploading and remote data integrity checking in public cloud (ID-PUIC) [27] with single owner effort on single storage cloud, such that it could contribute to secure big data storage if extrapolated to real application.
Paper Organization: The rest of the paper starts with notations in Section 2 and reviews of system model of identity-based batch public auditing with proxy processing scheme (ID-BPAPP) along with its system components and security model in Section 3. After revisiting of Yu et al.’s construction of an ID-BPAPP scheme in Section 4, two security shortcomings are demonstrated in Section 5. We present our improved scheme Sec-ID-BPAPP in Section 6, and formally prove its security in Subsection 6.1 under random oracle model. In Section 7, we compare our improved scheme with Wang et al.’s ID-PUIC, in the context of overheads theoretical analysis and simulation, to study the trend of efficiency for computation and communication. Section 8 concludes our paper.
2. Preliminary
2.1 Notations and computational assumption
- G1 and G2 are two cyclic groups of same large prime order q, additive and multiplicative groups respectively. e is a bilinear pairing mapping, where : \(G_{1} \times G_{1} \rightarrow G_{2}\)
- (mpk,msk) are the Private Key Generator (PKG)’s master public and private key pair. ski is i-th data owner’s corresponding identity-based private key.
- There are n0 number of data owners, outsourcing total N number of blocks, on nj number of clouds. \(\tilde{F}_{i j k}\) is the i-th data owner’s k-th block outsourced on j-th cloud \(C S_{j} \cdot \sigma_{i j k}\) is the tag of block \(\tilde{F}_{i j k}\).
-f is a pseudo random function (PRF) \(f: Z_{q} \times\{1, \cdots, N\} \rightarrow Z_{q}\) for generating challenging co-efficient to combine challenged blocks.
- \(\pi\) is a pseudo random permutation \(\pi: Z_{q} \times\{1, \cdots, N\} \rightarrow\{1, \cdots, N\}\) for generating index of challenged block.
- chal is the challenge token generated by auditor, and chalj is the specific challenge token for \(C S_{j} \cdot c_{i j}\) is number of challenged blocks for i-th owner on CSj, where \(c_{i j}.
- \(a_{i j} \in\left[1, c_{i j}\right]\) indicates the aij-th selected block of total dij challenged blocks, which should further specify index of i-th data owner’s k-th block outsourced on j-th cloud, i.e., \(k=\pi_{v_{i j, 1}}\left(a_{i j}\right)\)
- 𝐶 is the index set of challenged clouds picked by auditor. 𝑂 is the index set of data owners on challenged blocks, and J is the index set of challenged clouds, where \(|O|=n_{1},|J|=n_{2} \cdot P_{j}\) is the proof of storage generated by challenged cloud CSj.
CDH problem on 𝐺1 : Given 𝑔, 𝑔𝑎,𝑔𝑏 ∈ 𝐺1 , to compute \(e(g, g)^{a b w} \in G_{2}\) with a probabilistic polynomial time (PPT) algorithm, without knowing random 𝑎, 𝑏 ∈ 𝑍𝑞.
BDH problem on 𝐺2: Given 𝑔, 𝑔𝑎,𝑔𝑏 ,gw∈ 𝐺1 , to compute \(e(g, g)^{a b w} \in G_{2}\) with a PPT algorithm, without knowing random 𝑎, 𝑏, 𝑤 ∈ 𝑍𝑞.
DL problem on \(G_{2}: \text { For } g^{\prime} \in G_{2}, \text { given } g^{\prime} a\), to compute 𝑎 with a PPT algorithm
3. System model and Security model of ID-BBPAP system
In this section, we will first present system model of Identity-Based Batch Pubic Auditing scheme with Proxy Processing (ID-BPAPP) from the original paper [32]. The system components are described with general structures of seven algorithms. We also give the security model of the ID-BPAPP system.
3.1 System Model
As it depicts in Fig. 1, there are five kinds of entities in an ID-BPAPP scheme, i.e., the PKG, data Owners, Proxy, multiple Clouds ({Cloudj}), and a TPA. PKG initializes the system parameters and extracts private keys for data owners of their own identities. Data Owners delegate Proxy to process their massive data before storing them in multiple clouds. Proxy of abundant computation and bandwidth resource, helps data owners to generate proxy data tags and upload them to clouds, with data owners’ special warrants. Multiple Clouds maintain powerful storage and computation resources to provide storage service for data owners. The TPA is a trusted third party auditor to offer the batch data integrity verification on multiple clouds for the data owners.
Fig. 1. Architecture of ID-Batch Public Auditing with Proxy Processing.
3.2 System components of an ID-BPAPP scheme
- Setup (1𝑘) → (𝑝𝑎𝑟𝑎𝑚𝑠, 𝑚𝑝𝑘, 𝑚𝑠𝑘) is initialized by PKG with security parameter 𝑘 as input. It outputs public parameters , master key pairs(𝑚𝑝𝑘, 𝑚𝑠𝑘).
- Extract (𝑝𝑎𝑟𝑎𝑚𝑠, 𝑚𝑠𝑘,𝐼𝐷𝑖) → 𝑠𝑘𝑖 is executed by PKG with as input parameters 𝑝𝑎𝑟𝑎𝑚𝑠, master private key 𝑚𝑠𝑘 and data owner’s identity 𝐼𝐷𝑖, and outputs the private key 𝑠𝑘𝑖 for this owner. It also extracts private key 𝑠𝑘𝑝 for proxy of 𝐼𝐷𝑝.
- ProxyKeyGen (𝑝𝑎𝑟𝑎𝑚𝑠, 𝐼𝐷𝑖, 𝑠𝑘𝑖, 𝐼𝐷𝑝, 𝑠𝑘𝑝) → 𝑢𝑝𝑖 is run by proxy with interaction of data owner. With input of parameter 𝑝𝑎𝑟𝑎𝑚𝑠 and its private key 𝑠𝑘𝑖 , data owner of 𝐼𝐷𝑖 generates warrant and corresponding signature to send to proxy. Then the proxy of 𝐼𝐷p outputs the proxy secret key 𝑢𝑝𝑖 with its private key 𝑠𝑘𝑝 .
- TagGen (𝑝𝑎𝑟𝑎𝑚𝑠, 𝐼𝐷𝑖, 𝑠𝑘𝑝, 𝑢𝑝𝑖, 𝑚𝑝𝑘,\(\left.\left\{\tilde{F}_{i j k}\right\}\right) \rightarrow\left\{\sigma_{i j k}\right\}\) is run by proxy. It takes as input public parameters , owner’s identity IDi , its individual private key 𝑠𝑘p, corresponding proxy secret key 𝑢𝑝𝑖, master public key and owner’s data blocks \(\left\{\tilde{F}_{i j k}\right\}\) to be outsourced on the corresponding clouds. Then the proxy tags \(\left\{\sigma_{i j k}\right\}\) of above blocks could be generated.
- Challenge ({(𝑖,𝑗, 𝑘)}) → (𝑐ℎ𝑎𝑙,{𝑐ℎ𝑎𝑙𝑗}) is executed by third party auditor (TPA). It takes as input data index set {(𝑖,𝑗, 𝑘)} and randomly selects some indexes as the challenge token 𝑐ℎ𝑎𝑙 for one instance. According to the specified indexes {𝑗}, challenge token 𝑐ℎ𝑎𝑙 is further divided into a set of tokens {𝑐ℎ𝑎𝑙𝑗} and only forward 𝑐ℎ𝑎𝑙𝑗 to the corresponding 𝑗-th cloud 𝐶𝑆𝑗.
- ProofGen (𝑝𝑎𝑟𝑎𝑚𝑠, 𝑐ℎ𝑎𝑙𝑗,{𝐼𝐷𝑖},{𝜎𝑖𝑗𝑘},\(\left.\left\{\tilde{F}_{i j k}\right\}\right) \rightarrow P_{j}\) is run by cloud 𝐶𝑆𝑗. It takes as input parameters 𝑝𝑎𝑟𝑎𝑚𝑠, challenge token received 𝑐ℎ𝑎𝑙𝑗 , the specified set of data owners’ identities {𝐼𝐷𝑖}, the set of tags {𝜎𝑖𝑗𝑘}, and the blocks \(\left\{\tilde{F}_{i j k}\right\}\). Then the proof 𝑃𝑗 is generated for challenge token 𝑐ℎ𝑎𝑙𝑗, and is sent back to TPA.
- Verify (𝑝𝑎𝑟𝑎𝑚𝑠, 𝑐ℎ𝑎𝑙,{𝐼𝐷𝑖},{𝑃𝑗}, 𝑚𝑝𝑘) → {0,1} is executed by TPA. It takes as input public parameters 𝑝𝑎𝑟𝑎𝑚𝑠, challenge token 𝑐ℎ𝑎𝑙, specified set of data owners’ identities {𝐼𝐷𝑖}, set of proofs {𝑃𝑗} from all challenged clouds, and the master public key 𝑚𝑝𝑘. 1 will be output if the proofs are valid, otherwise 0 is output.
3.3 Scurity Model
In an ID-BPAPP scheme, we assume PKG is trusted to execute the scheme, and proxy honestly generates tags but may have management fault of data before tag generation. Meanwhile, original data owners might generate data tag themselves without the delegated proxy. Clouds could also hide data accident for the sake of reputation and saving cost, and TPA is trusted but curious about the data content. A secure ID-BPAPP scheme should satisfy three properties:
1) Proxy-protection: Data owners themselves are not able to masquerade as proxy to generate tags. Only proxy with authorization warrant could generate proxy tags.
2) Unforgeability: It is infeasible to fabricate valid data storage proofs to pass the auditing of TPA if any cloud data is modified or deleted.
3) Privacy-preserving: Real data content will not be revealed during the process of auditing. According to the security requirements, we review the three formal definitions as follows:
- Definition 1 (Proxy-Protection): The scheme is proxy-protected, if any probabilistic polynomial data owner wins the proxy Tag-Forge game below in probabilistic polynomial time (PPT), with negligible probability.
- Setup: The challenger C1 playing in the role of PKG and TPA, first generates master public/private key pairs and system parameters. It runs Extract to generate private key 𝑠𝑘p for proxy of identity 𝐼𝐷𝑝 and keeps it secret. Those public and not secret parameters could be sent to the adversary A1 , who acts as data owner.
- Queries: Besides all hash functions, A1 could adaptively query Extract for private key 𝑠𝑘𝑖 of identity 𝐼𝐷𝑖 except 𝐼𝐷𝑝 of proxy. Denote index set of identities as 𝑆1, (𝑝 ∉ S1). It could query proxy tag secret keys 𝑢𝑝′𝑖 for the pair (𝐼𝐷𝑝′ ,𝐼𝐷𝑖) except for pairs having proxy 𝐼𝐷𝑝. Denote index set of pairs as S1′ ((𝑝, 𝑖) ∉ S1′ ). Upon data block\(\tilde{F}_{i j k}, A_{1}\) could also adaptively query proxy tag \(\sigma_{p^{\prime} i j k}\) for this identity pair except having 𝐼𝐷𝑝 as proxy. Let us denote tuples set of corresponding indexes and data as \(S_{1}^{\prime \prime},\left(p, i, j, k, \tilde{F}_{i j k}\right) \notin S_{1}^{\prime \prime}\).
- Output: A1 wins the game if it creates a valid proxy tag \(\sigma_{i}^{*} j^{*} k^{*}\) for data block \(\tilde{F}_{i^{*} j^{*} k^{*}}\) by itself, for which it has neither extracted private key nor proxy tag secret key for proxy IDp, i.e., where \(p \notin S_{1},\left(p, i^{*}\right) \notin S_{1}^{\prime},\left(p, i^{*}, j^{*}, k^{*}, \tilde{F}_{i^{*} j^{*} k^{*}}\right) \notin S_{1}^{\prime \prime}\)
- Definition 2 (Unforgeability): The scheme is unforgeable if any PPT clouds win the Proofs-Forge game below, with negligible probability.
- Setup: The challenger C2 playing in the role of PKG and TPA, first generates private key 𝑠𝑘𝑝 for proxy of identity 𝐼𝐷𝑝 and keeps it secret. Those public and not secret parameters could be sent to the adversary A2 , who acts as clouds.
- First phase queries: Besides all hash functions, A2 could adaptively query Extract for private key 𝑠𝑘𝑖 of identity 𝐼𝐷𝑖 except 𝐼𝐷𝑝 of proxy. Denote index set of identities as 𝑆2, (𝑝 ∉ 𝑆2). It could query proxy tag secret keys 𝑢𝑝′𝑖 for the pair (𝐼𝐷𝑝′ ,𝐼𝐷𝑖) except for pairs having proxy 𝐼𝐷𝑝 . Denote index set of pairs as 𝑆2 ′ ((𝑝, 𝑖) ∉ 𝑆2 ′ ). Upon data block \(\tilde{F}_{i j k}, A_{2}\) could also adaptively query proxy tag \(\sigma_{p^{\prime} i j k}\) for this identity pair except having 𝐼𝐷𝑝 as proxy. Let us denote tuples set of corresponding indexes and data as \(S_{2}^{\prime \prime},\left(p, i, j, k, \tilde{F}_{i j k}\right) \notin S_{2}^{\prime \prime}\).
- Challenge: C2 generates challenge set 𝑐h𝑎𝑙 with ordered number collection {𝑐𝑖∗𝑗∗ } to specify every block \(\tilde{F}_{i^{*} j^{*} k^{*}}\) on the 𝑗∗th cloud for owner of 𝐼𝐷𝑖∗, where\(\left\{\left(p, i^{*}, j^{*}, k_{n}^{*}\right) | 1 \leq\right. \left.n \leq c_{i^{*} j^{*}}\right\}, i^{*} \neq p,\left(p, i^{*}\right) \notin S_{2}^{\prime},\left(p, i^{*}, j^{*}, k_{n}^{*}, \tilde{F}_{i^{*} j^{*} k_{n}^{*}}\right) \notin S_{2}^{\prime \prime}\) Chal will be sent to A2.
- Second phase queries : similar to First phase queries, denote index set of identities for Extract private key queries as 𝑆3, index set of identity pairs for proxy tag secret key queries as 𝑆3′ , tuple set of index and data for proxy tags queries as 𝑆3′′. We require that 𝑝 ∉ 𝑆2 ∪ 𝑆3, (𝑝, 𝑖) ∉ 𝑆2 ′ ∪ 𝑆3 ′ and \(\left(p, i, j, k, \tilde{F}_{i j k}\right) \notin S_{2}^{\prime \prime} \cup S_{3}^{\prime \prime}\)
- Output: A2 wins the game if it fabricates valid proofs {pj} for the same challenge set chal on the specified set of data blocks.
- Definition 3 (Privacy-Preserving): The ID-BPAPP scheme is privacy-preserving against TPA, if any PPT time TPA could extract any original block of data owners in the “challenge-proof-verify” integrity auditing interactions with clouds, with negligible probability. In this definition, we require that curious TPA is not allowed to recover data blocks even if it is able to fulfill task of auditing integrity of cloud data.
4. Revisiting Yu et al.’s construction of an of ID-BPAPP scheme
In this section, we will revisit the Yu et al.’s construction of an ID-BPAPP scheme with concrete designs of seven algorithms in [32].
- Setup: PKG uses this algorithm to generate a bilinear map 𝑒: 𝐺1 × 𝐺1 → 𝐺2 with two groups 𝐺1 and 𝐺2 of the same order 𝑞 > 2𝑘 , where 𝑔 is the generator of 𝐺1 and 𝑘 is security parameter. It also selects three cryptographic hash functions, \(H_{1}:\{0,1\}^{*} \rightarrow G_{1}, H_{2}:\{0,1\}^{*} \rightarrow\) \(Z_{q}, H_{3}: Z_{q} \times\{0,1\}^{*} \rightarrow Z_{q}\), a pseudo random permutation \(\pi: Z_{q} \times\{1, \cdots, N\} \rightarrow\{1, \cdots, N\}\) and a pseudo random function \(f: Z_{q} \times\{1, \cdots, N\} \rightarrow Z_{q}\). It picks random 𝑥 ∈ 𝑍𝑞 as master private key 𝑚𝑠𝑘 and computes 𝑔𝑥 as master public key 𝑚𝑝𝑘. The global parameters are \(\left(e, G_{1}, G_{2}, g, m p k, H_{1}, H_{2}, H_{3}, \pi, f\right)\)
- Extract: Given identity IDi , PKG extracts the identity-based private key as\(s k_{i}=H_{1}\left(I D_{i}\right)^{x}\) and returns to the data owner. For the proxy of identity IDp, the private key is extracted as \(s k_{p}=H_{1}\left(I D_{p}\right)^{x}\).
- ProxyKeyGen: Data owner of 𝐼𝐷𝑖 picks up random 𝑟𝑖 ∈ 𝑍𝑞 and creates its proxy warrant 𝜔𝑖 with signature\(\) \(U_{i}=s k_{i}^{r_{i} H_{2}\left(\omega_{i} \| R_{i}\right)}, \xi_{i}=g^{r_{i}}, \text { where } R_{i}=H_{1}\left(I D_{i}\right)^{r_{i}} .\left(\omega_{i}, U_{i}, R_{i}, \xi_{i}\right)\) are sent to proxy, clouds and TPA. Upon the warrant 𝜔𝑖 , TPA and proxy could verify it with signature as \(e\left(R_{i}, g\right)=e\left(H_{1}\left(I D_{i}\right), \xi_{i}\right), e\left(U_{i}, g\right)=e\left(R_{i}^{H_{2}\left(\omega_{i} \| R_{i}\right)}, m p k\right)\), and notify the data owner if any of equations does not hold. Proxy generates the proxy secret key as \(u_{p i}=U_{i} \cdot s k_{p}^{r_{p i}}=H_{1}\left(I D_{i}\right)^{x \cdot r_{i} H_{2}\left(\omega_{i l} \| R_{i}\right)} \cdot H_{1}\left(I D_{p}\right)^{x \cdot r_{p i}}\) by selecting up random \(r_{p i} \in Z_{q}\). It also computes \(R_{p i}=H_{1}\left(I D_{p}\right)^{r_{p i}}\), which is not secret and sent to the TPA for future verification.
- TagGen: Data owner of 𝐼𝐷𝑖 first divides original data \(\tilde{F}_{i}\) into blocks \(\left\{\tilde{F}_{i j k}\right\}\), and computes each \(F_{i j k}=\tilde{F}_{i j k}+H_{2}\left(\tilde{F}_{i j k}\right)\). Data blocks \(\left\{\tilde{F}_{i j k}\right\}\), are outsourced to corresponding clouds while masked {𝐹𝑖𝑗𝑘} are sent to the proxy. Then the proxy generates proxy tag for each data block as
\(\sigma_{i j k}=s k_{p}^{H_{3}\left(i\|j\| k, n a m e_{i j k} \| t i m e_{i j k}\right)} \cdot u_{p i}^{F_{i j k}}\) (1)
where 𝑛𝑎𝑚𝑒𝑖𝑗𝑘 is the name of block \(\tilde{F}_{i j k}\), and 𝑡𝑖𝑚𝑒𝑖𝑗𝑘 is the time stamp when proxy generates the tag. All the tags {𝜎𝑖𝑗𝑘 } and the not secret 𝑅𝑝𝑖 will be transferred to corresponding clouds, which will not accept them and inform the owner unless the warrant 𝜔𝑖 and the proxy tag 𝜎𝑖𝑗𝑘 could be verified by having the following equations hold as
\(\begin{aligned} e\left(R_{i}, g\right)=& e\left(H_{1}\left(I D_{i}\right), \xi_{i}\right), e\left(U_{i}, g\right)=e\left(R_{i}^{H_{2}\left(\omega_{i} \| R_{i}\right)}, m p k\right) \\ e\left(\sigma_{i j k}, g\right)=e &\left(H_{1}\left(I D_{p}\right)^{H_{3}\left(i\|j\| k, \text {name}_{i j k}\right) \| t\left(i m e_{i j k}\right)}\right.\\ &\left.\cdot\left(R_{i}^{H_{2}\left(\omega_{i} \| R_{i}\right)} \cdot R_{p i}\right)^{F_{i j k}}, m p k\right) \end{aligned}\) (2)
- Challenge: For data owner of 𝐼𝐷𝑖 on 𝑗-th cloud’s data, TPA picks up number of challenged blocks 𝑐𝑖𝑗 < 𝑁 and random 𝑣𝑖𝑗,1 and 𝑣𝑖𝑗,2 ∈ 𝑍𝑞. Denote 𝑂𝑗 as index set of identities for owners having data on 𝑗 -th cloud. It generates the challenge token \(\operatorname{chal}_{j}=\left\{\left(c_{i j}, v_{i j, 1}, v_{i j, 2}\right)\right\}_{i \in O_{j}}\) , and sends it to 𝑗-th cloud.
- ProorGen: According to the challenge token \(\operatorname{chal}_{j}=\left\{\left(c_{i j}, v_{i j, 1}, v_{i j, 2}\right)\right\}_{i \in O_{j}}\) , the 𝑗 -th challenged cloud first generates index set 𝛿𝑖𝑗 of challenged blocks for the data owner of 𝐼𝐷𝑖 where each index 𝑘 = 𝜋𝑣𝑖𝑗,1(𝑎𝑖𝑗)(1 ≤ 𝑎𝑖𝑗 ≤ 𝑐𝑖𝑗) according to the individual challenged number 𝑐𝑖𝑗 (e.g., assuming 𝑐𝑖𝑗 = 4 , 𝑎𝑖𝑗 ∈ [1,4] could be permutated into 4 challenged blocks indexes 𝑘 ∈ {234, 8, 364, 25} with 𝜋𝑣𝑖𝑗,1(⋅) ) and then the corresponding co-efficient ℎ𝑖𝑗𝑘 = 𝑓𝑣𝑖𝑗,2(𝑖,𝑗, 𝑘) ∈ 𝑍𝑞. The proof of storage 𝑃𝑗 includes aggregate tag 𝑇𝑗′ and masked data proof \(\left\{F_{i j}^{\prime}\right\}\) for its data owners of identities with index set 𝑂𝑗:
\(T_{j}^{\prime}=\prod_{i \in o_{j}} \Pi_{k \in \delta_{i j}} \sigma_{i j k}^{h_{i j k}}, F_{i j}^{\prime}=\sum_{k \in \delta_{i j}} h_{i j k} \cdot F_{i j k}\)
Where \(F_{i j k}=\tilde{F}_{i j k}+H_{2}\left(\tilde{F}_{i j k}\right) \cdot P_{j}=\left(T_{j}^{\prime},\left\{F_{i j}^{\prime}\right\}_{i \in O_{j}}\right)\) will be sent to the TPA.
- Verify: After receiving all the proofs {𝑃𝑗} from challenged clouds, the TPA denotes 𝑂 = ⋃𝑗∈𝐽 𝑂𝑗 as identity index set of all the challenged owners according to challenge tokens \(\left\{c h a l_{j}\right\}=\left\{\left\{\left(c_{i j}, v_{i j, 1}, v_{i j, 2}\right)\right\}_{i \in O_{j}}\right\}_{j \in J}\) , and computes index set of all challenged blocks by \(\{k\}=\left\{\pi_{v_{i j, 1}}\left(a_{i j}\right) | 1 \leq a_{i j} \leq c_{i j}\right\}\) and co-efficient set \(\left\{h_{i j k}\right\}=\left\{f_{v_{i j, 2}}(i, j, k)\right\}\) , as in ProofGen. With all valid set of warrants {𝜔𝑖} and corresponding signatures \(\left\{\left(U_{i}, R_{i}, \xi_{i}\right)\right\}\) from data owners, together with blocks’ names and time stamps \(\left\{\left(n a m e_{i j k}, t i m e_{i j k}\right)\right\}\), TPA is able to audit data integrity as :
\(\begin{aligned} e\left(\prod_{j \in J} T_{j}^{\prime}, g\right)=& e\left(\prod_{i \in O}\left(R_{i}^{H_{2}\left(\omega_{i} \| R_{i}\right)} \cdot R_{p i}\right)^{\sum_{j \in J} F_{i j}^{\prime}}\right.\\ &\left.\cdot H_{1}\left(I D_{p}\right)^{\sum_{i \in O} \sum_{j \in J} \sum_{k \in \delta_{i j}} h_{i j k} \cdot H_{3}\left(i\|j\| k, n a m e_{i j k} \| t i m e_{i j k}\right)}, m p k\right) \end{aligned}\) (3)
It will outputs 1 (valid) if the above equation holds and 0 (valid) otherwise.
5. On the security of Yu et al.’s construction of an ID-BPAPP scheme
With security analysis, Yu et al.’s construction of an ID-BPAPP scheme in [32] should satisfy security properties for data proof unforgeability and tag generation proxy-protection. However, this scheme may suffer from two security issues, as the analysis in the following.
5.1 First issue: generating valid proof without original data
In Yu et al.’s ID-BPAPP scheme, the TPA utilizes masked data proof to evaluate the original data integrity on the cloud. This design indeed makes original data content invisible to TPA to allow privacy-preserving auditing, but also leaves the room for malicious cloud to launch data attack as follows.
In the Proof, for data part \(\left\{F_{i j}^{\prime}\right\}_{i \in O_{j}}\) of proof 𝑃𝑗, honest cloud takes original data \(\tilde{F}_{i j k}\) as input to get masked data \(F_{i j k}=\tilde{F}_{i j k}+H_{2}\left(\tilde{F}_{i j k}\right)\), and do the combination with the fresh challenge co-efficient \(\left\{h_{i j k}\right\}, \text { as } F_{i j}^{\prime}=\sum_{k \in \delta_{i j}} h_{i j k} \cdot F_{i j k}\)⋅ Obviously, the fresh challenge co-efficient is combined with masked data, rather than directly with the original data. Therefore, after generating tag part 𝑇𝑗 ′ from correct tags, malicious cloud is able to generate valid integrity proof \(P_{j}=\left(T_{j}^{\prime},\left\{F_{i j}^{\prime}\right\}_{i \in O_{j}}\right)\) without having to store original data \(\tilde{F}_{i j k}\) , just combing pre-computed masked data 𝐹𝑖𝑗𝑘 and challenge co-efficient. In this way, malicious cloud could successfully pass TPA’s integrity auditing, when original data \(\tilde{F}_{i j k}\) is modified as\(\tilde{F}_{i j k}^{*}\) or even gets deleted.
5.2 Second issue: recovering private key of proxy and proxy tag secret key
With proxy-protection property, only proxy with authorization could generate the data tags for integrity auditing. As analysis below, we could find that it is feasible to recover proxy’s private key and thus impersonate proxy to generate data tag, for those who could access the data and tags.
In TagGen, for data \(\tilde{F}_{i j k}, \operatorname{tag} \sigma_{i j k}=s k_{p}^{H_{3}\left(i|| j||\left|k, n a m e_{i j k}\right| | t i m e_{i j k}\right)} \cdot u_{p i}^{F_{i j k}}\) is generated by proxy, with its individual private key 𝑠𝑘𝑝 and proxy tag secret key 𝑢𝑝𝑖 , and then uploads tag on the cloud. Afterwards, malicious cloud or curious data owner of 𝐼𝐷𝑖 , retrieve two arbitrary data blocks \(\left(\tilde{F}_{i j k_{1}}, \tilde{F}_{i j k_{2}}\right)\) with corresponding tags (𝜎𝑖𝑗𝑘1, 𝜎𝑖𝑗𝑘2), and do the computation:
\(s k_{p}=(\sigma_{i j k_{1}}^{\frac{1}{F_{i} j k_{1}} }/\sigma_{i j k_{2}}^{\frac{1}{F_{i} j k_{2}} })^{\frac{F_{i j k_{1}} F_{i j k_{2}}}{H_{3}\left(i\|j\| k_{1}, n a m e_{i j k} \| t i m e_{i j k}\right){F}_{{ i j k}_{2}}-H_{3}\left(i\|j\| k_{2}, n a m e_{i j k_{2}}{\left. \| t i m e_{i j k_{2}}\right) F_{i j k}}\right.}} \)
\( u_{p i}=\left(\left( \frac{1}{\sigma_{i j k_{1}}^{{H_{3}\left(i\|j\| k_{1}, n a m e_{i j k_{1}} \| t i m e_{i j k_{1}}\right)}}}\right) /\left(\frac{1}{\sigma_{i j k_{2}}^{{H_{3}\left(i\|j\| k_{2}, n a m e_{i j k_{2}} \| t i m e_{i j k_{2}}\right)}}}\right)\right)^{EX} \)
\( E X=\frac{H_{3}\left(i|| j|| k_{1}, \text { name }_{i j k_{1}} \| \operatorname{tim} e_{i j k_{1}}\right) H_{3}\left(i|| j|| k_{2}, \operatorname{name}_{i j k_{2}} \| \operatorname{tim} e_{i j k_{2}}\right)}{F_{i j k_{1}} H_{3}\left(i|| j|| k_{2}, \operatorname{name}_{i j k_{2}}|| \operatorname{tim} e_{i j k_{2}}\right)-F_{i j k_{2}} H_{3}\left(i|| j|| k_{1}, \operatorname{name}_{i j k_{1}} \| \operatorname{tim} e_{i j k_{1}}\right)} \)
Where masked data \( \left(F_{i j k_{1}}, F_{i j k_{2}}\right)=\left(\tilde{F}_{i j k_{1}}+H_{2}\left(\tilde{F}_{i j k_{1}}\right), \tilde{F}_{i j k_{2}}+H_{2}\left(\tilde{F}_{i j k_{2}}\right)\right) \).
With the recovered proxy private key 𝑠𝑘𝑝 and proxy tag secret key 𝑢𝑝𝑖, three kinds of security problems could happen. First, for new block \( \tilde{F}_{i j k_{3}} \), data owner could generate the proxy tag as 𝜎𝑖𝑗𝑘3 = \( s k_{p}^{H_{3}\left(i\|j\| k_{3}, n a m e_{i j k_{3}} \| t i m e_{i j k_{3}}\right)} \cdot u_{p i}^{F_{i j k_{3}}} \) without proxy’s processing, which will keep equations (2) (3) hold and finally help data to pass the TPA auditing. Thus proxy-protection security property cannot be guaranteed. Second, if the original block is modified to \( \tilde{F}_{i j k_{3}}^{*} \) , the malicious cloud could generate valid tag as \(\sigma_{i j k_{3}}^{*}=s k_{p}^{H_{3}\left(i\|j\| k_{3}, n a m e_{i j k_{3}} \| t i m e_{i j k_{3}}\right)} \cdot u_{p i}^{F_{i j k_{3}}^{*}}, \text { where } F_{i j k_{3}}^{*}=\tilde{F}_{i j k_{3}}^{*}+H_{2}\left(\tilde{F}_{i j k_{3}}^{*}\right)\)⋅ , without awareness of data owner and proxy. Certainly the two tags will also keep equations (2) (3) hold and help to generate valid integrity proof, but unforgeability property cannot be guaranteed for falling to check data modification. This will leads to the serious data loss situation: cloud could keep only one block and delete rest of data to pretend that all the blocks are equal in the value, simply computing valid proxy tags with all their indexes and information. Third, the digital property belonging to proxy, will be in the great risk of illegal access, due to the recovered proxy individual private key by other entities.
6. Our improved construction of an ID-BPAPP scheme
- Setup: PKG uses this algorithm to generate a bilinear map 𝑒: 𝐺1 × 𝐺1 → 𝐺2 with two groups 𝐺1 and 𝐺2 of the same order 𝑞 > 2𝑘 , where 𝑔 is the generator of 𝐺1 and 𝑘 is security parameter. It also selects four cryptographic hash functions, \(H_{1}:\{0,1\}^{*} \rightarrow G_{1}, H_{2}:\{0,1\}^{*} \rightarrow\) \(Z_{q}, H_{3}: Z_{q} \times\{0,1\}^{*} \rightarrow Z_{q}, H_{4}: Z_{q} \times\{0,1\}^{*} \rightarrow G_{1}\), a pseudo random permutation \(\pi: Z_{q} \times \{1, \cdots, N\} \rightarrow\{1, \cdots, N\}\) and a pseudo random function \(f: Z_{q} \times\{1, \cdots, N\} \rightarrow Z_{q}\) . It picks random 𝑥 ∈ 𝑍𝑞 as master private key 𝑚𝑠𝑘 and computes 𝑔𝑥 as master public key 𝑚𝑝𝑘.The global parameters are \(\left(e, G_{1}, G_{2}, g, m p k, H_{1}, H_{2}, H_{3}, H_{4}, \pi, f\right)\).
- Extract: Given identity 𝐼𝐷𝑖, PKG extracts the identity-based private key as 𝑠𝑘𝑖 = 𝐻1(𝐼𝐷𝑖)𝑥 and returns to the data owner. For the proxy of identity 𝐼𝐷𝑝, the private key is extracted as \(s k_{p}=H_{1}\left(I D_{p}\right)^{x}\).
- ProxyKeyGen: For data owner of 𝐼𝐷𝑖 , it picks up random 𝑟𝑖 ∈ 𝑍𝑞 and creates its proxy warrant 𝜔𝑖 with its signature \(U_{i}=s k_{i}^{r_{i} H_{2}\left(\omega_{i} \| R_{l}\right)}, \xi_{i}=g^{r_{l}}\), where \(R_{i}=H_{1}\left(I D_{i}\right)^{r_{i}}\). \(\left(\omega_{i}, U_{i}, R_{i}, \xi_{i}\right)\) are sent to proxy, clouds and TPA. Upon the warrant 𝜔𝑖 , TPA and proxy could verify it with signature as \(e\left(R_{i}, g\right)=e\left(H_{1}\left(I D_{i}\right), \xi_{i}\right), e\left(U_{i}, g\right)=e\left(R_{i}^{H_{2}\left(\omega_{i} \| R_{i}\right)}, m p k\right)\) and notify the data owner if any of equations does not hold. Proxy generates the proxy secret key as\(u_{p i}=U_{i} \cdot s k_{p}^{r_{p i}}=H_{1}\left(I D_{i}\right)^{x \cdot r_{i} H_{2}\left(\omega_{i} \| R_{l}\right)} \cdot H_{1}\left(I D_{p}\right)^{x \cdot r_{p i}}\) by picking up random \(r_{p i} \in Z_{q}\) It also computes \(R_{p i}=H_{1}\left(I D_{p}\right)^{r_{p i}}\), which is not secret and sent to the TPA for future verification.
- TagGen: Data owner of 𝐼𝐷𝑖 first divides original data \(\tilde{F}_{i}\) into blocks \(\left\{F_{i j k}\right\}\), where \(F_{i j k} \in Z_{q}\). They are outsourced to corresponding clouds and sent to the proxy. For each data block, proxy generates tag \(\sigma_{i j k}=\left(T_{i j k}, S\right)\) as
\(\begin{aligned} T_{i j k} &=\left(s k_{p} \cdot u_{p i}\right)^{H_{3}\left(i\|j\| k, \text {name}_{i l} \| t i m e_{i j k}\right)+F_{i j k}} \cdot H_{4}\left(i\|j\| k, \text {name}_{i}\left\|\operatorname{tim} e_{i j k}\right\| S\right)^{\eta} \\ S &=g^{\eta} \end{aligned}\) (4)
where 𝑛𝑎𝑚𝑒𝑖 is the name of file \(\tilde{F}_{i}\), and 𝑡𝑖𝑚𝑒𝑖𝑗𝑘 is the time stamp when proxy generates the tag, 𝜂 ∈ 𝑍𝑞. All the tags {𝜎𝑖𝑗𝑘} and the not secret 𝑅𝑝𝑖 will be transferred to corresponding clouds, which will not accept them and inform the owner unless the warrant 𝜔𝑖 and the proxy tag 𝜎𝑖𝑗𝑘 could be verified by having the following equations hold as
\(\begin{aligned} e\left(R_{i}, g\right)=e\left(H_{1}\left(I D_{i}\right), \xi_{i}\right), & e\left(U_{i}, g\right)=e\left(R_{i}^{H_{2}\left(\omega_{i} \| R_{i}\right)}, m p k\right) \\ e\left(T_{i j k}, g\right)=e(&\left.\left(H_{1}\left(I D_{p}\right) \cdot\left(R_{i}^{H_{2}\left(\omega_{i} \| R_{i}\right)} \cdot R_{p i}\right)\right)^{H_{3}\left(i\|j\|_{k}, \text {name}_{i} \| \text { time }_{i j k}\right)+F_{i j k}}, m p k\right) \\ & \cdot e\left(H_{4}\left(i\|j\| k, \text {name}_{i} \| \text {time}_{i j k} \| S\right), S\right) \end{aligned}\) (5)
- Challenge: For data owner of 𝐼𝐷𝑖 on 𝑗-th cloud’s data, TPA picks up number of challenged blocks 𝑐𝑖𝑗 < 𝑁, random 𝑣𝑖𝑗,1, 𝑣𝑖𝑗,2 ∈ 𝑍𝑞 and masking element 𝑀 = 𝑚𝑝𝑘𝑤 for 𝑤 ∈ 𝑍𝑞 . Denote 𝑂𝑗 as the index set of identities for owners having data on cloud 𝐶𝑆𝑗. It generates the challenge token \(\text { chal }_{j}=\left(\left\{\left(c_{i j}, v_{i j, 1}, v_{i j, 2}\right)\right\}_{i \in O_{j}}, M\right)\), and sends it to the cloud.
- ProofGen: According to the challenge token \(\operatorname{chal}_{j}=\left(\left\{\left(c_{i j}, v_{i j, 1}, v_{i j, 2}\right)\right\}_{i \in O_{j}}, M\right)\), the 𝑗-th challenged cloud first generates index set 𝛿𝑖𝑗 of challenged blocks for the data owner of 𝐼𝐷𝑖 where each index \(k=\pi_{v_{i j, 1}}\left(a_{i j}\right)\left(1 \leq a_{i j} \leq c_{i j}\right)\) according to the individual challenged number 𝑐𝑖𝑗 (e.g., assuming 𝑐𝑖𝑗 = 4 , 𝑎𝑖𝑗 ∈ [1,4] could be permutated into 4 challenged blocks indexes 𝑘 ∈ {234, 8, 364, 25} with 𝜋𝑣𝑖𝑗,1(⋅) ) and then the corresponding co-efficient \(h_{i j k}=f_{v_{i j, 2}}(i, j, k)\). The proof of storage 𝑃𝑗 includes aggregate tag 𝑇𝑗 ′ , 𝑆′ and masked data proof 𝑀𝑗 ′ for its data owners of identities with indexes in 𝑂𝑗:
\(T_{j}^{\prime}=\prod_{i \in O_{j}} \Pi_{k \in \delta_{i j}} T_{i j k}^{h_{i j k}}, S^{\prime}=S, M_{j}^{\prime}=e\left(\prod_{i \in O_{j}}\left(H_{1}\left(I D_{p}\right) \cdot\left(R_{i}^{H_{2}\left(\omega_{i} \| R_{i}\right)} \cdot R_{p i}\right)\right)^{F_{i j}^{\prime}}, M\right)\)
Where \(F_{i j}^{\prime}=\sum_{k \in \delta_{i j}} h_{i j k} \cdot F_{i j k}\) Proof 𝑃𝑗 = (𝑇𝑗 ′ , 𝑆′ , 𝑀𝑗 ′ ) will be sent to the TPA. (Cloud could send the proof to TPA in the secure channel or prevent modification with identity-based signature technology).
- Verify: After receiving all the proofs {𝑃𝑗} from challenged clouds, the TPA denotes 𝑂 = ⋃𝑗∈𝐽 𝑂𝑗 as identity index set of all the challenged owners according to challenge tokens \(\left\{c h a l_{j}\right\}=\left\{\left(\left\{\left(c_{i j}, v_{i j, 1}, v_{i j, 2}\right)\right\}_{i \in O_{j}}, M\right)\right\}_{j \in J}\) where 𝑀 = 𝑚𝑝𝑘𝑤, and computes index set of all challenged blocks by \(\{k\}=\left\{\pi_{v_{i j, 1}}\left(a_{i j}\right) | 1 \leq a_{i j} \leq c_{i j}\right\}\) and co-efficient set \(\left\{h_{i j k}\right\}=\left\{f_{v_{i j, 2}}(i, j, k)\right\}\), as in ProofGen. With all valid set of warrants {𝜔𝑖} and corresponding signatures {(𝑈𝑖,𝑅𝑖, 𝜉𝑖)} from data owners, together with files’ names and blocks’ time stamps {(𝑛𝑎𝑚𝑒𝑖,𝑡𝑖𝑚𝑒𝑖𝑗𝑘)}, TPA is able to audit data integrity as :
\(\begin{aligned} &e\left(\prod_{j \in J} T_{j}^{\prime}, g^{w}\right)=e\left(\prod_{i \in O}\left(H_{1}\left(I D_{p}\right) \cdot\left(R_{i}^{H_{2}\left(\omega_{i} \| R_{l}\right)} \cdot R_{p i}\right)\right)^{L_{i}}, M\right)\\ &\cdot e\left(\prod_{i \in O} \Pi_{j \in J} \Pi_{k \in \delta_{i j}}\left(H_{4}\left(i\|j\| k, \text {name}_{i}\left\|t i m e_{i j k}\right\| S^{\prime}\right)\right)^{h_{i j k}}, S^{\prime w}\right) \cdot \Pi_{j \in J} M_{j}^{\prime} \\ &\text { where } L_{i}=\sum_{j \in J} \sum_{k \in \delta_{i j}} h_{i j k} \cdot H_{3}\left(i|| j|| k, \text { name }_{i} \| \text { time }_{i j k}\right) \end{aligned}\) (6)
It will outputs 1 (valid) if the above equation holds and 0 (valid) otherwise. Correctness:
\(\begin{aligned} L H S=e\left(\prod_{j \in J} \prod_{i \in O_{j}}\right.& \prod_{k \in \delta_{i j}}\left(H_{1}\left(I D_{p}\right)\right.\\ \cdot &\left.\left.\left(R_{i}^{H_{2}\left(\omega_{i} \| R_{i}\right)} \cdot R_{p i}\right)\right)^{\left(H_{3}\left(i\|j\| k, \text {name}_{i} \| t i m e_{i j k}\right)+F_{i j k}\right) h_{i j k}},\left(g^{a}\right)^{w}\right) \\ & \cdot e\left(\prod_{j \in J} \prod_{i \in O_{j}} \prod_{k \in \delta_{i j}} H_{4}\left(i\|j\| k, \operatorname{name}_{i}\left\|\operatorname{tim} e_{i j k}\right\| S\right)^{h_{i j k}},\left(g^{\eta}\right)^{w}\right. \end{aligned}\)
\(\begin{array}{c} =e\left(\prod_\limits{i \in O}\left(H_{1}\left(I D_{p}\right) \cdot\left(R_{i}^{H_{2}\left(\omega_{i} \| R_{i}\right)} \cdot R_{p i}\right)\right)^{\sum_{j \in J} \Sigma_{k \in \delta_{i j}} h_{i j k} \cdot H_{3}\left(i|j|\left|k, n a m e_{i}\right| | t i m e_{i j k}\right)}, M\right) \\ \cdot e\left(\prod_\limits{i \in O} \prod_\limits{j \in J} \prod_\limits{k \in \delta_{i j}}\left(H_{4}\left(i|| j|| k, \text {name}_{i}|| \operatorname{tim} e_{i j k}|| S^{\prime}\right)\right)^{h_{i j k}}, S^{\prime w}\right) \\ \cdot \prod_\limits{j \in J} e\left(\left(H_{1}\left(I D_{p}\right) \cdot\left(R_{i}^{H_{2}\left(\omega_{i}|| R_{i}\right)} \cdot R_{p i}\right)\right)^{\sum_{i \in O_{j}} \sum_{k \in \delta_{i j}} h_{i j k} \cdot F_{i j k}}, M\right)=R H S \end{array}\)
6.1 Security analysis of improved scheme
Based on the system model of an ID-BPAPP scheme (Subsection 3.1) and corresponding system components (Subsection 3.2) and security model (Subsection 3.3), in this section, we prove security of our improved scheme. Compared with [27]’s security analysis, we also utilize Coron [31]’s random oracle model to define the interactions between adversary of our scheme and challenger, but with refined oracles for hash and tag queries. To prevent security flaws in [32], corresponding security reduction methods are also re-designed.
There are |𝑂| number of owners, 𝑐𝑖∗𝑗∗ number of challenged blocks on corresponding cloud for specified owner, and \(c^{*}=\left(\sum_{i^{*} \in O, j^{*} \in J} c_{i^{*} j^{*}}\right)^{-1}\), \(\widehat{N}\) number of selected identities. For oracle, 𝑞𝐻 hash, 𝑞𝐸 Extract, 𝑞𝑃 ProxyKeyGen, and 𝑞𝑇 TagGen queries are made. We assume both inversion and exponentiation operations on 𝐺1 require 𝑡𝐺1, so it is with 𝑡𝐺2, and pairing takes \(t_{e} \cdot \hat{e}\) is the natural logarithm.
Our security analysis below shows that CDH problem will be solved if breaking our scheme through forging valid proxy tag, BDH problem will be solved if fabricating storage proof without rejection, and DL problem will be solved if breaking our scheme through retrieving data value during auditing, with non-negligible probability under polynomial time.
Theorem 1 (Proxy-Protection) If there exists Probabilistic Polynomial Time (PPT) (𝑡1, 𝜖1)-adversary A1 who could generate valid proxy tag without proxy individual private key in our Sec-ID-BPAPP, then our scheme is proxy-protective when challenger C1 could solve CDH problem with non-negligibility \(\epsilon_{1}(\widehat{N}-1)^{q_{E}} /\left(\hat{e} \hat{N}^{q_{E}+q_{P}}\left(q_{E}+q_{T}+1\right)\right)\) within PPT time \(t_{1}+t_{G_{1}} \cdot\left(q_{H}+q_{E}+q_{P}+4 q_{T}+5\right)\).
Proof: There are \(\widehat{N}\) number of selected identities \(\left\{I D_{i}\right\}_{i \in O}\) having the proxy 𝐼𝐷𝑝. The original file \(\left\{\tilde{F}_{i}\right\}_{i \in O}\) will be split into blocks \(\left\{F_{i j k}\right\}_{i \in O, j \in J, k \in \delta_{i j}}\) before being outsourced on clouds \(\left\{C S_{j}\right\}_{j \in J}\).
- Setup: C1 plays in the role of PKG to choose random 𝑎 ∈ 𝑍𝑞 , then the master private /public keys pair (𝑚𝑠𝑘, 𝑚𝑝𝑘) = (𝑎, 𝑔𝑎), upon generator 𝑔 ∈ 𝐺1. It also picks random 𝑏 ∈ 𝑍𝑞. CDH instance is 𝑔𝑎, 𝑔𝑏 ∈ 𝐺1 to compute 𝑔𝑎𝑏. Although A1 is not allowed to query the target proxy tag secret keys 𝑢𝑝𝑖, the 𝑅𝑝𝑖 could be accessed as \(H_{1}\left(I D_{p}\right)^{r_{p i}} \text { by } \boldsymbol{C}_{1}\) picking up 𝑟𝑝𝑖 ∈ 𝑍𝑞.
C1 answers query by maintaining input and output list for every oracle. Especially, output is retrieved from existing record of same input, otherwise is generated as follows and C1 builds new record in the corresponding list.
- Hash function Oracle: 𝐻2 and 𝐻3work as normal hash functions.
𝐻1-oracle: C1 answers with 𝑔𝑦𝑖 for 𝑦𝑖 ∈ 𝑍𝑞 if 𝑖 ≠ 𝑝, and 𝑦𝑖 = 𝑏 for 𝑖 = 𝑝.
𝐻4-oracle: C1 answers with 𝑔𝑧𝑖𝑗𝑘 with 𝑧𝑖𝑗𝑘 ∈ 𝑍𝑞.
- Extract-oracle: C1 𝑠𝑘𝑖 = (𝑔𝑎)𝑦𝑖 from 𝐻1, if 𝑖 ≠ 𝑝; else aborts. Denote indexes set of identities extracting private key as 𝑆1(𝑝 ∉ 𝑆1) .
- ProxyKeyGen-oracle: C1 answers \(u_{p^{\prime} i}=U_{i} \cdot\left(g^{a}\right)^{y} p^{\prime} \cdot r_{p^{\prime} i}\) from 𝐻1 and 𝑟𝑝′𝑖 ∈ 𝑍𝑞, if 𝑖 ≠ 𝑝; else aborts. Denote index pair set of identities as 𝑆1 ′ ((𝑝, 𝑖) ∉ 𝑆1 ′ ).
- Tag-oracle: C1 answers \(T_{i j k}=\left(\left(g^{a}\right)^{y_{p^{\prime}}} \cdot u_{p^{\prime} i}\right)^{H_{3} i j k+F_{i j k}} \cdot S_{i j k}^{z_{i j k}}\) with 𝑆𝑖𝑗𝑘 ∈ 𝐺1 from 𝐻1, 𝐻4, and ProxyKeyGen, if 𝑝′ ≠ 𝑝. Certainly this tag is valid to pass equation (5) and computational indistinguishable from real one for A1’s view; else aborts. Denote query input as set 𝑆1′′((𝑝, 𝑖,𝑗, 𝑘, 𝐹𝑖𝑗𝑘) ∉ 𝑆1 ′′).
Forgery Output: Finally, A1 itself outputs a valid tag \(\sigma_{i^{*} j^{*} k^{*}}=\left(T_{i^{*} j^{*} k^{*}}, S^{\prime}\right)\) for data block \(F_{i^{*} j^{*} k^{*}}\)generated by proxy IDp 𝜔𝑖∗ and its signature (𝑈𝑖∗, 𝑅𝑖∗, 𝜉𝑖∗). C1 looks up lists of all oracles. It will not abort and terminate only when none of corresponding records exists, i.e., requiring 𝐼𝐷𝑖∗ ≠ 𝐼𝐷𝑝, (𝑝, 𝑖∗) ∉ 𝑆1′ , (𝑝, 𝑖 ∗,𝑗∗, 𝑘∗, 𝐹𝑖∗𝑗∗𝑘∗ ) ∉ 𝑆1 ′′. If game could proceed, C1 keeps on checking all hash function oracles and makes queries itself if there is no relative record in their lists. \(R_{p i^{*}}=H_{1}\left(I D_{p}\right)^{r_{p i}}\) in Setup and \(R_{p i^{*}}=H_{1}\left(I D_{p}\right)^{r_{p i}}\) for validity of warrant 𝜔𝑖∗.
Since 𝜎𝑖∗𝑗∗𝑘∗ = (𝑇𝑖∗𝑗∗𝑘∗ , 𝑆′ ) satisfies equation (5) as valid tag, with corresponding records of oracles and properties of bilinear mapping:
\(\begin{aligned} e\left(T_{i^{*} j^{*} k^{*}}, g\right)=& e\left(\left(H_{1}\left(I D_{p}\right)\right.\right.\\ &\left.\left.\cdot\left(R_{i^{*}}^{H_{2}\left(\omega_{i^{*}} \| R_{i^{*}}\right)} \cdot R_{p i^{*}}\right)\right)^{H_{3}\left(i^{*}\left\|j^{*}\right\| k^{*}, \text {name}_{i^{*}} \| t i m e_{i^{*} j^{*}}^{*} k^{*}\right)+F_{i^{*} j^{*} k^{*}}}, g^{a}\right) \\ & \cdot e\left(g^{z_{i}^{*} j^{*} k^{*}}, S^{\prime}\right) \\ =& e\left(\left(g^{a b} \cdot\left(U_{i^{*}} \cdot g^{a b r_{p i^{*}}}\right)\right)^{H_{3}\left(i^{*}\left\|j^{*}\right\| k^{*}, n a m e_{i^{*}}\right) | t i m e_{i^{*} j^{*} k^{*}}+F_{i^{*} j^{*} k^{*}}} \cdot S^{\prime z_{i^{*} j^{*} k^{*}}}, g\right) \end{aligned}\)
we will have a solution of CDH problem after simplification
\(g^{a b}=\left(T_{i^{*} j^{*} k^{*}} \cdot S^{-Z_{i^{*} j^{*} k^{*}}} \cdot U_{i^{*}}^{-H_{3}\left(i^{*}\left\|j^{*}\right\| k^{*}, n a m e_{i^{*}} \| t i m e_{i^{*} j^{*} k^{*}}\right)-F_{i^{*} j^{*} k^{*}}}\right)^{\frac{1}{W}}\)
Where \(W=\left(1+r_{p i^{*}}\right)\left(H_{3}\left(i^{*}\left\|j^{*}\right\| k^{*}, n a m e_{i^{*}} \| t i m e_{i^{*} j^{*} k^{*}}\right)+F_{i^{*} j^{*} k^{*}}\right)\).
Probability and Time Analysis
- We analyze C1’s probability and time of solving CDH problem with the A1’s ability to forge tag of our improved scheme. For the following four events:
- ℰ1: C1 does not abort for any A1’s Extract queries.
- ℰ2: C1 does not abort for any A1’s ProxyKeyGen queries.
- ℰ3: C1 does not abort for any A1’s Tag queries.
- ℰ4: A1 ggenerates a valid tag 𝜎𝑖∗𝑗∗𝑘∗ for block𝐹𝑖∗𝑗∗𝑘∗for proxy 𝐼𝐷𝑝 with warrant 𝜔𝑖∗, where \(i^{*} \neq p,\left(p, i^{*}\right) \notin S_{1}^{\prime},\left(p, i^{*}, j^{*}, k^{*}\right) \notin S_{1}^{\prime \prime}\)
If A1 succeeds in all the above events and 𝐻1 answers 𝑔𝑏with probability (1 − 𝛿), then C1’s probability for CDH solution is : \(\operatorname{Pr}\left[\mathcal{E}_{1} \wedge \mathcal{E}_{2} \wedge \mathcal{E}_{3} \wedge \mathcal{E}_{4}\right]=\operatorname{Pr}\left[\mathcal{E}_{1}\right] \operatorname{Pr}\left[\mathcal{E}_{2} | \mathcal{E}_{1}\right] \operatorname{Pr}\left[\mathcal{E}_{3} | \mathcal{E}_{2} \wedge\right. \left.\varepsilon_{1}\right] \operatorname{Pr}\left[\mathcal{E}_{4} | \mathcal{E}_{3} \wedge \mathcal{E}_{2} \wedge \mathcal{E}_{1}\right]=(\delta(\tilde{N}-1) / N)^{q_{E}}(1 / \tilde{N})^{q_{P}} \delta^{q_{T}} \epsilon_{1}(1-\delta)\). With \(\delta=\left(q_{E}+\right.\left.q_{T}\right) /\left(q_{E}+q_{T}+1\right)\), the probability is at least \(\epsilon_{1}(\hat{N}-1)^{48} /\left(\hat{e} N^{q_{E}+q_{P}}\left(q_{E}+q_{T}+1\right)\right)\), where 𝑒̂ is the natural logarithm, \(\widehat{N}\) is number of selected identities.
The total running time of C1 comprises of A1’s running time 𝑡1 and additional time, where C1 responds with (𝑞𝐻 + 𝑞𝑇) hash, 𝑞𝐸 Extract, 𝑞𝑃 ProxyKeyGen, 𝑞𝑇 TagGen queries and final CDH problem transforming time. Hash response, Extract and ProxyKeyGen require at most once exponentiation on group 𝐺1 for each query, while it takes triple exponentiation for Tag oracle query. 𝑆′−𝑧𝑖 ∗𝑗∗𝑘∗ could be computed by one exponentiation on 𝑆′ and one inversion, and so it is with computation on 𝑈𝑖∗ . Final (⋅)1/𝑊 requires exponentiation with 1/𝑊. So twice inversion and triple exponentiation on 𝐺1 are required for final output of CDH solution. Therefore, the total running time is at most 𝑡1 + 𝑡𝐺1 ⋅ (𝑞𝐻 + 𝑞𝐸 + 𝑞𝑃 + 4𝑞𝑇 + 5).
Theorem 2 (Unforgeability) If there exists PPT time (𝑡2, 𝜖2) -adversary A2 who could fabricate valid proof of our Sec-ID-BPAPP, then our scheme is unforgeable when challenger C2 could solve BDH problem with non-negligibility \(\epsilon_{2}(\widehat{N}-1)^{q_{E}} /\left(\hat{e}^{c^{*}} \widehat{N}^{q_{E}+q_{P}}\left(q_{E}+q_{T}+\right.\right.\)\(1)\)) with PPT time 𝑡2 + 𝑡𝐺1 ⋅ (𝑞𝐻 + 𝑞𝐸 + 𝑞𝑃 + 4𝑞𝑇 + 2|𝑂| + 4) + 2𝑡𝐺2 + 𝑡𝑒.
Proof: There are \(\widehat{N}\) number of selected identities {𝐼𝐷𝑖}𝑖∈𝑂 having the proxy 𝐼𝐷𝑝. The original data file \(\left\{\tilde{F}_{i}\right\}_{i \in O}\) will be divided into blocks\(\left\{F_{i j k}\right\}_{i \in O, j \in J, k \in \delta_{i j}}\) before being outsourced on clouds \(\left\{C S_{j}\right\}_{j \in J}\) .
- Setup: Like Theorem 1, C2 in the role of PKG, generates master key pairs (𝑚𝑠𝑘, 𝑚𝑝𝑘) = (𝑎, 𝑔𝑎)from generator 𝑔 with 𝑎, 𝑏, 𝑤 ∈ 𝑍𝑞, and creates BDH instance as 𝑔, 𝑔𝑎,𝑔𝑏, 𝑔𝑤 ∈ 𝐺1 to compute 𝑒(𝑔, 𝑔)𝑎𝑏𝑤 ∈ 𝐺2. It also allows A2 to access 𝑅𝑝𝑖 as \(H_{1}\left(I D_{p}\right)^{r_{p i}}\) where \(r_{p i} \in Z_{q}\).
- 𝐻1 -oracle, 𝐻2 -oracle, 𝐻3 -oracle, 𝐻4 -oracle, Extract-oracle, ProxyKeyGen-oracle, Tag-oracle, remain the same as Theorem 1.
- First phase queries: A2 could access all the oracles. Let us denote index set 𝐼𝐷𝑖 of private key extracting as 𝑆2, (𝑝 ∉ 𝑆2), the index pair set (𝐼𝐷𝑝′ ,𝐼𝐷𝑖 ) of proxy tag secret key query as 𝑆2′ ((𝑝, 𝑖) ∉ 𝑆2′ ), the tuple set of index and data for proxy tag query as 𝑆2′′((𝑝, 𝑖,𝑗, 𝑘, 𝐹𝑖𝑗𝑘) ∉ 𝑆2′′)
- Challenge phase: C2 generates challenge set 𝑐h𝑎𝑙 with ordered number collection {𝑐𝑖∗𝑗∗ } to specify every block 𝐹𝑖∗𝑗∗𝑘𝑛 ∗ on the 𝑗∗ th cloud for owner of 𝐼𝐷𝑖∗ ( {(𝑝, 𝑖∗,𝑗∗, 𝑘𝑛 ∗ )|1 ≤ 𝑛 ≤ 𝑐𝑖∗𝑗∗}, and 𝑖 ∗ ≠ 𝑝, (𝑝, 𝑖 ∗) ∉ 𝑆2 ′ , (𝑝, 𝑖 ∗,𝑗∗, 𝑘𝑛 ∗ , 𝐹𝑖∗𝑗∗𝑘𝑛 ∗ ) ∉ 𝑆2 ′′ ∪ 𝑆3 ′′.), and masking 𝑀 = 𝑚𝑝𝑘𝑤 for privacy-preserving auditing. 𝑐h𝑎𝑙 will be sent to A2.
- Second phase queries: A2 makes queries similar to First phase. Denote index set of identities for Extract private key queries as 𝑆3, index set of identity pairs for proxy tag secret key queries as 𝑆3′ , tuple set of index and data for proxy tags queries as 𝑆3′′. We require that 𝑝 ∉ 𝑆2 ∪ 𝑆3, (𝑝, 𝑖) ∉ 𝑆2 ′ ∪ 𝑆3 ′ and\(\left(p, i, j, k, F_{i j k}\right) \notin S_{2}^{\prime \prime} \cup S_{3}^{\prime \prime}\).
Forgery Output: Finally, A2 itself outputs valid proof \(\left\{P_{j} \cdot\right\}_{j: \in J} \text { for }\left\{F_{i} ; j: k_{n}\right\}_{1 \leq n \leq c_{i^*j^*}}\) and tags generated by proxy 𝐼𝐷𝑝 with warrants {𝜔𝑖∗}𝑖∗∈𝑂 and signatures {(𝑈𝑖∗,𝑅𝑖∗, 𝜉𝑖∗)}𝑖∗ ∈𝑂. C2 looks up lists of Extract-oracle, ProxyKeyGen-oracle and Tag-oracle. It will abort and terminate unless none of corresponding records exists. If game could proceed, C2 keeps on checking all hash function oracles and makes queries itself if there is no relative record in their lists. \(R_{p i^{*}}=H_{1}\left(I D_{p}\right)^{r_{p i}}\), gw in Setup and 𝑈𝑖∗ = (𝑔𝑎)𝑦𝑖∗𝑟𝑖∗𝐻2(𝜔𝑖∗||𝑅𝑖∗) for validity of warrant 𝜔𝑖∗. Since valid proof \(\left\{P_{j^{*}}\right\}_{j^{*} \in J}=\left\{\left(T_{j^{*}}^{\prime}, S^{\prime}, M_{j^{*}}^{\prime}\right)\right\}_{j^{*} \in J}\) satisfies (6), with corresponding records of oracles and properties of bilinear mapping as:
\(\begin{aligned} e\left(\prod_{j^{*} \in J} T_{j^{*}}^{\prime}, g^{w}\right) &=e\left(\prod_{i^{*} \in O}\left(H_{1}\left(I D_{p}\right) \cdot R_{i^{*}}^{H_{2}\left(\omega_{i^{*}} \| R_{i^{*}}\right)}\right.\right.\\ &\left.\left.\cdot R_{p i^{*}}\right)^{\sum_{j^{*} \in J} \sum_{n \in\left[1, c_{i^{*} j^{*}}\right]}{h_{i} n_{i^{*}} j^{*} k_{n}^{*} . H_{3}\left(i^{*}\left\|j^{*}\right\| k_{n}^{*}, n a m e_{i^{*}} \| t i m e_{i^{*} j^{*} k_{n}^{*}}\right)}},\left(g^{a}\right)^{w}\right) \\ & \cdot e\left(\prod_{i^{*} \in O} \prod_{j^{*} \in J} \prod_{n \in\left[1, c_{i^{*} j^{*}}\right]}\left(g^{z_{i^{*} j^{*} k_{n}^{*}}}\right)^{h_{i^{\prime} j^{*} k_{n}^{*}}}, S^{\prime w}\right) \cdot \prod_{j^{*} \in J} M_{j^{*}}^{\prime} \end{aligned}\)
\(\begin{array}{l} =e\left(g^{a b \Sigma_{i^{*} \in O} \sum_{j^{*} \in J} \sum_{n \in\left[1, c_{i^{*}} j^{*}\right]}\left(1+r_{p i^{*}}\right) h_{i^{*} j^{*} k_{n}^{*}} \cdot H_{3}\left(i^{*}\left\|j^{*}\right\| k_{n}^{*}, n a m e_{i^{*}} \| \operatorname{tim} e_{i^{*} j^{*} k_{n}^{*}}\right)}\right. \\ \cdot \prod_\limits{i^{*} \in O} U_{i^{*}} ^{\sum_{j^{*} \in J} \sum_{n \in\left[1, c_{i^{*} j^{*}}\right]}{h_{i^{*}} i_{j^{*}} k_{n}^{*} \cdot H_{3}\left(i^{*}|| j^{*}|| k_{n}^{*}, n a m e_{i^{*}} \| \operatorname{tim} e_{i^{*} j^{*} k_{n}^{*}}\right)}} \\ \left.\cdot {S^{\prime}}^{\sum_{i^{*} \in O} \Sigma_{j^{*} \in J} \sum_{n \in\left[1, c_{i^{*} j^{*}}\right]^{Z_{i^{*}} j^{*} k_{n}^{*} \cdot h_{i^{*} j^{*} k_{n}^{*}}},}}, g^{w}\right) \cdot \prod_\limits{j^{*} \in J} M_{j^{*}}^{\prime} \end{array}\)
The BDH problem solution is obtained after simplifications:
Where
\(e(g, g)^{a b w}=\left(e\left(\prod_{j^{*} \in J} T_{j^{*}}^{\prime} \cdot W^{\prime-1}, g^{w}\right) \cdot M^{\prime-1}\right)^{\frac{1}{E}}, \\ M^{\prime}=\sum_{j^{*} \in J} M_{j^{*}}^{\prime}, h_{i^{*} j^{*} k_{n}^{*}}=f_{v_{i^{*} j^{*}}, z}\left(i^{*}, j^{*}, k_{n}^{*}\right), \\ W^{\prime}=\prod_{i^{*} \in O} U_{i^{*}}^{-\sum_{j^{*} \in J} \sum_{n \in\left[1, c_{i^{*}} ; *\right)} h_{i^{*} j^{*} k_{n}^{*}} \cdot H_{3}\left(i^{*}\left\|j^{*}\right\| k_{n}^{*}, n a m e_{i^{*}} \| t i m e_{i^{*} j^{*} k_{n}^{*}}\right)} \\ E = \sum\limits_{i^{*} \in O}\sum\limits_{j^{*} \in J}\sum^{}\limits_{n \in\left[1, C_{i} * j^{*}\right]}{\left(1+r_{p i^{*}}\right) h_{i^{*} j^{*} k_{n}^{*}} \cdot H_{3}\left(i^{*}\left\|j^{*}\right\| k_{n}^{*}, \text { name }_{i^{*}} \| t i m e_{i^{*} j^{*} k_{n}^{*}}\right)}\)
Probability and Time Analysis
- We analyze C2’s probability and time of solving BDH problem with the A2’s ability to forge proof of our improved scheme. For the following four events:
- ℰ1: C2 does not abort for any A2’s Extract queries.
- ℰ2: C2 does not abort for any A2’s ProxyKeyGen queries.
- ℰ3: C2 does not abort for any A2’s TagGen queries.
- ℰ4: A2 generates a valid proof \(\left\{P_{j^{*}}\right\}_{j^{*} \in J}\), for challenged blocks \(\left\{F_{i^{*} j^{*} k_{n}^{*}}\right\}_{n \in\left[1, c_{i^{*} j^{*}}\right]}\) by proxy 𝐼𝐷𝑝 with warrants {𝜔𝑖∗}𝑖∗∈𝑂, where 𝑖 ∗ ≠ 𝑝,(𝑝, 𝑖 ∗) ∉ 𝑆2 ′ ∪ 𝑆3 ′ , (𝑝, 𝑖∗,𝑗∗, 𝑘𝑛∗ ) ∉ 𝑆2 ′′ ∪ 𝑆3 ′′. If A2 succeeds in all the above events and 𝐻1 answers 𝑔𝑏 with (1 − 𝛿), then C2’s probability for BDH solution is: \(\operatorname{Pr}\left[\mathcal{E}_{1} \wedge \mathcal{E}_{2} \wedge \mathcal{E}_{3} \wedge \mathcal{E}_{4}\right]=\operatorname{Pr}\left[\mathcal{E}_{1}\right] \operatorname{Pr}\left[\mathcal{E}_{2} | \mathcal{E}_{1}\right] \operatorname{Pr}\left[\mathcal{E}_{3} | \mathcal{E}_{2} \wedge \mathcal{E}_{1}\right] \operatorname{Pr}\left[\mathcal{E}_{4} | \mathcal{E}_{3} \wedge\right.\)\(\left.\varepsilon_{2} \wedge \varepsilon_{1}\right]=(\delta(\hat{N}-1) / \hat{N})^{q_{E}}(1 / \hat{N})^{q_{P}} \delta^{q_{T}} \epsilon_{2}\left(1-\delta^{c^{*}-1}\right)\). With \(\delta=\left(\left(q_{E}+q_{T}\right) /\left(q_{E}+\right.\right.\left.\left.q_{T}+1\right)\right)^{c^{*}}\), the probability is at least \(\varepsilon_{2}(N-1)^{q_{E}} /\left(\hat{e}^{c^{*}} N^{q_{E}+q_{P}}\left(q_{E}+q_{T}+1\right)\right)\), where \(\hat{e}\) is the natural logarithm, \(\widehat{N}\) is number of selected identities, \(c_{i^{*} j^{*}}\)is the number of challenged blocks on corresponding cloud for specified owner, and \(c^{*}=\left(\sum_{i^{*} \in O, j^{*} \in J} c_{i^{*} j^{*}}\right)^{-1}\).
The total running time of C2 comprises of A2’s running time 𝑡2 and additional time, where there are |𝑂| number of owners, C2 responds with (𝑞𝐻 + 𝑞𝑇) hash, 𝑞𝐸 Extract, 𝑞𝑃 ProxyKeyGen, 𝑞𝑇 TagGen queries and final BDH problem transforming time. Hash response, Extract and ProxyKeyGen require at most once exponentiation on group 𝐺1 for each query, while it takes triple exponentiation for Tag oracle query. One pairing, (|𝑂| + 2) inversion and (|𝑂| + 2) exponentiation on 𝐺1, one inversion and one exponentiation on 𝐺2 are spent for final output of BDH solution. Therefore, the total running time is at most 𝑡2 + 𝑡𝐺1 ⋅ (𝑞𝐻 + 𝑞𝐸 + 𝑞𝑃 + 4𝑞𝑇 + 2|𝑂| + 4) + 2𝑡𝐺2 + 𝑡𝑒. We complete the proof.
Theorem 3 (Privacy-preserving) If there exists PPT time TPA which could recover original data in our Sec-ID-BPAPP, then our scheme is privacy-preserving when challenger could solve DL problem with non-negligibility with PPT time.
Proof: After TPA receives masked data proof as \(M_{j}^{\prime}=e\left(\prod_{i \in O_{j}}\left(H_{1}\left(I D_{p}\right)\right.\right.\)⋅\(\left.\left.\left(R_{i}^{H_{2}\left(\omega_{i} \| R_{i}\right)} \cdot R_{p i}\right)\right)^{F_{i j}^{\prime}}, M\right)\). Denote \(g^{\prime}=e\left(\prod_{i \in O_{j}} H_{1}\left(I D_{p}\right) \cdot\left(R_{i}^{H_{2}\left(\omega_{i} \| R_{i}\right)} \cdot R_{p i}\right), M\right)\) and thus 𝑀𝑗 ′ = (𝑔′ )𝐹'𝑖𝑗 . If TPA retrieves original data combination 𝐹′ij = ∑𝑘∈𝛿𝑖𝑗ℎ𝑖𝑗𝑘 ⋅ 𝐹𝑖𝑗𝑘 for further recovering data blocks {𝐹𝑖𝑗𝑘}, then challenger could solve DL problem as given 𝑔′ ∈ 𝐺2, (𝑔′ )𝐹𝑖𝑗 ′ ∈ 𝐺2, obtaining 𝐹'𝑖𝑗 ∈ 𝑍𝑞. We complete the proof.
7. Efficiency Analysis
In this section, we compare overheads of computation and communication of our improved scheme Sec-ID-BPAPP, with Wang et al.’s ID-PUIC [27], summarized in Table 1 and Table 2, respectively. In addition, the performance comparison on computation is depicted in Fig. 2, based on results from simulation of the two schemes on a laptop, to evaluate efficiency trend when number of data owners, clouds and data amount increases.
Table 1. Computation Cost Comparison for Multiple Owners and Multiple Clouds
Table 2. Communication Cost Comparison for Multiple Owners and Multiple Clouds
- Assume there are 𝑛𝑂 data owners storing total 𝑁 blocks {𝐹𝑖𝑗𝑘} on 𝑛𝐽 clouds, by only one-off TagGen and upload. To prove data integrity, periodical Challenge and Verify will be executed between clouds and TPA, upon randomly selected 𝑐 data blocks of 𝑛1 data owners on 𝑛2 clouds with their tags, element size of group 𝐺1 is 𝒢1, 𝒢2 is for 𝐺2. Consequently, the dominant cost of this scheme is mostly contributed by ProofGen and Verify.
- Among all the operations, bilinear pairings 𝐶𝑒, exponentiation 𝐶𝑒𝑥𝑝 on group 𝐺1, and hash 𝐶ℎ on blocks are most expensive, compared with multiplication on 𝐺1and 𝐺2, operation on 𝑍𝑞, and other hash operations, which are efficient or can be done for only once. Additionally, since ID-PUIC only offers single owner’s data auditing on one cloud, we consider repeating 𝑛1𝑛2 loops of ID-PUIC instances, with 𝑁/(𝑛1𝑛2) outsourced blocks and only challenged 𝑐/(𝑛1𝑛2) blocks per loop.
Analysis for computation: In order to fully protect tags \(\left\{\sigma_{i j k}=\left(S_{i j k}, T_{i j k}\right)\right\}\) from being utilized to recover its private keys by adversaries, proxy requires \(\left(2 N+n_{O}\right) C_{e x p}\) operation for data owners in TagGen. Luckily, these could be performed off line for proxy as one-off task, although a little bit expensive. After one exponentiation Cexp for masking element 𝑀 in Challenge, our Sec-ID-BPAPP spends (𝑐 + 𝑛1𝑛2)𝐶𝑒𝑥𝑝 + 𝑛2𝐶𝑒 for all {𝑃𝑗} in ProofGen, where 𝑛2 clouds additionally perform 𝑛1𝑛2𝐶𝑒𝑥𝑝 + 𝑛2𝐶𝑒 for generating masked data proof, in order to realize privacy-preserving auditing on TPA’s side and reduce its computation load. And thus in Verify, TPA needs only 3 bilinear pairing to allow batch auditing at one time, which achieves enhanced security of proxy private key protection and still outperforms 2𝑛1𝑛2 pairings in Wang et al.’s ID-PUIC [27], if applied to the multiple clouds and multiple owners scenario in Table 1.
Analysis for communication: To enable privacy-preserving auditing, we first require special 𝑛2 𝒢1 size of element from Challenge to mask data in ProofGen, which later successfully outputs masked data in the size of 𝑛2 𝒢2 for final auditing. But for total proof, which includes both aggregate tag and masked data, our Sec-ID-BPAPP of 𝑛2 (2𝒢1 + 𝒢2) is still less than ID-PUIC’s 𝑛1𝑛2 (𝒢1 + 𝑙𝑜𝑔2𝑞), which is linear to both 𝑛1 and 𝑛2. If taking Challenge and ProofGen together, our proposed scheme introduces less bandwidth than ID-PUIC, since 𝑛2 ≪ 𝑛1 in the multiple clouds and multiple owners’ setting in Table 2.
Simulation: In order to compare the performance about Wang et al.’s ID-PUIC [27] versus our Sec-ID-BPAPP, we simulate data owners, proxy, storage clouds, and TPA on a laptop of Intel core i5 480 M at 2.67 GHz and 4G RAM running Linux operation system (Ubuntu 18.04 64bit with kernel 4.15.0-23-generic), in C programming language. Both of schemes are based on Pairing-Based Cryptography Library (PBC 0.5.14) [33], GNU Multiple Precision Arithmetic Library (GMP 6.1.2) [34] and OpenSSL Library (OpenSSL-1.1.0) [35].
To achieve 80-bit AES level of security, the elliptic curve we are using is of 160 bit group order with 512 bit length finite field element for 𝐺1 and 𝐺2, from Type-A pairing in PBC library. Therefore, the size of element is 𝒢1 = 𝒢2 = 64 Bytes, and q is 20 Bytes length prime. For generating challenging co-efficient {ℎ𝑖𝑗𝑘}, we consider HMAC-SHA256 as pseudo random function 𝑓 in OpenSSL library. We set each data block 𝐹𝑖𝑗𝑘 as 20 B. The simulation has run 10 trials and collected their mean values as results
For TagGen computation of proxy tags for total 1000000 blocks of 50 data owners, ID-PUIC requires 6825.433 seconds and Sec-ID-BPAPP is 6275.664 seconds. In order to prove total 1000000 blocks outsourced on 10 clouds for 50 data owners, running time of ProofGen is 2745.109 seconds of Sec-ID-BPAPP versus 53.984 seconds in ID-PUIC. Our Sec-ID-BPAPP indeed takes more time to generate masked data proof on the clouds. But this enables privacy-preserving public auditing advantage over ID-PUIC, and reduces TPA’s computation in the batch owners and clouds integrity auditing task on in Verify as follows.
On the left half of Fig. 2, the computation time on TPA’s side is depicted for Wang et al.’s ID-PUIC [27] (marked in blue bar) and our Sec-ID-BPAPP (in yellow bar), when challenged data owners increases from 50 to 250. For the fairness of evaluation, we repeat Wang et al.’s scheme to achieve the same number of data owners and clouds. Assume there are 10 clouds, each of which stores 2000 blocks for every data owner, and the total number of challenged data blocks will range from 1.0 × 106 to 5.0 × 106 (marked on the top X-axis), based on 100% probability to detect 1% rate of modification. It is illustrated that our improved scheme has less computation overheads on TPA’s side versus Wang et al.’s scheme.
Followed up with the right half, in Fig. 2, we present the computation time of on TPA’s side as the number of challenged clouds increases from 10 to 50, for ID-PUIC [27] (marked in blue 1060 Zhao et al.: Secure and Efficient Privacy-Preserving Identity-Based Batch Public Auditing with Proxy Processing bar) and Sec-ID-BPAPP (in yellow bar), based on 100% probability to detect 1% rate of modification. Imagine there are 50 data owners, each of which outsources 2000 blocks on every cloud, and thus the total number of challenged data blocks will range from 1.0 × 106 to 5.0 × 106 (shown on the top X-axis). We also repeat ID-PUIC for the fairness of evaluation. It is shown that Sec-ID-BPAPP introduces less computation overheads on TPA’s side.
Fig. 2. Comparison of computation on TPA: 1) as number of Owners increases: Total 10 Clouds of each stores 2000 blocks per owner; 2) as number of Clouds increases: Total 50 Owners of each outsources 2000 blocks per cloud
The difference illustrated in the Fig. 2 is able to predict their trend of performance upon extrapolation to real multiple clouds storage system, which are equipped with powerful CPUs and huge memories, even if the performance of two schemes are temporarily limited by our simulated laptop. Therefore, our Sec-ID-BPAPP is more efficient than Wang et al.’s ID-PUIC for the secure big data storage, which might have billion number of data owners, large number of storage clouds and large volume of data, in terms of storage integrity.
For total communication overheads of Challenge and ProofGen, our Sec-ID-BPAPP of (𝑛1𝑛2/8 𝑙𝑜𝑔2𝑁 + 40𝑛1𝑛2 + 256𝑛2) outperforms ID-PUIC’s (𝑛1𝑛2/8 𝑙𝑜𝑔2𝑁 + 124𝑛1𝑛2), since the number of challenged clouds 𝑛2 is usually much smaller than the number of challenged owners 𝑛1 . Especially, Sec-ID-BPAPP requires (𝑛1𝑛2/8 𝑙𝑜𝑔2𝑁 + 40𝑛1𝑛2 + 64𝑛2) B and 192𝑛2 B while ID-PUIC costs (𝑛1𝑛2/8 𝑙𝑜𝑔2𝑁 + 40𝑛1𝑛2) B and 84𝑛1𝑛2 B, for Challenge and ProofGen respectively, upon 64 B element size of group 𝐺1and 𝐺2, 20B per block. In real cloud storage, TPA could employ sampling technology in [5] for economic auditing, e.g. randomly challenging 460 blocks is sufficient to detect 1% data error with 99% probability among entire multiple clouds storage system.
8. Conclusions and Open Problem
In this paper, we revisited an identity-based batch public auditing with proxy processing (ID-BPAPP) scheme [32] designed by Yu et al. in KSII transactions on Internet and Information Systems 2017 October, and demonstrated that any cloud in their scheme could deceive TPA without original data. In particular, it is also feasible to recover proxy’s private key to generate tags by malicious clouds or data owners themselves. This will inevitably incur potential impersonation, and even might be leveraged to threaten digital properties of proxy. Therefore, we propose our solution to repair the security flaws and thus enhance the security, at the expense of reasonable overheads while still enjoy better auditing efficiency over ID-PUIC [27].
Despite these security flaws above, it is still of great value for Yu et al. to tackle the batch public data auditing problem with proxy processing, under identity based cryptography infrastructure. As a future work, we will keep on seeking to improve the efficiency of our proposed scheme, to enable practical and secure data integrity auditing on distributed clouds system for multiple owners of restricted access.
Acknowledgement
This work was supported by State Scholarship Fund Program of China Scholarship Council under Grant 201506070077, the National Key R&D Program of China under Grant 2017YFB0802000 and the National Natural Science Foundation of China under Grant 61370203 and 61872060.
References
- Gartner.com, "Gartner Forecasts Worldwide Public Cloud Services Revenue to Reach $260 Billion in 2017," October 12, 2017.
- IDC.com, "Worldwide Public Cloud Services Spending Forecast to Reach $122.5 Billion in 2017, According to IDC," February 20, 2017.
- Y. Wang, Q. Wu, B. Qin, W. Shi, R. H. Deng, J. Hu, "Identity-Based Data Outsourcing with Comprehensive Auditing in Clouds, " IEEE Transactions on Information Forensics and Security,12(4), 940-952, 2017. https://doi.org/10.1109/TIFS.2016.2646913
- Z. Fu, X. Wu, C. Guan, X. Sun, K. Ren, "Toward efficient multi-keyword fuzzy search over encrypted outsourced data with accuracy improvement," IEEE Transactions on Information Forensics and Security, 11(12), 2706-2716, 2016. https://doi.org/10.1109/TIFS.2016.2596138
- G. Ateniese, R. Burns, R. Curtmola, J. Herring, L. Kissner, Z. Peterson, D. Song, "Provable Data Possession at Untrusted Stores," in Proc. of ACM CCS 2007, pp. 598-609, 2007.
- H. Shacham, B. Waters, "Compact proofs of retrievability," In Proceedings of ASIACRYPT 2008, pp. 90-107, 2008.
- C. Wang, S. S. M. Chow, Q. Wang, K. Ren, W. Lou, "Privacy-Preserving Public Auditing for Secure Cloud Storage," IEEE Transactions on Computers, 62(2), 362-375, February, 2013. https://doi.org/10.1109/TC.2011.245
- Y. Zhu, H. Hu, G. J. Ahn, M.Yu, "Cooperative Provable Data Possession for Integrity Verification in MultiCloud Storage," IEEE Transactions Parallel and Distributed Systems, 23(12), 2231-2244, December, 2012. https://doi.org/10.1109/TPDS.2012.66
- K. Yang, X. Jia, "An efficient and secure dynamic auditing protocol for data storage in cloud computing,"IEEE Transactions on Parallel and Distributed Systems, 24(9), 1717-1726, 2013. https://doi.org/10.1109/TPDS.2012.278
- R. Curtmola, O. Khan, R. Burns, G. Ateniese, "MR-PDP: Multiple-replica provable data possession," In Proceedings of ICDCS 2008, pp. 411-420 (2008).
- B. Wang, B. Li, H. Li, "Panda: public auditing for shared data with efficient user revocation in the cloud," IEEE Transactions on Services Computing, 8(1), 92-106, 2015. https://doi.org/10.1109/TSC.2013.2295611
- Q. Wang, C. Wang, K. Ren, W. Lou, J. Li, "Enabling Public Auditability and Data Dynamics for Storage Security in Cloud Computing," IEEE Transactions on Parallel and Distributed Systems, 22(5), 847-859, 2011. https://doi.org/10.1109/TPDS.2010.183
- C. Erway, A. Kupcu, C. Papamanthou, R. Tamassia, "Dynamic Provable Data Possession," ACM Transactions on Information and System Security, 17(4), 2015.
- C. Liu, R. Ranjan, C. Yang, X. Zhang, L. Wang, J. Chen, "MuRDPA: Top-down levelled multi-replica merkle hash tree based secure public auditing for dynamic big data storage on cloud," IEEE Transactions. on Computers, 64(9), 2609-2622, 2015. https://doi.org/10.1109/TC.2014.2375190
- A. F. Barsoum, M. A. Hasan, "Provable multicopy dynamic data possession in cloud computing systems," IEEE Transactions on Information Forensics and Security, 10(3), pp. 485-497, 2015. https://doi.org/10.1109/TIFS.2014.2384391
- J. Wang, X. Chen, X. Huang, I. You, and Y. Xiang, "Verifiable auditing for outsourced database in cloud computing," IEEE Transactions on Computers, 64(11), 3293-3303, 2015. https://doi.org/10.1109/TC.2015.2401036
- Y. Miao, J. Ma, X. Liu, X. Li, Q. Jiang, and J. Zhang, "Attribute-based keyword search over hierarchical data in cloud computing," IEEE Transactions on Services Computing, 2018.
- Y. Miao, J. Ma, X. Liu, X. Li, Z. Liu, and H. Li, "Practical attribute based multi-keyword search scheme in mobile crowdsourcing," IEEE Internet of Things Journal, 5 (4), 3008-3018, 2018. https://doi.org/10.1109/JIOT.2017.2779124
- Y. Miao, J. Ma, X. Liu, J. Weng, and H. Li, H Li, "Lightweight fine-grained search over encrypted data in fog computing," IEEE Transactions on Services Computing, 2018.
- Y. Miao, J. Weng, X. Liu, KKR Choo, Z. Liu, and H. Li, "Enabling verifiable multiple keywords search over encrypted cloud data," Information Sciences, 465, 21-37, 2018. https://doi.org/10.1016/j.ins.2018.06.066
- J. Zhao, C. Xu, F. Li, W. Zhang, "Identity-based public verification with privacy preserving for data storage security in cloud computing," IEICE Transactions Fundamentals Electronics, Communications and Computer Sciences, 96(12), 2709-2716, 2013.
- D. Boneh, M. Franklin, "Identity-based encryption from the weil pairing, " in Proc. of CRYPTO 2001, LNCS 2139, pp. 213-229, 2001.
- H. Wang, "Identity-based distributed provable data possession in multicloud storage," IEEE Transactions on Services Computing, 8(2), 328-340, 2015. https://doi.org/10.1109/TSC.2014.1
- Y. Yu, Y. Zhang, Y. Mu, W. Susilo, "Provably Secure Identity based Provable Data Possession," in Proc. of ProvSec 2015, LNCS 9451, pp. 1-16, Springer, Heidelberg, 2015.
- H. Liu, Y. Mu, J. Zhao, C. Xu, H. Wang, L. Chen, et al., "Identity-based provable data possession revisited: security analysis and generic construction," Computer Standards & Interfaces, 54(1), 10-19, 2017. https://doi.org/10.1016/j.csi.2016.09.012
- Y. Yu, M. H. A. Au, G. Ateniese, X. Huang, W. Susilo, Y. Dai, G. Min , "Identity-based remote data integrity checking with perfect data privacy preserving for cloud storage," IEEE Transactions on Information Forensics and Security, 12(4), 767-778, April, 2017. https://doi.org/10.1109/TIFS.2016.2615853
- H. Wang, D. He, S. Tang, "Identity-based proxy-oriented data uploading and remote data integrity checking in public cloud," IEEE Transactions on Information Forensics and Security, 11(6), 1165-1176, 2016. https://doi.org/10.1109/TIFS.2016.2520886
- X. Zhang, H. Wang and C. Xu, "Identity-based key-exposure resilient cloud storage public auditing scheme from lattices," Information Science, 472, 223-234, 2019. https://doi.org/10.1016/j.ins.2018.09.013
- S. Peng, F. Zhou, J. Xu, Z. Xu, "Comments on "Identity-Based Distributed Provable Data Possession in Multicloud Storage," IEEE Transactions on Services Computing, 9(6), 996-998, Nov.-Dec, 2016. https://doi.org/10.1109/TSC.2016.2589248
- J. Zhao, C. Xu and K. Chen, "A Security-Enhanced Identity-Based Batch Provable Data Possession Scheme for Big Data Storage," KSII Transactions on Internet and Information Systems, 12(9), 4576-4598, 2018. https://doi.org/10.3837/tiis.2018.09.025
- J. Coron, "On the exact security of full domain hash," In Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 220-235. Springer, Heidelberg (2000).
- H. Yu, Y. Cai, S. Kong, et al, "Efficient and Secure Identity-Based Public Auditing for Dynamic Outsourced Data with Proxy," KSII transactions on Internet and Information Systems, 11(10), Oct. 5039-5061, 2017. https://doi.org/10.3837/tiis.2017.10.019
- The Pairing-Based Cryptography Library (PBC).
- The GNU Multiple Precision Arithmetic Library (GMP).
- OpenSSL: cryptography and SSL/TLS Toolkit.