1. Introduction
With the rapid development of network technology and digital products, information hiding technologies such as steganography, image hashing technology [1-3] and digital watermarking technology have emerged to protect multimedia information security. Digital watermarking is an important branch of information hiding technology, and has become a prevailing solution to protect multimedia data. Digital watermarking technology embed the watermark information into multimedia information carriers, which didn’t degrade the perceptual quality and withstand common attacks. The technology must satisfy robustness, security, fidelity and watermark capacity [4]. Security refers to demanding the embed information be secret to decry. Robustness refers to the ability of a system to possess an effective watermark after common attacks. Fidelity means that after embedding the watermark into an image, the visual quality of the watermarked image has no change compared with the original image. The watermark capacity refers the amount of information that can be embedded into the host image. Among these requirements, increasing the watermark strength is common way to enhance robustness at the cost of loosing fidelity. Therefore, there is a non-negligible issue for researchers to realize better tradeoff between robustness and fidelity.
Spread transform dither modulation (STDM), proposed by Chen and Wornell, is an important extension of quantization index modulation (QIM) watermarking algorithms, which probably obtains better tradeoff between robustness and fidelity than the other existing watermarking schemes [5]. In recent years, to enhance the robustness of the watermarking algorithms more effectively, many researchers have integrated the perceptual knowledge of human visual systems (HVS) into the watermark mechanism, then the watermark strength can be adjusted adaptively by utilizing the visual redundancy. Just noticeable distortion (JND) refers that the HVS cannot perceive the distortion below a certain threshold because of the existence of various visual masking effects, and is an efficient method to model the perceptual redundancies [6,7].
JND model in this paper assist the STDM watermarking framework in maximizing robustness while keeping the fidelity. There are still some limitations with the previous subband-based JND models that were used in digital watermarking systems to adjust the embedding strength, which are mainly reflected in two aspects. On one hand, most of the existing models cannot match well with the STDM framework. This is mainly because the perceptual slacks obtained from the watermarked image is inconsistent with that from the original image. On the other hand, the JND model does not reflect the real perception characteristics of the HVS well. Consequently, it is necessary to construct an effective perceptual model which estimates the perceptual slacks accurately and efficiently so as to enhance the performance of STDM watermarking scheme.
The research on primary visual cortex demonstrates that the neuronal cells in the V1 region are distributed in a precise and orderly manner. And these neuronal cells are arranged in stripe pattern with certain direction so that V1 area can extract structure features. The human brain can adaptively extract the structure regularities for signal perception and understanding [8-10]. It indicates that structural regularity is one of the important characteristics of visual perception. In the scene perception, humans are more likely to perceive the regularity regions than the irregularity regions. So inspired by the this mechanism, we consider that the sensitivity of human eyes to local area is affected by the structure of the local area.
In this paper, a novel structural regularity-based JND model that meets human visual perception characteristics is proposed to generate the perceptual slack vector in the STDM watermarking scheme, which achieves better tradeoff between robustness and fidelity. The contributions that we have made in this paper can be summarized as follows:
1. The HVS is more sensitive to the distortion in plain or orderly regions than that in texture or disorderly regions. To ensure the results of block classification consistent with human visual characteristics, we present two new methods of block classification which can classify DCT blocks effectively and accurately.
2. In order to avoid the edge density in the JND model affected by the watermark embedding and to measure the edge density more accurately, a new edge density calculation method by using the un-watermarked AC coefficients is proposed.
3. A new method for block classification and a new edge density calculation method are used to measure the contrast masking (CM) effects in the JND model. A new structural regularity-based perceptual JND model is established, which takes into account three factors, namely, contrast sensitivity function (CSF), luminance adaptation (LA) effect and CM effect. Compared with the latest JND models, the effectiveness of the proposed model in the STDM watermarking framework is verified.
4. The proposed JND model is applied in the STDM watermarking framework to obtain optimum rate-distortion-robustness tradeoff. The experiments are conducted to compare the performance of the proposed STDM scheme with other existing STDM watermarking algorithms. In terms of experimental results, the proposed STDM watermarking algorithm based on the proposed JND model has excellent performance.
The rest sections of this paper are organized as follows: The next section reviews the related work on existing JND estimation models and the traditional perceptual STDM algorithms. A robust DCT-based JND model for STDM watermarking framework is established and the proposed perceptual JND model-based STDM watermarking scheme is described in Section 3. The Section 4 of this paper presents the experimental results and analysis, and Section 5 concludes this paper.
2. Related Work
As we all know, the spread transform dither modulation (STDM) is more robust to re-quantization. Li et al. [11] proposed robust algorithms which incorporate Watson’s model into STDM framework and proved that perceptual STDM is one of the most effective schemes to embed multi-bit watermark in an image by quantization approach.
2.1 Spread Transform Dither Modulation
STDM which is an classical extension of quantization index modulation (QIM) plays a significant part in the quantization-based watermarking scheme [5]. By using the dither modulation (DM) quantizer, STDM modulates the projection of the host vector along a given random direction to achieve the embedding watermark information. With QIM and spread spectrum system cooperating, this scheme provides significant improvements in effectiveness and robustness.
In the embedding procedure, the projections Xu is obtained by projecting the host vector onto a random vector u and is modulated on the basis of the message bit m. The watermarked can be obtained as:
\(Y=X+\left[Q\left(X_{u}, \Delta, m, \delta\right)-X_{u}\right] \cdot u, m \in\{0,1\}\) (1)
where the quantization function Q( ⋅ ) is represented as
\(Q\left(X_{u}, \Delta, m, \delta\right)=\Delta \cdot \operatorname{round}\left(\frac{X_{u}+\delta}{\Delta}\right)-\delta, m \in\{0,1\}\) (2)
where ∆ and δ denote quantization step and dither signal, respectively. In the process of extracting watermark, the received vector X′ may be distorted because of channel transmission, and the projection Xu′ is obtained by projecting the distortion vector X′ onto the given vector u. The watermark bit m′ is extracted on the basis of the minimum distance detector as:
\(m^{\prime}=\arg \min _{b \in[0,1]}\left|X_{u}^{\prime}-Q\left(X_{u}^{\prime}, b, \Delta, \delta\right)\right|\) (3)
2.2 Perceptual STDM with JND Model
To avoid the noise which is introduced by quantization exceeding the distortion visibility threshold, the perceptual model is applied into watermarking framework. Watson proposed a JND model to estimate the minimum distortion value that can be perceived by human visual systems in DCT domain [12].
As mentioned above, the distortion caused by watermark embedding in STDM scheme actually happens on the random direction u. Therefore, the slack vector s is also projected onto the random direction u and then we can calculate the visual redundancy of host vector X in the random direction u. The reason why the quantization step ∆ = 2su is because the maximum quantization error of the quantizer is ∆/2 [5][17]. It’s easy to found that the slack vector s calculated by each image are unique. So the slack vector s obtained from the host image is not the same as that from the watermarked image. In order to avoid this mismatch phenomenon and ensure that the image distortion caused by watermark embedding is more correlates with human visual characteristics, a more accurate and robust JND model for STDM framework is necessary.
2.3 JND Estimation
In recent years, the analysis of human visual system has always been research focus. Applying this knowledge to digital image analysis can help researchers to understand image features. Here, we describe four state-of-the-art DCT-based perceptual JND models.
In [12], the classic DCT-based JND model which was put forward by Watson consists of masking component based on luminance, a sensitivity function and contrast masking. A frequency sensitivity table is defined in this model, and each value represents the minimum DCT coefficients that can be perceived when there is no noise. Lower value indicates that the human visual system is more sensitive to its frequency. M. Kim et al. [13,16] confirmed that frequency characteristics are also major determinants that affect the threshold of JND by performing psychophysical experiment, and then proposed a novel DCT-based JND model which contains three parts: the base threshold, luminance adaptation, and contrast masking. The model provides a more precise CSF, and the spatial frequency is used as one of determining factor to estimate the LA and CM effects. In [14], the authors have put forward a DCT-based JND model which extracts the frequency characteristics from the corresponding psychophysical experiment results, and this model is the product of three parts: the base threshold, the masking factors for LA and CM. In this model, there is a new approach to calculate edge strength and edge density. More recently, a orientation regularity-based JND model in which use the orientation features as one of essential factors in analyzing visual content is proposed in [15]. The authors adopt frequency texture energy and orientation regularity to classify DCT blocks into five types. Based on block types and their HVS sensitivities, the authors can measure the CM effect. As mentioned above, compared with the conventional Watson model, other model do a better job in simulating human visual characteristics. However, applying the Kim’s model or the model of [15] to the STDM framework exists mismatch problem, which is mainly because the JND value estimated from the watermarked image is distinguished with that from the initial image. The model of [14] is applied into the STDM framework, which can solve the mismatch problem. Nevertheless, the structural characteristics are not considered in this model so that the visual redundancies cannot be estimated well. Therefore, a more precise and robust perceptual model for the STDM watermarking framework is necessary to generate the perception slack vector.
3. The Proposed Robust Perceptual Model-based Watermarking Scheme
In the STDM watermarking framework, the high-precision JND model is introduced to accurately measure the visual redundancy of images and to improve the quality of the watermarked images. In this section, we first introduce a new robust structural regularity -based JND model in which the sensitivity of HVS to image structure is considered as an important factor, and then we applied the perceptual JND model to estimate the quantization step size of each sample in STDM-based watermarking embedding and detection procedures.
3.1 The Proposed Perceptual Just Noticeable Distortion Model
A novel JND estimation model based on structural regularity is proposed in this article, which is proposed as a model for the joint effect of contrast masking, luminance adaptation, and contrast sensitivity function. We focus on the JND model for the fixed-size 8×8 DCT block. For the n’th block in the DCT domain, the JND threshold \(T_{j n d}(n, x, y)\) of position (x, y) is expressed as
\(T_{j n d}(n, x, y)=T_{B} \cdot F_{L A} \cdot F_{M}\) (4)
Where FLA is the modulation factor from the luminance adaptation, FM is another modulation factor from the contrast masking, TB refers the base threshold.
3.1.1 Contrast Sensitivity Function
TB reflects the spatial CSF and is only variant to spatial frequencies [18]. The base threshold TB for each spatial frequency wx, y can be calculated by
\(T_{B}(n, x, y)=\frac{1}{4 \phi_{x} \phi_{y}} \cdot \frac{\exp \left(0.18 w_{x, y}\right) / 1.33+0.11 w_{x, y}}{0.6+0.4 \cos ^{2} \varphi_{x, y}}\) (5)
where ϕx,y is the direction angle and can be represented as Equation (6), φx and φy are the normalization factors and are calculated as Equation (7)
\(\varphi_{x, y}=\arcsin \left(\frac{2 w_{x, 0} \cdot w_{0, y}}{w_{x, y}^{2}}\right)\) (6)
\(\phi_{v}=\left\{\begin{array}{l} {\sqrt{1 / 8}, v=0} \\ {\sqrt{2 / 8}, v>0} \end{array}\right.\) (7)
wx,y represents the cycles per degree in spatial frequency for the (x, y)-th DCT coefficient and is defined as
\(w_{x, y}=\frac{1}{2 \times 8} \sqrt{\left(x / \theta_{h}\right)^{2}+\left(y / \theta_{v}\right)^{2}}\) (8)
where θh / θv is the horizontal/ vertical angles of a pixel.
3.1.2 Luminance Adaptation Effect
The luminance masking threshold usually depends on the background brightness of a local image area; the brighter the background is, the higher the masking value is [19]. The modulation luminance adaptation factor FLA which we refers the model of [19] can be expressed as
\(F_{L A}=\left\{\begin{array}{ll} {(60-\bar{u}) / 150+1,} & {\bar{u} \leq 60} \\ {1,} & {60<\bar{u}<170} \\ {(\bar{u}-170) / 425+1, \bar{u} \geq 170} \end{array}\right.\) (9)
where \(\bar{u}\) is the average intensity of the DCT block.
3.1.3 Contrast Masking Effect
The CM factor is usually determined by the edge density. There exists a phenomenon: some areas include many structural regularity edges or orderly texture structure information but they may be classified into block type with high texture complexity. In fact, the HVS is sensitive to the structural regularity information [20]. This fact indicates that the structural regularity is one of the important factors to estimate contrast masking effect. Therefore, we proposed a new CM model by analyzing the structure features and calculating the texture complexity.
A. Direction Energy
The regularity which indicates the change of content structure along diverse directions can be accurately reflected by AC coefficients [21]. Consequently, we decide to choose the AC coefficients AC0,1 , AC1,0 and AC1,1 (the subscripts indicate the index within the DCT block) to measure the energy along horizontal, vertical and diagonal directions, respectively. We define the horizontal energy Ehor of a block is as
\(E_{\text {hor}}(i, j)=\left|A C_{0,1}(i, j)\right|\) (10)
where (i, j) denote the position of the block. Similarly, we defines the vertical energy Ever and the diagonal energy Edia which are calculated as follows, respectively
\(E_{v e r}(i, j)=\left|A C_{1,0}(i, j)\right|\) (11)
\(E_{d i a}(i, j)=\left|A C_{1,1}(i, j)\right|\) (12)
The maximum value Emax and the median value Emed of the direction energy of the block are obtained by comparing the horizontal energy Ehor , the vertical energy Ever and the diagonal energy Edia of the blocks. In order to determine whether a block has the dominant direction, the ratio Er is defined and is calculated as
\(E_{r}=\frac{E_{\max }}{E_{m e d}+\gamma}\) (13)
where γ denotes a small constant value and is given by \(\gamma=(0.03 \times L)^{2}\) , and L denotes the gray level of the image.
B. Edge Pixel Density
The edge density uedge which can be obtained by detection operator goes in hand with the CM factor [22]. In the commonly used edge detectors, the Canny operator is one of the powerful operators which can provide good and reliable performance. The uedge detected by Canny operator is given by
\(u_{\text {edge}}=\left(1 / N^{2}\right) \sum_{i=0}^{N-1} \sum_{j=0}^{N-1} E d g e_{i, j}\) (14)
where Edgei,j and N denote an edge value (0 is non edge, 1 is edge pixel) in an edge map and the size of DCT blocks, respectively.
But in the JND model based on watermarking scheme, we usually adopt the Canny operator to detect the edge density. The Canny operator results in a mismatch problem: the CM factor obtained from the watermarking image is inconsistent with that from the host image without any attacks. To settle the mismatch problem, the model of [14] used two AC coefficients to measure the edge strength of a block. Although this method can avoid the edge computation using Canny operator, the texture complexity of a block in the diagonal direction was not considered. In order to calculate the CM factors more accurately and avoid the mismatch problem, the AC coefficients AC0,1 , AC1,0 and AC1,1 are selected to measure the texture complexity of a block in the horizontal, vertical and diagonal directions. We define the new edge strength of a block as
\(S_{A C}(i, j)=\left|A C_{0,1}(i, j)\right|+\left|A C_{1,0}(i, j)\right|+\left|A C_{1,1}(i, j)\right|\) (15)
where (i, j) denotes the position of the block. Thus new edge density uEDGE is given by
\(u_{E D G E}=\ln \left[1+\eta \cdot \frac{S_{a c}-\min \left(S_{a c}\right)}{\max \left(S_{a c}\right)-\min \left(S_{a c}\right)}\right]\) (16)
To make the value of uEDGE closer to the edge density uedge value calculated by Canny operator, we carried out experiment to obtain the optimal value of the parameter η.
\(\eta=\arg \min _{\eta \in[0.01,0.81]} D\) (17)
\(D=\left|u_{e d g e}-u_{E D G E}\right|^{2}\) (18)
where η changes from 0.01 to 0.81 with the step 0.02. When D obtains the minimum value, the corresponding parameter η is the optimal value. According to the experimental results, the parameter η is set as 0.47 and its corresponding the minimum value D of Euclidean distance is 0.9753. In [14], the parameter is empirically set as 0.57 and its corresponding D obtained by this experiment is 1.1786. This proved that the edge density value calculated by Equation (16) is closer to that obtained by Canny operator. The curve fitting of parameter η is shown in Fig.1.
Fig. 1. The curve fitting graph of parameter η.
C. CM estimation model based on Block Classification
As mentioned, to raise the accuracy of block classification, the structure is also seen as one of the significant factors for the CM model. In this section, we propose a new CM model which employs two methods of block classification to obtain the classification results of blocks more accurately.
As shown in the Algorithm, to begin with, we analyze the structure features of blocks. For the current block, by calculating Ehor , Ever and Edia , the maximum value Emax and median value Emed of direction energy and their ratio Er can be obtained, by Equations (10), (11), (12) and (13). The value of Er not exceeding a certain threshold T1 means that the current block has no dominant direction.
Algorithm : Block Classification (I, T1, T2)
If the value of Er does, the average value of direction energy of the current block is higher than the value of dominant direction energy, which indicates that the current block has almost no direction information. In the two cases above, the current block belongs to disorderly block type. Nevertheless, the direction corresponding to the maximum value Emax will be the primary directions of the current block and so the current block belongs to orderly block type. The method of block classification based on direction energy can effectively distinguish the orderly-texture blocks from the regions with high texture complexity. Then, the edge density uEDGE of the current block is calculated by Equation (16). Assuming that the value of uEDGE is less than a certain threshold T2 , then the current block has lower texture complexity, therefore the current block is the type of plain block. Otherwise, the current block belongs to texture block type.
Furthermore, If the current block is texture block and orderly block, the current block belongs to orderly-texture block type. If the current block is texture block and disorderly block, the current block is the type of disorderly-texture block. Finally, by analyzing the direction features and calculating the edge pixel density, all the blocks can be classified into three types: plain, disorderly-texture and orderly-texture. In experiment, the threshold T1 and T2 are empirically given by 2 and 0.19, respectively.
As far as contrast masking effect, human eyes have diverse sensitivities to various kinds of block types. The HVS is more susceptible to the plain regions than the orderly-texture areas and is less susceptible to the disorderly-texture regions. The contrast masking values are also variant with the sensitivity of the HVS. Therefore, to reflect the varieties of contrast masking effect, the masking factor for vertical, horizontal and diagonal blocks can be given as follows:
\(M_{h o r}(x, y)=0.8-\nabla \cdot y\) (19)
\(M_{v e r}(x, y)=0.8-\nabla \cdot x\) (20)
\(M_{d i a}(x, y)=0.8-\nabla \cdot k, k=\max (x, y)\) (21)
where x and y are the DCT sub-band indices. ∇ denotes constant value and is given by 0.1. Last, by merging with the human visual sensitivity into the spatial frequency, the CM factor for each block type can be given by
\(F_{M}=\left\{\begin{array}{l} {1, for plain block} \\ {1+\left\{\begin{array}{l} {M_{\text {hor}}, \text {for horizonof theorderly-textureblock}} \\ {M_{\text {ver}}, \text {forvertical of theorderly-textureblock}} \\ {M_{\text {dia}}, \text {for diagonal of the orderly-textureblock}} \end{array}\right.} \\ {2.25, for(x^{2}+y^{2} \leq 16)indisorderly-textureblock} \\ {1.25, for\left(x^{2}+y^{2}>16\right)indisorderly-textureblock} \end{array}\right.\) (22)
3.2 Proposed JND Model-based STDM Watermarking Scheme
The proposed structural regularity-based JND model is applied to modulate quantization step in the STDM watermarking scheme. And the proposed watermarking scheme includes two parts, embedding and extraction procedure. Fig. 2 illustrates the embedding and extraction steps of the watermarking scheme.
Here, taking Lena image as an example, the procedures of the watermark embedding and extraction are shown as follows.
Watermark embedding procedure
- In order to obtain DCT coefficients, the Lena image will be divided into 8×8 sub-band blocks, and we will perform the DCT transform. Then the coefficients are scanned by zigzag arrangement. A part of coefficients (the fourth, sixth to tenth and thirteenth coefficients) are obtained by DCT sequence to form a host vector x.
- According to the proposed perceptual JND model, Perceptual redundancy vector s is obtained.
- The host vector x and the perceptual redundancy vector s are mapped onto the given projection vector u (which can be used as key) to generate projections xu and su . The quantization step ∆ is calculated from su can be multiplied by the embedding strength in practice.
- One bit of watermark message m is embedded into the host vector projection xu in terms of Equation (1). Then the modified DCT coefficients are transformed to obtain a watermarked image.
Watermark extraction procedure
- A host vector x′ is acquired by the same way as step 1 of watermark embedding procedure.
- Perceptual redundancy vector s′ is obtained by the proposed perceptual JND model.
- The host vector x′ and the perceptual redundancy vector s′ are mapped onto the given projection vector u to generate projections xu′ and su′ , respectively. The quantization step ∆ is calculated from su′ and can be multiplied by the embedding strength.
- According to Equation (3), the watermark m′ is extracted by using the DM detector.
Fig. 2. Flowchart of proposed watermarking scheme.
4. Experimental Results
4.1. Evaluation of The Proposed Structural Regularity-based JND Model
A better JND model can tolerate more noise in the DCT coefficients of the image at the same perceptual quality. Therefore, we evaluate the performance of the proposed JND model by injecting noise into the DCT coefficients based on the corresponding JND value.
\(C^{\prime}(n, x, y)=C(n, x, y)+\lambda \cdot J N D(n, x, y)\) (23)
where \(C^{\prime}(n, x, y)\) indicates the modified DCT coefficient; λ takes random value of +1 or -1; JND(n, x, y) denotes the threshold obtained from Watson’s model [12], Kim’s model [13], the model of [14], the model of [15] and the proposed model, respectively.
To verify the performance of the proposed JND model, eight standard images, ‘Camera’, ‘Columbia’, ‘Crowd’, ‘Elaine’, ‘Lena’, ‘Peppers’, ‘Plane’ and ‘Woman’, were used as testing images [23]. The size of the eight testing images are 256 × 256 shown in Fig. 3.
Fig. 3. Original cover images: Camera, Columbia, Crowd, Elaine, Lena, Peppers, Plane,Woman.
In this experiment, the structural similarity index measurement (SSIM) was employed to weigh the similarity between the original image and the distorted image obtained by injecting noise according to JND value generated by JND model. The SSIM index is calculated on the blocks x and y of an image [26].
\(\operatorname{SSIM}(x, y)=\frac{\left(2 \mu_{x} \mu_{y}+c_{1}\right)\left(2 \sigma_{x y}+c_{2}\right)}{\left(\mu_{x}^{2}+\mu_{y}^{2}+c_{1}\right)\left(\sigma_{x}^{2}+\sigma_{y}^{2}+c_{2}\right)^{\prime}}\) (24)
With: µx, µy are the mean of x and y, respectively;
\(\sigma_{x}^{2}\) , \(\sigma_{y}^{2}\) are the variance of x and y, respectively;
σxy is the covariance of x and y;
\(c_{1}=\left(k_{1} L\right)^{2}\) , \(c_{2}=\left(k_{2} L\right)^{2}\) are two variables to stabilize the division with weak denominator; L is the dynamic range of the pixel-values, k1 = 0.01 and k2 = 0.03 by default.
At a same perceived quality, the better JND model is, the lower SSIM value will be yield. To clearly and intuitively describe the results of the JND comparison, we present the results in form of histogram Fig. 4. As shown in Fig. 4, the average SSIM values of Watson’s model, Kim’s model, the model of [14], the model of [15] and the proposed model are 0.7235, 0.5487, 0.5156, 0.3812 and 0.4811, respectively. The results show that the SSIM value of our proposed model is lower than Watson’s model and Kim’s model, which means that our proposed model can withstand more distortion than Watson’s model and Kim’s model. At the same time, we found that the average SSIM value of the proposed model is lower than the model of [14]. This is because the DCT coefficients AC0,1(x, y), AC1,0(x, y) and AC1,1( x, y ) are used to calculate the edge intensity of (x, y)’th block in the proposed model, whereas the model of [14] only uses two AC coefficients AC0,1(x, y) and AC1,0(x, y). It can be observed from Fig. 4 that the average SSIM of the proposed model is higher than the model of [15]. Because all the DCT coefficients are used to calculate the frequency texture energy and direction feature energy in the model of [15]. Although only three AC coefficients cannot accurately reflect the image block’s texture complexity, it can be closer to the result obtained by using the edge detection operator. Taken together, the classification results obtained by using the above two classification methods are almost the same as those in [15]. The following results proved that the proposed model is more efficient and applicable in the STDM watermarking framework.
Fig. 4. SSIM comparison of different JND models
4.2. Evaluation of Different JND Models for STDM Watermarking Algorithms
This experiment is used to compare the performance of different JND models used in the same STDM watermarking scheme [12-15]. The each of testing images is standard image with a dimension of 256×256 obtained from the USC-SIPI image database [23]. A random message (1024 bits) is embedded into each original image. The fourth, sixth to tenth and thirteenth coefficients are selected from the DCT sequence to form a host vector and embedded one bit in it. The bit error rate (BER) is calculated for comparison, when the SSIM value between a testing image and the watermarked image is 0.982. To test the robustness of various models, several signal processing attacks are employed to attack the watermarked image.
Table 1. BER of noise attacks in STDM scheme with different JND models
Table 2. BER of JPEG attacks in STDM scheme with different JND models
Table 3. BER of scaling attacks in STDM scheme with different JND models
Table 4. BER of filtering attacks in STDM scheme with different JND models
If adding noise in the image transmission, the watermarked image is attacked easily and inevitably. The results of the watermarked image attacked by adding Gaussian noise and adding Salt-and-Pepper noise are shown in Table 1. The watermarked image was tested against Gaussian noise with mean zero and different variance values. Compared with the other JND models, the proposed model always has the lowest BER for different noise intensities. This indicates that the proposed model performed much better than others.
The watermark robustness against the attack of JPEG compression is an important performance to be evaluated. As shown in Table 2, different performance emerges in the five JND models within the STDM watermarking algorithms show about JPEG compression attacks. The average BER of Watson’s, Kim’s, the model of [14], the model of [15] and the proposed model are 0.0927, 0.0789, 0.0863, 0.2111 and 0.0662, respectively.
When the watermarked image is under volumetric scaling attack with different scaling factors, the BER values of five JND models are presented in Table 3. For volumetric scaling attack with scaling factor ranging from 0.1 to 1, the performance of the proposed model is better than that of other models. For volumetric scaling attack with scaling factor 1.3, the performance of the proposed model is not the best, but the BER of the proposed model is still much lower than the model of [15].
As one of the classical attacks, filtering attack is significant. For example, the median filtering and Gaussian filtering are usually used to attack the watermarked image. Table 4 presents the comparison results of filtering. For filtering with median filter (3×3), the BER value of the proposed model is 0.2% higher than the model of [14], but lower than the rest of models. For Gaussian filtering and filtering with median filter (5×5), the proposed model has the lowest BER values. In summary, the STDM watermarking scheme based on the proposed mod performs excellently.
4.3. Evaluation of The Proposed Model-based STDM Watermarking Algorithms
In this section, the superior performance of the proposed scheme is demonstrated by comparing its functionality with that of the popular STDM watermarking schemes. We report the results that were performed on the test image “Lena” (size 256×256), and a random message of length 1024 bits was embedded into original Lena image.
Fig. 5. Watermark insertion in Lena image. (a) Original image, (b) Watermarked image
Fig. 5 shows the host image and the watermarked image without any attacks. The similarity between the original image (a) and the watermarked image (b) is measured by Equation (24). Generally, a larger SSIM indicates that the watermarked image resembles the host image more closely. The SSIM value between the host image (a) and the watermarked image (b) is 0.9893, which means that the watermarking scheme makes the watermark imperceptible.
The eight images in Fig. 3 are used to test the execution time required for each STDM watermarking scheme, which is implemented as program codes running on a Matlab® working environment. Table 5 presents the average time necessitated by different schemes.
Table 5. Average execution time per image (in second) for the compared
To further prove that the proposed model-based STDM watermarking scheme has better robustness, the proposed scheme was compared with the previous STDM watermarking schemes, referring to AdpWM [24], OptiWM [11], RDMWm [17], RW [25].
Table 6. BER of noise attacks for different STDM watermarking schemes
Table 7. BER of JPEG attacks for different STDM watermarking schemes
Table 8. BER of scaling attacks for different STDM watermarking schemes
Table 9. BER of filtering attacks for different STDM watermarking schemes
Table 6 reflects the different performance of five different STDM watermarking algorithms for Gaussian noise and adding Salt-and-Pepper noise attacks. The OptiWM and RW have bad performance. However, the scheme in this paper did not exceed 12.3% for the Gaussian noise with a variance of 0.9×10-3 and performs better than the other schemes in the Gaussian noise- adding attacks. For the same intensity of Salt-and-Pepper attack, the proposed scheme has the lowest BER values by comparing with other STDM schemes.
Table 7 shows the sensitivity to JPEG compression. It is easy to show that the scheme in this paper has a better performance than the other schemes in robustness against JPEG compression. The average BER generated by OptiWM exceeds 13.4%, and other schemes has average BER lower than 9%. It is worth to notice that the average BER of the proposed scheme is 6.22%. The resistance of five schemes to volumetric scaling attacks are presented in Table 8. RW has weak resistance to volumetric scaling attacks. The average BER of OptiWM exceeds 5.6%, and the average BER of AdpWM and RDMWm are no lower than 1.5%. The proposed scheme has lowest average BER value which is 0.64%.
Table 9 shows the response to median filtering and Gaussian attack. For filtering with median filter, AdpWM shows the best performance. The average BER of the proposed scheme is 0.51% higher than AdpWM, but lower than the rest of schemes. And for filtering with the Gaussian filter, RDWMm has the lowest BER value, and the proposed scheme is 0.19% higher than RDWMm. The BER of other schemes are higher than that of the proposed scheme.
Combining all of the experimental results, the proposed scheme has the best performance.
5. Conclusion
In this paper, a new perceptual JND model that meets human visual perception characteristic was introduced, which can effectively measure the masking effect. Using this perceptual model in the STDM watermarking scheme can not only meet the human visual perception characteristics, but also achieve better balance between robustness and fidelity. Specifically, new measurement methods for blocks classification and edge strength evaluation are proposed, which can estimate the perception slacks more effectively and accurately. Experimental results show that the proposed scheme is more robust than the watermarking algorithms based on the existing JND models with the uniform fidelity and has a superior performance than the previously proposed perceptual STDM framework.
In the proposed scheme, the structural regularity of image is an important factor to establish the proposed JND model. For an image with poor structural regularity, the performance of the proposed scheme may not be satisfactory. Therefore, for an image, the measurement of structural strength is an urgent problem to be solved. According to the structural strength of an image, we can make an adjustment on the scheme, which is the focus of future research.
References
- C. Qin, X. Chen, D. Ye, J. Wang, X. Sun, "A novel image hashing scheme with perceptual robustness using block truncation coding," Information Sciences, vol. 361-362, pp. 84-99, September 2016. https://doi.org/10.1016/j.ins.2016.04.036
- C. Qin, X. Chen, X. Luo, X. Zhang, X. Sun, "Perceptual image hashing via dual-cross pattern encoding and salient structure detection," Information Sciences, vol. 423, pp. 284-302, January 2018. https://doi.org/10.1016/j.ins.2017.09.060
- V. Monga, M.K. Mihcak, "Robust and secure image hashing via non-negative matrix factorizations," IEEE Transactions on Information Forensics and Security, vol. 2, no. 3, September 2007.
- S. A. Parah, J. A. Sheikh, N. A. Loan, "Robust and blind watermarking technique in DCT domain using inter-block coefficient differencing," Digital Signal Processing, vol. 53, pp. 11-24, 2016. https://doi.org/10.1016/j.dsp.2016.02.005
- B. Chen, G. Wornell, "Quantization index modulation: a class of provably good methods for digital watermarking and information embedding," IEEE Transactions on Information Theory, vol. 47, no. 4, pp. 1423-1443, May 2001. https://doi.org/10.1109/18.923725
- J. Zong, L. Meng, H. Zhang, W. Wan, "Jnd-based multiple description image coding," KSII Transactions on Internet and Information system, vol. 11, no. 8, pp. 3935-3949, August 2017. https://doi.org/10.3837/tiis.2017.08.010
- A. B. Watson, "DCTune: A Technique for Visual Optimization of DCT Quantization Matrices for Individual Images," Society for Information Display Digest of Technical Papers XXIV, pp. 946-949, 1993.
- J. Sun, X. Liu, W. Wan, J. Li, D. Zhao, H. Zhang, "Video hashing based on appearance and attention features fusion via DBN," Neurocompuing, vol. 213, no. 12, pp. 84-94, November 2016. https://doi.org/10.1016/j.neucom.2016.05.098
- N. B. Turkbrowne, J. Junge, B. J. Scholl, "The automaticity of visual statistical learning," Journal of Experimental Psychology: General, vol. 134, no. 4, pp. 552-564, November 2005. https://doi.org/10.1037/0096-3445.134.4.552
- J. R. Saffran, E. D. Thiessen, "Pattern induction by infant language learners," Developmental Psychology, vol. 39, no. 3, pp. 484-494, 2003. https://doi.org/10.1037/0012-1649.39.3.484
- Q. Li, I. J. Cox, "Improved spread transform dither modulation by using a perceptual model: robustness to amplitude scaling and JPEG compression," IEEE Int. Conf. Acoust. Speech Signal Process, vol. 2, pp. 185-188, 2007.
- A. B. Watson, "DCT quantization matrices optimized for individual images," Proc. SPIE, vol. 1913, pp. 202-216, 1993.
- S. H. Bae, M. Kim, "A Novel DCT-based JND model for luminance adaptation effect in DCT frequency," IEEE Signal Processing Letters, vol. 20, no. 9, pp. 893-896, July 2013. https://doi.org/10.1109/LSP.2013.2272193
- W. Wan, J. Liu, J. Sun et al, "Improved logarithmic spread transform dither modulation using a robust perceptual model," Multimedia Tools and Applications, vol. 75, no. 21, pp. 13481-13502, November 2016. https://doi.org/10.1007/s11042-015-2853-5
- W. Wan, J. Wu, X. Xie et al, "A novel just noticeable difference model via orientation regularity in DCT domain," IEEE Access, vol. 5, pp. 22953-22964, April 2017. https://doi.org/10.1109/ACCESS.2017.2699858
- S. H. Bae, M. Kim, "A new DCT-based JND model monochrome images for contrast masking effects with texture complexity and frequency," in Proc. of IEEE Int. Conf. Image Process, pp. 431-434, 2013.
- X. Li, J. Liu, J. Sun et al, "Step-projection-based spread transform dither modulation," IET Information Security, vol. 5, no. 3, pp. 170-180, 2011. https://doi.org/10.1049/iet-ifs.2010.0218
- J. Wu, W. Wan, G. Shi, "Content complexity based just noticeable difference estimation in DCT domain," in Proc. of Signal and Information Processing Association Annual Summit and Conference, pp. 1-5, 2017.
- Z. Wei, K. N. Ngan, "Spatio-temporal just noticeable distortion profile for grey scale image/video in DCT domain," IEEE Transactions on Circuits and Systems for Video Technology, vol. 19, no. 3, pp. 337-346, February 2009. https://doi.org/10.1109/TCSVT.2009.2013518
- J. Wu, G. Shi, W. Lin et al, "Just noticeable difference estimation for images with free-energy principle," IEEE Transactions on Multimedia, vol. 15, no. 7, pp. 1705-1710, June 2013. https://doi.org/10.1109/TMM.2013.2268053
- Y. Zhong, H. Zhang, A. K. Jain, "Automatic caption localization in compressed video," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 4, pp. 385-392, 2000. https://doi.org/10.1109/34.845381
- J. Canny, "A computational approach to edge detection," IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 8, no. 6, pp. 679-698, 1986. https://doi.org/10.1109/TPAMI.1986.4767851
- USC-SIPI Image Database, Available online: http://sipi.usc.edu/database/ (accessed on 11 Mar. 2018)
- L. Ma, "Adaptive spread-transform dither modulation using a new perceptual model for color image," IEICE Trans. Inf. Sys, vol. E93. D, no. 4, pp. 843-857, 2010 https://doi.org/10.1587/transinf.E93.D.843
- Q. Li, G. Doerr, I. J. Cox, "Spread transform dither modulation using a perceptual model," in Proc. of IEEE Workshop on Multimedia Signal Processing, pp. 98-102, 2006.
- X. Z. Xie, C. C. Lin, C. C. Chang, "Data hiding based on a two-layer turtle shell matrix," Symmetry, vol. 10, no. 2, pp. 47, 2018. https://doi.org/10.3390/sym10020047
Cited by
- Blind Photograph Watermarking with Robust Defocus-Based JND Model vol.2020, 2019, https://doi.org/10.1155/2020/8892349