1. Introduction
Authenticity and identity verification of images has become increasingly more
important with the rapidly growing multimedia applications requiring the seamless
distribution of digital images. Authenticity usually deals with the copyright
verification of the original source and intended destination of an image as well as
the verification of the integrity of the contents. Hence if an image is tampered
with, then the authentication test would fail. Additionally, in digital image
distribution, identity information is usually attached so that the image can be
traced, as in the applications of fingerprinting.
While the primary technology in providing these security requirements of digital
images has traditionally been cryptographic protocols, digital watermarking
technology has demonstrated promises of addressing the security issues like
authenticity, identity verification, copyright protection, and fingerprinting [
1
I. Cox, J. Bloom, and M. Miller, Digital watermarking: Principles &
Practice (Morgan Kauffman Publishers,
2001).
]. Digital watermarking is the process of embedding a digital
message into digital media. In an authentication watermark, the embedded message is
usually a variant of some sort of natural ‘signature’ of the
original image, so that it can be used as authenticating the validity of the
content. There have been a number of approaches for the design of authentication
watermarks [
2
T. Liu and Z.-D. Qiu, “The survey of digital
watermarking-based image authentication
techniques,” 6th International Conference on Signal
Processing, 2
1556 – 1559
(2002).
], with varying degrees of success. Applications like
copyright protection and fingerprinting embed a unique identification code that is
associated with only that specific image and can be used to uniquely identify the
media.
As an example of the premise of this paper, let us consider an attack where
classified information can be changed and then leaked. The solution paradigm should,
in this case, address the issue of identifying the leaker as well as whether any
tampering was done on the leaked image. Our proposed solution is to add dual
watermark for this dual purpose. A robust watermark is to be used to embed identity
information into the image, while a semi-robust authentication watermark is expected
to determine whether tampering was done.
We employ transform-domain watermark, which is generally shown to be more robust than
its spatial-domain counterpart. In particular, the authentication part of the
current work is based on the Fourier phase-based watermark called
‘Phasemark’ [
3
F. Ahmed and I. S. Moskowitz, “Correlation-based watermarking
method for image authentication applications,”
Opt. Eng.
43
1833–1838
(2004). [CrossRef]
]. In Ref. [
4
F. Ahmed and I. S. Moskowitz, “Phase Signature-based Image
Authentication watermark robust to compression and
coding,” Proc. SPIE
5561, 133–144
(2004). [CrossRef]
] the authors showed how
Phasemark can be
used in wavelet domain for better compression tolerance. In this work we extend that
further to have a better compression tolerance while maintaining the quality of the
image to an acceptable level. The primary advantage of using wavelet domain for
watermarking stems from the flexibility of sub-band decomposition and a predictable
compression tolerance [
5
P. Meerwald and A. Uhl, “A survey of wavelet domain
watermarking algorithms,” Proc. SPIE
4314, 505–516
(2001). [CrossRef]
].
2. Proposed Fourier-Wavelet domain watermark
The proposed method is based on the Fourier transform of the selected subbands of the
wavelet decomposition of an image. Using wavelets, an image can be decomposed into a
low-resolution smooth image and a number of detailed images [
6
S. Mallat, A Wavelet Tour of Signal Processing
(Academic Press, NY,
1998).
].
Figure 1 shows the sub bands resulting from a two-level
decomposition.
Fig. 1. Level 2 decomposition showing the sub bands.
With one-level decomposition, we denote the smooth sub band as A1
(‘A’ stands for the approximate image and 1 is for level 1
decomposition). The detailed part has three components depending upon the
directional vector used in the decomposition. These are horizontal (H1), vertical
(V1), and diagonal (D1). With a second level of decomposition, the A1 sub band can
be further decomposed into an even smoother version (A2) and three more detailed sub
bands (H2, V2, and D2). The decomposition can go on for more levels of resolution.
Figure 2 shows the flowchart of the watermark embedding
process using 1 level of decomposition. Among the 4 sub bands, the A1 and D1 sub
bands are unaffected with the watermarking process. Since perceptual quality is
mostly affected by A1, this is good for image quality preservation. In addition,
since compression will most likely affect D1 sub band more, watermark signal will be
less affected by compression. Note that A1 is used for computing the image hash, or
the authentication signature, which is subsequently embedded in the H1 or V1
subband. The identity information is embedded in the other subband. If we use two
level decompositions, then we choose A2 as the signature subband, H2 and V2 for
authentication and identity embedding. Looking at
Fig. 2, it might be tempting to use only Fourier transform,
avoiding wavelet transform altogether, since the DFT-only method would be more
efficient. But there appear to be two problems associated with this. First, the
‘conjugate symmetry’ of the transform domain coefficients is
necessary to transform the image back to the spatial domain. With the DFT-only
method, while the complete full-image transform coefficients satisfy the conjugate
symmetry condition, the one-fourth shares representing the approximate, horizontal,
vertical and diagonal detail components by themselves do not satisfy. On the other
hand, with the proposed DWT-DFT method since individual subbands of the original
image are independently Fourier transformed, the symmetry is maintained, admittedly
with more calculations necessary. The second aspect is that the DWT-DFT gives the
flexibility of wide range of robustness and perceptual quality.
Fig. 2. DWT-DFT Domain Dual Watermark Embedding using level 1 decomposition
Now we shall show how the actual embedding process works, in terms of level 1
decomposition without any loss of generality.
2.1 Embedding authentication signature
A number of different image hashes are reported in literature [
7
J. Fridrich and M. Goljan, “Robust hash functions for digital
watermarking,” IEEE Proc. Int. Conf. on
Information Technology: Coding and Computing ,”
178 – 183
(2000).
]. In Ref. [
3
F. Ahmed and I. S. Moskowitz, “Correlation-based watermarking
method for image authentication applications,”
Opt. Eng.
43
1833–1838
(2004). [CrossRef]
], we have shown binary-phase-only-filter (BPOF) [
8
J. L. Horner and J. R. Leger, “Pattern recognition with Binary
phase-only filters,” Appl. Opt.
24, 609–611
(1985). [CrossRef] [PubMed]
] signature to be an effective image authenticator. In the
present work, since the BPOF is extracted from an approximate sub band, like A1
or A2 instead of the original image, the resulting signature is expectedly even
more robust. In Ref. [
4
F. Ahmed and I. S. Moskowitz, “Phase Signature-based Image
Authentication watermark robust to compression and
coding,” Proc. SPIE
5561, 133–144
(2004). [CrossRef]
] we used the same sub band for signature and embedding.
That had the undesirable effect of changing the signature domain, so that even
in absence of any kind of degradation, the computed signature and the extracted
signature from the watermarked image will not be the same. We eliminate that
problem by not embedding the signature in the same signature sub band. In this
work, these two spaces are separate, as we use the A1 as the signature space and
the H1 (or V1) as the signature hiding space for authentication verification.
Since the signature and the bit-plane embedding uses Fourier transform, it is
instructive to look at the magnitude and phase of discrete Fourier transform
(DFT) of an arbitrary image x(m,n). During the signature
generation phase x(m,n) represents the A1 image. The DFT of
x(m,n) can be represented in polar form with its magnitude
and phase as following,
Where, X(u,v) is the magnitude part of the
frequency component given by |H(u,v)|, and
ϕ(u,v) is the phase part of frequency
H(u,v) given by
The BPOF is 1, if cos(ϕ) is ≥ 0, and the
BPOF is 0 otherwise. To thwart any forging attack, the actual signature is to be
encrypted with a key, thus giving S(u,v) =
Ek(BPOF). While any general- purpose encryption
technique can be used here, security of digital images often uses lightweight
encryption for efficiency. In our case, we first generate a random matrix of
1’s and 0’s using a key, then force Fourier symmetry in
it. Finally we perform an XOR operation with the Fourier-symmetric random
matrix. Next we decide which of the subbands (H1 or V1), this signature will be
hidden into. To do that, we compute the energy contents (in terms of L2-norm) of
the two subbands and embed the authentication signature in the lower-energy
subband. With two levels of decomposition, the choice is in between H2 and V2.
Now, we take the Fourier transform of the selected subband, keep its phase
unchanged, round the magnitude part, that makes it representable by a fixed
number of bit planes. We then substitute one of the bitplanes of the rounded
magnitude by the signature S. The rounded and modified
magnitude spectrum is then combined with the unchanged phase and an inverse
Fourier transform results in the watermarked wavelet subband, (H1w)
which carries the authentication signature.
2.2 Embedding Identity info
As mentioned above, the subband with the higher energy in the mid-frequency range
in wavelet decomposition is used for the spread spectrum watermarking of the
identity information, m. This could be a unique identification
number of an image, or a registration or serial number, or any other tracking
number.
The identity is first encoded using source coding and optionally with the help of
error correction and detection coding, and finally spreading, represented as
Wm,
which is made the same size as the
subband under consideration. The key, k is used as a seed in
pseudorandom number generator, to come up with the pattern
Wk
, which is also the same size as the subband
image. These two binary patterns (Wk
and
Wm
) are then combined, which can take as
simple as an XOR operation as follows.
As in the previous section, we embed this using the Fourier domain bit-plane
embedding, where Wa
is the bit-plane to be embedded
and rounded Fourier magnitude of the selected subband is the cover where it is
hidden.
2.3 Detection: authentication and identification
The detector in turn does the reverse operations of performing the authentication
test and then extracting the identity information. Note that since the two
processes in the embedder were done independently, the order of these two
operations in the detector does not make any difference. Also note that the
detector is a blind one which does not need the original image, but it makes use
of the knowledge of the keys used in the embedder as well as the wavelet
parameters used. It however does not need to know the specific subband used.
The detector performs the desired wavelet decomposition and identifies the
authentication and identification subbands from the measurement of entropy
(expressed as the energy content, in terms of L2 norm, in this
paper). Specifically, our selection is based on the fact that, authentication is
a 1-bit decision, while identification involves the extraction of multiple bits
of embedded information. Hence, identification needs a better embeddable
subband. And we argue that the subband with more energy (or entropy in this
case) will be better embeddable.
For the authentication test, we take the magnitude spectrum of the Fourier
transform of the authentication subband and round the coefficients to represent
them in a number of bit-planes. At this point, the detector should know which
bit-plane is used for hiding the signature. This can be done either by a using
an agreed-upon bit-plane between the embedder and detector or conceivably, the
detector can select it by performing a correlation with all the bit-planes with
the computed phase-only information as mentioned below. After this bit-plane
selection, the encrypted embedded signature is extracted. Then we use the shared
key to generate a Fourier-symmetric random matrix and XOR with the extracted
encrypted signature to obtain the decrypted version
S’(u,v).
From the phase information ϕA(u,v) , of
the Fourier transform of the approximate subband (A1 or A2), we then compute
phase-only-filter (POF) as
The next step is the correlation of these two extracted and computed signal,
which is given in Fourier domain as follows
The degree of correlation is used as a measure of the degree of authentication.
To extract the identity information, the identification subband is selected
first. After the Fourier transform of that subband, we round the magnitude and
then extract the embedded identity bitplane. It is then XORed with the key and
unspread to obtain the hidden identity information. The next section illustrates
the algorithm with specific choice of parameters and examples.
3. Simulation and results
3.1 Which Wavelet?
A small-tap wavelet like Haar filter is found to extract more detailed
information, compared to a large-tap wavelet like dB6, as shown in
Table 1. Since we are embedding in the two high-energy
detailed subbands, we choose ‘Haar’ wavelet for the rest
of the simulation.
Table 1 The energy of the subbands (using L2-norm)
|
A
|
H
|
V
|
D
|
|---|
|
Haar
|
50376
|
265
|
160
|
31
|
|
dB6
|
50538
|
222
|
51
|
22
|
3.2 Image Database
We tested the algorithm using the SIPI [
9] database images.
Figure 3 shows the first level wavelet decomposition of
the 256×256 image, ‘The Chemical Plant’. As
shown in
Table 1, the horizontal subband, H1 [shown in
Fig. 3(b)] contains more energy compared to the vertical
subband, V1 [shown in
Fig. 3(c)]. Hence H1 is used to embed the identification
information, while V1 is used to hide the authentication information. We just
arbitrarily used a 9-digit number for the identification. We convert it to
binary form and add some taint bits to result in a code of 32-bit. This is then
replicated to make it equal to the size of subband, which is 128×128
for the level 1 and 64×64 for level 2 decomposition. Note that A1
[shown in
Fig. 3(a)] is used to calculate the BPOF-based signature
and D1 [shown in
Fig. 3(d)], is not changed in the embedding process.
Fig. 3. 1-level decomposition of the ‘Chemical Plant’ with
Haar Wavelet (a) Approximation, (b) Horizontal Detail, (c) Vertical, (d)
Diagonal Detail
3.3 Detection metrics
Authentication is performed by a correlation operation given by Eq. (
5). We tested different correlation metrics and eventually
found the PSR to be more useful in this case. PSR is the ‘peak to
secondary ratio’ (PSR), defined in dB as
10log(peak
energy/energy in the second peak).
Figure 4(a) shows the PSR performance at different
watermarking strength for a number of 10 different 256×256 images. In
our case watermark strength is related to the most significant bit planes we use
for embedding the authentication and identification watermark. This is explained
more clearly in the next subsection.
We also investigated the use of a computationally less expensive
‘inner product’ metric to benchmark the authentication
value. This is done by computing the inner product of the computed BPOF from the
phase information [H
A
POF (u, v)] of the approximate band
and the extracted BPOF.
Figure 4(b) shows the result. While both metrics yield in
similar performance, it turns out that the correlation-based metric offers
better discrimination between a marked and unmarked image. For the rest of the
simulation, we therefore use the PSR metric.
Fig. 4. Authentication Performance for a set of images at different strength a)
PSR value, b) Inner product value
Now, it is clear from
Fig. 4(a) that for the unmarked case, the PSR is close to
zero. This means the highest peak and the second highest peak values are
similar, indicating a bad correlation and thus unauthenticated image. To
understand the authentication characteristics, we need to know the sources of
error in this watermarking process. There are three error sources. First, the
rounding of Fourier magnitude in the embedder introduces some errors. Second,
the rounding of the image after the final inverse Wavelet transform in the
embedder yields in some error. Third, the detector introduces error while
rounding the Fourier magnitude. Since rounding error affects a lower order bit
plane more adversely than a higher-order bit-plane, we see that, as we increase
the watermark strength (by selecting a higher-order Fourier magnitude bit
plane), the error decreases and saturates, and thus yielding better
authentication. Finally, note also that it is possible to define a threshold of
detection value, for example a value of 5dB (giving enough guard band) can
differentiate the marked and unmarked image very well.
For an understanding of how well the identity verification works, we use the bit
error rate (BER), which is defined as the ratio of the number of incorrectly
identified bits to the total number of bits. Note that in real applications we
don’t know the actual bits, so the identity number need to be
validated by the authenticity metric and this is one of the interesting
contributions of the present work.
3.4 Quality of the watermarked image
We use the widely accepted objective metric PSNR (peak signal to noise
ratio) as the quality indicator, which is defined as follows. The
PSNR of a watermarked image Iw, with respect to
the original image Io (both represented in 8-bit gray scale with peak
intensity of 255), is given by,
In this work, quality of the watermarked image depends primarily on the selected
subbands and the selected bitplane for Fourier magnitude embedding. For a given
subband, higher strength of watermark is obtained by selecting higher-order
bitplane. Specifically, in our implementation, we use the following
relationship,
We do this to represent the strength in a sliding scale of 1 to a desired maximum
value. BIT_POS is the most significant bitplane of the Fourier
magnitude that is used in embedding. The maximum strength of 6 used in
simulation means, we have hidden the signature in bitplane number 12 of the
Fourier magnitude. Typically, an increase of strength by 1 decreases the image
quality by 6 dB in PSNR sense.
Fig. 5. Quality and Authentication Metric at a) Level 1 and b) Level 2
decomposition
Figure 5 shows a classical trade-off between the
authentication and image quality metric for different watermark strength. As
mentioned earlier, PSR is used as the metric for authentication and PSNR is used
for measuring the quality of the watermarked image. The figure shows that a
stronger watermark makes the authentication more robust, while degrading the
quality of the image. As an example, strength of 4, results in the quality of
more than 42 db.
Figure 5 also shows that while the quality of the
watermarked image remains similar for both 1-level and 2-level decomposition,
authentication value decreases in level-2, which is primarily a correlation
artifact. On the other hand, level-2 embedding is found to be more robust to
compression as shown below.
3.5 Compression performance
After the watermark embedding, the image is compressed with different quality
factor and then the detector is run. We used JPEG compression engine adopted by
Matlab, and also used its definition of compression ‘quality
factor’. Assuming a detection threshold of 5 dB, with a quality of 42
dB (strength 4), the method can tolerate compression quality factor up to 85, as
shown in
Fig. 6(a). As we increase the watermark strength from 4
to 5, it can tolerate a compression of approximately 60 quality factor. As we
further increase the strength, the image can be compressed down to a quality of
40, with the authentication still working perfectly. If we compare the results
with level 2 decomposition, as in
Fig. 6(b), we see that the compression tolerance has
significantly increased. For a strength of 5 (corresponding to PSNR 36 dB), it
can now tolerate compression down to a quality factor of as low as 15.
A more important concern is how robust the embedded identity information is
against compression.
Figure 7 depicts the result. Our goal is to achieve BER
of 0 for complete compression tolerance. Again, in general, level 2 embedding is
found to be more compression tolerant. Specifically, if we look at the strength
5, level 1 can tolerate a quality factor of 75, while level 2 can go down to a
quality of 30.
Fig. 6. Authentication Performance against Compression a) Level 1, b) Level
2.
Fig. 7. Identity verification Performance against Compression (BER vs compression
QF) (a) Level 1, (b) Level 2.
Above results demonstrate the efficacy of the proposed dual DWT-DFT domain
watermark that integrates the flexibility of embedding parameter selection
offered by DWT, with the robustness of signature offered by DFT. While the
authentication performances are found to be similar or better compared to some
other contemporary transform domain watermarking techniques, our work introduces
couple of unique contributions that were not reported in those works. These are
i) tagging of authenticity and identity watermark; and ii) the energy-based
selection of subbands for embedding authentication signature and the
identification information. In addition, the mutual exclusivity of signature
subband and embedding subband resulted in a signature robust to watermark
itself. In our test with several images, it is also shown that even when we
blackened out (replaced pixels with values of zero) up to an area of
100×100 pixel (out of 256×256), the authentication
watermark still survived. On the other hand, since the authentication watermark
is robust, it may not be very suitable for tamper-proof scenario, although
modification in the busy area of the image can be still detected easily. Of
course, the identification part is found to be more sensitive to malicious
modification than the authentication part.