TechniquesforDataHiding

需积分: 9 163 浏览量更新于2023-03-03 评论收藏 421KB PDF 举报

身份认证购VIP最低享 7 折!

领优惠券(最高得80元）

资源详情

资源评论

资源推荐

313

tion. Copying in printed form for private use is permitted without

payment of royalty provided that (1) each reproduction is done

without alteration and (2) the Journal reference and IBM copyright

notice are included on the ﬁrst page. The title and abstract, but no

other portions, of this paper may be copied or distributed royalty

free without further permission by computer-based and other infor-

mation-service systems. Permission to republish any other portion

of this paper must be obtained from the Editor.

IBM SYSTEMS JOURNAL, VOL 35, NOS 3&4, 1996 0018-8670/96/$5.00  1996 IBM BENDER ET AL.

Data hiding, a form of steganography, embeds

data into digital media for the purpose of

identiﬁcation, annotation, and copyright. Several

constraints affect this process: the quantity of

data to be hidden, the need for invariance of these

data under conditions where a “host” signal is

subject to distortions, e.g., lossy compression,

and the degree to which the data must be immune

to interception, modiﬁcation, or removal by a third

party. We explore both traditional and novel

techniques for addressing the data-hiding process

and evaluate these techniques in light of three

applications: copyright protection, tamper-

prooﬁng, and augmentation data embedding.

igital representation of media facilitates access

and potentially improves the portability, efﬁ-

ciency, and accuracy of the information presented.

Undesirable effects of facile data access include an

increased opportunity for violation of copyright and

tampering with or modiﬁcation of content. The moti-

vation for this work includes the provision of protec-

tion of intellectual property rights, an indication of

content manipulation, and a means of annotation.

Data hiding represents a class of processes used to

embed data, such as copyright information, into vari-

ous forms of media such as image, audio, or text with

a minimum amount of perceivable degradation to the

“host” signal; i.e., the embedded data should be invis-

ible and inaudible to a human observer. Note that data

hiding, while similar to compression, is distinct from

encryption. Its goal is not to restrict or regulate access

to the host signal, but rather to ensure that embedded

data remain inviolate and recoverable.

Two important uses of data hiding in digital media are

to provide proof of the copyright, and assurance of

content integrity. Therefore, the data should stay hid-

den in a host signal, even if that signal is subjected to

manipulation as degrading as ﬁltering, resampling,

cropping, or lossy data compression. Other applica-

tions of data hiding, such as the inclusion of augmen-

tation data, need not be invariant to detection or

removal, since these data are there for the beneﬁt of

both the author and the content consumer. Thus, the

techniques used for data hiding vary depending on the

quantity of data being hidden and the required invari-

ance of those data to manipulation. Since no one

method is capable of achieving all these goals, a class

of processes is needed to span the range of possible

applications.

The technical challenges of data hiding are formida-

ble. Any “holes” to ﬁll with data in a host signal,

either statistical or perceptual, are likely targets for

removal by lossy signal compression. The key to suc-

cessful data hiding is the ﬁnding of holes that are not

suitable for exploitation by compression algorithms.

A further challenge is to ﬁll these holes with data in a

way that remains invariant to a large class of host sig-

nal transformations.

Techniques for data

hiding

by W. Bender

D. Gruhl

N. Morimoto

A. Lu

BENDER ET AL. IBM SYSTEMS JOURNAL, VOL 35, NOS 3&4, 1996

314

Features and applications

Data-hiding techniques should be capable of embed-

ding data in a host signal with the following restric-

tions and features:

1. The host signal should be nonobjectionally

degraded and the embedded data should be mini-

mally perceptible. (The goal is for the data to

remain hidden. As any magician will tell you, it is

possible for something to be hidden while it

remains in plain sight; you merely keep the person

from looking at it. We will use the words hidden,

inaudible, imperceivable, and invisible to mean

that an observer does not notice the presence of the

data, even if they are perceptible.)

2. The embedded data should be directly encoded

into the media, rather than into a header or wrap-

per, so that the data remain intact across varying

data file formats.

3. The embedded data should be immune to modifi-

cations ranging from intentional and intelligent

attempts at removal to anticipated manipulations,

e.g., channel noise, filtering, resampling, cropping,

encoding, lossy compressing, printing and scan-

ning, digital-to-analog (D/A) conversion, and ana-

log-to-digital (A/D) conversion, etc.

4. Asymmetrical coding of the embedded data is

desirable, since the purpose of data hiding is to

keep the data in the host signal, but not necessarily

to make the data difficult to access.

5. Error correction coding

should be used to ensure

data integrity. It is inevitable that there will be

some degradation to the embedded data when the

host signal is modified.

6. The embedded data should be self-clocking or

arbitrarily re-entrant. This ensures that the embed-

ded data can be recovered when only fragments of

the host signal are available, e.g., if a sound bite is

extracted from an interview, data embedded in the

audio segment can be recovered. This feature also

facilitates automatic decoding of the hidden data,

since there is no need to refer to the original host

signal.

Applications. Trade-offs exist between the quantity

of embedded data and the degree of immunity to host

signal modiﬁcation. By constraining the degree of

host signal degradation, a data-hiding method can

operate with either high embedded data rate, or high

resistance to modiﬁcation, but not both. As one

increases, the other must decrease. While this can be

shown mathematically for some data-hiding systems

such as a spread spectrum, it seems to hold true for all

data-hiding systems. In any system, you can trade

bandwidth for robustness by exploiting redundancy.

The quantity of embedded data and the degree of host

signal modiﬁcation vary from application to applica-

tion. Consequently, different techniques are employed

for different applications. Several prospective applica-

tions of data hiding are discussed in this section.

An application that requires a minimal amount of

embedded data is the placement of a digital water

mark. The embedded data are used to place an indica-

tion of ownership in the host signal, serving the same

purpose as an author’s signature or a company logo.

Since the information is of a critical nature and the

signal may face intelligent and intentional attempts to

destroy or remove it, the coding techniques used must

be immune to a wide variety of possible modiﬁca-

tions.

A second application for data hiding is tamper-proof-

ing. It is used to indicate that the host signal has been

modiﬁed from its authored state. Modiﬁcation to the

embedded data indicates that the host signal has been

changed in some way.

A third application, feature location, requires more

data to be embedded. In this application, the embed-

ded data are hidden in speciﬁc locations within an

image. It enables one to identify individual content

features, e.g., the name of the person on the left versus

the right side of an image. Typically, feature location

data are not subject to intentional removal. However,

it is expected that the host signal might be subjected

to a certain degree of modiﬁcation, e.g., images are

routinely modiﬁed by scaling, cropping, and tone-

scale enhancement. As a result, feature location data-

hiding techniques must be immune to geometrical and

nongeometrical modiﬁcations of a host signal.

Trade-offs exist between

the quantity of data and

the immunity to

modiﬁcation.

IBM SYSTEMS JOURNAL, VOL 35, NOS 3&4, 1996 BENDER ET AL.

315

Image and audio captions (or annotations) may

require a large amount of data. Annotations often

travel separately from the host signal, thus requiring

additional channels and storage. Annotations stored in

ﬁle headers or resource sections are often lost if the

ﬁle format is changed, e.g., the annotations created in

a Tagged Image File Format (TIFF) may not be present

when the image is transformed to a Graphic Inter-

change Format (GIF). These problems are resolved by

embedding annotations directly into the data structure

of a host signal.

Prior work. Adelson

describes a method of data hid-

ing that exploits the human visual system’s varying

sensitivity to contrast versus spatial frequency. Adel-

son substitutes high-spatial frequency image data for

hidden data in a pyramid-encoded still image. While

he is able to encode a large amount of data efﬁciently,

there is no provision to make the data immune to

detection or removal by typical manipulations such as

ﬁltering and rescaling. Stego,

one of several widely

available software packages, simply encodes data in

the least-signiﬁcant bit of the host signal. This tech-

nique suffers from all of the same problems as Adel-

son’s method but creates an additional problem of

degrading image or audio quality. Bender

modiﬁes

Adelson’s technique by using chaos as a means to

encrypt the embedded data, deterring detection, but

providing no improvement to immunity to host signal

manipulation. Lippman

hides data in the chromi-

nance channel of the National Television Standards

Committee (NTSC) television signal by exploiting the

temporal over-sampling of color in such signals. Typi-

cal of Enhanced Deﬁnition Television Systems, this

method encodes a large amount of data, but the data

are lost to most recording, compression, and transcod-

ing processes. Other techniques, such as Hecht’s

Data-Glyph,

which adds a bar code to images, are

engineered in light of a predetermined set of geomet-

ric modiﬁcations.

Spread-spectrum,

8-11

a promising

technology for data hiding, is difﬁcult to intercept and

remove but often introduces perceivable distortion

into the host signal.

Problem space. Each application of data hiding

requires a different level of resistance to modiﬁcation

and a different embedded data rate. These form the

theoretical data-hiding problem space (see Figure 1).

There is an inherent trade-off between bandwidth and

“robustness,” or the degree to which the data are

immune to attack or transformations that occur to the

host signal through normal usage, e.g., compression,

resampling, etc. The more data to be hidden, e.g., a

caption for a photograph, the less secure the encoding.

The less data to be hidden, e.g., a watermark, the more

secure the encoding.

Data hiding in still images

Data hiding in still images presents a variety of chal-

lenges that arise due to the way the human visual sys-

tem (HVS) works and the typical modiﬁcations that

images undergo. Additionally, still images provide a

relatively small host signal in which to hide data. A

fairly typical 8-bit picture of 200 × 200 pixels pro-

vides approximately 40 kilobytes (kB) of data space

in which to work. This is equivalent to only around 5

seconds of telephone-quality audio or less than a sin-

gle frame of NTSC television. Also, it is reasonable to

expect that still images will be subject to operations

ranging from simple afﬁne transforms to nonlinear

transforms such as cropping, blurring, ﬁltering, and

lossy compression. Practical data-hiding techniques

need to be resistant to as many of these transforma-

tions as possible.

Despite these challenges, still images are likely candi-

dates for data hiding. There are many attributes of the

HVS that are potential candidates for exploitation in a

data-hiding system, including our varying sensitivity

to contrast as a function of spatial frequency and the

masking effect of edges (both in luminance and

Figure 1 Conceptual data-hiding problem space

ROBUSTNESS

BANDWIDTH

EXTENT OF CURRENT TECHNIQUES

BENDER ET AL. IBM SYSTEMS JOURNAL, VOL 35, NOS 3&4, 1996

316

chrominance). The HVS has low sensitivity to small

changes in luminance, being able to perceive changes

of no less than one part in 30 for random patterns.

However, in uniform regions of an image, the HVS

is more sensitive to the change of the luminance,

approximately one part in 240. A typical CRT (cathode

ray tube) display or printer has a limited dynamic

range. In an image representation of one part in 256,

e.g., 8-bit gray levels, there is potentially room to hide

data as pseudorandom changes to picture brightness.

Another HVS “hole” is our relative insensitivity to

very low spatial frequencies such as continuous

changes in brightness across an image, i.e., vignett-

ing. An additional advantage of working with still

images is that they are noncausal. Data-hiding tech-

niques can have access to any pixel or block of pixels

at random.

Using these observations, we have developed a variety

of techniques for placing data in still images. Some

techniques are more suited to dealing with small

amounts of data, while others to large amounts. Some

techniques are highly resistant to geometric modiﬁca-

tions, while others are more resistant to nongeometric

modiﬁcations, e.g., ﬁltering. We present methods that

explore both of these areas, as well as their combina-

tion.

Low bit-rate data hiding

With low bit-rate encoding, we expect a high level of

robustness in return for low bandwidth. The emphasis

is on resistance to attempts of data removal by a third

party. Both a statistical and a perceptual technique are

discussed in the next sections on Patchwork, texture,

and applications.

Patchwork: A statistical approach

The statistical approach, which we refer to as Patch-

work, is based on a pseudorandom, statistical process.

Patchwork invisibly embeds in a host image a speciﬁc

statistic, one that has a Gaussian distribution. Figure 2

shows a single iteration in the Patchwork method.

Two patches are chosen pseudorandomly, the ﬁrst A,

the second B. The image data in patch A are lightened

while the data in patch B are darkened (exaggerated

for purposes of this illustration). This unique statistic

indicates the presence or absence of a signature.

Patchwork is independent of the contents of the host

image. It shows reasonably high resistance to most

nongeometric image modiﬁcations.

For the following analysis, we make the following

simplifying assumptions (these assumptions are not

limiting, as is shown later): We are operating in a 256

level, linearly quantized system starting at 0; all

brightness levels are equally likely; all samples are

independent of all other samples.

The Patchwork algorithm proceeds as follows: take

any two points, A and B, chosen at random in an

image. Let a equal the brightness at point A and b the

brightness at point B. Now, let

(1)

The expected value of S is 0, i.e., the average value of

S after repeating this procedure a large number of

times is expected to be 0.

Sab–=

Figure 2 A single iteration in the Patchwork method

(photograph courtesy of Webb Chapel)

IBM SYSTEMS JOURNAL, VOL 35, NOS 3&4, 1996 BENDER ET AL.

317

Although the expected value is 0, this does not tell us

much about what S will be for a speciﬁc case. This is

because the variance is quite high for this procedure.

The variance of S, σ

is a measure of how tightly sam-

ples of S will cluster around the expected value of 0.

To compute this, we make the following observation:

Since S = a − b and a and b are assumed independent,

can be computed as follows (this, and all other

probability equations are from Drake

(2)

where for a uniform S is:

(3)

Now, since a and b are samples from the

same set, taken with replacement. Thus:

(4)

which yields a standard deviation σ

≈ 104. This

means that more than half the time, S will be greater

than 43 or less than − 43. Assuming a Gaussian clus-

tering, a single iteration does not tell us much. How-

ever, this is not the case if we perform the procedure

many times.

Let us repeat this procedure n times, letting a

and b

be the values a and b take on during the ith iteration,

. Now let S

be deﬁned as:

(5)

The expected value of S

is:

(6)

This makes intuitive sense, since the number of times

is greater than b

should be offset by the number of

times the reverse is true. Now the variance is:

(7)

And the standard deviation is:

(8)

Now, we can compute S

10000

for a picture, and if it var-

ies by more than a few standard deviations, we can be

fairly certain that this did not happen by chance. In

fact, since as we will show later S′

for large n has a

Gaussian distribution, a deviation of even a few σ

S′

indicates to a high degree of certainty the presence of

encoding (see Table 1).

The Patchwork method artiﬁcially modiﬁes S for a

given picture, such that S′

is many deviations away

from expected. To encode a picture, we:

1. Use a specific key for a known pseudorandom

number generator to choose (a

, b

). This is impor-

tant, because the encoder needs to visit the same

points during decoding.

2. Raise the brightness in the patch a

by an amount δ,

typically in the range of 1 to 5 parts in 256.

3. Lower the brightness in b

by this same amount δ

(the amounts do not have to be the same, as long as

they are in opposite directions).

4. Repeat this for n steps (n typically ~10 000).

Now, when decoded, S′

will be:

(9)

or:

(10)

So each step of the way we accumulate an expectation

of 2 × δ. Thus after n repetitions, we expect S′

to be:

5418≈

2 σ×

255 0–()

------------------------

× 10836≈≈=

i1=

∑

–

i1=

∑

nS× n0× 0===

nσ

×=

nσ×n104×≈=

′

δ+()b

δ–()–

i1=

∑

′

2δna

–()

i1=

∑

Table 1 Degree of certainty of encoding given deviation

from that expected in a Gaussian distribution

(

δ =2)

Standard

Deviations

Away

Certainty

50.00%

84.13%

97.87%

99.87%

679

2713

6104

剩余23页未读，继续阅读

choolt

粉丝: 3
资源: 7

会员权益专享

Techniques for Data Hiding

评论0

会员权益专享

最新资源

Techniques for Data Hiding

评论0

Techniques-for-Data-Hiding.rar_bender_data hiding_site:www.pudn.

Coding for Data and Computer Communications

A data hiding approach for the self-security of iris recognition

reversible data hiding

hidden: hiding data with deep networks

hiding images in plain sight

UE4For each loop

**warning** (netrw) your hiding list is empty!

opaque handle

基于直方图的RDH技术

WPF isvisble binding

matlab HUGO隐写

What is the difference between “display: none” and “visibility: hidden”, when used as attributes to the HTML element.Verify your answer with your own examples.

Object-oriented programming aims to implement real world entities like inheritance, hiding, polymorphism, etc. in programming.

'Move.rigidbody' hides inherited member 'Component.rigidbody'. Use the new keyword if hiding was intended. [Assembly-CSharp]csharp(CS0108)

nginx proxy-

cpp中什么叫做封闭类？

click web proxy

会员权益专享

最新资源

warning (netrw) your hiding list is empty!