Content Indexing and Searching using Content Identifiers and

Content Indexing and Searching using Content Identifiers and
US 20070055689Al
(19) United States
(12) Patent Application Publication (10) Pub. No.: US 2007/0055689 A1
(43) Pub. Date:
Rhoads et al.
(54)
Mar. 8, 2007
CONTENT INDEXING AND SEARCHING
of application No. 09/507,096, ?led on Feb. 17, 2000,
USING CONTENT IDENTIFIERS AND
ASSOCIATED METADATA
now abandoned, and which is a continuation-in-part
of application No. 09/482,786, ?led on Jan. 13, 2000,
now Pat. No. 7,010,144.
(76) Inventors: Geo?rey B. Rhoads, West Linn, OR
(US); Kenneth L. Levy, Stevenson,
WA (Us)
Correspondence Address:
(60) Provisional application No. 60/282,205, ?led on Apr.
6, 2001. Provisional application No. 60/082,228, ?led
on Apr. 16, 1998. Provisional application No. 60/232,
163, ?led on Sep. 11, 2000. Provisional application
DIGIMARC CORPORATION
9405 SW GEMINI DRIVE
No. 60/257,822, ?led on Dec. 21, 2000. Provisional
application No. 60/191,778, ?led on Mar. 24, 2000.
BEAVERTON, OR 97008 (US)
(30)
(21) Appl. No.:
11/466,392
(22) Filed:
Aug. 22, 2006
Foreign Application Priority Data
Jul. 20, 2001
(WO) ......................... .. PCT/US01/22953
Publication Classi?cation
Related US. Application Data
(51)
Int. Cl.
G06F 7/00
(52)
US. Cl. ............................................... ..707/102;707/3
(60) Continuation-in-part of application No. 10/ 118,468,
?led on Apr. 5, 2002, now Pat. No. 7,095,871.
(2006.01)
Continuation-in-part of application No. 10/869,320,
?led on Jun. 15, 2004, now Pat. No. 7,130,087, which
(57)
ABSTRACT
is a continuation-in-part of application No. 09/975,
739, ?led on Oct. 10, 2001, now Pat. No. 6,750,985,
which is a division of application No. 09/127,502,
?led on Jul. 31, 1998, now Pat. No. 6,345,104.
Continuation-in-part of application No. 09/952,384,
?led on Sep. 11, 2001, which is a continuation-in-part
of application No. 09/620,019, ?led on Jul. 20, 2000.
Continuation-in-part of application No. 09/636,102,
?led on Aug. 10, 2000.
Continuation-in-part of application No. 09/840,018,
?led on Apr. 20, 2001, which is a continuation-in-part
A method of indexing content for network searching com
prises identifying media content signals stored at sites
distributed over a distributed computer network; extracting
content identi?ers from the content signals; using the con
tent identi?ers to obtain metadata used to classify the media
content signals; and creating a searchable index of the media
content signals based on the metadata, wherein users access
the searchable index on the distributed computer network to
submit a search query for the searchable index to retrieve
links to the media content signals.
metadata database
1 _s'-'—
metadata dihbase
managemenl system 116
content database 102
4 _. 5;
searchable database 124
Content File(s)
b server 100
Reader
Applioatlon (10a)
_
Content Files
Embedder
Applimlion 142 ‘1'
FneNVeb Server 122
(110)
Designated
I irectories 13 =
Content Files
(140)
Muitimedla ?les
126
Document ?les
128
Search Men‘
thread 120
Patent Application Publication Mar. 8, 2007 Sheet 2 0f 3
US 2007/0055689 A1
com
0mm
652:0
0mm
6.3gmw:
mwn?o
0mm
2.592:
oumNc /w8.5E2810¢
/mwE‘A625|I0
2N
ucmEo
n:m xom
.5
N
Patent Application Publication Mar. 8, 2007 Sheet 3 0f 3
US 2007/0055689 A1
0mm
com
Eumoc w m oO xwvE m n?o
653%
m
.5
oNN/
0mm/
Mar. 8, 2007
US 2007/0055689 A1
CONTENT INDEXING AND SEARCHING USING
CONTENT IDENTIFIERS AND ASSOCIATED
METADATA
TECHNICAL FIELD
[0001]
This patent application is a continuation in part of
US. patent application Ser. No. 10/118,468, ?led Apr. 5,
2002 (Now US. Pat. No. 7,095,871), which claims priority
to US. Provisional Application 60/282,205, ?led Apr. 6,
latest content, and that they are getting accurate and helpful
information relating to the content.
[0009] In these applications, there is a need to enable
digital asset management to reliably link media content with
additional data about the content. One way to associate
content with information about the content is to place the
information in a ?le header or footer. This approach, how
ever, is less elfective because the information often does not
survive ?le format changes, conversion to the analog
domain, etc. Another way to associate multimedia content
2001.
with other data is to hide identifying information in the
[0002] US. patent application Ser. No. 10/118,468 is also
content through data hiding or steganography. Steganogra
a continuation in part of US. patent application Ser. No.
09/612,177, ?led Jul. 6, 2000, which is a continuation of
phy refers to a process of hiding information into a signal.
US. patent application Ser. No. 08/746,613, ?led Nov. 12,
1996, which is a continuation in part of US. patent appli
cation Ser. No. 08/649,419, ?led May 16, 1996 (now US.
Pat. No. 5,862,260) and Ser. No. 08/508,083 ?led Jul. 27,
1995 (now US. Pat. No. 5,841,978).
Digital watermarking is a process for modifying media
[0003]
This patent application is a continuation in part of
US. patent application Ser. No. 09/952,384, ?led Sep. 11,
2001, which is a continuation in part of US. patent appli
cation Ser. No. 09/620,019, ?led Jul. 20, 2000. application
Ser. No. 09/952,384 also claims priority to US. Provisional
Patent Application Nos. 60/232,163, ?led Sep. 11, 2000, and
60/257,822, ?led Dec. 21, 2000. application Ser. No. 09/952,
One example of steganography is digital watermarking.
content to embed a machine-readable code into the data
content. The data may be modi?ed such that the embedded
code is imperceptible or nearly imperceptible to the user, yet
may be detected through an automated detection process.
Most commonly, digital watermarking is applied to media
such as images, audio signals, and video signals. However,
it may also be applied to other types of data, including
documents (e.g., through line, word or character shifting),
software, multi-dimensional graphics models, and surface
textures of objects.
[0010] Digital watermarking systems have two primary
384 also claims priority to PCT Application PCT/US01/
22953, ?led Jul. 20, 2001.
components: an embedding component that embeds the
watermark in the media content, and a reading component
[0004] This patent application is also a continuation in part
of US. patent application Ser. No. 09/636,102, ?led Aug.
10, 2000, which claims priority to US. Provisional Appli
cation No. 60/191,778, ?led Mar. 24, 2000.
that detects and reads the embedded watermark. The embed
[0005] This patent application is also a continuation in part
of US. patent application Ser. No. 09/ 840,018, ?led Apr. 20,
2001, which is a continuation in part of US. patent appli
lyZes target content to detect whether a watermark is present.
In applications where the watermark encodes information
(e.g., a message), the reader extracts this information from
the detected watermark.
cation Ser. No. 09/507,096, ?led Feb. 17, 2000, which is a
continuation in part of US. patent application Ser. No.
09/482,786, ?led Jan. 13, 2000 (Now US. Pat. No. 7,010,
144).
[0006]
The above patents and patent applications are
hereby incorporated by reference.
ding component embeds a watermark by altering data
samples of the media content in the spatial, temporal or
some other transform domain (e.g., Fourier, Discrete Cosine,
Wavelet Transform domains). The reading component ana
[0011] The present assignee’s work in steganography, data
hiding and watermarking is re?ected in US. Pat. No.
5,862,260; in copending application Ser. Nos. 09/503,881
and 09/452,023; and in published speci?cations WO
9953428 and WO0007356 (corresponding to US. Ser. Nos.
09/ 074,034 and 09/ 127,502). A great many other approaches
BACKGROUND AND SUMMARY
are familiar to those skilled in the art. The artisan is
presumed to be familiar with the full range of literature
[0007] As digital content continues to proliferate, man
agement of digital assets becomes an increasingly dif?cult
challenge. Enhancements in computer networking and data
base technology allow companies to manage large collec
tions of images and other media and make the content
available to third parties. While network communication
provides a powerful tool to enable the manager of the
database to share content with others, it makes it more
dif?cult to control and track how the content is being used.
[0008] For example, some companies maintain extensive
databases of images and other media content used to pro
mote their products. Customers or service providers such as
advertising and marketing ?rms can access this content
remotely via extranet, web site, or other ?le transfer trans
actions. Though computer networking telecommunication
technology facilitates access, it makes it di?icult to ensure
that the customers and services providers are getting the
about steganography, data hiding and watermarking. The
subject matter of the present application is related to that
disclosed in US. Pat. Nos. 5,862,260, 6,122,403 and in
co-pending application Ser. Nos. 09/503,881 ?led Feb. 14,
2000, 60/198,857 ?led Apr. 21, 2000, Ser. No. 09/571,422
?led May 15, 2000, Ser. No. 09/620,019 ?led Jul. 20, 2000,
and Ser. No. 09/636,102 ?led Aug. 10, 2000; which are
hereby incorporated by reference.
[0012] The disclosure describes methods and systems for
managing digital content using watermarks to link the
content to related metadata. In one method, a watermark
reader device reads a watermark embedded into media
content. The watermark conveys watermark information,
such as a content identi?er and creator identi?er. The reader
forwards the watermark information to a router. The router
then uses the watermark information to ?nd a metadata
database identi?er. It then sends a request for metadata along
Mar. 8, 2007
US 2007/0055689 A1
with the watermark information to the metadata database
identi?ed by the metadata database identi?er. The metadata
database uses the watermark information to ?nd related
metadata for the media content and sends the related meta
data to the reader device.
[0013]
One aspect of the invention is a method for pro
cessing media content on a distributed network. The method
comprises: identifying media content signals stored at sites
distributed over the distributed computer network; extract
ing content identi?ers from the content signals; using the
content identi?ers to obtain metadata used to classify the
media content signals; and creating a searchable index of the
media content signals based on the metadata, wherein users
access the searchable index on the distributed computer
network to submit a search query for the searchable index to
retrieve links to the media content signals.
[0014] Another aspect of the invention is a method for
searching for audio or images on a distributed computer
network comprising: from a location in the distributed
computer network, receiving a query for content signals
related to a ?rst content signal, the ?rst content signal being
part of the query; receiving a content identi?er extracted
from audio or image data of the ?rst content signal; using the
content identi?er to obtain metadata used to classify the ?rst
content signal; searching a searchable index of media con
tent signals based on the metadata, which forms search
criteria for the ?rst content signal; and returning a set of
search results including references to content signals stored
in the distributed computer network that correspond to the
search criteria.
[0015] Further features will become apparent with refer
ence to the following detailed description and accompanying
drawings.
communication software for establishing a network com
munication with other systems on a network via TCP/IP. The
reader application 108 communicates watermark informa
tion extracted from watermarked content to a router appli
cation 112 executing on a router system 114. The router
application maps the watermark information to a corre
sponding metadata database management system 116 using
a registry 118, which includes data records that include the
watermark information and associated metadata database
information. The router also includes communication soft
ware for receiving requests from reader applications and
re-directing requests to the metadata database system 116.
[0021] The metadata database system 116 manages
requests for information from router applications and reader
applications. It includes a metadata database that stores
information about the content ?les. In some implementa
tions, the content database and metadata database may be
integrated.
[0022] There are a variety of application scenarios for
using embedded watermark data in digital asset manage
ment. In one application scenario, the reader application
operates in conjunction with the router and metadata data
base to dynamically link content ?les to information and
actions. This scenario operates as follows. The user acquires
watermarked content, such as images, audio or video from
a computer network (e.g., an extranet, web site or e-mail).
The user provides the content ?le as input to a watermark
reader application using the user interface of the reader. In
a windowing user interface environment, the user drags and
drops the content ?le from the desktop into the reader UI
(e.g., a window).
[0023] The reader extracts a watermark message embed
ded in the content within the ?le and sends it to a routing
application. The routing application is accessible on a net
work 104 via Internet communication protocols, such as
BRIEF DESCRIPTION OF THE DRAWINGS
HTTP, XML, and TCP/IP. The routing application maintains
[0016] FIG. 1 illustrates a system for enhancing digital
asset management by linking media content with metadata
a registry database 118 including a number of database
records that associate watermark messages with related
and actions associated with the content.
[0017] FIG. 2 illustrates a content distribution system
according to an embodiment of the present invention.
[0018] FIG. 3 illustrates a veri?cation process according
to the FIG. 2 system.
DETAILED DESCRIPTION
information. In one implementation, the routing application
uses a content identi?er extracted from the watermark
message to look up a creator identi?er. The creator identi?er
is associated with a metadata database management system.
In particular, it is associated with a network address of the
database management system to which queries are sent to
fetch information and actions linked to the content via the
watermark.
[0024] The routing application sends a request for related
[0019] FIG. 1 illustrates a system for enhancing digital
asset management by linking media content with metadata
information or actions to the metadata database along with
the content identi?er and the network address (e.g., IP
and actions associated with the content. The media content
is maintained as a collection of media ?les (e.g., still image,
address) of the reader application. In response, the metadata
database sends content/product speci?c information from
the metadata database to the reader for display in prede?ned
audio, or video), stored or distributed on one or more
smart phone, etc. The user’s computer 106 shown in FIG. 1
is representative of the wide array of these types of devices.
?elds within reader UI. The metadata database looks up the
content/product speci?c information based on the content
identi?er.
[0025] The metadata may be sent in many different forms.
In one implementation, the metadata database sends HTML
content back to the reader, which renders it. In another
implementation, it sends content in the form of XML. For
background on a routing application, see US. application
[0020] The user’s computer executes a watermark reader
application 108 that decodes watermarks from content ?les
110, such as images, audio or video ?les. It includes network
[0026] The information returned to the reader may enu
merate links to additional actions, such as hyperlinks to web
devices, such as a web site 100, a content database 102, etc.
User’s of the content ?les are typically distributed in many
locations, but are interconnected via a local area or wide area
network 104. Each user accesses content through a network
device such as a Personal Computer, set top box, network
enabled audio or video player, personal digital assistant,
Ser. No. 09/571,422 ?led May 15, 2000.
Mar. 8, 2007
US 2007/0055689 Al
sites, additional content ?les, or programs. Some examples
of these actions include options to order another version of
the watermarked content or products or services depicted in
Internet) and a netWork for Wireless personal digital assis
tants (e. g., the Palm.net netWork). The Wireless PDA extracts
the watermarked content. For example, the user can click an
mark in the content item). The PDA sends the identi?er to
the data formatting server in a message, Which passes the
option displayed in the reader UI to go to a URL speci?ed
by the metadata database for additional functionality, such as
fetching more information from the metadata database or
some other database, purchasing related products or ser
vices, launching a search for related content, etc.
[0027] In one implementation, a search program is imple
mented as part of the metadata database management sys
tem. When the user selects an action to launch a search for
related content, the reader application sends the request to
the metadata database management system. The metadata
database looks up corresponding content descriptors for the
Watermarked content ?le based on the content identi?er. It
then searches for other content ?les represented in the
a content identi?er from a content item (e.g., from a Water
message to the router 114.
[0032] The router parses the identi?er from the message,
looks up the netWork address associated With the content
identi?er, and returns it to the data formatting server. Next,
the data formatting server retrieves the metadata associated
With the content identi?er from the metadata database
located at the netWork address. Speci?cally, the data for
matting server retrieves a Web page indexed by the netWork
address returned by the router. Next, the data formatting
server reformats the metadata for display on the PDA and
sends the reformatted data to the PDA for rendering. Spe
ci?cally if the metadata is a Web page, the data formatting
metadata database that have matching descriptors, and
server reformats the Web page for display on the PDA’s
returns pointers to the related content ?les to the reader
then click on a listing to fetch and render the selected content
monitor. For other types of metadata content, the data
formatting server formats the metadata content for delivery
to the PDA and rendering on the PDA, such as by converting
?le.
to a compressed ?le, or a streaming ?le format like
application, Which displays a listing of them. The user may
[0028] In another scenario, the functionality of the reader
application described above is incorporated into an Internet
broWser or ?le broWser, such as WindoWs Explorer in the
WindoWs Operating System. Using a Web of ?le broWser
equipped With Watermark reader softWare (e.g., a plug-in,
integrated via an Application Programming Interface, or as
a shell extension to the operating system), the user broWses
content ?les. The user may broWse rendered versions of the
?le, such as a rendering of an image ?le, a thumb nail of an
image, or a ?le icon representing an audio or video ?le in a
?le directory structure. As the user scrolls over rendered
content (such as an image displayed on the user’s display
monitor) or representations of ?les (e.g., ?le icons in a
Microsoft’s ASF format. This example is applicable to other
portable communication devices like Wireless phones.
[0033]
The above processes performed Within the data
formatting server may be performed in Whole or in part on
router system 114, metadata database 116, and the content
database 102. For example, the router can perform the
function of fetching the Web page in response to looking up
the Web page address in the registry, and then re-formatting
the Web page for rendering on the PDA device, Wireless
phone, or other client device (e.g., set top box, TV, etc.). In
addition, the router can send information about the client
device, such as a device ID sent by the reader application
108, to the metadata database, Which in turn, formats the
directory structure), the application dialogue appears noti
metadata in a format for rendering on the PDA device or
fying the user that the content ?le has additional information
Wireless phone.
[0034] In particular, the data formatting functions may be
available. From this point forWard, the broWser operates in
a similar fashion as the reader application described above.
The broWser renders metadata returned from the metadata
performed in a product handler executing in the router
database in the form of HTML or XML.
system. The product handler refers to a process described in
[0029]
US. application Ser. No. 09/571,422, and incorporated by
reference into this patent application.
The router system may be implemented Within a
local area netWork in Which the user’s computer resides, or
may be located on a Wide area netWork such as the Internet.
[0035] To improve performance, the reader application
Similarly, the metadata database may be implemented
the Internet.
can be designed to cache Watermark data to avoid repeated
read operations on the same content. In particular, the reader
application retains Watermark message data decoded from
some number of most recently used ?les, along With the
[0030]
In some cases, the metadata returned to the user’s
name of the ?les. When the user instructs the reader to fetch
computer may be formatted for the type of computer. For
example, PDA’s, cell phones and other consumer electronic
related information for a selected ?le, the reader ?rst checks
the cache for Watermark message data extracted from the
?le, and if present, forWards that message data to the router
Within a local area netWork in Which the user’s computer
resides, or may be located on a Wide area netWork such as
devices may have differing display protocols for Which the
application. Further, the reader application may also cache
data needs to be formatted for proper rendering. One Way to
address this is for the reader application to communicate
reader device information to the router, Which in turn,
provides this information to the metadata database. The
metadata database may provide data in the proper format,
such as a format for display using the Palm Operating
metadata associated With most recently, or most frequently
accessed media ?les. This may require additional memory,
but obviates the need to decode the Watermark and fetch the
metadata.
system, or may route it through an intermediate data for
matting server that converts the data before sending it to the
router system may link a Watermark message to tWo or more
reader application.
[0031] For example, in the diagram of FIG. 1, the data
formatting server is connected to the netWork 104 (e.g., the
[0036]
While FIG. 1 shoWs a single metadata database, the
different metadata databases. The router system can return
HTML or XML, for example, giving the user the option to
choose Which metadata database he or she Would like
information from. Alternatively, the router can issue mul
Mar. 8, 2007
US 2007/0055689 A1
tiple requests to each of the metadata databases listed in the
registry for a particular Watermark message. Each of the
lems and Web searching limitations. The combination
includes running Web craWlers (also knoWn as spiders)
metadata databases then return related information to the
locally on numerous remote netWorks, domains or comput
reader application in response to the router application’s
ers, and having these Web craWlers report back to a central
request.
or distributed database. This database can be searched, via a
user interface similar to the one used for current search
[0037] In one implementation, the metadata is returned to
the reader application as XML. This format enables the
reader to parse the metadata and format it for display Within
?elds of the reader UI.
[0038]
Some content ?les may have multiple different
Watermarks in different blocks of the content. Each of these
engines, Where the user enters keyWords or phrases, and
desired information is returned. As an extension of this user
interface, a Watermark detector may be used to extract a
Watermark bearing a content identi?er, and possibly content
type tags, that are used as input for a search to ?nd related
content or information about the content.
Watermarks may link to the same or different metadata, or
metadata database.
Enhanced Content and Metadata Searching and Indexing
[0039] The above digital asset management systems and
processes may be used advantageously in various combina
tions With content and metadata searching and indexing
systems, such as those described in 60/198,857, Ser. No.
09/571,422, Ser. No. 09/620,019 and Ser. No. 09/636,102.
The folloWing section describes systems and processes for
content searching and indexing that employ imperceptibly
embedded Watermark data in combination With other mecha
nisms for identifying and indexing multimedia content,
including still images, video, audio, graphics, and text.
[0040] Peer-to-peer (known as P2P) ?le sharing is the
current rage in the Internet. Examples of such systems
include Napster, AlMster, Scour.net, Gnutella, and FreeNet,
[0043] Currently, only Web pages are returned as links in
Web-based search engines. HoWever, With this combined
system, Web page links, proprietary ?lename links, and
database links are returned. Another advantage over current
Web searching is that rather than the Web craWlers running
on the Web and going from link to link, the craWlers run on
the local system With the permission and guidelines of the
system they are searching. Another advantage is that, since
the Web craWlers are running locally in a user-de?ned (i.e.
restricted) environment, they can be designed to look at
database entries and non-HTML ?le formats, such as Word
documents, MPEG movies, and MP3 audio ?les. An addi
tional advantage is that Web craWlers can be running on
numerous, potentially every, local netWork, or Within numer
ous or potentially every domain since they run locally and do
not block Internet access by doWnloading the Web informa
to name a feW. These ?le-sharing systems alloW users to
tion and then scanning it.
share ?les directly betWeen their computers, With a central
[0044] Advantages over ?le-sharing systems include
searching the Whole document for keyWords. This novel
database or a distributed database that is passed from
computer to computer. The ?le sharing is usually restricted
to a certain ?le type, such as music or videos, and to a certain
directory. These systems are based upon metadata tags in the
?le headers or footers, or ?lenames, and users are concerned
about opening their hard drives. For example, most MP3
system also searches for related information, such as meta
data and Watermarks, and searches all document types. In
addition, the local programs are designed for craWling the
current computer or local netWork, and not just a speci?ed
directory, although user-de?ned limitations can exist.
?les have a standard ID3 tag, v2 in their header or v1 in their
footer, Which includes the song, album and artist names.
Another advantage is that the searching is continuous,
Current ?le-sharing systems only search at the beginning,
during peak hours. Thus, this novel system can handle huge
and possibly When the user connects to the ?le sharing
netWork. This Works When you share one small directory and
only search for ?le names and metadata tags. These systems
are also usually based upon a proprietary program reporting
about one individual computer. These limitations and the
fact that the systems Work With a restricted ?le type go hand
in hand because it is unknown hoW to expand the system and
remain user friendly.
[0041]
Web searching is one of the ?rst booms in the
Internet. Examples include AltaVista, Yahool, Excite, and
Google, to name a feW. Web searching alloWs the user to ?nd
information that is distributed on the Internet. HoWever, the
alloWing the search times to be set as to not sloW the system
amounts of data Without netWork congestion or sloW user
response.
[0045]
Finally, the system can be designed to search
documents for out-of-band information, such as header and
footer metadata, or in-band information, such as Water
marks, so that the ?les can be classi?ed according to this
extra information and not only text. This is extremely useful
for non-text media ?les, such as images, audio and video,
since search engines currently do not knoW hoW to classify
these ?les. For example, the Watermark may contain key
Word information (e.g., content type tags) about a scene in
craWlers that ?nd information can only search around 10%
an image and Whether the image is acceptable for vieWing by
minors (an adult content ?ag).
(a generous estimate). The Web craWler also only locates
surface information, such as HTML (hypertext markup
We noW describe an implementation of a system for search
searching systems have tWo major problems. The Web
language) Web page, and ignores deep information, includ
ing doWnloadable ?les and database information. Inventors
are trying to solve the latter problem With search engines that
query Web pages and then search, thus potentially ?nding
deep database or doWnloadable ?les. HoWever, this is sloWer
than general searching and can never cover the Web.
[0042] The unique combination of these tWo technologies
solves the ?le-sharing restrictions and user-friendly prob
[0046] Having summariZed the system and its advantages,
ing and indexing multimedia content and metadata related to
that content. FIG. 1 shoWs components of this system. In this
system, a Web searching agent (e.g., search agent thread
120) runs locally on a collection of distributed, registered
Web servers (e.g., Web server 122) and reports back to a
searchable database 124 available for general Web search
ing. In particular, the agent invokes Watermark detectors to
extract content identi?ers from Watermarks imperceptibly
Mar. 8, 2007
US 2007/0055689 A1
embedded in multimedia content ?les 126 and fetch related
metadata using the metadata linking system described
above. Alternatively, the Watermarks include content type
?ags that may be used to index the content type Without
resorting to a metadata database 116. In addition, the agent
invokes text based searching of ?les and ?le headers and
footers to index text content, such as Word processor docu
ments 128, based on key Words. The agents (e.g., 120)
supply the content type tags from Watermarks and key Word
text to a searchable database (124) that indexes the content
type tags and text in a content index 130. The content index
has a searchable index of key Words and content tags 132
that are associated With ?le pointers 134 of ?les that match
the description of the key Words/content tags. The ?le
pointers provide the location of the corresponding ?les on
the computer netWork.
[0047]
The searchable database 124 has a search engine
136 that presents a Web based interface enabling users to
present key Word searches or searches automated by detect
ing a Watermark from a particular content item of interest. In
the former case, the user supplies a key Word search query,
much like the user interfaces of Google or AltaVista, and the
searchable database uses the key Word query as input to a
search of its index for related content. In the latter case, a
Watermark detector, such as reader application 108, extracts
a Watermark from a content ?le, and uses the Watermark to
derive content type tags for that ?le. The detector obtains
these content type ?ags either directly from content type tags
in the Watermark message payload, or indirectly from a
database look up of a content identi?er from the Watermark
message to content type tags in the metadata database 116.
The Watermark detector 108 provides the search engine 136
With one or more content type tags for the content ?le of
interest. The searchable database 124 uses the content type
tags and/or the keyWord search terms to search the index of
content 130, and returns pointers to the content items that
match the search request. Since the search engine 136 has a
Web interface, it is accessible from remote computers (e.g.,
user’s computer 106) via a conventional Internet broWser
application, or other applications With broWser capability,
such as Watermark reader application 108.
[0048] The search agents 120 run on computers and com
puter netWorks that are dif?cult to access through conven
tional Web craWler searching. The search agents have a
number of parameters that control their operation. In par
ticular, the agents have input parameters that enable a Web
master to specify the directories, times, and CPU usage for
searching (e.g., search designated directories 138 between 1
A.M. and 5 A.M. using no more than x % of CPU time per
machine in each thread of execution). In Web servers, the
search agent can be programmed to minimiZe interference
With request for ?les to be searched, and can be programmed
cycles of computers in the evening or other off-peak hours.
In addition, the searching agent is intelligent. The agent can
use search agent technology such as RuleSpace for text and
Virage for video categoriZation.
[0050] Images, audio and video in the ?le directory of the
Web server or local netWork 122 to be searched are Water
marked and categoriZed based on content tags stored in the
router system 114 or metadata database 116. In particular,
the content identi?er in the Watermark embedded in the
content is associated With usage rules stored in the router’s
registry 118 and/ or metadata database 116. These usage rules
can be used to specify the content type and control hoW the
content is indexed and used by those that access the content
via the searchable database 124. Using this approach, more
Web content can be better categorized, thus improving
consumers’ searches and properly indexing every compa
ny’s Web server.
[0051] The above system is intended for enabling Wider
access to content on Web servers to others on the Internet via
the searchable database that indexes the content. HoWever,
a similar structure may be used for internal digital asset
management (DAM) Within a company’s local or Wide area
computer netWork. In particular, in this con?guration, the
digital asset management system runs Within the company’s
Intranet, and the search agent 120 runs on every employee’s
computer. More speci?cally, each employee marks directo
ries on his computer or netWork directory that are to be
continually searched (e.g., the designated directories 138),
categoriZed and reported to the central Intranet search site
(the searchable database having a repeatedly updated index
of accessible content on the Intranet). Each employee moves
important documents and Watermarked content ?les to that
directory When ?nished, or alloWs people to search on
documents in process. For example, as the user creates
content ?les like images, audio or video 140, she invokes a
Watermark embedder application 142 to embed a content
identi?er or content type tags into an imperceptible Water
mark embedded in the content. These Watermarks enable the
search agent 120 to ?nd the content to be indexed in the
designated directories, and further, enable the system to
index the Watermarked ?les in the searchable database 124,
Which is then searchable by others. The searchable database
124 returns pointers to Where content ?les satisfying a
search can be found in the Intranet, and fetched automati
cally. In summary, the system helps employees of large
companies to access and share company information.
[0052]
As an alternative to a Watermark embedder, a ?le
header inserter may be used to Write content type tags into
the header or footer of the ?le. In this case, the search agent
is programmed to read the ?le header/ footer for content type
tags. OtherWise, operation of the system is similar.
not interfere With Web site content that is accessible for
[0053] While the above structure helps locate digital assets
and associate usage rules, the system also shoWs the rela
doWnloading by others.
tionship betWeen content items, like documents, images,
to search redundant copies of content on a Web site so as to
[0049] By running locally on the Web server 122 or user’s
machine 106, the search agent can also search non-HTML
?les, such as Word documents, PoWerPoint presentations,
spread sheets, databases and Watermarked media for deep
searching. By running in a distributed architecture, more
content can be searched and categoriZed. The agent prefer
ably runs as a distributed agent on the Web server or local
computer netWork 122, using idle computer processing
audio, etc. For example, When a user ?nds a document
satisfying a search request, the user interface of the search
engine 136 returns an interface displaying all of the linked
?les, such as for HTML, Word processor documents, etc.,
and inserted objects, such as images, audio, video, etc.
[0054] This system advantageously employs digital Water
marks and key Word text to index content Within company
netWorks. The Watermarks carry identi?ers that link the
Mar. 8, 2007
US 2007/0055689 A1
content to metadata through the router and metadata data
base. This metadata, in turn, enables the content to be
(or after) content creation, the content is registered via a
registration authority 220 to obtain a unique identi?er (ID)
indexed for searching.
for the content. The registration process can be electroni
cally automated, e.g., via the internet or other netWork
[0055] The systems described above overcome key
obstacles to effectively associating content With its meta
data. One of the key obstacles With any digital asset man
agement system is the cost of inputting the metadata asso
system. The registration authority 220 preferably maintains
(or communicates With) a database 230, Which associates the
content (and/or enhanced content) With the unique IDs.
ciated With each digital asset ?le. By using Watermarks to
identify and link through the router system, the system
[0061]
overcomes this obstacle.
Watermark. (Of course, the content creator, the registration
[0056] To illustrate, consider the folloWing example. I take
a picture With my digital camera and store the image in my
digital asset management (DAM) system (e.g., content data
base 102 and metadata database system 116). I enter in
associated metadata (maybe the name of the beach it Was
taken on), Which is stored in the metadata database 116. The
image is Watermarked With an Image ID, establishing a link
betWeen the Image ID and the metadata database entry
storing the name of the beach. I noW distribute the image to
my business partners. One partner takes the image and stores
it in his DAM system. This system recogniZes the Water
mark, links through the router to the metadata database in
Once obtained, an identi?er is steganographically
encoded Within the content, e.g., in the form of a digital
authority or a third party may carry out the actual encoding).
In one embodiment, multiple IDs are associated With a
single content item. For example, individual identi?ers
uniquely identify particular audio segments or video
sequences. Even objects Within a video frame (or still image)
can be identi?ed With a unique identi?er. Such embedded
identi?ers may be used to trigger an action or response, or
to identify content, distributors, authors, performers, etc.
[0062] The registered, embedded content may be option
my DAM systemiWhich responds by supplying all the
ally associated With enhanced content. For example, in an
interactive television system (“iTV”), the content may be
associated With interactive (e.g., enhanced) content, such as
Web pages or internet sites, graphics, audio and video, etc.
metadata. This data is then automatically entered into my
In this case, an embedded identi?er may correspond to a
partner’s systemiimproving productivity and accuracy,
speci?c URL or IP address, Which is maintained in database
230. (For audio-based content, the embedded identi?ers may
and gaining metadata that could not be determined from the
image itself (the name of the beach). In this manner, the
imperceptibly embedded digital Watermark in content items
enables disparate DAM systems to interoperate and share
content items.
[0057] Moreover, the metadata for a content item stores
usage rules that govern Where the metadata and content ?le
is alloWed to be shared (e.g., to a particular authenticated
be similarly associated With enhanced content, such as a
URL or IP address, performer, artist, record label, etc.). Of
course, instead of storing the enhanced content, database
230 may include links to the enhanced data. The relationship
betWeen unique identi?ers and enhanced content is main
tained via database 230. (Of course, the registration author
ity 220 and the enhanced content database 230 may be in
communication, and in one embodiment, may even be
user, to a particular authenticated machine, etc.). This
authentication scheme is implemented by requiring the user
functionally combined.).
Who Wants access to the content or its metadata to supply
[0063]
authentication data, such as a particular computer address,
passWord, etc.
example, video content is reproduced on video cassettes
(e.g., VHS cassettes) or DVDs, and audio content is repro
duced on CDs, audio DVD, electronic or magnetic media, or
[0058] The system combines tWo poWerful functions:
automatically indexing content ?les through the search agent
and searchable database, and automatically indexing the
metadata associated With those content ?les.
[0059]
The searchable database 124 may be centraliZed or
The embedded media content is packaged. For
tapes, etc., etc. (The term media package is used to represent
both a physical package (e.g., VHS cassettes, DVD, jeWel
case, etc.) and/or any media content contained therein.).
[0064] The physical package 250 is also encoded, e.g.,
digitally Watermarked. The encoding of the package can
distributed over a number of computers interconnected on a
encompass artWork or printing on a package, or may include
netWork. The content index 124 can be searched from a
an encoded label, certi?cate, media documentation, shipping
standard broWser as noted above, or searched by agents, as
in the Gnutella system. In ?le sharing netWorks, the search
agent 120 can be programmed to scan ?les on a user’s
computer While the computer is connected to the ?le sharing
netWork. Alternatively, the search agent can run on the user’ s
computer in off-peak times and create a local index of
content on the user’s machine. Then, Whenever the user
connects, this index created locally by the search agent
shares the user’s local index With a central content index
maintained by the searchable database 124 or a distributed
content index database that is shared among users of the ?le
invoice or package container, etc. If a line design or graphic
is present, it too can be encoded. (The design and/or text on
a DVD or CD face can even be encoded.). A variety of
Watermarking encoding techniques are detailed in the patent
documents discussed herein; a variety of other encoding
techniques are knoWn to those skilled in the art. Such
techniques may be suitable employed With the present
invention.
[0065] The digital Watermark embedded Within package
250 preferably includes a unique identi?er (e.g., as payload
sharing netWork.
bits), similarly obtained from the registration authority 220.
Content and Asset Management System and Method
packaged content (or the Watermark embedded therein).
[0060] An asset management system 200 is noW described
With reference to FIG. 2. A content creator 210 develops
[0066] There are many advantages and applications asso
ciated With Watermarking media content and its respective
content package. A feW examples are provided beloW.
content (audio, video, images, etc.) for distribution. During
The package Watermark identi?er is associated With the
Mar. 8, 2007
US 2007/0055689 A1
[0067] In one embodiment, procession of the physical
package itself is required to facilitate veri?cation, registra
Watermark is one that does not survive a scan-print or copy
tion and/or authentication. Consider a video distribution
video content includes at least a ?rst Watermark, and the
package itself includes at least a second Watermark. The
fragile Watermark in any of the above embodiments.
Although a fragile Watermark is not robust enough to
survive duplication, it still provides accurate Watermark
detection for an original package, e.g., the Watermarked
package. Accordingly, a Would-be pirate may be able to copy
the digital content, but Would be unable to successfully
broadcaster 260, in order to register the content and/or
reproduce the Watermarked package itself (e.g., unable to
example With reference to FIG. 3. A distributor (e. g., broad
caster or cable operator, etc.) 260 receives the packaged
content 250 (video in this example). As discussed above, the
process. Accordingly, a package may be encoded With a
enable vieWer access to enhanced content index database
copy the fragile Watermark). (Various fragile Watermarking
230, presents the Watermarked package to a compliant
reading device (e.g., a device that is capable of reading the
second Watermark). The package identi?er is extracted from
the second Watermark and conveyed to the registration
authority 220, preferably along With a user, broadcaster or
techniques are discussed in assignee’s U.S. patent applica
tion Ser. No. 09/689,226, ?led Oct. 11, 2000, and Ser. No.
09/731,456, ?led Dec. 6, 2000, and assignee’s PCT Publi
cation WO 99/36876, published Jul. 22, 1999, each ofWhich
are hereby incorporated by reference. Artisans in the ?eld
knoW other fragile Watermarking techniques. Of course,
such other techniques are suitably interchangeable With the
netWork ID. Upon receipt, the registration authority 220
permits access of the distributor 260 (or its vieWer netWork)
to the enhanced data stored in database 230. (The authority
220 or database 230 can log that a particular distributor or
netWork has registered the package Watermark. Then When
a database query is received for the enhanced content, e.g.,
via a media content identi?er With the distributor or netWork
ID, the distributor or netWork ID is checked to determined
Whether registration has occurred. If so, database access is
permitted.). A digital or other reproduction of the video
content, Without the Watermarked package itself, Will not
alloW access to the enhanced or interactive content.
present invention.).
[0072] (As an alternative, to deter use of precision pho
tocopy apparatuses to reproduce a package face (While
retaining the associated Watermark), the face of the package
can be provided With a re?ective layer, e.g., in the form of
an overlay or varnish. In the bright illumination of a pho
tocopier, such layer mirrors the light back onto the photo
detectors, preventing them from accurately reproducing the
Watermark pattern. In contrast, When presented to a Web cam
content) are required to access the media content. In this
or other such imaging device, no bright illumination is
typically present, so the photosensors are not overWhelmed
and the document can be used for its intended authentication
case, hoWever, the package ID provides a key (e.g., encryp
purpose.).
[0068] In another embodiment, both IDs (i.e., package and
tion key or Watermark orientation/location or decoding key)
CONCLUDING REMARKS
to read the content or to access the content Watermark
identi?er. The package Watermark is initially read and
information contained therein enables (e.g., decodes,
unscrambles, etc.) the content or the content Watermark. In
a case Where the package Watermark identi?er provides
access to the content Watermark, once obtained, the content
Watermark can then be used to unlock or unscramble the
media content. Without physical possession of the package
(and the Watermark encoded thereon), vieWing or listening
[0073] Having described and illustrated the principles of
the technology With reference to speci?c implementations, it
Will be recogniZed that the technology can be implemented
in many other, different, forms. To provide a comprehensive
disclosure Without unduly lengthening the speci?cation,
applicants incorporate by reference the patents and patent
applications referenced above.
to the media content is prohibited or impaired.
[0074] The methods, processes, and systems described
[0069]
above may be implemented in hardWare, softWare or a
In still another embodiment, a compliant device
(perhaps a video recorder or audio player) reads both the
package Watermark and the content Watermark. The com
pliant device determines if the Watermarks match (or cor
responds With one another). The compliant device may even
query the registration authority 220 or other database to
determine if the Watermarks coincide. The device operates to
play the content only if the Watermarks coincide.
[0070]
In yet another embodiment, content is Watermarked
With a unique identi?er as discussed above. The correspond
ing packaging is also Watermarked With a corresponding ID.
(In this section, the term “corresponding” implies that the
Watermarks are the same, match, relate, correspond, are
compatible With, or are related to one another via a data
record, etc.). The packaged content is placed in a retail
distribution system. The package Watermark is used to
manage the content, e.g., inventory, shelf management, etc.
For example, the package can be read (or scanned) by a
compliant device to determine a quantity, content, inventory
status, etc.
[0071] So-called fragile Watermarking may also be uti
liZed to even further enhance security of a package. A fragile
combination of hardWare and softWare. For example, the
Watermark data encoding processes may be implemented in
a programmable computer or a special purpose digital
circuit. Similarly, Watermark data decoding may be imple
mented in softWare, ?rmware, hardWare, or combinations of
softWare, ?rmWare and hardWare. The methods and pro
cesses described above may be implemented in programs
executed from a system’s memory (a computer readable
medium, such as an electronic, optical or magnetic storage
device).
[0075] The particular combinations of elements and fea
tures in the above-detailed embodiments are exemplary
only; the interchanging and substitution of these teachings
With other teachings in this and the incorporated-by-refer
ence patents/applications are also contemplated.
We claim:
1. A method for processing media content on a distributed
netWork, the method comprising:
identifying media content signals stored at sites distrib
uted over the distributed computer netWork;
Mar. 8, 2007
US 2007/0055689 A1
extracting content identi?ers from the content signals;
using the content identi?ers to obtain metadata used to
classify the media content signals; and
creating a searchable index of the media content signals
based on the metadata, Wherein users access the search
able index on the distributed computer netWork to
submit a search query for the searchable index to
retrieve links to the media content signals.
2. The method of claim 1 Wherein the content identi?ers
are extracted from digital Watermarks imperceptibly embed
ded in the content signals by making imperceptible changes
to audio or image signals that comprise the content signals.
3. The method of claim 1 Wherein the content identi?ers
reference metadata corresponding to the media content
signals that is stored in remote locations from the media
content signals.
4. The method of claim 2 Wherein the digital Watermarks
include content ?ags that are used to classify the media
content signals in the searchable index.
5. The method of claim 1 Wherein the identifying includes
executing search agents Within different local computer
netWorks that are each connected to the distributed computer
network, the search agents extracting the content identi?ers
from content signals stored Within corresponding local com
puter netWorks and providing the metadata for indexing in
the searchable index.
6. The method of claim 1 Wherein the identi?ers are used
to obtain usage rules specifying hoW the content signals
from Which the identi?ers are extracted are to be indexed or
used by the users of the searchable index.
7. The method of claim 1 Wherein the metadata is stored
in a database accessible to the users, and users update the
metadata in the database by supplying metadata about
corresponding content signals that then becomes subse
quently accessible to other users that submit search queries
for content signals on the distributed computer netWork.
8. A method for searching for audio or images on a
distributed computer netWork comprising:
from a location in the distributed computer netWork,
receiving a query for content signals related to a ?rst
content signal, the ?rst content signal being part of the
query;
receiving a content identi?er extracted from audio or
image data of the ?rst content signal;
using the content identi?er to obtain metadata used to
classify the ?rst content signal;
searching a searchable index of media content signals
based on the metadata, Which forms search criteria for
the ?rst content signal; and
returning a set of search results including references to
content signals stored in the distributed computer net
Work that correspond to the search criteria.
9. The method of claim 8 Wherein the content identi?er is
extracted from a digital Watermark imperceptibly embedded
in the ?rst content signals by making imperceptible changes
to audio or image signals that comprise the ?rst content
signal.
10. The method of claim 8 Wherein the content identi?er
references metadata corresponding to the ?rst content signal
that is stored in a remote location from the ?rst content
signal.
11. The method of claim 9 Wherein the digital Watermark
include a content ?ag that is used to classify the ?rst content
signal as part of the search criteria used to search for related
content signals in the searchable index.
12. The method of claim 8 Wherein the searchable index
is built by executing search agents Within different local
computer netWorks that are each connected to the distributed
computer netWork, the search agents extracting content
identi?ers from content signals stored Within corresponding
local computer netWorks and providing metadata for index
ing in the searchable index.
13. The method of claim 8 Wherein the identi?er extracted
from the ?rst content signal is used to obtain a usage rule
specifying hoW the ?rst content signal is to be used by the
users of the searchable index.
14. The method of claim 8 Wherein the metadata is stored
in a database accessible to the users, and users update the
metadata in the database by supplying metadata about
corresponding content signals that then becomes subse
quently accessible to other users that submit search queries
for content signals on the distributed computer netWork.
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement