The Hypertext Transfer Protocol (HTTP)
Entity Tag ("ETag") Response Header in Write Operationsgreenbytes GmbHHafenweg 16MuensterNW48155Germany+49 251 2807760+49 251 2807761julian.reschke@greenbytes.dehttp://greenbytes.de/tech/webdav/
The Hypertext Transfer Protocol (HTTP) specifies a state identifier,
called "Entity Tag", to be returned in the "ETag" response header. However,
the description of this header for write operations
such as PUT is incomplete, and has caused confusion among developers and
protocol designers, and potentially interoperability problems.
This document explains the problem in detail and suggests both a
clarification for a revision to the HTTP/1.1 specification (RFC2616) and a new header for use in responses,
making HTTP entity tags more useful for user agents that want to avoid
round-trips to the server after modifying a resource.
Distribution of this document is unlimited. Please send comments to the
Hypertext Transfer Protocol (HTTP) mailing list at ietf-http-wg@w3.org, which may be joined by sending a message with subject
"subscribe" to ietf-http-wg-request@w3.org.
Discussions of the HTTP working group are archived at
.
XML versions, latest edits and the issues list for this document
are available from .
Umbrella issue for editorial fixes/enhancements.
The Hypertext Transfer Protocol (HTTP) specifies a state identifier,
called "Entity Tag", to be returned in the "ETag" response header (see
). However,
the description of this header for write operations
such as PUT is incomplete, and has caused confusion among developers and
protocol designers, and potentially interoperability problems.
This document explains the problem in detail and suggests both a
clarification for a revision to
and a new header for use in responses, making HTTP entity tags more useful for user agents that want to avoid
round-trips to the server after modifying a resource.
Note that there is a related problem: modifying content-negotiated
resources. Here the consensus seems to be simply not to do it. Instead,
the origin server should reveal specific URIs of content that is not
content-negotiated in the Content-Location response header (), and user
agents should use this more specific URI for authoring. Thus, the remainder
of this document will focus on resources for which no content negotiation
takes place.
Another related question is the usability of the weak entity tags for
authoring (see ). Although
this document focuses on the usage of strong entity tags, it is believed
that the changes suggested in this document could be applied to weak
entity tags as well.
For a long time, nobody realized that there was a problem at all, or
those who realized preferred to ignore it.
Server implementers added code that would return the new value of the
"ETag" header in a response to a successful PUT request. After all, the client
could be interested in it.
User agent developers in turn were happy to get a new "ETag" value, saving
a subsequent HEAD request to retrieve the new entity tag.
However, at some point of time, potentially during a Web Distributed
Authoring and Versioning (WebDAV, ) interoperability
event, client programmers asked serversserver programmers to always return "ETag" headers
upon PUT, never ever to change the entity tag without "good reason", and
- by the way - always to guarantee that the server stores the new content
octet-by-octet.
From the perspective of client software that wants to treat an HTTP
server as a file system replacement, this makes a lot of sense. After all,
when one writes to a file one usually expects the file system to
store what was written, and not to unexpectedly change the contents.
However, in general, an HTTP server is not a file system replacement. There
may be some that have been designed that way, and some that expose some
parts of their namespace that have this quality.
But in general, HTTP server implementers have a lot of freedom in how
resources are implemented. Indeed, this flexibility is one of the reasons
for HTTP's success, allowing it to be used for a wide range of tasks, of which
replacing file systems is just one (and not necessarily the most
interesting one).
In particular:
A server may not store a resource as a binary object - in this case,
the representation returned in a subsequent GET request may just
be similar, but not identical to what was written. Good examples
are servers that use HTTP to access XML data (),
Calendaring data () or newsfeed data
().
A server may change the data being written on purpose, while it's
being written. Examples that immediately come to mind are keyword
substitution in a source control system, or filters that remove
potentially insecure parts out of HTML pages.
Furthermore:
An "unrelated" method such as WebDAV's PROPPATCH (see )
may affect the entity body and therefore the entity tag in an
unexpected way, because the server stores some or all of the WebDAV
properties inside the entity body (for instance, GPS information inside a
JPG image file).
As long as servers store the content octet-by-octet, and return exactly
what the client wrote, there is no problem at all.
Things get more interesting when a server does change the content,
such as in the "simple authoring" example given in .
Here, the server does change the content upon writing to the resource,
yet no harm is done, because the final state of the resource on the
server does not depend on the client being aware of that.
All of the content rewriting examples mentioned above have this quality:
the client can safely continue to edit the entity it sent, because
the result of the transformation done by the server will be the same
in the end. Formally, if we call the server-side transformation "t", the
initial content "c", and the client-side editing steps "e1" and "e2",
then
Question: does anybody know a real-world example for server-side content
rewriting where the above is not true?
Problems will only occur if the client uses the entity body it sent,
and the entity tag it obtained
in return, in subsequent requests that only transfer part of the
entity body, such as GET or PUT requests using the "Range" request header
(see ).
Furthermore, some clients need to expose the actual contents to the end user.
These clients will have to ensure that they really have the current
representation.
Entity bodies (and thus entity tags) changing due to side effects of
seemingly unrelated requests are indeed a problem, as demonstrated in
, and this specification
proposes a way to resolve this in .
There are several places in the HTTP/1.1 specification ()
mentioning the "ETag" response header.
Let us start with the header definition in :
The ETag response-header field provides the current value of the entity
tag for the requested variant. Sections , and
describe the headers used with entity tags. The entity tag MAY be
used for comparison with other entities from the same resource (see
Section ).
The meaning of a "response-header" in turn is defined in :
The response-header fields allow the server to pass additional information
about the response which cannot be placed in the Status-Line. These header
fields give information about the server and about further access to the
resource identified by the Request-URI.
The "ETag" response header itself is mentioned mainly in the context
of cache validation, such as in .
What is missing is a coherent description on how the origin server can notify the
user-agent when the entity tag changes as result of a write operation, such
as PUT.
Indeed, the definition of the 201 Created status code mentions entity tags ():
A 201 response MAY contain an ETag response header field indicating the
current value of the entity tag for the requested variant just created,
see Section .
The "ETag" response header is mentioned again in the definition of
206 Partial Content ()
and 304 Not Modified (),
but notably missing are statements about other 2xx series status codes
that can occur upon a successful PUT operation, such as 200 OK ()
and 204 No Content ().
Summarizing, the specification is a bit vague about what an ETag
response header upon a write operation means,
but this problem is somewhat mitigated by the precise definition
of a response header. A proposal for enhancing
in this regard is made in below.
While working on the revision of , the IETF WebDAV
working group realized that this is a generic problem that needs attention
independently of WebDAV. An initial attempt was made with
in February 2006, but no
progress was made since.
At the time of this writing in August 2006, two specifications based on HTTP
were under IESG review, taking two opposite approaches:
makes it a MUST-level
requirement to return an entity tag upon PUT, even though the very nature of
an XCAP server will cause it to rewrite contents (due to its XML-based
storage).
explicitly forbids
("MUST NOT") returning an entity tag upon PUT if the content was rewritten.
In essence, this makes it impossible to implement an HTTP resource that
conforms to both specifications. Due to the differing use cases of XCAP and CalDAV,
this may not be a problem in practice, but the disagreement in itself is
scary. Publication of these specifications on the standards track will
make it much harder for future protocols to deal with this topic in a
meaningful way (comments were sent during IETF Last Call for CalDAV,
see ).
Note that of the two specifications above, profiles
in that it makes a previously optional behavior
required, while explicitly forbids a behavior which
was previously allowed.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL-NOT", "SHOULD",
"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
interpreted as described in .
The terminology used here follows and extends that in the HTTP specification
, notably the augmented Backus-Naur Form (BNF) defined
in of that document.
This section describes a minimal change to , proposed
in .
At the end of , add:
The response MAY contain an ETag response header field indicating the
current value of the entity tag () for the requested
variant. The value SHOULD be the same as the one returned upon a
subsequent HEAD request addressing the same variant.
In , remove:
A 201 response MAY contain an ETag response header field indicating
the current value of the entity tag for the requested variant just
created, see .
In essence, this moves the statement about entity tags in write operations
from the specific case of status 201 Created into the more generic
description of the 2xx series status codes.
Should "requested
variant" be clarified in the context of write operations?
The 'Entity-Transform' entity header provides information about whether a
transformation has been applied to an entity body.
When used in an HTTP request, its meaning is undefined. In an HTTP response,
it provides information whether the server has applied a transformation
when the entity was stored last.
In general, entity headers may be stored in intermediates. The main use
of this header however applies to the HTTP PUT method, of which by default
the results are not cacheable (see ).
In addition, the value format is defined so that a client can reliably
detect whether the information is fresh.
The entity-tag specifies the entity body to which this information applies.
An entity-transform-keyword of "identity" specifies that the origin server has stored the
entity octet-by-octet, thus the user agent MAY use a local copy of the entity body
with the given entity-tag for subsequent requests that rely on octet-by-octet identity (such
as a PUT with "Range" request header).
Both the absence of this response header and any entity-transform-keyword value other than "identity"
specify that the origin server may have transformed the entity before
storage, thus a subsequent retrieval will not necessarily return an exact
copy of the storedsubmitted entity.
The clarification of (see )
makes it clear that user agents can use "ETag" headers obtained in write
operations, as long as they do not require octet-by-octet identity. In
particular, a new entity tag can be returned for any method, such
as a WebDAV PROPPATCH (see .
This helps dealing with the problem described in .
See for details.
The addition of the "Entity-Transform" header (see )
enables origin servers to signal that they stored an exact copy of the
content, thus allowing clients not to refetch the content. Note that by
default (in absence of the response header), a client can not make any
assumptions about the server's behavior in this regard. Thus clients will
only benefit from new servers explicitly setting the new header.
This document specifies the new HTTP header listed below.
Entity-TransformhttpinformationalIETF of this specification
This specification introduces no new security considerations beyond those
discussed in .
Uniform Resource Identifier (URI): Generic SyntaxWorld Wide Web Consortiumtimbl@w3.orgDay Softwarefielding@gbiv.comAdobe Systems IncorporatedLMM@acm.orgKey words for use in RFCs to Indicate Requirement LevelsHarvard Universitysob@harvard.eduHypertext Transfer Protocol -- HTTP/1.1University of California, Irvinefielding@ics.uci.eduW3Cjg@w3.orgCompaq Computer Corporationmogul@wrl.dec.comMIT Laboratory for Computer Sciencefrystyk@w3.orgXerox Corporationmasinter@parc.xerox.comMicrosoft Corporationpaulle@microsoft.comW3Ctimbl@w3.orgHTTP Extensions for Distributed Authoring -- WEBDAVMicrosoft Corporationyarong@microsoft.comDept. Of Information and Computer Science, University of California, Irvineejw@ics.uci.eduNetscapeasad@netscape.comNovellsrcarter@novell.comNovelldcjensen@novell.comCalendaring Extensions to WebDAV (CalDAV)cyrus@daboo.nameOracle Corporation600 Blvd. de Maisonneuve WestSuite 1900MontrealQCH3A 3J2CAbernard.desruisseaux@oracle.comhttp://www.oracle.com/
Open Source Application Foundation
2064 Edgewood Dr.Palo AltoCA94303USlisa@osafoundation.orghttp://www.osafoundation.org/Calendaring Extensions to WebDAV (CalDAV)cyrus@daboo.nameOracle Corporation600 Blvd. de Maisonneuve WestSuite 1900MontrealQCH3A 3J2CAbernard.desruisseaux@oracle.comhttp://www.oracle.com/
Open Source Application Foundation
2064 Edgewood Dr.Palo AltoCA94303USlisa@osafoundation.orghttp://www.osafoundation.org/HTML 4.01 SpecificationW3Cdsr@w3.orgW3CW3CDesign Considerations for State Identifiers in HTTP and WebDAVUC Santa Cruz, Dept. of Computer Science1156 High StreetSanta CruzCA95064ejw@cse.ucsc.eduThe Extensible Markup Language (XML) Configuration Access Protocol (XCAP)Cisco Systemsjdrosen@cisco.comThe Atom Publishing ProtocolBitWorking, Inc1002 Heathwood Dairy Rd.ApexNC27502US+1 919 272 3764joe@bitworking.comhttp://bitworking.com/Propylon Ltd.45 Blackbourne Square, Rathfarnham GateDublinDublinD14IE+353-1-4927444bill.dehora@propylon.comhttp://www.propylon.com/The Atom Publishing Protocol (APP) is an application-level
protocol for publishing and editing Web resources. The protocol is based on
HTTP transport of Atom-formatted representations. The
Atom format is documented in the Atom Syndication Format
(RFC4287).
To provide feedback on this Internet-Draft, join the atom-protocol mailing
list (http://www.imc.org/atom-protocol/index.html).
The Atom Publishing ProtocolBitWorking, Inc1002 Heathwood Dairy Rd.ApexNC27502US+1 919 272 3764joe@bitworking.comhttp://bitworking.com/Propylon Ltd.45 Blackbourne Square, Rathfarnham GateDublinDublinD14IE+353-1-4927444bill.dehora@propylon.comhttp://www.propylon.com/
Let us consider a server not having the quality of preserving octet-by-octet
identity, for instance because
of SVN-style keyword expansion in text content ().
In this case, the client has previously retrieved the representation for
<http://example.com/test>, and the server has returned the ETag "1":
The client now wants to update the resource. To avoid overwriting
somebody else's changes, it submits the PUT request with the HTTP
"If-Match" request header (see ):
If the resource was modified in the meantime, the server will reject
the request with a 412 Precondition Failed status:
In this case, the client usually has take care of merging the changes
made locally with those made on the server ("Merge Conflict").
If there was no overlapping update, the server will execute the
request and return a new entity tag:
What seems to be a problem at first may not be a real problem in
practice. Let us assume that the client continues editing the resource,
using the entity tag obtained from the previous request, but editing
the entity it last sent:
Assuming there was no overlapping update, the PUT request will succeed:
Note that the only problem here is that the client doesn't have an exact
copy of the entity it's editing. However, from the server's point of view
this is entirely irrelevant, because the "Revision" keyword will be
automatically updated upon every write anyway.
In any case, the final contents will be:
In this example, the server exposes data extracted from the HTML
<title> element ()
as a custom WebDAV property (),
allowing both read and write access.
In the first step, the client obtains the current representation for
<http://example.com/test.html>:
Next, it adds one paragraph to the <body> element, and gets back
a new entity tag:
Next, the client sets a custom "title" property (see ):
The server knows how to propagate property changes into the HTML
content, so it updates the entity by adding an HTML title document
accordingly. This causes the entity tag changing to "C".
A subsequent attempt by the client to update the entity body will fail,
unless it realizes that changing WebDAV properties may affect the
entity as well. In this case, it would have had to get the current
entity tag before proceeding. Of course, this introduces an additional
round-trip, and a timing window during which overlapping updates by
other clients would go unnoticed.
Below we repeat the example from above (),
but here the origin server returns entity tags for all write operations,
and the user agent knows how to deal with them. That is, both take
advantage of already allows.
As before, this causes the entity to change, and a new entity tag
to be assigned. But in this case, the origin server actually
notifies the client of the changed state by including the
"ETag" response header.
The client now will be aware that the requested entity change, and can
use the new entity tag in subsequent requests (potentially after
refreshing the local copy).
Add and resolves issues "entity-header" and "extensibility", by removing
the extension hooks, and by redefining the header to it can be used as an
Entity header.
Update APP and CALDAV references. Remove RFC3986 reference (not needed anymore after
the simplication in draft 01). Fix typo in header description ("submitted entity", not
"stored entity"). Remove comparison about how XCAP and CALDAV profile
RFC2616: after all, both mandate a behaviour that was legal but optional before.
Add "Updates: RFC2616".