<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="no" ?>
<?rfc symrefs="yes" ?>
<?rfc compact="yes" ?>
<?rfc-ext allow-markup-in-artwork="yes" ?>

<rfc number="2397" category="std">

<front>
  <title>The "data" URL scheme</title>

  <author initials='L.' surname='Masinter' fullname='Larry Masinter'>
    <organization abbrev="Xerox Corporation">Xerox Palo Alto Research Center</organization>
    <address>
      <postal>
        <street>3333 Coyote Hill Road</street>
        <city>Palo Alto</city>
        <region>CA</region>
        <code>94034</code>
      </postal>
      <email>masinter@parc.xerox.com</email>
    </address>
  </author>

  <date month='August' year='1998'></date>

<!--  <abstract>
<t>
   A new URL scheme, "data", is defined. It allows inclusion of small
   data items as "immediate" data, as if it had been included
   externally.
</t>
</abstract>-->
</front>
<middle>
<section title="Abstract">
<t>
   A new URL scheme, "data", is defined. It allows inclusion of small
   data items as "immediate" data, as if it had been included
   externally.
</t>
</section>
<section title="Description">
<t>
   Some applications that use URLs also have a need to embed (small)
   media type data directly inline. This document defines a new URL
   scheme that would work like 'immediate addressing'. The URLs are of
   the form:
</t>
<figure><artwork type="inline">
                 data:[&lt;mediatype>][;base64],&lt;data>
</artwork></figure>
<t>
   The &lt;mediatype> is an Internet media type specification (with
   optional parameters.) The appearance of ";base64" means that the data
   is encoded as base64. Without ";base64", the data (as a sequence of
   octets) is represented using ASCII encoding for octets inside the
   range of safe URL characters and using the standard %xx hex encoding
   of URLs for octets outside that range.  If &lt;mediatype> is omitted, it
   defaults to text/plain;charset=US-ASCII.  As a shorthand,
   "text/plain" can be omitted but the charset parameter supplied.
</t>
<t>
   The "data:" URL scheme is only useful for short values. Note that
   some applications that use URLs may impose a length limit; for
   example, URLs embedded within &lt;A> anchors in HTML have a length limit
   determined by the SGML declaration for HTML <xref target="RFC1866"/>. The LITLEN
   (1024) limits the number of characters which can appear in a single
   attribute value literal, the ATTSPLEN (2100) limits the sum of all
   lengths of all attribute value specifications which appear in a tag,
   and the TAGLEN (2100) limits the overall length of a tag.
</t>
<t>
   The "data" URL scheme has no relative URL forms.
</t>
</section>
<section title="Syntax">
<figure><artwork type="inline">
    dataurl    := "data:" [ mediatype ] [ ";base64" ] "," data
    mediatype  := [ type "/" subtype ] *( ";" parameter )
    data       := *urlchar
    parameter  := attribute "=" value
</artwork></figure>
<t>
   where "urlchar" is imported from <xref target="RFC2396"/>, and "type", "subtype",
   "attribute" and "value" are the corresponding tokens from <xref target="RFC2045"/>,
   represented using URL escaped encoding of <xref target="RFC2396"/> as necessary.
</t>
<t>
   Attribute values in <xref target="RFC2045"/> are allowed to be either represented as
   tokens or as quoted strings. However, within a "data" URL, the
   "quoted-string" representation would be awkward, since the quote mark
   is itself not a valid urlchar. For this reason, parameter values
   should use the URL Escaped encoding instead of quoted string if the
   parameter values contain any "tspecial".
</t>
<t>
   The ";base64" extension is distinguishable from a content-type
   parameter by the fact that it doesn't have a following "=" sign.
</t>
</section>
<section title="Examples">
<t>
   A data URL might be used for arbitrary types of data. The URL
</t>
<figure><artwork type="example">
                       data:,A%20brief%20note
</artwork></figure>
<t>
   encodes the text/plain string "A brief note", which might be useful
   in a footnote link.
</t>
<t>
   The HTML fragment:
</t>
<figure><artwork type="example">
&lt;IMG
SRC="data:image/gif;base64,R0lGODdhMAAwAPAAAAAAAP///ywAAAAAMAAw
AAAC8IyPqcvt3wCcDkiLc7C0qwyGHhSWpjQu5yqmCYsapyuvUUlvONmOZtfzgFz
ByTB10QgxOR0TqBQejhRNzOfkVJ+5YiUqrXF5Y5lKh/DeuNcP5yLWGsEbtLiOSp
a/TPg7JpJHxyendzWTBfX0cxOnKPjgBzi4diinWGdkF8kjdfnycQZXZeYGejmJl
ZeGl9i2icVqaNVailT6F5iJ90m6mvuTS4OK05M0vDk0Q4XUtwvKOzrcd3iq9uis
F81M1OIcR7lEewwcLp7tuNNkM3uNna3F2JQFo97Vriy/Xl4/f1cf5VWzXyym7PH
hhx4dbgYKAAA7"
ALT="Larry">
</artwork></figure>
<t>
   could be used for a small inline image in a HTML document.  (The
   embedded image is probably near the limit of utility. For anything
   else larger, data URLs are likely to be inappropriate.)
</t>
<t>
   A data URL scheme's media type specification can include other
   parameters; for example, one might specify a charset parameter.
</t>
<figure><artwork type="example">
   data:text/plain;charset=iso-8859-7,%be%fg%be
</artwork></figure>
<t>
   can be used for a short sequence of greek characters.
</t>
<t>
   Some applications may use the "data" URL scheme in order to provide
   setup parameters for other kinds of networking applications. For
   example, one might create a media type
</t>
<figure><artwork type="example">
        application/vnd-xxx-query
</artwork></figure>
<t>
   whose content consists of a query string and a database identifier
   for the "xxx" vendor's databases. A URL of the form:
</t>
<figure><artwork type="example">
data:application/vnd-xxx-
query,select_vcount,fcol_from_fieldtable/local
</artwork></figure>
<t>
   could then be used in a local application to launch the "helper" for
   application/vnd-xxx-query and give it the immediate data included.
</t>
</section>
<section title="History">
<t>
   This idea was originally proposed August 1995. Some versions of the
   data URL scheme have been used in the definition of VRML, and a
   version has appeared as part of a proposal for embedded data in HTML.
   Various changes have been made, based on requests, to elide the media
   type, pack the indication of the base64 encoding more tightly, and
   eliminate "quoted printable" as an encoding since it would not easily
   yield valid URLs without additional %xx encoding, which itself is
   sufficient. The "data" URL scheme is in use in VRML, new applications
   of HTML, and various commercial products. It is being used for object
   parameters in Java and ActiveX applications.
</t>
</section>
<section title="Security">
<t>
   Interpretation of the data within a "data" URL has the same security
   considerations as any implementation of the given media type.  An
   application should not interpret the contents of a data URL which is
   marked with a media type that has been disallowed for processing by
   the application's configuration.
</t>
<t>
   Sites which use firewall proxies to disallow the retrieval of certain
   media types (such as application script languages or types with known
   security problems) will find it difficult to screen against the
   inclusion of such types using the "data" URL scheme.  However, they
   should be aware of the threat and take whatever precautions are
   considered necessary within their domain.
</t>
<t>
   The effect of using long "data" URLs in applications is currently
   unknown; some software packages may exhibit unreasonable behavior
   when confronted with data that exceeds its allocated buffer size.
</t>
</section>
</middle>
<back>
<references>
  <reference anchor='RFC2396'>
    <front>
      <title abbrev='URI Generic Syntax'>Uniform Resource Identifiers (URI): Generic Syntax</title>
      <author initials='T.' surname='Berners-Lee' fullname='Tim Berners-Lee'>
        <organization abbrev='MIT/LCS'>World Wide Web Consortium</organization>
        <address>
          <email>timbl@w3.org</email>
        </address>
      </author>
      <author initials='R.T.' surname='Fielding' fullname='Roy T. Fielding'>
        <organization abbrev='U.C. Irvine'>University of California, Irvine</organization>
        <address>
          <email>fielding@ics.uci.edu</email>
        </address>
      </author>
      <author initials='L.' surname='Masinter' fullname='Larry Masinter'>
        <organization abbrev='Xerox Corporation'>Xerox PARC</organization>
        <address>
          <email>masinter@parc.xerox.com</email>
        </address>
      </author>
      <date month='August' year='1998' />
    </front>
    <seriesInfo name='RFC' value='2396' />
  </reference>
  <reference anchor="RFC1866">
    <front>
      <title>Hypertext Markup Language - 2.0</title>
      <author initials="T." surname="Berners-Lee" fullname="Tim Berners-Lee">
        <organization>MIT Laboratory for Computer Science</organization>
        <address>
          <email>timbl@w3.org</email>
        </address>
      </author>
      <author initials="D." surname="Connolly" fullname="Daniel W. Connolly">
        <organization>MIT Laboratory for Computer Science, W3 Consortium</organization>
        <address>
          <email>connolly@w3.org</email>
        </address>
      </author>
      <date month="November" year="1995"/>
    </front>
    <seriesInfo name="RFC" value="1866"/>
  </reference>
  <reference anchor="RFC2045">
    <front>
      <title abbrev="Internet Message Bodies">Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies</title>
      <author initials="N." surname="Freed" fullname="Ned Freed">
        <organization>Innosoft International, Inc.</organization>
        <address><email>ned@innosoft.com</email></address>
      </author>
      <author initials="N.S." surname="Borenstein" fullname="Nathaniel S. Borenstein">
        <organization>First Virtual Holdings</organization>
        <address><email>nsb@nsb.fv.com</email></address>
      </author>
      <date month="November" year="1996"/>
    </front>
    <seriesInfo name="RFC" value="2045"/>
  </reference>
</references>
</back>
</rfc>
