<?xml version='1.0' encoding='US-ASCII'?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">

<?rfc toc='yes' ?>
<?rfc editing='no' ?>
<?rfc symrefs='yes' ?>
<?rfc sortrefs='no'?>
<?rfc linkmailto='no'?>
<?rfc compact='yes'?>
<?rfc subcompact="no"?>
<?rfc comments='yes'?>
<?rfc inline='yes'?>

<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>

<rfc number='7995' category='info' submissionType="IAB" consensus="yes">

  <front>
    <title abbrev='PDF for RFCs'>PDF Format for RFCs</title>

    <author initials='T.' surname='Hansen' fullname='Tony Hansen' role='editor'>
      <organization>AT&amp;T Laboratories</organization>
      <address>
        <postal>
          <street>200 Laurel Ave. South</street>
          <city>Middletown</city>
          <region>NJ</region>
          <code>07748</code>
          <country>United States of America</country>
        </postal>
        <email>tony@att.com</email>
      </address>
    </author>

    <author initials='L.' surname='Masinter' fullname='Larry Masinter'>
      <organization>Adobe</organization>
      <address>
        <postal>
          <street>345 Park Ave.</street>
          <city>San Jose</city>
          <region>CA</region>
          <code>95110</code>
          <country>United States of America</country>
        </postal>
        <email>masinter@adobe.com</email>
        <uri>http://larrymasinter.net</uri>
      </address>
    </author>

    <author initials='M.' surname='Hardy' fullname='Matthew Hardy'>
      <organization>Adobe</organization>
      <address>
        <postal>
          <street>345 Park Ave.</street>
          <city>San Jose</city>
          <region>CA</region>
          <code>95110</code>
          <country>United States of America</country>
        </postal>
        <email>mahardy@adobe.com</email>
      </address>
    </author>

    <date month="December" year="2016"/>

    <keyword>Requests for Comment</keyword>

    <abstract>
      <t>
        This document discusses options and requirements for the PDF
        rendering of RFCs in the RFC Series, as outlined in RFC
        6949. It also discusses the use of PDF for Internet-Drafts,
        and available or needed software tools for producing and
        working with PDF.
      </t>
    </abstract>
  </front>
  <middle>
    <section title='Introduction'>
      <t> The RFC Series is evolving, as outlined in
        <xref target='RFC6949'/>. Future documents will use a canonical
        format, XML, with renderings in various formats, including PDF.
      </t>

      <t>Because PDF has a wide range of capabilities and
        alternatives, not all PDFs are "equal".  For example, visually
        similar documents could consist of scanned or rasterized images,
        or include text layout options, hyperlinks, embedded fonts, and
        digital signatures. (See <xref target='APP-PDF'/>
        for a history of PDF.)
      </t>

      <t> This document explains some of the relevant options and
        makes recommendations, for both the RFC Series and Internet-Drafts.
      </t>

      <t> The PDF format and the tools to manipulate it are
        not as well known as those for the other RFC formats, at least
        in the IETF community. This document
        discusses some of the processes for creating and using PDFs
        using both open source and commercial products.
      </t>

      <t>
        The details described in this document are expected to change based on experience
        gained in implementing the new publication toolsets.
        Revised documents will be published capturing those changes as the
        toolsets are completed.
        Other implementers must not expect those changes to remain backwards-compatible with the
        details described in this document.
      </t>

    </section>

    <section title='Choosing PDF Versions and Standards'>
      <t><xref target="PDF">PDF</xref> has gone through several revisions,
        primarily for the addition of features.  PDF
        features have generally been added in a way that older viewers
        "fail gracefully", but even so, the older the PDF version
        produced, the more legacy viewers will support that version
        but the fewer features will be enabled.
      </t>

      <t>As PDF has evolved a broad set of capabilities, additional
        standards for PDF files are applicable. These standards
        establish ground rules that are important for specific
        applications.  For example, PDF/X was specifically designed
        for Prepress digital data exchange, with careful attention
        to color management and printing instructions. The PDF/E 
        standard was designed for engineering documents with dynamic 
        workflows (where a document continues to be revised after 
        publication) and allows interactive media (including
        animation and&nbsp;3D).
      </t>

      <t>Two additional standards families are important to the
        RFC format, though: long-term preservation (PDF/A), and user
        accessibility (<xref target="PDFUA">PDF&wj;/UA</xref>). These then
        have sub-profiles (PDF/A-1, <xref target="PDFA2">PDF/A-2</xref>,
        <xref target="PDFA3">PDF/A-3</xref>), each of which has
        conformance levels. These standards are then supported
        by various software libraries and tools.
      </t>

      <t>It is effective and useful to use these standards to capture
        PDF for RFC requirements, and they will make the PDF files useful
        in workflows that expect them.
      </t>

      <t>Recommendations:
        <list style="symbols">
          <t>Use PDF 1.7; although relatively recent, it is well
            supported by widely available viewers.
          </t>

          <t>For RFCs, require PDF/A-3 with conformance level "U".
            This captures the archivability and
            long-term stability of PDF 1.7 files, mandatory Unicode mapping
            (Sections&nbsp;14.8.2.4.2 ("Unicode Mapping in Tagged PDF")
            and 9.10.2 ("Mapping Character Codes to Unicode Values")
            of <xref target="PDF"/>), and many of the requirement features.
          </t>

          <t>Use PDF/A-3 for embedding additional data (including
            the XML source file) in RFCs and Internet-Drafts.
          </t>

          <t>Use PDF/UA for user accessibility.
          </t>
        </list>
      </t>
    </section>

    <section title='Options and Requirements for PDF RFCs'
             anchor='requirements'>
      <t>This section lays out options and requirements for PDFs produced
        by the RFC Editor for RFCs. There are two subsections:
        <xref target="visible-req"/> covers "visible" requirements
        related to how the PDF normally appears when it is viewed with
        a PDF viewer; <xref target="invisible-opt-req"/> covers
        "invisible" options and requirements, which primarily affect
        the ability to process PDFs in other ways but do not ordinarily
        control the way the document appears. (Of course, a viewer
        UI might display processing capabilities, such as showing
        whether a document has been digitally signed.)
      </t>

      <t>In many cases, the choice of PDF requirements is heavily
        influenced by the capabilities of available tools to create
        PDFs. Most of the discussion of tooling is to be found in
        <xref target='tooling'/>.
      </t>

      <section title='"Visible" Requirements' anchor="visible-req">
        <t> PDF supports rich visible layout of fixed-sized pages.
        </t>

        <section title='General Visible Requirements'>
          <t> For a consistent "look" of RFCs and good style, the PDFs
            produced by the RFC Editor should have a clear, consistent,
            identifiable, and easy-to-read style. They should print well
            on the widest range of printers and should look good on
            displays of varying resolution.
          </t>
        </section>

        <section title='Page Size and Margins'>
          <t>PDF files are laid out for a particular size of page and
            margins. There are two paper sizes in common use: "US
            Letter" (8.5x11&nbsp;inches, 216x279 mm, in popular use in
            North America) and "A4" (210x297 mm, 8.27x11.7 inches,
            standard for the rest of the world).  Usually, PDF printing
            software is used in a "shrink to fit" mode where the
            printing is adjusted to fit the paper in the printer.
            There is some controversy, but the argument that A4 is
            an international standard is compelling.
            However, if the margins and header positioning are chosen
            appropriately, the document can be printed without any
            scaling.
          </t>

          <t><list style="hanging"><t hangText="Recommendation:">
            The Internet-Draft and RFC processors should produce A4 size
            by default.
            However, the margins and header positioning need to be chosen
            to look good on both paper sizes without scaling.
            Following the advice found in <xref target='RFC2346'/>, this
            means that we should use A4 portrait mode with left and right
            margins of 20&nbsp;mm, and top and bottom margins of 33 mm.
          </t></list></t>
        </section>

        <section title='Headers and Footers'>
          <t>
            Page headers and footers are part of the page layout.
            There are a variety of options. Note that page headers and
            footers in PDF can be typeset in a way that the entire
            (longer) title might fit.
          </t>

          <t><list style="hanging"><t hangText="Recommendation:">
            Page headers and footers should contain information similar to
            the headings in the current text versions of
            documents, including page numbers, title, author, and date.
            However, the page headers and footers should be typeset
            in a way so as to be unobtrusive.
            The page headers and footers should be placed into the PDF
            in such a way that they do not interfere with screen readers. 
          </t></list></t>
        </section>

        <section title='Paragraph Numbering'>
          <t>One common feature of the Internet-Draft output formats
            is optional visible paragraph numbers, to aid in discussions.
            In the PDF, and thus in the printed rendition, it is possible
            to make paragraph numbers unobtrusive and even to impinge on
            the margins.
          </t>
          <t><list style="hanging"><t hangText="Recommendation:">
            When the XML "editing=yes" option has been chosen,
            show paragraph numbers in the right margin, typeset in a way so
            as to be unobtrusive.
            (The right margin instead of the left margin prevents the paragraph
            numbers from being confused with the section numbers.)
            If possible, the paragraph numbers should be coded in such a way
            that they do not interfere with screen readers.
          </t></list></t>
        </section>

        <section title="Paged Content Layout">
          <t>
            By its nature, PDF is paginated, so pagination issues must
            be considered.
            This is reflected in two areas: running headers and footers, 
            and how text is laid out on a page for optimal reading.
          </t>
          <t>
            <xref target="page-layout"/> describes
            the process of creating a paged document from running text
            such that related material is present
            on the same page together and artifacts of pagination
            don't interfere with easy reading of the document.
          </t>
          <t>Layout engines differ in the quality of the algorithms
            used to automate these processes. In some cases, the
            automated processes require some manual assistance
            to ensure, for example, that a text line intended
            as a heading is "kept" with the text for which it is a heading.
          </t>
          <t>Recommendations:
          <list style='symbols'>
            <t>
            Headers and footers should be printed on each page.
            The information should include the RFC number or Internet-Draft
            name, the page number, the category (e.g., Informational), a
            shortened version of the authors' names,
            the date of the RFC or Internet&nbhy;Draft, and
            the short form of the document title.
            </t>
          <t>
            Choose a layout engine so that
            <list style="symbols">
            <t>manual intervention is minimized</t>
            <t>widow and orphan processing is automatic</t>
            <t>heading and title contiguation is automatic</t>
            </list></t>
          </list>
          </t>
        </section>

        <section title="Typeface Choices">
          <t>
            A PDF may refer to a font by name, or it may use an
            embedded font.  When a font is not embedded, a PDF viewer
            will attempt to locate a locally installed font of the same
            name.  If it cannot find an exact match, it will find a
            "close match".  If a close match is not available, it will
            fall back to something implementation dependent and
            usually undesirable.
          </t>
          <t>
            In addition, the PDF/A standards mandate the embedding
            of fonts. Instead of using additional software to embed the fonts, 
            the software generating the PDF files should produce
            PDF/A-conforming files directly, thus ensuring that all
            glyphs include Unicode mappings and embedded fonts from the outset.
          </t>

          <t>If the HTML version of the document is
            being visually mimicked, the font(s) chosen should have
            both variable-width and constant-width components, as well
            as bold and italic representations.
          </t>
          <t> The typefaces used by Internet-Drafts and by RFCs
            need not be identical.
          </t>
          <t> Few fonts have glyphs for the entire repertoire of
            Unicode characters; for this purpose, the PDF generation
            tool may need a set of fonts and a way of choosing them.
            The RFC Editor is defining where Unicode characters
            may be used within RFCs <xref target='RFC7997'/>.
          </t>
          <t>Typefaces are typically licensed, and in many cases there is
            a fee for use by PDF creation tools; however, there is usually
            no fee for display or print of the embedded fonts.
          </t>
          <t> Recommendations:
            <list style='symbols'>
              <t>
                For consistent viewing, all fonts should be embedded.
                The fonts used must be available for use by the IETF community.
                Some discussion of available typefaces can be found in 
                <xref target="typefaces"/>.
              </t>

              <t>
                The choice of typefaces with respect to serif, sans-serif,
                monospace, etc., should follow the recommendations for HTML
                and CSS renderings ("CSS" refers to a Cascading Style Sheet)
                <xref target='RFC7992'/> and  <xref target='RFC7993'/>.
              </t>

              <t>
                The range of Unicode characters allowed in
                the XML source for Internet-Drafts and RFCs may
                be bounded by the availability of embeddable fonts with
                appropriate glyphs <xref target='RFC7997'/>.
              </t>
            </list>
          </t>
        </section>

        <section title='Hyphenation and Line Breaks'>
          <t>Typically, when doing page layout of running text,
            especially with narrow page width and long words,
            layout processors of English text often have the
            option of either hyphenating words or using existing
            hyphens as a place to introduce word breaks.

            However, inserting line breaks mid-word can be harmful
            when the "word" is actually a sequence of characters
            representing a protocol element or protocol sequence.
          </t>
          <t>
            <list style="hanging"><t hangText="Recommendation:">
            Avoid introducing hyphenated line breaks mid-word into the
            visual display, consistent with requirements for plain&nbsp;text
            and HTML.
          </t></list></t>
        </section>

        <section title='Hyperlinks'>
          <t>PDF supports hyperlinks to sections of the same
            document and also to sections of other documents.
          </t>
          <t>The conversion to PDF can generate:
            <list style='symbols'>
              <t>hyperlinks within the document
              </t>
              <t>hyperlinks to other RFCs and Internet-Drafts
              </t>
              <t>hyperlinks to external locations
              </t>
              <t>hyperlinks within a table of contents
              </t>
              <t>
                hyperlinks within an index
              </t>
            </list>
          </t>

          <t>Recommendations:
            <list style='symbols'>
              <t>All hyperlinks available in the HTML rendition of the
                RFC should also be visible and active in the PDF
                produced.
                This includes both internal hyperlinks and hyperlinks to
                external resources.
              </t>
              <t>The table of contents, including page numbers, is useful
                when printed. Section numbers and page numbers in the
                table of contents should also be hyperlinked to their
                respective sections in the body of the document.
              </t>
              <t>As specified in Section 4.8.6.2 ("Referencing RFCs") of
                <xref target="RFC7322"/>, hyperlinks to RFCs from the
                references section should point to the RFC "info" page
                (e.g., &lt;https://www.rfc-editor.org/info/rfc7322&gt;),
                which then links to the various formats available.
              </t>
              <t>Hyperlinks to Internet-Drafts from the
                references section should point to the Datatracker entry
                page for the draft, which then links to the various formats
                available.
              </t>
            </list>
          </t>
        </section>

        <section title='Similarity to Other Outputs'>
          <t>There is some advantage to having the PDF files look like
            the text or HTML renderings of the same document.
            Even so, there are several options.
            The PDF
            <list style='numbers'>
              <t>could look like the text version of the document, or
              </t>
              <t>could look like the text version of the document but
                with pictures rendered as pictures instead of using their
                ASCII&nbsp;art equivalent, or
              </t>
              <t>could look like the HTML version.
              </t>
            </list>
          </t>

          <t><list style="hanging"><t hangText="Recommendation:">
            The PDF rendition should look like the
            HTML rendition, at least in spirit.  Some
            differences from the HTML rendition would include different
            typeface and size (chosen for printing), page numbers in the
            table of contents and index, and the use of page headers and
            footers.
          </t></list></t>

          <t>
            Most of the choices used for the renderings per
            <xref target='RFC7992'/> and <xref target='RFC7993'/> are
            thus applicable. See those documents for specifics on the
            rendering of the specific XML elements.
            Some notes:
            <list style="symbols">
              <t>
                Every place in the document that would receive an HTML ID
                would be given an identical PDF named destination.
                In addition, a named destination will be created for each
                page with the form "pg-#", as in "pg-35".
              </t>
              <t>
                No pilcrows are generated or made visible.
              </t>
              <t>
                The table of contents (generated if the XML's &lt;rfc&gt;
                element's tocInclude attribute
                has the value "true") <xref target="RFC7991"/>
                will have the section number linked to the section
                start but will also include a page number that
                is linked to the corresponding page.
                The section title and the page number will be separated by a
                visually appropriate separator, and the page numbers will be
                aligned with each other.
              </t>
              <t>
                The index (generated if the XML's &lt;rfc&gt;
                element's indexInclude attribute has the value "true")
                will have the section number linked to that section
                named destination but will also include a page number that
                is linked to the page named destination.
              </t>
              <t>
                The running header in one line (on page 2 and all subsequent
                pages) has the RFC number on the left (RFC NNNN), the (possibly
                shortened form) title centered, and the date (Month Year) on
                the right.
                The text is rendered in a way that is visually unobtrusive.
              </t>
              <t>
                The running footer in one line (on all pages) has the author's
                last name on the left, category centered, and the page number
                on the right ([Page N]).
                The text is rendered in a way that is visually unobtrusive.
              </t>
              <t>
                We should not attempt to replicate in PDF the feature of the
                HTML format that includes a dynamic block that displays
                up-to-date information on updates, obsoletions, and errata.
              </t>
            </list>
          </t>
        </section>
      </section>

      <section title='"Invisible" Options and Requirements' anchor="invisible-opt-req">
        <t>PDF offers a number of features that improve the utility
          of PDF files in a variety of workflows, at the cost of extra
          effort in the xml2rfc conversion process; the trade-offs may be
          different for the RFC&nbsp;Editor production of RFCs and for
          Internet-Drafts.
        </t>

        <section title='Internal Text Representation' anchor='internal'>
          <t> The contents of a PDF file can be represented in many
            ways.  The PDF file could be generated:
            <list style='symbols'>

              <t>as an image of the visual representation, such as a
                JPEG image of the word "IETF". That is, there might be no
                internal representation of letters, words, or paragraphs
                at all.
              </t>

              <t>placing individual characters in position on the page,
                such as saying "put an 'F' here," then "put a 'T' before
                it," then "put an 'E' before that," then "put an 'I'
                before that" to render the word "IETF". That is, there might be
                no internal representation of words or paragraphs at all.
              </t>

              <t>placing words in position on the page, such as keeping
                the characters of the word "IETF" together.
                That is, there might be
                no internal representation of paragraphs at all.
              </t>

              <t>ensuring that the running order of text in the content
                stream matches the logical reading order. That is, a sentence
                such as "The Internet Engineering Task Force (IETF)
                supports the Internet." would be kept together as a sentence,
                and multiple sentences within a paragraph would be kept together.
              </t>
            </list>
          </t>

          <t>All of these end up with essentially the same visual
            representation of the output.  However, each level has
            trade-offs for auxiliary uses, such as searching or indexing,
            commenting and annotation, and accessibility
            (text-to-speech).
            Keeping the running order of text in the content stream in the
            proper order supports all of these auxiliary uses.
          </t>

          <t>In addition, the "role map" feature of PDF
            (Section&nbsp;14.7.3 ("Structure Types") of <xref target="PDF"/>)
            would allow for the mapping of the logical tags
            found in the original XML into tags in the PDF.
          </t>

          <t>Recommendations:
            <list style='symbols'>

              <t>Text in content streams should follow the XML
                document's logical order (in the order of tags) to the
                extent possible.  This will provide optimal reuse by
                software that does not understand Tagged&nbsp;PDF.
                (PDF/UA requires this.)
              </t>

              <t>
                It might be possible to use the "role map" annotation
                to capture enough of the xml2rfc source structure,
                to the point where it is possible to reconstruct
                the XML source structure completely. However, there is not
                a compelling case to do so over embedding the original
                XML, as described in <xref target='embedXML' />.
              </t>
            </list>
          </t>
        </section>

        <section title='Unicode Support'>
          <t> PDF itself does not require the use of Unicode.  Text is
            represented as a sequence of glyphs that can then be
            mapped to Unicode.
          </t>
          <t>Recommendations:
            <list style="symbols">
              <t>PDF files generated must have the full text,
                as it appears in the original XML.
              </t>
              <t>Unicode normalization may occur.
              </t>
              <t>Text within SVG for SVG images should
                also have Unicode mappings.
              </t>
              <t>Alt-text for images should also support Unicode.
              </t>
            </list>
          </t>
        </section>

        <section title='Image Processing (Artwork)'>
          <t>
            The XML allows both ASCII art and SVG to be used for artwork.
          </t>
          <t>Recommendations:
            <list style="symbols">
              <t>
                If both ASCII art and SVG are available for a picture,
                the SVG artwork should be preferred over the ASCII artwork.
              </t>
              <t>
                ASCII artwork must be rendered using a monospace font.
              </t>
            </list>
          </t>
        </section>

        <section title='Text Description of Images (Alt-Text)'>
          <t>
            Guidelines for the accessibility of PDF 
            &lt;http://www.w3.org/TR/WCAG20&nbhy;TECHS/PDF1.html&gt;
            recommend that images, formulas, and other non-text items
            provide textual alternatives, using the "/Alt" Tag in
            PDF to provide human-readable text that can be vocalized
            by text-to-speech technology.
          </t>
          <t>
            <list style="hanging"><t hangText="Recommendation:">
            Any alt-text for artwork and figures
            available in the XML source should be stored using the PDF
            /Alt property. Internet-Draft authors and the RFC Editor
            should ensure that alt&nbhy;text for all SVG or images is
            included within the XML source.
          </t></list></t>
        </section>

        <section title='Metadata Support'>
          <t>
            Metadata encodes information about the document authors,
            the document series, date created, etc.  Having this
            metadata within the PDF file allows it to be used by
            search engines, viewers, and other reuse tools.
            PDF supports embedded metadata in a variety of
            ways, including using the Extensible Metadata Platform (XMP)
            <xref target='XMP' />. The RFC Editor maintains metadata about
            an RFC on its info page.
          </t>
          <t>
            <list style="hanging"><t hangText="Recommendation:">
            The PDFs generated should have all of the
            metadata from the XML version embedded directly as XMP
            metadata, including the author, date, the
            document series, and a URL for where the document can be
            retrieved. This information should be consistent with
            the RFC Editor info page at the time of publication.
          </t></list></t>
        </section>

        <section title='Document Structure Support'>
          <t>PDF supports an "outline" feature where sections of the
            document are marked; this could be used in addition to the
            table of contents as a navigation aid.
          </t>
          <t>
            The section structure of an RFC can be mapped into the PDF
            elements for the document structure.
            This will allow the
            bookmark feature of PDF readers to be used to quickly access
            sections of the document.
          </t>
          <t>
            <list style="hanging"><t hangText="Recommendation:">
            The section structure of an RFC should be
            mapped into the PDF elements for the document structure.
            This would include section headings for the boilerplate
            sections, such as the Abstract, the Status of This Memo section,
            the table of contents, and the Author's Address section, plus
            the obvious section headings that are normally included in
            the table of contents. If possible, this should be done in a way
            that the same fragment identifiers for the HTML version of
            the RFC will work for the PDF version.
          </t></list></t>
        </section>

        <section title='Embedded Files' anchor='embedXML'>
          <t>PDF has the capability of including other files;
            the files may be labeled by both a media type and a
            role, the AFRelationship key <xref target='PDFA3'/>.
            In this way, the PDF file also acts as a container.
          </t>
          <t>Embedded content may be compressed.
          </t>
          <t>Many PDF viewers support the ability to
            view and extract embedded files, although this capability
            is not universal.
          </t>
          <t>Embedding content in the PDF file allows the PDF to act as
            a complete package that can be transformed, archived,
            and digitally signed. 
            (Some sample code illustrating how items can be attached to a 
            PDF file and subsequently extracted can be found
            at &lt;https://github.com/Aiybe/xmptest&gt;.)
            Useful possibilities:

            <list style="symbols">
              <t>Embed the source XML input file itself within the PDF.
                If the source SVG and images for illustrations are also
                embedded, this would make the PDF file totally
                self-referential.
              </t>
              <t>Embed directly extractable components that are useful for
                independent processing, including ABNF, MIBs, and source code
                for reference implementations.  This capability might be
                supported through other mechanisms from the XML source
                files but could also be supported within the PDF.
              </t>
              <t>Finding, extracting, and embedding other components
                may require additional markup to clearly identify them
                and additional review to ensure the correctness of
                embedded files that are not visible.
              </t>
            </list>
          </t>
          <t>
            Recommendations:
            <list style="symbols">
              <t>Embed the XML source and all illustrations,
                for RFCs, as a standard
                feature for xml2rfc's PDF output.
              </t>
              <t>If possible, make this a standard feature for
                Internet-Drafts as&nbsp;well.
              </t>
              <t>
                Named &lt;sourcecode&gt; entries should be embedded.
              </t>
              <t>
                Bitmap images (SVG sources, JPEGs, PNGs, etc.) should be embedded.
              </t>
            </list>
          </t>
        </section>

      </section>
      <section title='Digital Signatures'>
        <t>
          The RFC Editor and staff are at times called to provide evidence
          that a particular RFC is the "original" and has not been modified;
          digital signatures can provide that verification.
          As signatures also apply to embedded content, embedding the
          XML source will provide a way of signing the source XML
          that was used to produce the PDF file as well.
        </t>
        <t>
          PDF has supported digital signatures since PDF 1.2, and there
          are multiple methods and options available for signing PDF files.
          The method chosen for the signing of Internet&nbhy;Drafts and
          RFCs will be determined by separate policy.
        </t>

        <t>
         If PDF digital signatures are chosen, the authors suggest the
         following:

        <list style="symbols">
          <t>
          PDF documents generated by the Internet-Draft upload tools
          should be signed with no restrictions on what can be done to
          the documents afterwards.
          </t>

          <t>
          If Internet-Drafts are allowed to be uploaded in PDF form
          by an individual, the signature being added should be set in
          the same way as that noted in the previous paragraph. A PDF that
          would not allow the IETF Secretariat to re&nbhy;sign it in that
          fashion should be rejected.
          </t>
          <t>
          PDF documents generated by the RFC Editor should be signed and
          certified, and restrictions placed on them to only allow
          additional signatures and comments (markup) to be added.
          </t>

	</list>
	</t>
      </section>
    </section>

    <section title="Security Considerations">
    <t>The following security considerations apply:</t>

    <t>Threats:
      <list style="symbols">
      <t>There is a risk that user-submitted Internet-Drafts in PDF might
      contain malware that targets a vulnerability in one of the deployed PDF
      consumers (readers, printers, validation tools, etc.) in use.</t>
      <t>There is a small risk that a PDF production toolset might itself have
      some vulnerability by which it could be tricked into producing
      malware-bearing PDF files.</t>
      <t>Section 7 of <xref target="RFC3778"/> describes some additional
      security considerations for PDF, although this specification is intended
      to avoid features (like scripting) that might trigger some of those
      concerns.</t>
      </list></t>

    <t>Mitigations:
      <list style="symbols">
      <t>The toolsets for producing PDFs need careful security reviews before
      deploying broadly.</t>
      <t>If users are allowed to submit Internet-Drafts in PDF, such PDF files
      should be examined carefully for conformance to this specification, as
      well as any known exploits of deployed PDF software.</t>
      </list></t>
    </section>

  </middle>
  <back>

    <references title='Normative References'>
      <reference anchor="PDF">
        <front>
          <title>
            Document management -- Portable document format -- Part 1: PDF 1.7
          </title>
          <author>
            <organization>ISO</organization>
          </author>
          <date month="" year="2008"/>
        </front>
        <seriesInfo name="ISO" value="32000-1"/>
        <annotation>Also available free from Adobe.</annotation>
      </reference>

      <reference anchor="XMP">
        <front>
          <title>Graphic technology -- Extensible metadata platform (XMP)
          specification -- Part 1: Data model, serialization and core
          properties
          </title>
          <author><organization>ISO</organization></author>
          <date month="" year="2012"/>
        </front>
        <seriesInfo name="ISO" value="16684-1"/>
        <annotation>Not available free, but there
          are a number of descriptive resources, e.g.,
          &lt;http://en.wikipedia.org/wiki/Extensible_Metadata_Platform&gt;.
        </annotation>
      </reference>

      <reference anchor="PDFA2">
        <front>
          <title>Document management -- Electronic document file format for
          long-term preservation -- Part 2: Use of ISO 32000-1 (PDF/A-2)
          </title>
          <author><organization>ISO</organization></author>
          <date month="" year="2011" />
        </front>
        <seriesInfo name="ISO" value="19005-2" />
      </reference>

      <reference anchor="PDFA3">
        <front>
          <title>Document management -- Electronic document file format for
          long-term preservation -- Part 3: Use of ISO 32000-1 with support for
          embedded files (PDF/A-3)
          </title>
          <author><organization>ISO</organization></author>
          <date month="" year="2012" />
        </front>
        <seriesInfo name="ISO" value="19005-3" />
      </reference>

      <reference anchor="PDFUA">
        <front>
          <title>Document management applications -- Electronic document file
          format enhancement for accessibility -- Part&nbsp;1: Use of ISO
          32000-1 (PDF/UA-1)</title>
          <author><organization>ISO</organization></author>
          <date month="" year="2014" />
        </front>
        <seriesInfo name="ISO" value="14289-1" />
      </reference>

      <?rfc include='https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.3778' ?>
    </references>

    <references title='Informative References'>
      <?rfc include='https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.2346' ?>
      <?rfc include='https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.6949' ?>
      <?rfc include='https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.7322' ?>

<!-- draft-iab-xml2rfc (RFC 7991) -->
<reference anchor='RFC7991' target="http://www.rfc-editor.org/info/rfc7991">
<front>
<title>The "xml2rfc" Version 3 Vocabulary</title>
<author initials='P' surname='Hoffman' fullname='Paul E. Hoffman'>
    <organization />
</author>
<date month='December' year='2016' />
</front>
<seriesInfo name="RFC" value="7991"/>
<seriesInfo name="DOI" value="10.17487/RFC7991"/>
</reference>

<!-- draft-iab-rfc-nonascii (RFC 7997) -->
<reference anchor='RFC7997' target="http://www.rfc-editor.org/info/rfc7997">
<front>
<title>The Use of Non-ASCII Characters in RFCs</title>
<author initials='H' surname='Flanagan' fullname='Heather Flanagan' role="editor">
    <organization />
</author>
<date month='December' year='2016' />
</front>
<seriesInfo name="RFC" value="7997"/>
<seriesInfo name="DOI" value="10.17487/RFC7997"/>
</reference>

<!-- draft-iab-rfc-css (RFC 7993) -->
<reference anchor='RFC7993' target="http://www.rfc-editor.org/info/rfc7993">
<front>
<title>Cascading Style Sheets (CSS) Requirements for RFCs</title>
<author initials='H' surname='Flanagan' fullname='Heather Flanagan'>
    <organization />
</author>
<date month='December' year='2016' />
</front>
<seriesInfo name="RFC" value="7993"/>
<seriesInfo name="DOI" value="10.17487/RFC7993"/>
</reference>

<!-- draft-iab-html-rfc (RFC 7992) -->
<reference anchor='RFC7992' target="http://www.rfc-editor.org/info/rfc7992">
<front>
<title>HTML Format for RFCs</title>
<author initials='J' surname='Hildebrand' fullname='Joe Hildebrand' role="editor">
    <organization />
</author>
<author initials='P' surname='Hoffman' fullname='Paul E. Hoffman'>
    <organization />
</author>
<date month='December' year='2016' />
</front>
<seriesInfo name="RFC" value="7992"/>
<seriesInfo name="DOI" value="10.17487/RFC7992"/>
</reference>

<!-- draft-hardy-pdf-mime (IESG Eval.) -->
<reference anchor='APP-PDF'>
<front>
<title>The application/pdf Media Type</title>
<author initials='M' surname='Hardy' fullname='Matthew Hardy'>
    <organization />
</author>
<author initials='L' surname='Masinter' fullname='Larry Masinter'>
    <organization />
</author>
<author initials='D' surname='Markovic' fullname='Dejan Markovic'>
    <organization />
</author>
<author initials='D' surname='Johnson' fullname='Duff Johnson'>
    <organization />
</author>
<author initials='M' surname='Bailey' fullname='Martin Bailey'>
    <organization />
</author>
<date month='September' year='2016' />
</front>
<seriesInfo name='Work in Progress,' value='draft-hardy-pdf-mime-04' />
</reference>

    </references>

    <section title="History and Current Use of PDF with RFCs and Internet-Drafts">
      <t>NOTE: This section is meant as an overview
        to give some background.
      </t>
      <section title="RFCs">
        <t>The RFC Series has for a long time accepted Postscript
          renderings of RFCs, either in addition to or instead of the
          text renderings of those same RFCs.  These have usually been
          produced when there was a complicated figure or mathematics
          within the document.  For example, consider the figures and
          mathematics found in RFCs 1119 and 1142, and compare the
          figures found in the text version of RFC 3550 with those in
          the Postscript version.  The RFC Editor has provided a PDF
          rendering of RFCs. Usually, this has been a print of the
          text file that does not take advantage of any of the broader
          PDF functionality, unless there was a Postscript version
          of the RFC, which would then be used by the RFC Editor to
          generate the PDF.
        </t>
      </section>
      <section title='Internet-Drafts'>
        <t>In addition to PDFs generated and published by the RFC
          Editor, the IETF tools community has also long supported PDF
          for Internet-Drafts.  Most RFCs start with Internet-Drafts,
          edited by individual authors.  The Internet-Drafts submission
          tool at &lt;https://datatracker.ietf.org/submit/&gt; accepts PDF
          and Postscript files in addition to the (required) text submission
          and (currently optional) XML.  If a PDF wasn't submitted for a
          particular version of an Internet-Draft, the tools would
          generate one from the Postscript, HTML, or text.
        </t>
      </section>
    </section>
    
    <section title="Paged Content Layout Quality" anchor="page-layout">
      <t>The process of creating a paged document from running text
        typically involves ensuring that related material is present
        on the same page together and that artifacts of pagination
        don't interfere with easy reading of the document. Typical
        high-quality layout processors do several things:
        <list style='hanging'>
          <t hangText='Widow and Orphan Management:'>
            Widows and orphans (&lt;https://en.wikipedia.org/wiki/Widows_and_orphans&gt;)
            should be avoided automatically (unless the entire paragraph is only one line).
            Ensure that a page break does not occur after the first
            line of a paragraph (orphans), if necessary, using slightly longer
            page sizes.
            Similarly, ensure that a page break does not occur before
            the last line of a paragraph (widows).
          </t>
          <t hangText='Keep Section Heading Contiguous:'>
            Do not insert a page break immediately after a section heading.
            If there isn't room on a page for the first (two)
            lines of a section after the section heading,
            insert a page break before the heading.
          </t>
          <t hangText='Avoid Splitting Artwork:'>
            Figures should not be split from figure titles.
            If possible, keep the figure on the same page
            as the (first) mention of the figure.
          </t>
          <t hangText='Headers for Long Tables after Page Breaks:'>
            Another common option in producing paginated documents
            is to include the column headings of a table if
            the table cannot be displayed on a single page.
            Similarly, tables should not be split from the table titles.
          </t>
          <t hangText='keepWithNext and keepWithPrevious:'>
            The XML attributes "keepWithNext" and "keepWithPrevious"
            should be used and followed whenever possible.
          </t>
          <t hangText='Whitespace Preservation:'>
            The Unicode Points for XML entities such as
            Non-Breaking Space (nbsp) and Non-Breaking Hyphen (nbhy)
            should be followed as directed whenever possible.
          </t>
        </list>
      </t>
    </section>

    <section title='Tooling' anchor='tooling'>

      <t>This section discusses tools for viewing, comparing,
        creating, manipulating, and transforming PDF files, including those
        currently in use by the RFC Editor and Internet-Drafts, as well
        as outlining available PDF tools for various processes.
      </t>

      <section title='PDF Viewers'>
        <t>As with most file formats, PDF files are experienced through
        a reader or viewer of PDF files. For most of the common
        platforms in use (iOS, OS X, Windows, Android, ChromeOS,
        Kindle) and for most browsers (Edge, Safari, Chrome,
        Firefox), PDF viewing is built in. In addition there are
        many PDF viewers available for download and installation.
        </t>

        <t>PDF viewers vary in capabilities, and it is important to note
          which PDF viewers support the features utilized in PDF RFCs and
          Internet&nbhy;Drafts (features such as links, digital signatures,
          Tagged PDF, and others mentioned in
          <xref target='requirements'/>).
        </t>

      </section>
      <section title='Printers'>
        <t>While almost all viewers also support the printing of
          PDF files, printing is one of the most important use
          cases for PDFs. Some printers have direct PDF
          support.
        </t>
      </section>

      <section title='PDF Generation Libraries'>
        <t>Because the xml2rfc format is a unique format,
          software for converting XML source documents to
          the various formats will be needed, including
          PDF generation.
        </t>
        <t>
          One promising direction is suggested in
          &lt;http://greenbytes.de/tech/webdav/rfc2629xslt/rfc2629xslt.html#output.pdf.fop&gt;:
          using XSLT (Extensible Stylesheet Language Transformations)
          to generate XSL&nbhy;FO (XSL Formatting Objects);
          XSL-FO is then processed by a FOP (Formatting Objects Processor)
          such as Apache FOP.
        </t>
        <t>
          Several libraries are also available for generating PDF signatures.
          The choice of library to use for xml2pdf will depend on many factors:
          programming language, quality of implementation, quality of
          PDF generated, support, cost, availability, 
          and so forth. 

        </t>
      </section>

      <section title='Typefaces' anchor='typefaces'>
        <t>Various typefaces are available that might satisfy the
          requirements of this document. Google's Noto typeface family
          &lt;https://www.google.com/get/noto/&gt; supports a significant
          subset of Unicode and includes fixed-width, serif, and
          sans-serif styles. Another potentially useful set of typefaces
          (without extensive Unicode support, however) includes:
          <list style='symbols'>
            <t>Source Sans Pro &lt;https://en.wikipedia.org/wiki/Source_Sans_Pro&gt;</t>
            <t>Source Serif Pro &lt;https://en.wikipedia.org/wiki/Source_Serif_Pro&gt;</t>
            <t>Source Code Pro &lt;https://en.wikipedia.org/wiki/Source_Code_Pro&gt;</t>
          </list>
          Another font that looks promising for its broad Unicode support is
          Skolar &lt;https://www.rosettatype.com/Skolar&gt;,
          but it requires licensing.
        </t>
      </section>

      <section title='Other Tools'>
        <t>In addition to generating and viewing PDF, other categories of
          PDF tools are available and may be useful both during
          specification development and for published RFCs.
          These include tools for comparing two PDFs, checkers that could
          be used to validate the results of conversion, reviewing and commentary
          tools that attach annotations to PDF files, and digital signature
          creation and validation.
        </t>
        <t>Validation of an arbitrary author-generated PDF file
          would be quite difficult; there are few PDF validation
          tools. However, if RFCs and Internet-Drafts are generated by
          conversion from XML via xml2rfc, then explicit validation of PDF
          and adherence to expected profiles would mainly be useful to ensure
          that xml2rfc has functioned properly.
        </t>
        <t><list style="hanging"><t hangText="Recommendation:">
           Discourage (but allow) submission of a PDF
           representation for Internet-Drafts. In most cases, the PDF for
           an Internet-Draft should be produced automatically when XML
           is submitted, with an opportunity to verify the conversion.
        </t></list></t>
      </section>
     </section>

<section title="IAB Members at the Time of Approval" numbered="false">
<t>
     The IAB members at the time this memo was approved
     were (in&nbsp;alphabetical order):

<?rfc subcompact="yes" ?>
<list>
 <t>Jari Arkko</t>
 <t>Ralph Droms</t>
 <t>Ted Hardie</t>
 <t>Joe Hildebrand</t>
 <t>Russ Housley</t>
<t>Lee Howard</t>
 <t>Erik Nordmark</t>
 <t>Robert Sparks</t>
 <t>Andrew Sullivan</t>
 <t>Dave Thaler</t>
<t>Martin Thomson</t>
 <t>Brian Trammell</t>
 <t>Suzanne Woolf</t>

</list>
<?rfc subcompact="no" ?>
</t>
</section>

    <section title='Acknowledgements' numbered="false">
      <t>
        The input of the following people is gratefully acknowledged:
        Nevil Brownlee (ISE), 
        Brian Carpenter,
        Chris Dearlove,
        Martin Duerst,
        Heather Flanagan (RSE),
        Joe Hildebrand,
        Paul Hoffman,
        Duff Johnson,
        Ted Lemon,
        Sean Leonard,
        Henrik Levkowetz,
        Julian Reschke, 
        Adam&nbsp;Roach,
        Leonard Rosenthol,
        Alice Russo,
        Robert Sparks,
        Andrew Sullivan,
        and
        Dave Thaler.
      </t>
    </section>
  </back>
</rfc>
