MPEP § 2413.02 — Form and Format of the XML file containing the “Sequence Listing XML” (Annotated Rules)

§2413.02 Form and Format of the XML file containing the “Sequence Listing XML”

USPTO MPEP version: BlueIron's Update: 2025-12-31

This page consolidates and annotates all enforceable requirements under MPEP § 2413.02, including statutory authority, regulatory rules, examiner guidance, and practice notes. It is provided as guidance, with links to the ground truth sources. This is information only, it is not legal advice.

Form and Format of the XML file containing the “Sequence Listing XML”

This section addresses Form and Format of the XML file containing the “Sequence Listing XML”. Primary authority: 37 CFR 1.831 and 37 CFR 1.834. Contains: 3 requirements, 2 permissions, and 2 other statements.

Key Rules

Topic

Sequence Listing Format

5 rules
StatutoryPermittedAlways
[mpep-2413-02-1b510cea2962c2987b3811f7]
XML Format for Sequence Listings Required
Note:
All permitted printable characters, including space and control characters, must conform to WIPO Standard ST.26 paragraph 40.

(a) A “Sequence Listing XML” encoded using Unicode UTF–8, created by any means (e.g., text editors, nucleotide/amino acid sequence editors, or other custom computer programs) in accordance with §§ 1.831 through 1.833, must:

(2) Be in XML format, where all permitted printable characters (including the space character) and nonprintable (control) characters are defined in paragraph 40 of WIPO Standard ST.26 (incorporated by reference, see § 1.839).

37 CFR 1.77 · 37 CFR 1.831Sequence Listing FormatSequence Listing ContentSequence Listing Requirements
StatutoryInformativeAlways
[mpep-2413-02-af9e6a0a2fc48d45820cad0d]
Filename Requirement for Sequence Listing XML
Note:
The filename must be one character or a combination of upper- or lowercase letters, numbers, hyphens, and underscores, not exceeding 60 characters excluding the .xml extension.

(a) A “Sequence Listing XML” encoded using Unicode UTF–8, created by any means (e.g., text editors, nucleotide/amino acid sequence editors, or other custom computer programs) in accordance with §§ 1.831 through 1.833, must:

(3) Be named as *.xml, where “*” is one character or a combination of characters limited to upper- or lowercase letters, numbers, hyphens, and underscores, and the name does not exceed 60 characters in total, excluding the extension.

37 CFR 1.77 · 37 CFR 1.831Sequence Listing FormatSequence Listing ContentSequence Listing Requirements
StatutoryPermittedAlways
[mpep-2413-02-89903e8b8fcb75dbecaa5365]
File Name for Sequence Listing XML Must Contain Only Letters, Numbers, Hyphens, and Underscores
Note:
The file name for the Sequence Listing XML must consist of letters, numbers, hyphens, and underscores, with a maximum length of 60 characters excluding the .xml extension.

(a) A “Sequence Listing XML” encoded using Unicode UTF–8, created by any means (e.g., text editors, nucleotide/amino acid sequence editors, or other custom computer programs) in accordance with §§ 1.831 through 1.833, must:

No spaces or other types of characters are permitted in the file name.

37 CFR 1.77 · 37 CFR 1.831Sequence Listing FormatSequence Listing ContentSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2413-02-e08767396ff439191eb9934f]
Sequence Listing XML Must Use Unicode UTF-8
Note:
The USPTO requires that all characters in the Sequence Listing XML file be encoded using Unicode UTF-8 for processing.

In order for the USPTO to be able to process the “Sequence Listing XML”.xml file, all characters must be encoded using Unicode UTF-8. The file must be compatible with PC or Mac ® computers using one of the following operating systems, MS–DOS ®, MS-Windows ®, Mac OS ®, or Unix ® /Linux ®. The printable and non-printable characters in the.xml file are defined in paragraph 40 and 41 of WIPO Standard ST.26 (see MPEP § 2413.01(a)) where Annex IV of WIPO Standard ST.26 provides a table of the CHARACTER SUBSET FROM THE UNICODE BASIC LATIN CODE TABLE FOR USE IN AN XML INSTANCE OF A SEQUENCE LISTING.

Jump to MPEP Source · 37 CFR 1.834Sequence Listing FormatSequence Listing RequirementsSequence Listing Content
StatutoryInformativeAlways
[mpep-2413-02-bd31bd7a1601849d74e7425c]
Character Subset for Sequence Listing XML Required
Note:
The printable and non-printable characters in the Sequence Listing XML file must be defined according to Annex IV of WIPO Standard ST.26.

In order for the USPTO to be able to process the “Sequence Listing XML”.xml file, all characters must be encoded using Unicode UTF-8. The file must be compatible with PC or Mac ® computers using one of the following operating systems, MS–DOS ®, MS-Windows ®, Mac OS ®, or Unix ® /Linux ®. The printable and non-printable characters in the.xml file are defined in paragraph 40 and 41 of WIPO Standard ST.26 (see MPEP § 2413.01(a)) where Annex IV of WIPO Standard ST.26 provides a table of the CHARACTER SUBSET FROM THE UNICODE BASIC LATIN CODE TABLE FOR USE IN AN XML INSTANCE OF A SEQUENCE LISTING.

Jump to MPEP Source · 37 CFR 1.834Sequence Listing FormatSequence Listing RequirementsSequence Listing Content
Topic

Sequence Listing Content

3 rules
StatutoryInformativeAlways
[mpep-2413-02-8e10f4ae504beeea435be177]
XML File Requirement for Sequence Listings
Note:
This rule requires that all applications with a filing date on or after July 1, 2022, and containing nucleotide and/or amino acid sequences must include an XML file in the specified format.

[Editor Note: This section is applicable to all applications with a filing date, or, for national phase applications, an international filing date, on or after July 1, 2022, having disclosure of one or more nucleotide and/or amino acid sequences as defined in 37 CFR 1.831(b).]

37 CFR 1.77 · 37 CFR 1.831(b)Sequence Listing ContentSequence Listing RequirementsSequence Listing Format
StatutoryRequiredAlways
[mpep-2413-02-0b470e67b0911ac71d98b376]
XML Format for Sequence Listings Required
Note:
The rule requires that sequence listings submitted as XML files must be encoded in Unicode UTF-8 and conform to specific format requirements, including being named with a .xml extension and adhering to certain character limitations.
(a) A “Sequence Listing XML” encoded using Unicode UTF–8, created by any means (e.g., text editors, nucleotide/amino acid sequence editors, or other custom computer programs) in accordance with §§ 1.831 through 1.833, must:
  • (1) Have the following compatibilities:
    • (i) Computer compatibility: PC or Mac ®; and
    • (ii) Operating system compatibility: MS–DOS ®, MS-Windows ®, Mac OS ®, or Unix ® /Linux ®.
  • (2) Be in XML format, where all permitted printable characters (including the space character) and nonprintable (control) characters are defined in paragraph 40 of WIPO Standard ST.26 (incorporated by reference, see § 1.839).
  • (3) Be named as *.xml, where “*” is one character or a combination of characters limited to upper- or lowercase letters, numbers, hyphens, and underscores, and the name does not exceed 60 characters in total, excluding the extension. No spaces or other types of characters are permitted in the file name.
37 CFR 1.77 · 37 CFR 1.831Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2413-02-e29a367adfd54f114f42f9ce]
XML File Must Be Compatible with PC or Mac Operating Systems
Note:
The 'Sequence Listing XML' file must be compatible with MS-DOS, MS-Windows, Mac OS, or Unix/Linux operating systems for USPTO processing.

In order for the USPTO to be able to process the “Sequence Listing XML”.xml file, all characters must be encoded using Unicode UTF-8. The file must be compatible with PC or Mac ® computers using one of the following operating systems, MS–DOS ®, MS-Windows ®, Mac OS ®, or Unix ® /Linux ®. The printable and non-printable characters in the.xml file are defined in paragraph 40 and 41 of WIPO Standard ST.26 (see MPEP § 2413.01(a)) where Annex IV of WIPO Standard ST.26 provides a table of the CHARACTER SUBSET FROM THE UNICODE BASIC LATIN CODE TABLE FOR USE IN AN XML INSTANCE OF A SEQUENCE LISTING.

Jump to MPEP Source · 37 CFR 1.834Sequence Listing ContentSequence Listing FormatSequence Listing Requirements

Citations

Primary topicCitation
Sequence Listing Content
Sequence Listing Format
37 CFR § 1.831
Sequence Listing Content37 CFR § 1.831(b)
Sequence Listing Content
Sequence Listing Format
37 CFR § 1.839
Sequence Listing Content
Sequence Listing Format
MPEP § 2413.01(a)

Source Text from USPTO’s MPEP

This is an exact copy of the MPEP from the USPTO. It is here for your reference to see the section in context.

BlueIron Last Updated: 2025-12-31