MPEP § 2412.05(d) — Representation and Symbols of Amino Acid Sequence Data (Annotated Rules)

§2412.05(d) Representation and Symbols of Amino Acid Sequence Data

USPTO MPEP version: BlueIron's Update: 2025-12-31

This page consolidates and annotates all enforceable requirements under MPEP § 2412.05(d), including statutory authority, regulatory rules, examiner guidance, and practice notes. It is provided as guidance, with links to the ground truth sources. This is information only, it is not legal advice.

Representation and Symbols of Amino Acid Sequence Data

This section addresses Representation and Symbols of Amino Acid Sequence Data. Primary authority: 37 CFR 1.831(b) and 37 CFR 1.832. Contains: 9 requirements, 3 prohibitions, 2 guidance statements, and 2 permissions.

Key Rules

Topic

Sequence Listing Content

25 rules
StatutoryInformativeAlways
[mpep-2412-05-d-95111f4962c50b9af8d63a09]
Requirement for Sequence Disclosure
Note:
This rule requires applications with disclosure of nucleotide and/or amino acid sequences to properly format such disclosures according to XML standards as of July 1, 2022.

[Editor Note: This section is applicable to all applications with a filing date, or, for national phase applications, an international filing date, on or after July 1, 2022, having disclosure of one or more nucleotide and/or amino acid sequences as defined in 37 CFR 1.831(b). Formatting representations of XML (eXtensible Markup Language) elements in this section appear different than shown in Standard ST.26, which may be accessed at: www.wipo.int /export/sites/www/standards/en/pdf/03-26-01.pdf.]

Jump to MPEP Source · 37 CFR 1.831(b)Sequence Listing ContentSequence Listing RequirementsSequence Listing Format
StatutoryRequiredAlways
[mpep-2412-05-d-33c54a8bb582f955b7e718aa]
Representation and Symbols for Amino Acid Sequences Must Conform to WIPO Standard ST.26
Note:
The amino acid sequence data must be represented using the symbols and methods specified in WIPO Standard ST.26, including modified and unknown amino acids.
(c) The representation and symbols for amino acid sequence data shall conform to the requirements of paragraphs (c)(1) through (4) of this section.
  • (1) The amino acids in an amino acid sequence must be represented in the manner described in paragraphs 24 and 25 of WIPO Standard ST.26.
  • (2) All amino acids, including modified amino acids and “unknown” amino acids, within an amino acid sequence must be represented using the symbols set forth in paragraphs 26–29 and 32 of WIPO Standard ST.26
  • (3) Modified amino acids within an amino acid sequence must be described in the manner discussed in paragraphs 29 and 30 of WIPO Standard ST.26.
  • (4) A region containing a known number of contiguous “X” residues for which the same description applies may be jointly described in the manner described in paragraph 34 of WIPO Standard ST.26.
Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2412-05-d-844d4144e305711a5b77cdba]
Amino Acids Must Conform to WIPO Standard ST.26
Note:
The amino acids in an amino acid sequence must be represented according to the requirements specified in paragraphs 24 and 25 of WIPO Standard ST.26.

(c) The representation and symbols for amino acid sequence data shall conform to the requirements of paragraphs (c)(1) through (4) of this section. (1) The amino acids in an amino acid sequence must be represented in the manner described in paragraphs 24 and 25 of WIPO Standard ST.26.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2412-05-d-6786c5f39e09c97c81df8492]
Symbols for Amino Acids Must Be Used
Note:
All amino acids, including modified and unknown ones, must be represented using specific symbols as defined in WIPO Standard ST.26 paragraphs 26-29 and 32.

(c) The representation and symbols for amino acid sequence data shall conform to the requirements of paragraphs (c)(1) through (4) of this section.

(2) All amino acids, including modified amino acids and “unknown” amino acids, within an amino acid sequence must be represented using the symbols set forth in paragraphs 26–29 and 32 of WIPO Standard ST.26

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2412-05-d-b11bbd15dc3c80f86e802800]
Modified Amino Acids Must Be Described According to WIPO Standard ST.26 Paragraphs 29 and 30
Note:
The representation of modified amino acids in an amino acid sequence must follow the guidelines specified in paragraphs 29 and 30 of WIPO Standard ST.26.

(c) The representation and symbols for amino acid sequence data shall conform to the requirements of paragraphs (c)(1) through (4) of this section.

(3) Modified amino acids within an amino acid sequence must be described in the manner discussed in paragraphs 29 and 30 of WIPO Standard ST.26.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2412-05-d-2cd3f2ee6d9abf71ddb41073]
Amino to Carboxy Direction Required for Amino Acid Sequences
Note:
The amino acids in an amino acid sequence must be represented from the amino end to the carboxy end.

WIPO Standard ST.26, paragraph 24, specifies that the amino acids in an amino acid sequence must be represented in the amino to carboxy direction from left to right. The amino and carboxy groups must not be represented in the sequence.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryInformativeAlways
[mpep-2412-05-d-12dddc4f3c9a4d8394051888]
First Amino Acid as Residue 1
Note:
The first amino acid in a sequence, including pre-sequences and signal sequences, is designated as residue position number 1.

WIPO Standard ST.26, paragraph 25, indicates that the first amino acid in the sequence is residue position number 1, including amino acids preceding the mature protein, for example, pre-sequences, pro-sequences, pre-pro-sequences and signal sequences. When an amino acid sequence is circular in configuration and the ring consists solely of amino acid residues linked by peptide bonds, i.e., the sequence has no amino and carboxy termini, applicant must choose the amino acid in residue position number 1. Numbering is continuous through the entire sequence in the amino to carboxy direction.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2412-05-d-93d1883e99b68f391ca104f3]
Requirement for Circular Amino Acid Residue Positioning
Note:
When an amino acid sequence is circular, the first residue must be designated as position number 1.

WIPO Standard ST.26, paragraph 25, indicates that the first amino acid in the sequence is residue position number 1, including amino acids preceding the mature protein, for example, pre-sequences, pro-sequences, pre-pro-sequences and signal sequences. When an amino acid sequence is circular in configuration and the ring consists solely of amino acid residues linked by peptide bonds, i.e., the sequence has no amino and carboxy termini, applicant must choose the amino acid in residue position number 1. Numbering is continuous through the entire sequence in the amino to carboxy direction.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2412-05-d-fb716d718cfbd6baab91dfc1]
Amino Acids Must Use Uppercase Letters
Note:
All amino acids in a sequence must be represented using uppercase letters as per the list of symbols provided.

WIPO Standard ST.26, paragraph 26, specifies that all amino acids in a sequence must be represented using the symbols set forth in Table 3: List of Amino Acids Symbols, in MPEP § 2412.03(a) above. Only uppercase letters must be used. Any symbol used to represent an amino acid is the equivalent of only one residue.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryInformativeAlways
[mpep-2412-05-d-d9254740d24a0e146b64d726]
Amino Acid Symbols Represent Single Residues
Note:
Symbols used to represent amino acids must correspond to a single residue in the sequence.

WIPO Standard ST.26, paragraph 26, specifies that all amino acids in a sequence must be represented using the symbols set forth in Table 3: List of Amino Acids Symbols, in MPEP § 2412.03(a) above. Only uppercase letters must be used. Any symbol used to represent an amino acid is the equivalent of only one residue.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing RequirementsSequence Listing Format
StatutoryRecommendedAlways
[mpep-2412-05-d-12bdb387024fd15cf6da1fd6]
Use Most Restrictive Symbol for Ambiguity
Note:
When an amino acid in a given position could be aspartic acid or asparagine, use the symbol ‘B’ instead of ‘X’.

WIPO Standard ST.26, paragraph 27, indicates that where an ambiguity symbol (representing two or more amino acids in the alternative) is appropriate, the most restrictive symbol should be used, as listed in Table 3: List of Amino Acids Symbols (MPEP § 2412.03(a)). For example, if an amino acid in a given position could be aspartic acid or asparagine, the symbol “B” should be used, rather than “X”. The symbol “X” will be construed as any one of “A”, “R”, “N”, “D”, “C”, “Q”, “E”, “G”, “H”, “I”, “L”, “K”, “M”, “F”, “P”, “O”, “S”, “U”, “T”, “W”, “Y”, or “V”, except where it is used with a further description in the feature table. The symbol “X” must not be used to represent anything other than an amino acid. A single modified or “unknown” amino acid may be represented by the symbol “X”, together with a further description in a feature table (see MPEP § 2413.01(g), subsection I or MPEP § 2412.03(c), for more detail regarding a “feature table”). For representation and inclusion of sequence variants, see MPEP § 2412.05(c). For details of how to represent variants in a “Sequence Listing XML,” see MPEP § 2413.01(g), subsection XI.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing RequirementsSequence Listing Format
StatutoryInformativeAlways
[mpep-2412-05-d-bc57a92b0e29d78c92276d04]
Representation of Amino Acid Sequence Data
Note:
The symbol 'X' represents any amino acid except where further described in a feature table.

WIPO Standard ST.26, paragraph 27, indicates that where an ambiguity symbol (representing two or more amino acids in the alternative) is appropriate, the most restrictive symbol should be used, as listed in Table 3: List of Amino Acids Symbols (MPEP § 2412.03(a)). For example, if an amino acid in a given position could be aspartic acid or asparagine, the symbol “B” should be used, rather than “X”. The symbol “X” will be construed as any one of “A”, “R”, “N”, “D”, “C”, “Q”, “E”, “G”, “H”, “I”, “L”, “K”, “M”, “F”, “P”, “O”, “S”, “U”, “T”, “W”, “Y”, or “V”, except where it is used with a further description in the feature table. The symbol “X” must not be used to represent anything other than an amino acid. A single modified or “unknown” amino acid may be represented by the symbol “X”, together with a further description in a feature table (see MPEP § 2413.01(g), subsection I or MPEP § 2412.03(c), for more detail regarding a “feature table”). For representation and inclusion of sequence variants, see MPEP § 2412.05(c). For details of how to represent variants in a “Sequence Listing XML,” see MPEP § 2413.01(g), subsection XI.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing RequirementsSequence Listing Format
StatutoryProhibitedAlways
[mpep-2412-05-d-03a0599f78291ce4a5e04308]
X Must Represent Only Amino Acids
Note:
The symbol 'X' can only represent an amino acid and not any other type of sequence variant.

WIPO Standard ST.26, paragraph 27, indicates that where an ambiguity symbol (representing two or more amino acids in the alternative) is appropriate, the most restrictive symbol should be used, as listed in Table 3: List of Amino Acids Symbols (MPEP § 2412.03(a)). For example, if an amino acid in a given position could be aspartic acid or asparagine, the symbol “B” should be used, rather than “X”. The symbol “X” will be construed as any one of “A”, “R”, “N”, “D”, “C”, “Q”, “E”, “G”, “H”, “I”, “L”, “K”, “M”, “F”, “P”, “O”, “S”, “U”, “T”, “W”, “Y”, or “V”, except where it is used with a further description in the feature table. The symbol “X” must not be used to represent anything other than an amino acid. A single modified or “unknown” amino acid may be represented by the symbol “X”, together with a further description in a feature table (see MPEP § 2413.01(g), subsection I or MPEP § 2412.03(c), for more detail regarding a “feature table”). For representation and inclusion of sequence variants, see MPEP § 2412.05(c). For details of how to represent variants in a “Sequence Listing XML,” see MPEP § 2413.01(g), subsection XI.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing RequirementsSequence Listing Format
StatutoryPermittedAlways
[mpep-2412-05-d-adb5da3eeffe91732656d78d]
Unknown Amino Acid Representation
Note:
An unknown amino acid can be represented by 'X' with a feature table description.

WIPO Standard ST.26, paragraph 27, indicates that where an ambiguity symbol (representing two or more amino acids in the alternative) is appropriate, the most restrictive symbol should be used, as listed in Table 3: List of Amino Acids Symbols (MPEP § 2412.03(a)). For example, if an amino acid in a given position could be aspartic acid or asparagine, the symbol “B” should be used, rather than “X”. The symbol “X” will be construed as any one of “A”, “R”, “N”, “D”, “C”, “Q”, “E”, “G”, “H”, “I”, “L”, “K”, “M”, “F”, “P”, “O”, “S”, “U”, “T”, “W”, “Y”, or “V”, except where it is used with a further description in the feature table. The symbol “X” must not be used to represent anything other than an amino acid. A single modified or “unknown” amino acid may be represented by the symbol “X”, together with a further description in a feature table (see MPEP § 2413.01(g), subsection I or MPEP § 2412.03(c), for more detail regarding a “feature table”). For representation and inclusion of sequence variants, see MPEP § 2412.05(c). For details of how to represent variants in a “Sequence Listing XML,” see MPEP § 2413.01(g), subsection XI.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing RequirementsSequence Listing Format
StatutoryRequiredAlways
[mpep-2412-05-d-176d839fa5d1e40c58f014f6]
Terminator Symbols Must Be Separated for Amino Acid Sequences
Note:
Amino acid sequences separated by terminator symbols must be split into separate sequences if they contain at least four specific amino acids.

WIPO Standard ST.26, paragraph 28, specifies that disclosed amino acid sequences separated by internal terminator symbols, represented for example by “Ter” or asterisk “*” or period “.” or a blank space, must be included as separate sequences for each enumerated amino acid sequence that contains at least four specifically defined amino acids and is encompassed by the description of sequences found in MPEP § 2412.03. Each such separate sequence must be assigned its own sequence identifier (see MPEP § 2412.05(a)). Terminator symbols and spaces must not be included in a sequence contained in a “Sequence Listing XML”.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2412-05-d-b05856bb123153ff88af5d37]
Unknown Amino Acids Must Be Represented as X
Note:
Any unknown amino acid must be represented by the symbol 'X' and further described in a feature table with the key ‘UNSURE’.

Any “unknown” amino acid must be represented by the symbol “X” in the sequence. An “unknown” amino acid designated as “X” must be further described in a feature table (see MPEP § 2413.01(g), subsection I, for more detail regarding a “feature table”) using the feature key “UNSURE” and optionally the qualifier “note.” The symbol “X” is the equivalent of only one residue (WIPO Standard ST.26, paragraph 32).

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2412-05-d-510c3ea14f2e4ca32ed27250]
Unknown Amino Acid Must Be Described
Note:
An unknown amino acid designated as 'X' must be further described in a feature table using the key ‘UNSURE’ and optionally with a qualifier ‘note’.

Any “unknown” amino acid must be represented by the symbol “X” in the sequence. An “unknown” amino acid designated as “X” must be further described in a feature table (see MPEP § 2413.01(g), subsection I, for more detail regarding a “feature table”) using the feature key “UNSURE” and optionally the qualifier “note.” The symbol “X” is the equivalent of only one residue (WIPO Standard ST.26, paragraph 32).

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing RequirementsSequence Listing Format
StatutoryInformativeAlways
[mpep-2412-05-d-59bd2fab408c9f0b4e35dfea]
Unknown Amino Acid Must Be Represented by X
Note:
An unknown amino acid in a sequence must be represented by the symbol 'X' and described using a feature table with the key 'UNSURE'.

Any “unknown” amino acid must be represented by the symbol “X” in the sequence. An “unknown” amino acid designated as “X” must be further described in a feature table (see MPEP § 2413.01(g), subsection I, for more detail regarding a “feature table”) using the feature key “UNSURE” and optionally the qualifier “note.” The symbol “X” is the equivalent of only one residue (WIPO Standard ST.26, paragraph 32).

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryProhibitedAlways
[mpep-2412-05-d-a6f410aeecd6bcdc8abdf1e6]
Other Amino Acid Must Be Represented by X
Note:
Any modified amino acid not represented by other symbols must be denoted by 'X' in sequence listings.

WIPO Standard ST.26, paragraph 29, specifies that modified amino acids, including D-amino acids, should be represented in the sequence as the corresponding unmodified amino acids whenever possible. Any modified amino acid in a sequence that cannot otherwise be represented by any other symbol in Table 3: List of Amino Acids Symbols (see MPEP § 2412.03(a)), i.e., an “other” amino acid, must be represented by “X”. The symbol “X” is the equivalent of only one residue.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryInformativeAlways
[mpep-2412-05-d-f34dc3b371029b6630a5eb88]
X Symbol Represents Single Residue
Note:
The symbol 'X' must represent a single residue when an amino acid cannot be represented by any other symbol in the list.

WIPO Standard ST.26, paragraph 29, specifies that modified amino acids, including D-amino acids, should be represented in the sequence as the corresponding unmodified amino acids whenever possible. Any modified amino acid in a sequence that cannot otherwise be represented by any other symbol in Table 3: List of Amino Acids Symbols (see MPEP § 2412.03(a)), i.e., an “other” amino acid, must be represented by “X”. The symbol “X” is the equivalent of only one residue.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2412-05-d-4af58648599c46a961ad52b4]
Modified Amino Acid Must Be Described in Feature Table
Note:
A modified amino acid must be further described using a feature table with appropriate qualifiers and abbreviations as specified.

WIPO Standard ST.26, paragraph 30, provides that a modified amino acid must be further described in a feature table (see MPEP § 2413.01(g), subsection I, for more detail regarding a “feature table”). Where applicable, the feature keys “CARBOHYD” or “LIPID” should be used together with the qualifier “note”. The feature key “MOD_RES” should be used for other post-translationally modified amino acids together with the qualifier “note”. The feature key “SITE” together with the qualifier “note” should be used when the modified amino acid is not a post-translationally modified amino acid. The value for the qualifier “note” must either be an abbreviation set forth in Table 4: List of Modified Amino Acids (see MPEP § 2412.03(c)), above, or the complete, unabbreviated name of the modified amino acid. The abbreviations set forth in Table 4, or the complete, unabbreviated names must not be used in the sequence itself.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRecommendedAlways
[mpep-2412-05-d-a82a2e492042d9872eb5c241]
Feature Keys for Modified Amino Acids Must Be Used with Note Qualifier
Note:
When describing modified amino acids, the feature keys 'CARBOHYD', 'LIPID', or 'MOD_RES' must be used along with a 'note' qualifier.

WIPO Standard ST.26, paragraph 30, provides that a modified amino acid must be further described in a feature table (see MPEP § 2413.01(g), subsection I, for more detail regarding a “feature table”). Where applicable, the feature keys “CARBOHYD” or “LIPID” should be used together with the qualifier “note”. The feature key “MOD_RES” should be used for other post-translationally modified amino acids together with the qualifier “note”. The feature key “SITE” together with the qualifier “note” should be used when the modified amino acid is not a post-translationally modified amino acid. The value for the qualifier “note” must either be an abbreviation set forth in Table 4: List of Modified Amino Acids (see MPEP § 2412.03(c)), above, or the complete, unabbreviated name of the modified amino acid. The abbreviations set forth in Table 4, or the complete, unabbreviated names must not be used in the sequence itself.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRecommendedAlways
[mpep-2412-05-d-714c8e60977dd04fb554503f]
Requirement for Post-Translationally Modified Amino Acids
Note:
Use the feature key 'MOD_RES' with the qualifier 'note' to describe post-translationally modified amino acids in sequence listings.

WIPO Standard ST.26, paragraph 30, provides that a modified amino acid must be further described in a feature table (see MPEP § 2413.01(g), subsection I, for more detail regarding a “feature table”). Where applicable, the feature keys “CARBOHYD” or “LIPID” should be used together with the qualifier “note”. The feature key “MOD_RES” should be used for other post-translationally modified amino acids together with the qualifier “note”. The feature key “SITE” together with the qualifier “note” should be used when the modified amino acid is not a post-translationally modified amino acid. The value for the qualifier “note” must either be an abbreviation set forth in Table 4: List of Modified Amino Acids (see MPEP § 2412.03(c)), above, or the complete, unabbreviated name of the modified amino acid. The abbreviations set forth in Table 4, or the complete, unabbreviated names must not be used in the sequence itself.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRecommendedAlways
[mpep-2412-05-d-af144d6140cac25857c520be]
Site Note for Non-Post-Translational Modifications
Note:
Use the feature key 'SITE' with 'note' to describe non-post-translationally modified amino acids.

WIPO Standard ST.26, paragraph 30, provides that a modified amino acid must be further described in a feature table (see MPEP § 2413.01(g), subsection I, for more detail regarding a “feature table”). Where applicable, the feature keys “CARBOHYD” or “LIPID” should be used together with the qualifier “note”. The feature key “MOD_RES” should be used for other post-translationally modified amino acids together with the qualifier “note”. The feature key “SITE” together with the qualifier “note” should be used when the modified amino acid is not a post-translationally modified amino acid. The value for the qualifier “note” must either be an abbreviation set forth in Table 4: List of Modified Amino Acids (see MPEP § 2412.03(c)), above, or the complete, unabbreviated name of the modified amino acid. The abbreviations set forth in Table 4, or the complete, unabbreviated names must not be used in the sequence itself.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing RequirementsSequence Listing Format
StatutoryRequiredAlways
[mpep-2412-05-d-ab7c689690ec9c4624768911]
Note Qualifier for Modified Amino Acids Must Be Abbreviation or Full Name
Note:
The qualifier 'note' must use abbreviations from Table 4 or the full name of modified amino acids, not used in sequence itself.

WIPO Standard ST.26, paragraph 30, provides that a modified amino acid must be further described in a feature table (see MPEP § 2413.01(g), subsection I, for more detail regarding a “feature table”). Where applicable, the feature keys “CARBOHYD” or “LIPID” should be used together with the qualifier “note”. The feature key “MOD_RES” should be used for other post-translationally modified amino acids together with the qualifier “note”. The feature key “SITE” together with the qualifier “note” should be used when the modified amino acid is not a post-translationally modified amino acid. The value for the qualifier “note” must either be an abbreviation set forth in Table 4: List of Modified Amino Acids (see MPEP § 2412.03(c)), above, or the complete, unabbreviated name of the modified amino acid. The abbreviations set forth in Table 4, or the complete, unabbreviated names must not be used in the sequence itself.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing RequirementsSequence Listing Format
Topic

Sequence Listing Format

14 rules
StatutoryPermittedAlways
[mpep-2412-05-d-f86a9af7bd6c977ade894f6b]
XML Representation for Sequence Listings Required
Note:
This rule requires the use of XML formatting for representing nucleotide and amino acid sequence data in applications filed on or after July 1, 2022.

[Editor Note: This section is applicable to all applications with a filing date, or, for national phase applications, an international filing date, on or after July 1, 2022, having disclosure of one or more nucleotide and/or amino acid sequences as defined in 37 CFR 1.831(b). Formatting representations of XML (eXtensible Markup Language) elements in this section appear different than shown in Standard ST.26, which may be accessed at: www.wipo.int /export/sites/www/standards/en/pdf/03-26-01.pdf.]

Jump to MPEP Source · 37 CFR 1.831(b)Sequence Listing FormatSequence Listing RequirementsSequence Listing Content
StatutoryPermittedAlways
[mpep-2412-05-d-8f41046d3a969ff30f1b4cb6]
Joint Description for Contiguous X Residues
Note:
A region containing a known number of contiguous 'X' residues can be jointly described using the method specified in WIPO Standard ST.26 paragraph 34.

(c) The representation and symbols for amino acid sequence data shall conform to the requirements of paragraphs (c)(1) through (4) of this section.

(4) A region containing a known number of contiguous “X” residues for which the same description applies may be jointly described in the manner described in paragraph 34 of WIPO Standard ST.26.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing ContentSequence Listing Requirements
StatutoryProhibitedAlways
[mpep-2412-05-d-505d7d2c19ce84a077f6bfee]
Amino to Carboxy Direction Required in Sequence Listing
Note:
The amino acids must be represented from the amino to carboxy direction, without including amino or carboxy groups in the sequence listing.

WIPO Standard ST.26, paragraph 24, specifies that the amino acids in an amino acid sequence must be represented in the amino to carboxy direction from left to right. The amino and carboxy groups must not be represented in the sequence.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing RequirementsSequence Listing Content
StatutoryInformativeAlways
[mpep-2412-05-d-8ee4524a41c97d29b969bec0]
Amino Acid Sequence Numbering Must Be Continuous
Note:
The first amino acid is numbered as position 1, and numbering continues in the amino to carboxy direction without interruption.

WIPO Standard ST.26, paragraph 25, indicates that the first amino acid in the sequence is residue position number 1, including amino acids preceding the mature protein, for example, pre-sequences, pro-sequences, pre-pro-sequences and signal sequences. When an amino acid sequence is circular in configuration and the ring consists solely of amino acid residues linked by peptide bonds, i.e., the sequence has no amino and carboxy termini, applicant must choose the amino acid in residue position number 1. Numbering is continuous through the entire sequence in the amino to carboxy direction.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing RequirementsSequence Listing Content
StatutoryRequiredAlways
[mpep-2412-05-d-b4532b7aefdaf4ec4f286fad]
Amino Acid Sequence Must Use Specified Symbols
Note:
All amino acids in a sequence must be represented using the symbols set forth in Table 3 of MPEP § 2412.03(a), and only uppercase letters are allowed.

WIPO Standard ST.26, paragraph 26, specifies that all amino acids in a sequence must be represented using the symbols set forth in Table 3: List of Amino Acids Symbols, in MPEP § 2412.03(a) above. Only uppercase letters must be used. Any symbol used to represent an amino acid is the equivalent of only one residue.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing RequirementsSequence Listing Content
StatutoryRecommendedAlways
[mpep-2412-05-d-909d2b232b6c228bd20aa6be]
Most Restrictive Symbol for Ambiguity in Amino Acids
Note:
When representing multiple amino acids with a single symbol, use the most restrictive one listed in Table 3: List of Amino Acids Symbols.

WIPO Standard ST.26, paragraph 27, indicates that where an ambiguity symbol (representing two or more amino acids in the alternative) is appropriate, the most restrictive symbol should be used, as listed in Table 3: List of Amino Acids Symbols (MPEP § 2412.03(a)). For example, if an amino acid in a given position could be aspartic acid or asparagine, the symbol “B” should be used, rather than “X”. The symbol “X” will be construed as any one of “A”, “R”, “N”, “D”, “C”, “Q”, “E”, “G”, “H”, “I”, “L”, “K”, “M”, “F”, “P”, “O”, “S”, “U”, “T”, “W”, “Y”, or “V”, except where it is used with a further description in the feature table. The symbol “X” must not be used to represent anything other than an amino acid. A single modified or “unknown” amino acid may be represented by the symbol “X”, together with a further description in a feature table (see MPEP § 2413.01(g), subsection I or MPEP § 2412.03(c), for more detail regarding a “feature table”). For representation and inclusion of sequence variants, see MPEP § 2412.05(c). For details of how to represent variants in a “Sequence Listing XML,” see MPEP § 2413.01(g), subsection XI.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing ContentSequence Listing Requirements
StatutoryInformativeAlways
[mpep-2412-05-d-a2f2753416588396bc9a7248]
Representation of Sequence Variants
Note:
This rule outlines how to represent sequence variants in amino acid sequences for patent applications.

WIPO Standard ST.26, paragraph 27, indicates that where an ambiguity symbol (representing two or more amino acids in the alternative) is appropriate, the most restrictive symbol should be used, as listed in Table 3: List of Amino Acids Symbols (MPEP § 2412.03(a)). For example, if an amino acid in a given position could be aspartic acid or asparagine, the symbol “B” should be used, rather than “X”. The symbol “X” will be construed as any one of “A”, “R”, “N”, “D”, “C”, “Q”, “E”, “G”, “H”, “I”, “L”, “K”, “M”, “F”, “P”, “O”, “S”, “U”, “T”, “W”, “Y”, or “V”, except where it is used with a further description in the feature table. The symbol “X” must not be used to represent anything other than an amino acid. A single modified or “unknown” amino acid may be represented by the symbol “X”, together with a further description in a feature table (see MPEP § 2413.01(g), subsection I or MPEP § 2412.03(c), for more detail regarding a “feature table”). For representation and inclusion of sequence variants, see MPEP § 2412.05(c). For details of how to represent variants in a “Sequence Listing XML,” see MPEP § 2413.01(g), subsection XI.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing RequirementsSequence Listing Content
StatutoryInformativeAlways
[mpep-2412-05-d-3a3359ab9aa4bc26c3a700e5]
Representation of Sequence Variants in Sequence Listing XML
Note:
This rule explains how to represent sequence variants in the 'Sequence Listing XML' part of a patent application.

WIPO Standard ST.26, paragraph 27, indicates that where an ambiguity symbol (representing two or more amino acids in the alternative) is appropriate, the most restrictive symbol should be used, as listed in Table 3: List of Amino Acids Symbols (MPEP § 2412.03(a)). For example, if an amino acid in a given position could be aspartic acid or asparagine, the symbol “B” should be used, rather than “X”. The symbol “X” will be construed as any one of “A”, “R”, “N”, “D”, “C”, “Q”, “E”, “G”, “H”, “I”, “L”, “K”, “M”, “F”, “P”, “O”, “S”, “U”, “T”, “W”, “Y”, or “V”, except where it is used with a further description in the feature table. The symbol “X” must not be used to represent anything other than an amino acid. A single modified or “unknown” amino acid may be represented by the symbol “X”, together with a further description in a feature table (see MPEP § 2413.01(g), subsection I or MPEP § 2412.03(c), for more detail regarding a “feature table”). For representation and inclusion of sequence variants, see MPEP § 2412.05(c). For details of how to represent variants in a “Sequence Listing XML,” see MPEP § 2413.01(g), subsection XI.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing RequirementsSequence Listing Content
StatutoryProhibitedAlways
[mpep-2412-05-d-f88c2439d8a3b51fc34ea7da]
Terminator Symbols Prohibited In Sequence Listing XML
Note:
Amino acid sequences in the 'Sequence Listing XML' must not include terminator symbols such as 'Ter', asterisks, periods, or spaces.

WIPO Standard ST.26, paragraph 28, specifies that disclosed amino acid sequences separated by internal terminator symbols, represented for example by “Ter” or asterisk “*” or period “.” or a blank space, must be included as separate sequences for each enumerated amino acid sequence that contains at least four specifically defined amino acids and is encompassed by the description of sequences found in MPEP § 2412.03. Each such separate sequence must be assigned its own sequence identifier (see MPEP § 2412.05(a)). Terminator symbols and spaces must not be included in a sequence contained in a “Sequence Listing XML”.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing RequirementsSequence Listing Content
StatutoryRecommendedAlways
[mpep-2412-05-d-b45a9422fa0667230c22aad9]
Modified Amino Acids Must Be Represented as Unmodified
Note:
WIPO Standard ST.26 requires modified amino acids, including D-amino acids, to be represented by their unmodified counterparts in sequence listings whenever possible.

WIPO Standard ST.26, paragraph 29, specifies that modified amino acids, including D-amino acids, should be represented in the sequence as the corresponding unmodified amino acids whenever possible. Any modified amino acid in a sequence that cannot otherwise be represented by any other symbol in Table 3: List of Amino Acids Symbols (see MPEP § 2412.03(a)), i.e., an “other” amino acid, must be represented by “X”. The symbol “X” is the equivalent of only one residue.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing RequirementsSequence Listing Content
StatutoryProhibitedAlways
[mpep-2412-05-d-1a5de59798563dfea80886c3]
Modified Amino Acid Abbreviations Prohibited in Sequence
Note:
The rule prohibits the use of abbreviations for modified amino acids in sequence listings, requiring full names instead.

WIPO Standard ST.26, paragraph 30, provides that a modified amino acid must be further described in a feature table (see MPEP § 2413.01(g), subsection I, for more detail regarding a “feature table”). Where applicable, the feature keys “CARBOHYD” or “LIPID” should be used together with the qualifier “note”. The feature key “MOD_RES” should be used for other post-translationally modified amino acids together with the qualifier “note”. The feature key “SITE” together with the qualifier “note” should be used when the modified amino acid is not a post-translationally modified amino acid. The value for the qualifier “note” must either be an abbreviation set forth in Table 4: List of Modified Amino Acids (see MPEP § 2412.03(c)), above, or the complete, unabbreviated name of the modified amino acid. The abbreviations set forth in Table 4, or the complete, unabbreviated names must not be used in the sequence itself.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing RequirementsSequence Listing Content
StatutoryPermittedAlways
[mpep-2412-05-d-538aa82de9c6ff17f0c9ad3a]
Region Description Using x..y Syntax
Note:
This rule permits describing a region of known contiguous residues using the 'x..y' syntax in INSDFeature_location for sequence listings.

WIPO Standard ST.26, paragraph 34, provides that a region containing a known number of contiguous “X” residues for which the same description applies may be jointly described in one feature key using the syntax “x..y” as the location descriptor in the element INSDFeature_location (see MPEP § 2413.01(g) subsections II-III for information regarding “feature keys” and subsection IV, for information regarding INSDFeature_location). For representation and inclusion of sequence variants, see MPEP § 2412.05(c). For details of how to represent variants in a “Sequence Listing XML,” see MPEP § 2413.01(g), subsection XII.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing ContentSequence Listing Requirements
StatutoryInformativeAlways
[mpep-2412-05-d-567e9b592db07cce631c753a]
Representation of Sequence Variants Required
Note:
MPEP §2412.05(c) requires the proper representation and inclusion of sequence variants in a patent application's 'Sequence Listing XML'.

WIPO Standard ST.26, paragraph 34, provides that a region containing a known number of contiguous “X” residues for which the same description applies may be jointly described in one feature key using the syntax “x..y” as the location descriptor in the element INSDFeature_location (see MPEP § 2413.01(g) subsections II-III for information regarding “feature keys” and subsection IV, for information regarding INSDFeature_location). For representation and inclusion of sequence variants, see MPEP § 2412.05(c). For details of how to represent variants in a “Sequence Listing XML,” see MPEP § 2413.01(g), subsection XII.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing RequirementsSequence Listing Content
StatutoryInformativeAlways
[mpep-2412-05-d-6f7fb7f4b329c8e7a174aeae]
How to Represent Sequence Variants in Sequence Listing XML
Note:
This rule provides guidance on representing sequence variants in the 'Sequence Listing XML' part of a patent application, as detailed in MPEP § 2413.01(g), subsection XII.

WIPO Standard ST.26, paragraph 34, provides that a region containing a known number of contiguous “X” residues for which the same description applies may be jointly described in one feature key using the syntax “x..y” as the location descriptor in the element INSDFeature_location (see MPEP § 2413.01(g) subsections II-III for information regarding “feature keys” and subsection IV, for information regarding INSDFeature_location). For representation and inclusion of sequence variants, see MPEP § 2412.05(c). For details of how to represent variants in a “Sequence Listing XML,” see MPEP § 2413.01(g), subsection XII.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing RequirementsSequence Listing Content

Citations

Primary topicCitation
Sequence Listing Content
Sequence Listing Format
37 CFR § 1.831(b)
Sequence Listing Content
Sequence Listing Format
MPEP § 2412.03
Sequence Listing Content
Sequence Listing Format
MPEP § 2412.03(a)
Sequence Listing Content
Sequence Listing Format
MPEP § 2412.03(c)
Sequence Listing Content
Sequence Listing Format
MPEP § 2412.05(a)
Sequence Listing Content
Sequence Listing Format
MPEP § 2412.05(c)
Sequence Listing Content
Sequence Listing Format
MPEP § 2413.01(g)

Source Text from USPTO’s MPEP

This is an exact copy of the MPEP from the USPTO. It is here for your reference to see the section in context.

BlueIron Last Updated: 2025-12-31