MPEP § 2412.05(b) — Representation and Symbols of Nucleotide Sequence Data (Annotated Rules)

§2412.05(b) Representation and Symbols of Nucleotide Sequence Data

USPTO MPEP version: BlueIron's Update: 2025-12-31

This page consolidates and annotates all enforceable requirements under MPEP § 2412.05(b), including statutory authority, regulatory rules, examiner guidance, and practice notes. It is provided as guidance, with links to the ground truth sources. This is information only, it is not legal advice.

Representation and Symbols of Nucleotide Sequence Data

This section addresses Representation and Symbols of Nucleotide Sequence Data. Primary authority: 37 CFR 1.831(b) and 37 CFR 1.832. Contains: 11 requirements, 1 prohibition, 1 guidance statement, and 4 permissions.

Key Rules

Topic

Sequence Listing Content

27 rules
StatutoryInformativeAlways
[mpep-2412-05-b-97de2fe476965dfbaab1329f]
Requirement for Sequence Listing Disclosure
Note:
This rule requires all applications filed on or after July 1, 2022, with disclosure of nucleotide and/or amino acid sequences to include the required sequence listing content as defined in 37 CFR 1.831(b).

[Editor Note: This section is applicable to all applications with a filing date, or, for national phase applications, an international filing date, on or after July 1, 2022, having disclosure of one or more nucleotide and/or amino acid sequences as defined in 37 CFR 1.831(b). Formatting representations of XML (eXtensible Markup Language) elements in this section appear different than shown in Standard ST.26, which may be accessed at: www.wipo.int /export/sites/www/standards/en/pdf/03-26-01.pdf.]

Jump to MPEP Source · 37 CFR 1.831(b)Sequence Listing ContentSequence Listing RequirementsSequence Listing Format
StatutoryRequiredAlways
[mpep-2412-05-b-42d7c4ec484075c43f0184de]
Representation and Symbols for Nucleotide Sequences Must Conform to WIPO Standard ST.26
Note:
The nucleotide sequence data must be represented using the symbols and methods specified in WIPO Standard ST.26, including nucleotide analogs, modified nucleotides, and unknown residues.
(b) The representation and symbols for nucleotide sequence data shall conform to the requirements of paragraphs (b)(1) through (4) of this section.
  • (1) A nucleotide sequence must be represented in the manner described in paragraphs 11–12 of WIPO Standard ST.26.
  • (2) All nucleotides, including nucleotide analogs, modified nucleotides, and “unknown” nucleotides, within a nucleotide sequence must be represented using the symbols set forth in paragraphs 13–16, 19, and 21 of WIPO Standard ST.26.
  • (3) Modified nucleotides within a nucleotide sequence must be described in the manner discussed in paragraphs 17, 18, and 19 of WIPO Standard ST.26.
  • (4) A region containing a known number of contiguous “a,” “c,” “g,” “t,” or “n” residues for which the same description applies may be jointly described in the manner described in paragraph 22 of WIPO Standard ST.26.
Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2412-05-b-d3b8eb89498cd0dc3fe2ea29]
Nucleotide Sequence Must Conform to WIPO Standard ST.26
Note:
A nucleotide sequence must be represented in the manner described by paragraphs 11-12 of WIPO Standard ST.26.

(b) The representation and symbols for nucleotide sequence data shall conform to the requirements of paragraphs (b)(1) through (4) of this section. (1) A nucleotide sequence must be represented in the manner described in paragraphs 11–12 of WIPO Standard ST.26.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2412-05-b-75ab5c2b91f3d4b300f459a2]
Symbols for Nucleotide Sequences Must Conform to WIPO Standard ST.26
Note:
All nucleotides, including analogs and unknowns, must be represented using specific symbols as defined in WIPO Standard ST.26.

(b) The representation and symbols for nucleotide sequence data shall conform to the requirements of paragraphs (b)(1) through (4) of this section.

(2) All nucleotides, including nucleotide analogs, modified nucleotides, and “unknown” nucleotides, within a nucleotide sequence must be represented using the symbols set forth in paragraphs 13–16, 19, and 21 of WIPO Standard ST.26.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2412-05-b-2cbb1daad0f14e9c414e44e4]
Modified Nucleotides Must Be Described According to WIPO Standard ST.26
Note:
Nucleotide sequences containing modified nucleotides must follow the description requirements outlined in paragraphs 17, 18, and 19 of WIPO Standard ST.26.

(b) The representation and symbols for nucleotide sequence data shall conform to the requirements of paragraphs (b)(1) through (4) of this section.

(3) Modified nucleotides within a nucleotide sequence must be described in the manner discussed in paragraphs 17, 18, and 19 of WIPO Standard ST.26.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryProhibitedAlways
[mpep-2412-05-b-97257b0cc299cf54f28e1852]
Single Strand Representation Required for Nucleotide Sequences
Note:
A nucleotide sequence must be represented by a single strand in the 5’ to 3’ direction and cannot include 5’ or 3’ designations. Double-stranded sequences must be presented as either one sequence or two separate complementary strands.
WIPO Standard ST.26, paragraph 11, provides that a nucleotide sequence must be represented only by a single strand, in the 5’ to 3’ direction from left to right, or in the direction from left to right that mimics the 5’ to 3’ direction. The designations 5’ and 3’ or any other similar designations must not be included in the sequence. A double-stranded nucleotide sequence disclosed by enumeration of the residues of both strands must be represented as:
  • (a) a single sequence or as two separate sequences, each assigned its own sequence identifier, where the two separate strands are fully complementary to each other, or
  • (b) two separate sequences, each assigned its own sequence identifier, where the two strands are not fully complementary to each other.
Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryInformativeAlways
[mpep-2412-05-b-a205c156a22dd18591e3f093]
Double-Stranded Sequences Must Be Represented Separately
Note:
A double-stranded nucleotide sequence must be represented as two separate sequences, each with its own identifier and not fully complementary to each other.

WIPO Standard ST.26, paragraph 11, provides that a nucleotide sequence must be represented only by a single strand, in the 5’ to 3’ direction from left to right, or in the direction from left to right that mimics the 5’ to 3’ direction. The designations 5’ and 3’ or any other similar designations must not be included in the sequence. A double-stranded nucleotide sequence disclosed by enumeration of the residues of both strands must be represented as:

(b) two separate sequences, each assigned its own sequence identifier, where the two strands are not fully complementary to each other.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryInformativeAlways
[mpep-2412-05-b-8ab80a36456e24546ff6570b]
First Nucleotide Is Position 1
Note:
The first nucleotide in a sequence must be numbered as position 1, with continuous numbering throughout the sequence.

WIPO Standard ST.26, paragraph 12, provides that the first nucleotide presented in the sequence is residue position number 1. When nucleotide sequences are circular in configuration, applicant must choose the nucleotide in residue position number 1. Numbering is continuous throughout the entire sequence in the 5’ to 3’ direction, or in the direction that mimics the 5’ to 3’ direction. The last residue position number must equal the number of nucleotides in the sequence.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2412-05-b-adef20eda1db670ee941e8b0]
Nucleotide Position Numbering for Circular Sequences
Note:
When nucleotide sequences are circular, the first presented nucleotide must be at position number 1.

WIPO Standard ST.26, paragraph 12, provides that the first nucleotide presented in the sequence is residue position number 1. When nucleotide sequences are circular in configuration, applicant must choose the nucleotide in residue position number 1. Numbering is continuous throughout the entire sequence in the 5’ to 3’ direction, or in the direction that mimics the 5’ to 3’ direction. The last residue position number must equal the number of nucleotides in the sequence.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing RequirementsSequence Listing Format
StatutoryRequiredAlways
[mpep-2412-05-b-be9e379ad628234302476ead]
Last Residue Position Must Equal Nucleotides Count
Note:
The last residue position number in a nucleotide sequence must match the total count of nucleotides, ensuring proper numbering continuity.

WIPO Standard ST.26, paragraph 12, provides that the first nucleotide presented in the sequence is residue position number 1. When nucleotide sequences are circular in configuration, applicant must choose the nucleotide in residue position number 1. Numbering is continuous throughout the entire sequence in the 5’ to 3’ direction, or in the direction that mimics the 5’ to 3’ direction. The last residue position number must equal the number of nucleotides in the sequence.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2412-05-b-5cbf0210c4f933b821a1490f]
Lower-Case Letters for Nucleotides
Note:
All nucleotides in a sequence must be represented using lower-case letters as per WIPO Standard ST.26.

WIPO Standard ST.26, paragraph 13, provides that all nucleotides in a sequence must be represented using the symbols as listed in Table 1: List of Nucleotides Symbols (see MPEP § 2412.03(a)). Only lower-case letters must be used. Any symbol used to represent a nucleotide is the equivalent of only one residue.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryInformativeAlways
[mpep-2412-05-b-945eccba13a261e726a4de11]
Representation of Nucleotides Must Be Unique
Note:
Each symbol used to represent a nucleotide must correspond to only one residue in the sequence listing.

WIPO Standard ST.26, paragraph 13, provides that all nucleotides in a sequence must be represented using the symbols as listed in Table 1: List of Nucleotides Symbols (see MPEP § 2412.03(a)). Only lower-case letters must be used. Any symbol used to represent a nucleotide is the equivalent of only one residue.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing RequirementsSequence Listing Format
StatutoryRequiredAlways
[mpep-2412-05-b-a8b8c502161f91be112dd89e]
Modified Nucleotide Description Required
Note:
Uracil in DNA or thymine in RNA must be described in a feature table as they are considered modified nucleotides.

WIPO Standard ST.26, paragraph 14, sets forth that the symbol “t” will be construed as thymine in deoxyribonucleic acid (DNA) and uracil in ribonucleic acid (RNA). Uracil in DNA or thymine in RNA is considered a modified nucleotide and must be further described in a feature table. See MPEP § 2413.01(g), subsection I for more detail regarding a “feature table.”

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing RequirementsSequence Listing Format
StatutoryRecommendedAlways
[mpep-2412-05-b-5f412b6f5268104ce42e7f3a]
Use Most Restrictive Nucleotide Symbol
Note:
When a nucleotide could be 'a' or 'g', use 'r' instead of 'n'.

WIPO Standard ST.26, paragraph 15, provides that where an ambiguity symbol (representing two or more alternative nucleotides) is appropriate, the most restrictive symbol should be used, as listed in Table 1: List of Nucleotides Symbols (see MPEP § 2412.03(a)). For example, if a nucleotide in a given position could be “a” or “g”, then “r” should be used, rather than “n”. The symbol “n” will be construed as any one of “a”, “c”, “g”, or “t/u” except where it is used with a further description in a feature table. The symbol “n” must not be used to represent anything other than a nucleotide. A single modified or “unknown” nucleotide may be represented by the symbol “n”, together with a further description in a feature table. See MPEP § 2413.01(g), subsection I, for more detail regarding a “feature table.” For representation of sequence variants, i.e., alternatives, deletions, insertions or substitutions relative to a primary sequence, see MPEP § 2412.05(c); and also MPEP § 2413.01(g), subsection XII for information on variants.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing RequirementsSequence Listing Format
StatutoryInformativeAlways
[mpep-2412-05-b-0163411bb06a7ada8f3cc417]
Nucleotide Symbol Representation
Note:
The symbol 'n' represents any of 'a', 'c', 'g', or 't/u' unless further described in a feature table.

WIPO Standard ST.26, paragraph 15, provides that where an ambiguity symbol (representing two or more alternative nucleotides) is appropriate, the most restrictive symbol should be used, as listed in Table 1: List of Nucleotides Symbols (see MPEP § 2412.03(a)). For example, if a nucleotide in a given position could be “a” or “g”, then “r” should be used, rather than “n”. The symbol “n” will be construed as any one of “a”, “c”, “g”, or “t/u” except where it is used with a further description in a feature table. The symbol “n” must not be used to represent anything other than a nucleotide. A single modified or “unknown” nucleotide may be represented by the symbol “n”, together with a further description in a feature table. See MPEP § 2413.01(g), subsection I, for more detail regarding a “feature table.” For representation of sequence variants, i.e., alternatives, deletions, insertions or substitutions relative to a primary sequence, see MPEP § 2412.05(c); and also MPEP § 2413.01(g), subsection XII for information on variants.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryProhibitedAlways
[mpep-2412-05-b-73f34ab64238dff83f78798a]
Symbol 'n' Must Represent Nucleotides Only
Note:
The symbol 'n' can only represent a single nucleotide and not any combination of them.

WIPO Standard ST.26, paragraph 15, provides that where an ambiguity symbol (representing two or more alternative nucleotides) is appropriate, the most restrictive symbol should be used, as listed in Table 1: List of Nucleotides Symbols (see MPEP § 2412.03(a)). For example, if a nucleotide in a given position could be “a” or “g”, then “r” should be used, rather than “n”. The symbol “n” will be construed as any one of “a”, “c”, “g”, or “t/u” except where it is used with a further description in a feature table. The symbol “n” must not be used to represent anything other than a nucleotide. A single modified or “unknown” nucleotide may be represented by the symbol “n”, together with a further description in a feature table. See MPEP § 2413.01(g), subsection I, for more detail regarding a “feature table.” For representation of sequence variants, i.e., alternatives, deletions, insertions or substitutions relative to a primary sequence, see MPEP § 2412.05(c); and also MPEP § 2413.01(g), subsection XII for information on variants.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing RequirementsSequence Listing Format
StatutoryPermittedAlways
[mpep-2412-05-b-0eb88fbf11b2bed4c6374204]
Unknown Nucleotide Must Be Represented by 'n' with Feature Table Description
Note:
A single modified or unknown nucleotide must be represented by the symbol ‘n’ followed by a feature table description, unless it is used in a specific sequence variant context.

WIPO Standard ST.26, paragraph 15, provides that where an ambiguity symbol (representing two or more alternative nucleotides) is appropriate, the most restrictive symbol should be used, as listed in Table 1: List of Nucleotides Symbols (see MPEP § 2412.03(a)). For example, if a nucleotide in a given position could be “a” or “g”, then “r” should be used, rather than “n”. The symbol “n” will be construed as any one of “a”, “c”, “g”, or “t/u” except where it is used with a further description in a feature table. The symbol “n” must not be used to represent anything other than a nucleotide. A single modified or “unknown” nucleotide may be represented by the symbol “n”, together with a further description in a feature table. See MPEP § 2413.01(g), subsection I, for more detail regarding a “feature table.” For representation of sequence variants, i.e., alternatives, deletions, insertions or substitutions relative to a primary sequence, see MPEP § 2412.05(c); and also MPEP § 2413.01(g), subsection XII for information on variants.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing RequirementsSequence Listing Format
StatutoryProhibitedAlways
[mpep-2412-05-b-65b08a41469e8c3e0dd97f16]
Modified Nucleotides Must Be Represented as Unmodified Where Possible
Note:
This rule requires that modified nucleotides in a sequence be represented by their unmodified counterparts (a, c, g, or t) whenever possible. If a modified nucleotide cannot be represented this way, it must be denoted with the symbol 'n'.

WIPO Standard ST.26, paragraph 16, sets forth that modified nucleotides should be represented in the sequence as the corresponding unmodified nucleotides, i.e., “a”, “c”, “g” or “t” whenever possible. Any modified nucleotide in a sequence that cannot otherwise be represented by any other symbol in Table 1: List of Nucleotides Symbols (see MPEP § 2412.03(a)), i.e., an “other” nucleotide, such as a non-naturally occurring nucleotide, must be represented by the symbol “n”. The symbol “n” is the equivalent of only one residue.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryInformativeAlways
[mpep-2412-05-b-afe254ff3d5ed9a512732292]
Modified Nucleotides Must Be Represented as 'n'
Note:
This rule requires that modified nucleotides, which cannot be represented by standard symbols, must be denoted by the symbol 'n', indicating only one residue.

WIPO Standard ST.26, paragraph 16, sets forth that modified nucleotides should be represented in the sequence as the corresponding unmodified nucleotides, i.e., “a”, “c”, “g” or “t” whenever possible. Any modified nucleotide in a sequence that cannot otherwise be represented by any other symbol in Table 1: List of Nucleotides Symbols (see MPEP § 2412.03(a)), i.e., an “other” nucleotide, such as a non-naturally occurring nucleotide, must be represented by the symbol “n”. The symbol “n” is the equivalent of only one residue.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2412-05-b-ea1d574dcd636979318540b1]
Unknown Nucleotide Must Be Represented by 'n'
Note:
Any unknown nucleotide in a sequence must be represented by the symbol 'n' and described using the feature key 'unsure'.

WIPO Standard ST.26, paragraph 21, provides that any “unknown” nucleotide must be represented by the symbol “n” in the sequence. An “unknown” nucleotide should be further described in a feature table using the feature key “unsure”. The symbol “n” is the equivalent of only one residue. See MPEP § 2413.01(g), subsection I, for more detail regarding a “feature table.”

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRecommendedAlways
[mpep-2412-05-b-df4fc3afa4efe46b1864f8da]
Unknown Nucleotide Must Be Described in Feature Table
Note:
An unknown nucleotide must be represented by 'n' and further described using the feature key ‘unsure’ in a feature table.

WIPO Standard ST.26, paragraph 21, provides that any “unknown” nucleotide must be represented by the symbol “n” in the sequence. An “unknown” nucleotide should be further described in a feature table using the feature key “unsure”. The symbol “n” is the equivalent of only one residue. See MPEP § 2413.01(g), subsection I, for more detail regarding a “feature table.”

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing RequirementsSequence Listing Format
StatutoryInformativeAlways
[mpep-2412-05-b-0be5f8a800267eb6bc9ddef6]
Unknown Nucleotides Must Be Represented by 'n'
Note:
Any unknown nucleotide in a sequence must be represented by the symbol 'n' and further described using the feature key 'unsure' in a feature table.

WIPO Standard ST.26, paragraph 21, provides that any “unknown” nucleotide must be represented by the symbol “n” in the sequence. An “unknown” nucleotide should be further described in a feature table using the feature key “unsure”. The symbol “n” is the equivalent of only one residue. See MPEP § 2413.01(g), subsection I, for more detail regarding a “feature table.”

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2412-05-b-ae2ff4b903e70f6393433f66]
Requirement for Describing Modified Nucleotides in Feature Table
Note:
A modified nucleotide must be described using a feature key and qualifier from specified tables, ensuring clarity in sequence listings.

WIPO Standard ST.26, paragraph 17, specifies that a modified nucleotide must be further described in a feature table (see MPEP § 2413.01(g), subsection I, for more detail regarding a “feature table”) using the feature key “modified_base” and the mandatory qualifier “mod_base” in conjunction with a single abbreviation from Table 2: List of Modified Nucleotides, below, as the qualifier value. See MPEP § 2413.01(g) subsections II and III, for more information regarding use of a feature key; and MPEP § 2413.01(g) subsections V and VI, for more information regarding use of a qualifier. If the abbreviation is “OTHER”, the complete unabbreviated name of the modified nucleotide must be provided as the value in a “note” qualifier. For a listing of alternative modified nucleotides, the qualifier value “OTHER” may be used in conjunction with a further “note” qualifier. The abbreviations (or full names) provided in Table 2 must not be used in the sequence itself.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2412-05-b-1064a7f773019653287ce1c2]
Complete Name Required for OTHER Modified Nucleotides
Note:
If the abbreviation is 'OTHER', provide the complete unabbreviated name of the modified nucleotide in a note qualifier.

WIPO Standard ST.26, paragraph 17, specifies that a modified nucleotide must be further described in a feature table (see MPEP § 2413.01(g), subsection I, for more detail regarding a “feature table”) using the feature key “modified_base” and the mandatory qualifier “mod_base” in conjunction with a single abbreviation from Table 2: List of Modified Nucleotides, below, as the qualifier value. See MPEP § 2413.01(g) subsections II and III, for more information regarding use of a feature key; and MPEP § 2413.01(g) subsections V and VI, for more information regarding use of a qualifier. If the abbreviation is “OTHER”, the complete unabbreviated name of the modified nucleotide must be provided as the value in a “note” qualifier. For a listing of alternative modified nucleotides, the qualifier value “OTHER” may be used in conjunction with a further “note” qualifier. The abbreviations (or full names) provided in Table 2 must not be used in the sequence itself.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing RequirementsSequence Listing Format
StatutoryRequiredAlways
[mpep-2412-05-b-c4361fea720c08169ae5483c]
Modified Nucleotides Must Be Described in Feature Table
Note:
A nucleotide sequence with modified regions must be described using a feature table, including the chemical names of modified nucleotides.

WIPO Standard ST.26, paragraph 18, describes that a nucleotide sequence including one or more regions of consecutive modified nucleotides that share the same backbone moiety must be further described in a feature table as required for a modified nucleotide. See MPEP § 2413.01(g), subsection I, for information regarding a feature table and MPEP § 2412.03(e) regarding modified nucleotides. The modified nucleotides of each such region may be jointly described in a single INSDFeature element of a “feature table” as described below. See MPEP § 2413.01(g), subsection I, for information regarding INSDFeature elements of a feature table. The most restrictive unabbreviated chemical name that encompasses all of the modified nucleotides in the range or a list of the chemical names of all the nucleotides in the range must be provided as the value in the “note” qualifier. For example, a glycol nucleic acid sequence containing “a”, “c”, “g”, or “t” nucleobases may be described in the “note” qualifier as “2,3-dihydroxypropyl nucleosides.” Alternatively, the same sequence may be described in the “note” qualifier as “2,3-dihydroxypropyladenine, 2,3-dihydroxypropylthymine, 2,3-dihydroxypropylguanine, or 2,3-dihydroxypropylcytosine.” Where an individual modified nucleotide in the region includes an additional modification, then the modified nucleotide must also be further described in the feature table as required for a modified nucleotide.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2412-05-b-1f14c1738d27fa09faf41775]
Note Qualifier Must Contain Chemical Names for Modified Nucleotides
Note:
The note qualifier in a sequence listing must contain the most restrictive unabbreviated chemical name encompassing all modified nucleotides or list each nucleotide's chemical name.

WIPO Standard ST.26, paragraph 18, describes that a nucleotide sequence including one or more regions of consecutive modified nucleotides that share the same backbone moiety must be further described in a feature table as required for a modified nucleotide. See MPEP § 2413.01(g), subsection I, for information regarding a feature table and MPEP § 2412.03(e) regarding modified nucleotides. The modified nucleotides of each such region may be jointly described in a single INSDFeature element of a “feature table” as described below. See MPEP § 2413.01(g), subsection I, for information regarding INSDFeature elements of a feature table. The most restrictive unabbreviated chemical name that encompasses all of the modified nucleotides in the range or a list of the chemical names of all the nucleotides in the range must be provided as the value in the “note” qualifier. For example, a glycol nucleic acid sequence containing “a”, “c”, “g”, or “t” nucleobases may be described in the “note” qualifier as “2,3-dihydroxypropyl nucleosides.” Alternatively, the same sequence may be described in the “note” qualifier as “2,3-dihydroxypropyladenine, 2,3-dihydroxypropylthymine, 2,3-dihydroxypropylguanine, or 2,3-dihydroxypropylcytosine.” Where an individual modified nucleotide in the region includes an additional modification, then the modified nucleotide must also be further described in the feature table as required for a modified nucleotide.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2412-05-b-d97d51f198c8dbda3f90c4e4]
Modified Nucleotide Must Be Described in Feature Table
Note:
If a modified nucleotide includes an additional modification, it must be further described in the feature table as required for a modified nucleotide.

WIPO Standard ST.26, paragraph 18, describes that a nucleotide sequence including one or more regions of consecutive modified nucleotides that share the same backbone moiety must be further described in a feature table as required for a modified nucleotide. See MPEP § 2413.01(g), subsection I, for information regarding a feature table and MPEP § 2412.03(e) regarding modified nucleotides. The modified nucleotides of each such region may be jointly described in a single INSDFeature element of a “feature table” as described below. See MPEP § 2413.01(g), subsection I, for information regarding INSDFeature elements of a feature table. The most restrictive unabbreviated chemical name that encompasses all of the modified nucleotides in the range or a list of the chemical names of all the nucleotides in the range must be provided as the value in the “note” qualifier. For example, a glycol nucleic acid sequence containing “a”, “c”, “g”, or “t” nucleobases may be described in the “note” qualifier as “2,3-dihydroxypropyl nucleosides.” Alternatively, the same sequence may be described in the “note” qualifier as “2,3-dihydroxypropyladenine, 2,3-dihydroxypropylthymine, 2,3-dihydroxypropylguanine, or 2,3-dihydroxypropylcytosine.” Where an individual modified nucleotide in the region includes an additional modification, then the modified nucleotide must also be further described in the feature table as required for a modified nucleotide.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing ContentSequence Listing RequirementsSequence Listing Format
Topic

Sequence Listing Format

14 rules
StatutoryPermittedAlways
[mpep-2412-05-b-420553017ae13f5ac02fb0f6]
XML Representation for Sequence Listings Required
Note:
This rule requires the proper XML representation of nucleotide and amino acid sequence data in applications filed on or after July 1, 2022.

[Editor Note: This section is applicable to all applications with a filing date, or, for national phase applications, an international filing date, on or after July 1, 2022, having disclosure of one or more nucleotide and/or amino acid sequences as defined in 37 CFR 1.831(b). Formatting representations of XML (eXtensible Markup Language) elements in this section appear different than shown in Standard ST.26, which may be accessed at: www.wipo.int /export/sites/www/standards/en/pdf/03-26-01.pdf.]

Jump to MPEP Source · 37 CFR 1.831(b)Sequence Listing FormatSequence Listing RequirementsSequence Listing Content
StatutoryPermittedAlways
[mpep-2412-05-b-553277e49dea62f085860864]
Joint Description of Contiguous Nucleotides
Note:
A region with a known number of contiguous 'a', 'c', 'g', 't', or 'n' residues can be jointly described using the method in WIPO Standard ST.26 paragraph 22.

(b) The representation and symbols for nucleotide sequence data shall conform to the requirements of paragraphs (b)(1) through (4) of this section.

(4) A region containing a known number of contiguous “a,” “c,” “g,” “t,” or “n” residues for which the same description applies may be jointly described in the manner described in paragraph 22 of WIPO Standard ST.26.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing ContentSequence Listing Requirements
StatutoryInformativeAlways
[mpep-2412-05-b-28cab157c52e333922697c15]
Numbering of Nucleotides in Sequence Listing
Note:
The first nucleotide is numbered as position 1, and numbering continues in the 5’ to 3’ direction for circular sequences.

WIPO Standard ST.26, paragraph 12, provides that the first nucleotide presented in the sequence is residue position number 1. When nucleotide sequences are circular in configuration, applicant must choose the nucleotide in residue position number 1. Numbering is continuous throughout the entire sequence in the 5’ to 3’ direction, or in the direction that mimics the 5’ to 3’ direction. The last residue position number must equal the number of nucleotides in the sequence.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing RequirementsSequence Listing Content
StatutoryRequiredAlways
[mpep-2412-05-b-eba9b36009d55958dfabf81e]
Nucleotides Must Use Specified Symbols
Note:
All nucleotides in a sequence must be represented using the symbols listed in Table 1, and only lower-case letters are allowed.

WIPO Standard ST.26, paragraph 13, provides that all nucleotides in a sequence must be represented using the symbols as listed in Table 1: List of Nucleotides Symbols (see MPEP § 2412.03(a)). Only lower-case letters must be used. Any symbol used to represent a nucleotide is the equivalent of only one residue.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing RequirementsSequence Listing Content
StatutoryInformativeAlways
[mpep-2412-05-b-4e2e4993bf949a0f3dc22dae]
Symbol 't' for Thymine and Uracil
Note:
The symbol ‘t’ is used to represent thymine in DNA and uracil in RNA. Any deviation must be described in a feature table.

WIPO Standard ST.26, paragraph 14, sets forth that the symbol “t” will be construed as thymine in deoxyribonucleic acid (DNA) and uracil in ribonucleic acid (RNA). Uracil in DNA or thymine in RNA is considered a modified nucleotide and must be further described in a feature table. See MPEP § 2413.01(g), subsection I for more detail regarding a “feature table.”

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing ContentSequence Listing Requirements
StatutoryRecommendedAlways
[mpep-2412-05-b-edb1bdabf87dbc3d05651346]
Ambiguity Symbols Must Be Most Restrictive
Note:
When using ambiguity symbols to represent multiple nucleotides, the most restrictive symbol from Table 1 must be used.

WIPO Standard ST.26, paragraph 15, provides that where an ambiguity symbol (representing two or more alternative nucleotides) is appropriate, the most restrictive symbol should be used, as listed in Table 1: List of Nucleotides Symbols (see MPEP § 2412.03(a)). For example, if a nucleotide in a given position could be “a” or “g”, then “r” should be used, rather than “n”. The symbol “n” will be construed as any one of “a”, “c”, “g”, or “t/u” except where it is used with a further description in a feature table. The symbol “n” must not be used to represent anything other than a nucleotide. A single modified or “unknown” nucleotide may be represented by the symbol “n”, together with a further description in a feature table. See MPEP § 2413.01(g), subsection I, for more detail regarding a “feature table.” For representation of sequence variants, i.e., alternatives, deletions, insertions or substitutions relative to a primary sequence, see MPEP § 2412.05(c); and also MPEP § 2413.01(g), subsection XII for information on variants.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing ContentSequence Listing Requirements
StatutoryInformativeAlways
[mpep-2412-05-b-8a224b8d251fe6c3da982f03]
Representation of Sequence Variants
Note:
For alternatives, deletions, insertions, or substitutions relative to a primary sequence, follow MPEP §2412.05(c) and §2413.01(g), subsection XII.

WIPO Standard ST.26, paragraph 15, provides that where an ambiguity symbol (representing two or more alternative nucleotides) is appropriate, the most restrictive symbol should be used, as listed in Table 1: List of Nucleotides Symbols (see MPEP § 2412.03(a)). For example, if a nucleotide in a given position could be “a” or “g”, then “r” should be used, rather than “n”. The symbol “n” will be construed as any one of “a”, “c”, “g”, or “t/u” except where it is used with a further description in a feature table. The symbol “n” must not be used to represent anything other than a nucleotide. A single modified or “unknown” nucleotide may be represented by the symbol “n”, together with a further description in a feature table. See MPEP § 2413.01(g), subsection I, for more detail regarding a “feature table.” For representation of sequence variants, i.e., alternatives, deletions, insertions or substitutions relative to a primary sequence, see MPEP § 2412.05(c); and also MPEP § 2413.01(g), subsection XII for information on variants.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing RequirementsSequence Listing Content
StatutoryRequiredAlways
[mpep-2412-05-b-4e216144a1ebad3a51a9b302]
Uracil and Thymine Representation in Sequence Listings
Note:
URACIL in DNA or THYMINE in RNA must be represented as 't' and described using feature table with qualifiers 'mod_base:OTHER' and 'note:uracil/thymine'.

WIPO Standard ST.26, paragraph 19, specifies that uracil in DNA or thymine in RNA are considered modified nucleotides and must be represented in the sequence as “t” and be further described in a feature table using the feature key “modified_base”, the qualifier “mod_base” with “OTHER” as the qualifier value and the qualifier “note” with “uracil” or “thymine”, respectively, as the qualifier value. See MPEP § 2413.01(g), subsection I for more detail regarding a “feature table.”

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing RequirementsSequence Listing Content
StatutoryProhibitedAlways
[mpep-2412-05-b-893455a1df22582cb9f7cbc2]
Abbreviations Not Allowed In Sequence
Note:
Modified nucleotide abbreviations from Table 2 must not be used within the sequence itself.

WIPO Standard ST.26, paragraph 17, specifies that a modified nucleotide must be further described in a feature table (see MPEP § 2413.01(g), subsection I, for more detail regarding a “feature table”) using the feature key “modified_base” and the mandatory qualifier “mod_base” in conjunction with a single abbreviation from Table 2: List of Modified Nucleotides, below, as the qualifier value. See MPEP § 2413.01(g) subsections II and III, for more information regarding use of a feature key; and MPEP § 2413.01(g) subsections V and VI, for more information regarding use of a qualifier. If the abbreviation is “OTHER”, the complete unabbreviated name of the modified nucleotide must be provided as the value in a “note” qualifier. For a listing of alternative modified nucleotides, the qualifier value “OTHER” may be used in conjunction with a further “note” qualifier. The abbreviations (or full names) provided in Table 2 must not be used in the sequence itself.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing RequirementsSequence Listing Content
StatutoryPermittedAlways
[mpep-2412-05-b-957299e1fa0eb7f70a465902]
Modified Nucleotides Must Be Described in Feature Table
Note:
Nucleotide sequences with modified regions must be described using the most restrictive chemical name or a list of names and included in a feature table as required for a modified nucleotide.

WIPO Standard ST.26, paragraph 18, describes that a nucleotide sequence including one or more regions of consecutive modified nucleotides that share the same backbone moiety must be further described in a feature table as required for a modified nucleotide. See MPEP § 2413.01(g), subsection I, for information regarding a feature table and MPEP § 2412.03(e) regarding modified nucleotides. The modified nucleotides of each such region may be jointly described in a single INSDFeature element of a “feature table” as described below. See MPEP § 2413.01(g), subsection I, for information regarding INSDFeature elements of a feature table. The most restrictive unabbreviated chemical name that encompasses all of the modified nucleotides in the range or a list of the chemical names of all the nucleotides in the range must be provided as the value in the “note” qualifier. For example, a glycol nucleic acid sequence containing “a”, “c”, “g”, or “t” nucleobases may be described in the “note” qualifier as “2,3-dihydroxypropyl nucleosides.” Alternatively, the same sequence may be described in the “note” qualifier as “2,3-dihydroxypropyladenine, 2,3-dihydroxypropylthymine, 2,3-dihydroxypropylguanine, or 2,3-dihydroxypropylcytosine.” Where an individual modified nucleotide in the region includes an additional modification, then the modified nucleotide must also be further described in the feature table as required for a modified nucleotide.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing RequirementsSequence Listing Content
StatutoryPermittedAlways
[mpep-2412-05-b-f09069c8a6c9484e0b716d7b]
Note Qualifier for Modified Nucleotides Must Describe Chemical Names
Note:
The note qualifier must provide the most restrictive unabbreviated chemical name encompassing all modified nucleotides in a range or list them individually.

WIPO Standard ST.26, paragraph 18, describes that a nucleotide sequence including one or more regions of consecutive modified nucleotides that share the same backbone moiety must be further described in a feature table as required for a modified nucleotide. See MPEP § 2413.01(g), subsection I, for information regarding a feature table and MPEP § 2412.03(e) regarding modified nucleotides. The modified nucleotides of each such region may be jointly described in a single INSDFeature element of a “feature table” as described below. See MPEP § 2413.01(g), subsection I, for information regarding INSDFeature elements of a feature table. The most restrictive unabbreviated chemical name that encompasses all of the modified nucleotides in the range or a list of the chemical names of all the nucleotides in the range must be provided as the value in the “note” qualifier. For example, a glycol nucleic acid sequence containing “a”, “c”, “g”, or “t” nucleobases may be described in the “note” qualifier as “2,3-dihydroxypropyl nucleosides.” Alternatively, the same sequence may be described in the “note” qualifier as “2,3-dihydroxypropyladenine, 2,3-dihydroxypropylthymine, 2,3-dihydroxypropylguanine, or 2,3-dihydroxypropylcytosine.” Where an individual modified nucleotide in the region includes an additional modification, then the modified nucleotide must also be further described in the feature table as required for a modified nucleotide.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing RequirementsSequence Listing Content
StatutoryRequiredAlways
[mpep-2412-05-b-7dfffed4b0ea5e86e21cfe53]
Uracil and Thymine Must Be Described in Sequence Listing
Note:
URACIL in DNA or THYMINE in RNA must be represented as 't' and described using feature keys and qualifiers.

WIPO Standard ST.26, paragraph 19, provides that uracil in DNA or thymine in RNA are considered modified nucleotides and must be represented in the sequence as “t” and be further described in a feature table using the feature key “modified_base”, the qualifier “mod_base” with “OTHER” as the qualifier value and the qualifier “note” with “uracil” or “thymine”, respectively, as the qualifier value.See MPEP § 2413.01(g), subsections II and III, for more information regarding use of a feature key; and MPEP § 2413.01(g), subsections V and VI, for more information regarding use of a qualifier.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing RequirementsSequence Listing Content
StatutoryPermittedAlways
[mpep-2412-05-b-8ada939a7a6a4b7d125fd686]
Joint Description of Contiguous Residues
Note:
A region with contiguous 'a', 'c', 'g', 't', or 'n' residues can be described jointly using a single INSDFeature element with the syntax 'x..y'.

WIPO Standard ST.26, paragraph 22, specifies that a region containing a known number of contiguous “a”, “c”, “g”, “t”, or “n” residues for which the same description applies may be jointly described using a single INSDFeature element with the syntax “x..y” as the location descriptor in the element INSDFeature_location. See MPEP § 2413.01(g) subsection IV, for information regarding INSDFeature_location. For representation of sequence variants, i.e., alternatives, deletions, insertions or substitutions, see MPEP § 2412.05(c) and MPEP § 2413.01(g), subsection XI.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing ContentSequence Listing Requirements
StatutoryInformativeAlways
[mpep-2412-05-b-4031477ad0c0d78495daad1b]
Representation of Sequence Variants Required
Note:
MPEP sections detail how to represent alternatives, deletions, insertions, or substitutions in sequence listings.

WIPO Standard ST.26, paragraph 22, specifies that a region containing a known number of contiguous “a”, “c”, “g”, “t”, or “n” residues for which the same description applies may be jointly described using a single INSDFeature element with the syntax “x..y” as the location descriptor in the element INSDFeature_location. See MPEP § 2413.01(g) subsection IV, for information regarding INSDFeature_location. For representation of sequence variants, i.e., alternatives, deletions, insertions or substitutions, see MPEP § 2412.05(c) and MPEP § 2413.01(g), subsection XI.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing FormatSequence Listing RequirementsSequence Listing Content
Topic

Sequence Listing Requirements

1 rules
StatutoryPermittedAlways
[mpep-2412-05-b-68afb6ba5ca6c4e64fb4d413]
Other Modified Nucleotides Must Be Described
Note:
If a modified nucleotide is not listed in Table 2, its full name must be provided in the 'note' qualifier.

WIPO Standard ST.26, paragraph 17, specifies that a modified nucleotide must be further described in a feature table (see MPEP § 2413.01(g), subsection I, for more detail regarding a “feature table”) using the feature key “modified_base” and the mandatory qualifier “mod_base” in conjunction with a single abbreviation from Table 2: List of Modified Nucleotides, below, as the qualifier value. See MPEP § 2413.01(g) subsections II and III, for more information regarding use of a feature key; and MPEP § 2413.01(g) subsections V and VI, for more information regarding use of a qualifier. If the abbreviation is “OTHER”, the complete unabbreviated name of the modified nucleotide must be provided as the value in a “note” qualifier. For a listing of alternative modified nucleotides, the qualifier value “OTHER” may be used in conjunction with a further “note” qualifier. The abbreviations (or full names) provided in Table 2 must not be used in the sequence itself.

Jump to MPEP Source · 37 CFR 1.832Sequence Listing RequirementsSequence Listing ContentSequence Listing Format

Citations

Primary topicCitation
Sequence Listing Content
Sequence Listing Format
37 CFR § 1.831(b)
Sequence Listing Content
Sequence Listing Format
MPEP § 2412.03(a)
Sequence Listing Content
Sequence Listing Format
MPEP § 2412.03(e)
Sequence Listing Content
Sequence Listing Format
MPEP § 2412.05(c)
Sequence Listing Content
Sequence Listing Format
Sequence Listing Requirements
MPEP § 2413.01(g)

Source Text from USPTO’s MPEP

This is an exact copy of the MPEP from the USPTO. It is here for your reference to see the section in context.

BlueIron Last Updated: 2025-12-31