MPEP § 2423.01 — Format and Symbols To Be Used in a “Sequence Listing” (Annotated Rules)

§2423.01 Format and Symbols To Be Used in a “Sequence Listing”

USPTO MPEP version: BlueIron's Update: 2025-12-31

This page consolidates and annotates all enforceable requirements under MPEP § 2423.01, including statutory authority, regulatory rules, examiner guidance, and practice notes. It is provided as guidance, with links to the ground truth sources. This is information only, it is not legal advice.

Format and Symbols To Be Used in a “Sequence Listing”

This section addresses Format and Symbols To Be Used in a “Sequence Listing”. Primary authority: 37 CFR 1.822(b) and 37 CFR 1.822(d). Contains: 2 requirements and 1 other statement.

Key Rules

Topic

Sequence Listing Content

6 rules
StatutoryInformativeAlways
[mpep-2423-01-21705691908c38955a03f51a]
Sequence Listing Not Applicable After July 1, 2022
Note:
This rule does not apply to applications filed on or after July 1, 2022, that disclose nucleotide and/or amino acid sequences.

[Editor Note: This section is not applicable to applications filed on or after July 1, 2022, having disclosures of nucleotide and/or amino acid sequences as defined in 37 CFR 1.831(b). See MPEP §§ 2412 – 2419 for guidance on WIPO ST.26 requirements for applications filed on or after July 1, 2022.]

37 CFR 1.77 · 37 CFR 1.831(b)Sequence Listing ContentSequence Listing RequirementsSequence Listing Format
StatutoryInformativeAlways
[mpep-2423-01-c28b9776f75f0f9153fd0a8a]
Format and Symbols for Sequence Listings
Note:
This rule specifies the required format and symbols to be used when listing nucleotide and amino acid sequences in a patent application.

37 CFR 1.822 sets forth the format and symbols to be used for listing nucleotide and/or amino acid sequence data. The symbols for representing the nucleotide and/or amino acid characters in the sequences are set forth in Appendices A and C to Subpart G of Part 1 of the CFR. See MPEP § 2422 (I). No other symbols shall be used in nucleotide and amino acid sequences. The “modified base” and “modified and unusual amino acid” symbols appearing in Appendices B and D to Subpart G of Part 1 of the CFR (see 37 CFR 1.822 and MPEP § 2422 (I)) are not to be set forth in the sequences recited in the "Sequence Listing”. However, “modified base” or “modified and unusual amino acid” symbols may be used in the written description and/or drawing portions of the specification. To properly enter notations for modified bases or amino acids in the “Sequence Listing”, the Feature section of the “Sequence Listing” should be used. That is, a modified base or amino acid may be presented in a given sequence as the corresponding unmodified base or amino acid if the modified base or amino acid is one of those listed in Appendices B and D to Subpart G of Part 1 of the CFR and the modification is also set forth in the Feature section of the “Sequence Listing”. Otherwise, all nucleotide bases or amino acids not appearing in Appendices A and C to Subpart G of Part 1 of the CFR must be listed in a given sequence as “n” or “Xaa,” respectively, with further information given in the Feature section of the “Sequence Listing” by including one or more feature keys listed in Appendices E and F to Subpart G of Part 1 of the CFR. See 37 CFR 1.822(b).

Jump to MPEP Source · 37 CFR 1.822Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryInformativeAlways
[mpep-2423-01-92a28814f7a0ce608e156510]
Symbols for Nucleotide and Amino Acid Sequences Must Be Used
Note:
The rule requires the use of specific symbols for representing nucleotide and amino acid characters in sequence listings, as defined in Appendices A and C to Subpart G of Part 1 of the CFR.

37 CFR 1.822 sets forth the format and symbols to be used for listing nucleotide and/or amino acid sequence data. The symbols for representing the nucleotide and/or amino acid characters in the sequences are set forth in Appendices A and C to Subpart G of Part 1 of the CFR. See MPEP § 2422 (I). No other symbols shall be used in nucleotide and amino acid sequences. The “modified base” and “modified and unusual amino acid” symbols appearing in Appendices B and D to Subpart G of Part 1 of the CFR (see 37 CFR 1.822 and MPEP § 2422 (I)) are not to be set forth in the sequences recited in the "Sequence Listing”. However, “modified base” or “modified and unusual amino acid” symbols may be used in the written description and/or drawing portions of the specification. To properly enter notations for modified bases or amino acids in the “Sequence Listing”, the Feature section of the “Sequence Listing” should be used. That is, a modified base or amino acid may be presented in a given sequence as the corresponding unmodified base or amino acid if the modified base or amino acid is one of those listed in Appendices B and D to Subpart G of Part 1 of the CFR and the modification is also set forth in the Feature section of the “Sequence Listing”. Otherwise, all nucleotide bases or amino acids not appearing in Appendices A and C to Subpart G of Part 1 of the CFR must be listed in a given sequence as “n” or “Xaa,” respectively, with further information given in the Feature section of the “Sequence Listing” by including one or more feature keys listed in Appendices E and F to Subpart G of Part 1 of the CFR. See 37 CFR 1.822(b).

Jump to MPEP Source · 37 CFR 1.822Sequence Listing ContentSequence Listing RequirementsSpecification
StatutoryRequiredAlways
[mpep-2423-01-dac6b4f26ad10e1da6313bf2]
Symbols Must Be Standard for Sequences
Note:
All nucleotide and amino acid sequences must use standard symbols as defined in Appendices A and C of Subpart G of Part 1 of the CFR.

37 CFR 1.822 sets forth the format and symbols to be used for listing nucleotide and/or amino acid sequence data. The symbols for representing the nucleotide and/or amino acid characters in the sequences are set forth in Appendices A and C to Subpart G of Part 1 of the CFR. See MPEP § 2422 (I). No other symbols shall be used in nucleotide and amino acid sequences. The “modified base” and “modified and unusual amino acid” symbols appearing in Appendices B and D to Subpart G of Part 1 of the CFR (see 37 CFR 1.822 and MPEP § 2422 (I)) are not to be set forth in the sequences recited in the "Sequence Listing”. However, “modified base” or “modified and unusual amino acid” symbols may be used in the written description and/or drawing portions of the specification. To properly enter notations for modified bases or amino acids in the “Sequence Listing”, the Feature section of the “Sequence Listing” should be used. That is, a modified base or amino acid may be presented in a given sequence as the corresponding unmodified base or amino acid if the modified base or amino acid is one of those listed in Appendices B and D to Subpart G of Part 1 of the CFR and the modification is also set forth in the Feature section of the “Sequence Listing”. Otherwise, all nucleotide bases or amino acids not appearing in Appendices A and C to Subpart G of Part 1 of the CFR must be listed in a given sequence as “n” or “Xaa,” respectively, with further information given in the Feature section of the “Sequence Listing” by including one or more feature keys listed in Appendices E and F to Subpart G of Part 1 of the CFR. See 37 CFR 1.822(b).

Jump to MPEP Source · 37 CFR 1.822Sequence Listing ContentSequence Listing RequirementsSpecification
StatutoryPermittedAlways
[mpep-2423-01-2271c4454acc8a98989a6a91]
Modified Bases and Amino Acids Must Be Described in Feature Section
Note:
A modified base or amino acid must be presented as the corresponding unmodified one if it is listed in Appendices B and D of Part 1, Subpart G of the CFR and its modification is described in the Feature section of the Sequence Listing.

37 CFR 1.822 sets forth the format and symbols to be used for listing nucleotide and/or amino acid sequence data. The symbols for representing the nucleotide and/or amino acid characters in the sequences are set forth in Appendices A and C to Subpart G of Part 1 of the CFR. See MPEP § 2422 (I). No other symbols shall be used in nucleotide and amino acid sequences. The “modified base” and “modified and unusual amino acid” symbols appearing in Appendices B and D to Subpart G of Part 1 of the CFR (see 37 CFR 1.822 and MPEP § 2422 (I)) are not to be set forth in the sequences recited in the "Sequence Listing”. However, “modified base” or “modified and unusual amino acid” symbols may be used in the written description and/or drawing portions of the specification. To properly enter notations for modified bases or amino acids in the “Sequence Listing”, the Feature section of the “Sequence Listing” should be used. That is, a modified base or amino acid may be presented in a given sequence as the corresponding unmodified base or amino acid if the modified base or amino acid is one of those listed in Appendices B and D to Subpart G of Part 1 of the CFR and the modification is also set forth in the Feature section of the “Sequence Listing”. Otherwise, all nucleotide bases or amino acids not appearing in Appendices A and C to Subpart G of Part 1 of the CFR must be listed in a given sequence as “n” or “Xaa,” respectively, with further information given in the Feature section of the “Sequence Listing” by including one or more feature keys listed in Appendices E and F to Subpart G of Part 1 of the CFR. See 37 CFR 1.822(b).

Jump to MPEP Source · 37 CFR 1.822Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2423-01-d75d74334e57dd42e304b74b]
Nucleotide and Amino Acid Listing Requirement
Note:
All nucleotide bases or amino acids not in Appendices A and C must be listed as 'n' or 'Xaa', with additional details provided in the Feature section using keys from Appendices E and F.

37 CFR 1.822 sets forth the format and symbols to be used for listing nucleotide and/or amino acid sequence data. The symbols for representing the nucleotide and/or amino acid characters in the sequences are set forth in Appendices A and C to Subpart G of Part 1 of the CFR. See MPEP § 2422 (I). No other symbols shall be used in nucleotide and amino acid sequences. The “modified base” and “modified and unusual amino acid” symbols appearing in Appendices B and D to Subpart G of Part 1 of the CFR (see 37 CFR 1.822 and MPEP § 2422 (I)) are not to be set forth in the sequences recited in the "Sequence Listing”. However, “modified base” or “modified and unusual amino acid” symbols may be used in the written description and/or drawing portions of the specification. To properly enter notations for modified bases or amino acids in the “Sequence Listing”, the Feature section of the “Sequence Listing” should be used. That is, a modified base or amino acid may be presented in a given sequence as the corresponding unmodified base or amino acid if the modified base or amino acid is one of those listed in Appendices B and D to Subpart G of Part 1 of the CFR and the modification is also set forth in the Feature section of the “Sequence Listing”. Otherwise, all nucleotide bases or amino acids not appearing in Appendices A and C to Subpart G of Part 1 of the CFR must be listed in a given sequence as “n” or “Xaa,” respectively, with further information given in the Feature section of the “Sequence Listing” by including one or more feature keys listed in Appendices E and F to Subpart G of Part 1 of the CFR. See 37 CFR 1.822(b).

Jump to MPEP Source · 37 CFR 1.822Sequence Listing ContentSequence Listing FormatSequence Listing Requirements
Topic

Sequence Listing Format

5 rules
StatutoryRecommendedAlways
[mpep-2423-01-69fd104607d61da3c2b13d21]
Modified Bases and Amino Acids Must Be Noted in Feature Section
Note:
To properly include modified bases or amino acids, they must be noted in the Feature section of the Sequence Listing.

37 CFR 1.822 sets forth the format and symbols to be used for listing nucleotide and/or amino acid sequence data. The symbols for representing the nucleotide and/or amino acid characters in the sequences are set forth in Appendices A and C to Subpart G of Part 1 of the CFR. See MPEP § 2422 (I). No other symbols shall be used in nucleotide and amino acid sequences. The “modified base” and “modified and unusual amino acid” symbols appearing in Appendices B and D to Subpart G of Part 1 of the CFR (see 37 CFR 1.822 and MPEP § 2422 (I)) are not to be set forth in the sequences recited in the "Sequence Listing”. However, “modified base” or “modified and unusual amino acid” symbols may be used in the written description and/or drawing portions of the specification. To properly enter notations for modified bases or amino acids in the “Sequence Listing”, the Feature section of the “Sequence Listing” should be used. That is, a modified base or amino acid may be presented in a given sequence as the corresponding unmodified base or amino acid if the modified base or amino acid is one of those listed in Appendices B and D to Subpart G of Part 1 of the CFR and the modification is also set forth in the Feature section of the “Sequence Listing”. Otherwise, all nucleotide bases or amino acids not appearing in Appendices A and C to Subpart G of Part 1 of the CFR must be listed in a given sequence as “n” or “Xaa,” respectively, with further information given in the Feature section of the “Sequence Listing” by including one or more feature keys listed in Appendices E and F to Subpart G of Part 1 of the CFR. See 37 CFR 1.822(b).

Jump to MPEP Source · 37 CFR 1.822Sequence Listing FormatSequence Listing RequirementsSpecification
StatutoryRequiredAlways
[mpep-2423-01-87f1eaf1bb1e0e6f1413b7f0]
Three-Letter Symbols Required for Amino Acids in Sequence Listing
Note:
The use of three-letter symbols, with the first character in uppercase and the remaining in lowercase, is required for amino acids in the sequence listing.

In 37 CFR 1.822(b) and 37 CFR 1.822(d), the use of three-letter symbols for amino acids is required in the “Sequence Listing”. The three-letter symbols must be presented using the upper case for the first character and lower case for the remaining two characters. Applicants are encouraged to use the three-letter symbols for amino acids throughout the disclosure, instead of the one-letter symbols, for easier reading of the application and any patent issuing therefrom.

Jump to MPEP Source · 37 CFR 1.822(b)Sequence Listing FormatSequence Listing Requirements
StatutoryRequiredAlways
[mpep-2423-01-b4f7f553bdeb3965444ab6ff]
Amino Acid Symbols Must Use Upper Case For First Character And Lower Case For Remaining Two Characters
Note:
Applicants must use three-letter symbols for amino acids with the first character in upper case and the remaining two in lower case in the Sequence Listing.

In 37 CFR 1.822(b) and 37 CFR 1.822(d), the use of three-letter symbols for amino acids is required in the “Sequence Listing”. The three-letter symbols must be presented using the upper case for the first character and lower case for the remaining two characters. Applicants are encouraged to use the three-letter symbols for amino acids throughout the disclosure, instead of the one-letter symbols, for easier reading of the application and any patent issuing therefrom.

Jump to MPEP Source · 37 CFR 1.822(b)Sequence Listing FormatSequence Listing Requirements
StatutoryInformativeAlways
[mpep-2423-01-7968dda0f6135f6906c99706]
Amino Acid Symbols Must Be Used Consistently in Disclosure
Note:
Applicants are encouraged to use three-letter symbols for amino acids throughout the disclosure instead of one-letter symbols for easier readability.

In 37 CFR 1.822(b) and 37 CFR 1.822(d), the use of three-letter symbols for amino acids is required in the “Sequence Listing”. The three-letter symbols must be presented using the upper case for the first character and lower case for the remaining two characters. Applicants are encouraged to use the three-letter symbols for amino acids throughout the disclosure, instead of the one-letter symbols, for easier reading of the application and any patent issuing therefrom.

Jump to MPEP Source · 37 CFR 1.822(b)Sequence Listing FormatSequence Listing Requirements
StatutoryInformativeAlways
[mpep-2423-01-9c7ddd7a77e14bf9610d1b6d]
Format for Presenting Sequence Data Required
Note:
These paragraphs specify how characters in sequences should be grouped, spaced, presented, and numbered.

37 CFR 1.822(c) through (e) set forth the format for presenting sequence data. These paragraphs set forth the manner in which the characters in sequences are to be grouped, spaced, presented and numbered.

Jump to MPEP Source · 37 CFR 1.822(c)Sequence Listing FormatSequence Listing Requirements
Topic

Specification

1 rules
StatutoryPermittedAlways
[mpep-2423-01-fad0fbc038f08641eb21d70b]
Symbols Not to Be Listed in Sequence Listing
Note:
Modified base and modified amino acid symbols must not be included in the sequences recited in the 'Sequence Listing', but can be used in the written description and drawings.

37 CFR 1.822 sets forth the format and symbols to be used for listing nucleotide and/or amino acid sequence data. The symbols for representing the nucleotide and/or amino acid characters in the sequences are set forth in Appendices A and C to Subpart G of Part 1 of the CFR. See MPEP § 2422 (I). No other symbols shall be used in nucleotide and amino acid sequences. The “modified base” and “modified and unusual amino acid” symbols appearing in Appendices B and D to Subpart G of Part 1 of the CFR (see 37 CFR 1.822 and MPEP § 2422 (I)) are not to be set forth in the sequences recited in the "Sequence Listing”. However, “modified base” or “modified and unusual amino acid” symbols may be used in the written description and/or drawing portions of the specification. To properly enter notations for modified bases or amino acids in the “Sequence Listing”, the Feature section of the “Sequence Listing” should be used. That is, a modified base or amino acid may be presented in a given sequence as the corresponding unmodified base or amino acid if the modified base or amino acid is one of those listed in Appendices B and D to Subpart G of Part 1 of the CFR and the modification is also set forth in the Feature section of the “Sequence Listing”. Otherwise, all nucleotide bases or amino acids not appearing in Appendices A and C to Subpart G of Part 1 of the CFR must be listed in a given sequence as “n” or “Xaa,” respectively, with further information given in the Feature section of the “Sequence Listing” by including one or more feature keys listed in Appendices E and F to Subpart G of Part 1 of the CFR. See 37 CFR 1.822(b).

Jump to MPEP Source · 37 CFR 1.822SpecificationSequence Listing ContentSequence Listing Format
Topic

Sequence Listing Requirements

1 rules
StatutoryInformativeAlways
[mpep-2423-01-6ad250c7eaaf77ab1ed25371]
Format for Sequence Listing Data
Note:
This rule specifies how characters in sequence data must be grouped, spaced, presented, and numbered.

37 CFR 1.822(c) through (e) set forth the format for presenting sequence data. These paragraphs set forth the manner in which the characters in sequences are to be grouped, spaced, presented and numbered.

Jump to MPEP Source · 37 CFR 1.822(c)Sequence Listing RequirementsSequence Listing Format

Citations

Primary topicCitation
Sequence Listing Content
Sequence Listing Format
Specification
37 CFR § 1.822
Sequence Listing Content
Sequence Listing Format
Specification
37 CFR § 1.822(b)
Sequence Listing Format
Sequence Listing Requirements
37 CFR § 1.822(c)
Sequence Listing Format37 CFR § 1.822(d)
Sequence Listing Content37 CFR § 1.831(b)
Sequence Listing ContentMPEP § 2412
Sequence Listing Content
Sequence Listing Format
Specification
MPEP § 2422

Source Text from USPTO’s MPEP

This is an exact copy of the MPEP from the USPTO. It is here for your reference to see the section in context.

BlueIron Last Updated: 2025-12-31