How should variable-length regions in sequences be represented in a Sequence Listing?
Variable-length regions in sequences should be represented in a Sequence Listing as follows:
- Repeat the variable-length region as many times as the maximum length.
- Specify in the Features section that the amino acid (or nucleotide) at a specified position is either absent or present.
The MPEP provides this guidance: Sequences that have variable-length regions depicted as, for example, Ala Ala Leu Leu (Xaa Xaa)n Ile Pro where n=0-234 or agccttgggaca(nnnnn)m gtcatt where m=0-354 or Ser Met Ala Xaa Ser where Xaa could be 1, 2, 3, 4 and/or 5 amino acids must still comply with the Sequence Rules. The method to use is to repeat the variable-length region as many times as the maximum length and specify in the Features section that the amino acid (or nucleotide) at a specified position is either absent or present.
(MPEP 2429)
This approach ensures that the Sequence Listing accurately represents all possible variations of the sequence while complying with the Sequence Rules.
To learn more: