What is Sequence Analysis

We all observe different species around us some of them share astounding similarity, while some of them are entirely different. All the codes to these features lie within the DNA, RNA and peptide sequences of an organism. Therefore, the biotechnologists and molecular biologists spend long hours doing sequence analysis to understand the features, structures, functions and evolution of different species.

Bio-sequencing is an everyday practice to infer common ancestry, to detect functional equivalence, or simply while searching for similar entries in a database.

Furthermore, if we talk about the structures, as genetic sequences are made up of combinations of four bases – designated as A, C, G, and T (U). In the case of DNA, it is Thymine (T), while in the case of RNA it is Uracil (U). Also, the 20 amino acids each with different chemical properties – designated with single or triple letter codes – in the case of protein. They can only be interpreted using specialized computer software tools and within the context of patent eligibility or infringement issues, their structure and function value has gained more importance

The sequence analysis of these genetic sequences has a huge role in the patenting world. These are either chemical compounds or as information encoding elements.

Also, know the things that you should avoid doing Sequence Analysis.

The Methodology of Sequence Analysis

To carry out sequence listing it is better to refer to certain certified databases and to perform sequence alignment. About these methodologies we are going to elaborate further:

Sequence Alignment:

In simple terms, it is a method of arranging the base sequences of DNA, RNA or protein to find the regions of similarity. Also, these similar sequences are known as homologous genes.

Furthermore, this similarity can be a consequence of functional, structural or evolutionary connection between the species.

Biological Database Searches:

There are databases that are strictly under government control and updated from time-to-time. These include:


GenBank is the NIH genetic sequence database. It has an annotated collection of all publicly available DNA sequences. This database is also the fastest growing repositories of known genetic sequences. It is a part of the International Nucleotide Sequence Database Collaboration, which comprises the DNA DataBank of Japan (DDBJ), the European Nucleotide Archive (ENA), and GenBank at NCBI. Furthermore, these three organizations exchange data on a daily basis.

How you can search and retrieve data from GenBank?

  • Search GenBank for sequence identifiers and annotations with Entrez Nucleotide.
  • BLAST Search: BLAST is an acronym for Basic Local Alignment Search Tool. This tool enables you to search and align GenBank sequences to a query sequence. Also, it searches CoreNucleotide, dbEST, and dbGSS independently.


The EMBL is essentially a Nucleotide Sequence Database. It provides comprehensive results of DNA and RNA sequences collected from the scientific literature and patent applications. Also, this data is directly submitted from researchers and sequencing groups.

Also, the European Molecular Biology Laboratory Nucleotide Sequence Database is a comprehensive collection of primary nucleotide sequences maintained at the European Bioinformatics Institute (EBI). Data are received from genome sequencing centres, individual scientists and patent offices.


It is a freely accessible database of protein sequence and functional information, many entries being derived from genome sequencing projects. Furthermore, it contains a large amount of information about the biological function of proteins derived from the research literature.

Until recently, EBI and SIB together produced the Swiss-Prot and TrEMBL databases, while PIR produced the Protein Sequence Database (PIR-PSD). These databases coexisted with differing protein sequence coverage and annotation priorities.

Also read: WIPO Sequence Listings Standards for PCT Applications

The Function of Sequence Analysis in Bioinformatics:

In the field of Bioinformatics sequence analysis plays a vital role. The ultimate goal of aligning the biological or genetic sequences is to determine the similarity between different sequences. The applications of sequence alignment are as follows:

  • Gene finding: This procedure constitutes finding regions that encodes genes in the genomic structure. The other term for this process is Gene Prediction. This also includes both protein coding genes and RNA genes.
  • Function prediction: The alignment of sequences to determine if two genes are similar. Therefore, if we get to know the first function of these genes, we can assign the same function as the second.
  • Genome Sequence Assembly: the de-novo assembly is a very important task in bioinformatics. It relies on the alignments between short DNA sequences obtained by the new-generation sequencers.

Get Sequence Listing prepared with The Sequence Listing Company

If you have reached here you know how complex it can get to carry out sequence analysis. Also, looking for points of similarity in a sequence with millionth of length is as overwhelming and tedious as it sounds. Therefore, it is better to find an expert with years of experience preparing sequence listings for people like you out there.

So, if you are looking for one such expert ‘The Sequence Listing Company’ is at your end. We have years of experience preparing sequence listings as per the Patent Office Standards (USPTO and WIPO ST. 25). Moreover, we offer a quick turn-around-time to ensure timely delivery of services. Also, we take care of the pockets of our customers and hence offer budget-friendly rates.

To avail our service, visit The Sequence Listing Company.  

Other Related Articles:

Leave a Reply

Your email address will not be published.