Skip to main content
Chapter 9 of 12
NCERT Solutions

Introduction to Bioinformatics

CBSE · Class 11 · Biotechnology

NCERT Solutions for Introduction to Bioinformatics — CBSE Class 11 Biotechnology.

Interactive on Super Tutor

Studying Introduction to Bioinformatics? Get the full interactive chapter.

Quizzes, flashcards, AI doubt-solver and a step-by-step study plan — built for ncert solutions and more.

1,000+ Class 11 students started this chapter today

11 Questions Solved · 1 Section

EXERCISES

1Name the two modalities of analysis following sequencing.Show solution
Given: A question about the two modalities of analysis that follow sequencing.

Answer:

The two modalities of analysis following sequencing are:

1. De novo assembly – In this approach, the sequenced reads are assembled without the use of a reference genome. It is used when no reference genome is available for the organism under study.

2. Reference-guided (Genome-guided) assembly/mapping – In this approach, the sequenced reads are aligned or mapped to an already available reference genome. It is used when a well-annotated reference genome exists for the organism.
2Name any three major types of variants.Show solution
Given: A question about major types of genomic variants.

Answer:

Three major types of variants are:

1. Single Nucleotide Polymorphisms (SNPs) – A variation at a single nucleotide position in the genome where one nucleotide is substituted by another (e.g., A→G).

2. Insertions and Deletions (InDels) – Small insertions or deletions of one or more nucleotide bases in the DNA sequence.

3. Copy Number Variations (CNVs) – Variations in the number of copies of a particular segment of the genome; a segment may be duplicated or deleted, leading to more or fewer copies than normal.

*(Other acceptable answers include structural variants and inversions.)*
3What are disease-specific variants termed?
(a) somatic
(b) germline
Show solution
Correct Option: (a) Somatic

Justification:

Disease-specific variants (mutations that arise in an individual's body cells and are associated with diseases such as cancer) are termed somatic variants/mutations. These mutations occur in non-reproductive (somatic) cells and are not inherited by offspring. They are acquired during the lifetime of an individual and are responsible for conditions like cancer.

Germline variants, on the other hand, are inherited mutations present in the egg or sperm cells and are passed on from parents to offspring. They are not disease-specific in the same acquired sense.
4Which is the preferred tool for transcriptome assembly, in the de novo and genome-guided modalities?
(a) Tophat2
(b) Trinity
Show solution
Correct Option: (b) Trinity

Justification:

Trinity is the preferred tool for transcriptome assembly in both the de novo and genome-guided modalities. It is specifically designed for the reconstruction of full-length transcripts from RNA-Seq data. Trinity can work without a reference genome (de novo) as well as with a reference genome (genome-guided mode).

Tophat2, in contrast, is primarily a read-alignment tool used to map RNA-Seq reads to a reference genome; it is not a transcriptome assembler.
5What is the difference between BLAT and BLAST?Show solution
Given: A question comparing two widely used sequence alignment tools — BLAT and BLAST.

Concept: Both BLAT (BLAST-Like Alignment Tool) and BLAST (Basic Local Alignment Search Tool) are used for sequence similarity searches, but they differ in speed, approach, and application.

| Feature | BLAST | BLAT |
|---|---|---|
| Full Form | Basic Local Alignment Search Tool | BLAST-Like Alignment Tool |
| Speed | Relatively slower | Much faster than BLAST |
| Database | Searches against a database of sequences | Searches against a pre-indexed genome |
| Best suited for | Searching protein/nucleotide databases (e.g., GenBank) | Aligning sequences to a large genome (e.g., human genome) |
| Sensitivity | High sensitivity, even for distantly related sequences | Less sensitive for highly divergent sequences; best for highly similar sequences |
| Use case | Homology searches across species | Rapid mapping of ESTs, mRNA, or short reads to a genome |

In summary: BLAST is more sensitive and is used for database searches across diverse sequences, while BLAT is faster and is preferred for aligning highly similar sequences (≥95% identity) to a genome.
6What came first? Structural Bioinformatics or Genome informatics?Show solution
Answer: Structural Bioinformatics came first.

Explanation:

- Structural Bioinformatics has its roots in the early work on protein structure determination and analysis. The Protein Data Bank (PDB), which stores 3D structural data of biological macromolecules, was established in 1971. Early computational methods to analyse and predict protein structures predate the genomics era.

- Genome Informatics (Genomics/Bioinformatics focused on genome sequences) gained prominence after the development of DNA sequencing techniques (Sanger sequencing, 1977) and especially with large-scale genome sequencing projects such as the Human Genome Project (initiated in 1990).

Therefore, Structural Bioinformatics preceded Genome Informatics, as the analysis of macromolecular structures began before large-scale genome sequencing became feasible.
7Name any two of the major classes of biological macromolecules.Show solution
Answer:

The two major classes of biological macromolecules are:

1. Nucleic Acids – These include DNA (Deoxyribonucleic acid) and RNA (Ribonucleic acid). They store and transmit genetic information and play a central role in protein synthesis.

2. Proteins – These are polymers of amino acids and perform a vast array of functions in the cell, including catalysis (enzymes), structural support, transport, and signalling.

*(Other acceptable answers: Carbohydrates and Lipids.)*
8DNA sequences can be represented by which of the following data format?
(a) FASTQ
(b) FASTA
(c) AB1
(d) All of the above
Show solution
Correct Option: (d) All of the above

Justification:

DNA sequences can be represented in all three formats:

- (a) FASTQ – A text-based format that stores both the nucleotide sequence and its corresponding quality scores. It is the standard output format from Next Generation Sequencing (NGS) platforms.

- (b) FASTA – A simple text-based format that represents nucleotide or protein sequences using single-letter codes. It begins with a '>' header line followed by the sequence.

- (c) AB1 – A binary file format produced by Sanger sequencing instruments (e.g., Applied Biosystems). It contains the raw chromatogram data along with the called DNA sequence.

Hence, all of the above formats can be used to represent DNA sequences.
9Can a phylogeny be produced directly from a multi-fasta file? Justify your answer.Show solution
Answer: No, a phylogeny cannot be produced directly from a multi-fasta file.

Justification:

A multi-FASTA file contains multiple sequences in FASTA format (each with a '>' header and its sequence), but these sequences are not aligned to each other.

To construct a phylogenetic tree (phylogeny), the following steps are required:

Step 1 – Multiple Sequence Alignment (MSA): The sequences in the multi-FASTA file must first be aligned using tools such as ClustalW, MUSCLE, or MAFFT. This alignment identifies homologous positions across all sequences.

Step 2 – Phylogenetic Analysis: Only after alignment can a phylogenetic tree be constructed using methods such as:
- Neighbour-Joining (NJ)
- Maximum Likelihood (ML)
- Bayesian Inference

Using tools like MEGA, PhyML, or RAxML.

Conclusion: A multi-FASTA file is a prerequisite (raw input), but it must undergo multiple sequence alignment first before a phylogeny can be generated. Therefore, a phylogeny cannot be produced directly from an unaligned multi-FASTA file.
10Which tool can help you visualise variants in a circular manner?
(a) UCSC Genome Browser
(b) CIRCOS
(c) IGV
Show solution
Correct Option: (b) CIRCOS

Justification:

CIRCOS is a software tool specifically designed to visualise genomic data in a circular layout. It is widely used to display chromosomal relationships, structural variants, copy number variations, and other genomic features in a circular, visually informative manner.

- UCSC Genome Browser displays genomic data in a linear track-based format.
- IGV (Integrative Genomics Viewer) also displays genomic data in a linear format and is used to visualise read alignments and variants along a reference genome.

Hence, CIRCOS is the correct tool for circular visualisation of variants.
11Why do we need to sequence nucleic acids? What can one gain by understanding the sequence of nucleic acids?Show solution
Answer:

Why do we need to sequence nucleic acids?

Nucleic acids (DNA and RNA) carry the fundamental genetic information of all living organisms. Sequencing nucleic acids allows us to read this information at the molecular level. The need to sequence nucleic acids arises because:

1. The sequence of DNA determines the sequence of proteins, which in turn determines the structure and function of cells and organisms.
2. Variations in nucleic acid sequences are responsible for genetic diseases, evolutionary differences, and individual traits.
3. Understanding gene expression (via RNA sequencing) helps us know which genes are active under specific conditions.

What can one gain by understanding the sequence of nucleic acids?

By understanding the sequence of nucleic acids, one can gain the following:

1. Gene identification: Locate and identify genes within a genome, including their regulatory regions.

2. Understanding genetic diseases: Identify mutations, SNPs, InDels, and other variants responsible for hereditary or acquired diseases (e.g., cancer, sickle cell anaemia).

3. Evolutionary studies: Compare sequences across species to understand evolutionary relationships and construct phylogenetic trees.

4. Drug and vaccine development: Identify target genes/proteins in pathogens for the development of new drugs, vaccines, and diagnostic tools.

5. Transcriptomics: RNA sequencing reveals gene expression patterns, helping understand how organisms respond to environmental changes, diseases, or developmental stages.

6. Personalised medicine: Understanding an individual's genomic sequence enables tailored medical treatments based on their genetic makeup.

7. Forensic science: DNA sequencing is used in forensic identification and paternity testing.

In summary, sequencing nucleic acids is fundamental to modern biology and medicine, providing insights into the blueprint of life, disease mechanisms, and evolutionary history.

Stuck on a step?

Ask Super Tutor AI to explain any solution on this page in a simpler way — free, 24x7.

Ask a Doubt Free

Frequently Asked Questions

What are the important topics in Introduction to Bioinformatics for CBSE Class 11 Biotechnology?
Introduction to Bioinformatics covers several key topics that are frequently asked in CBSE Class 11 board exams. Focus on the core concepts listed on this page and practise related questions to build confidence.
How to score full marks in Introduction to Bioinformatics — CBSE Class 11 Biotechnology?
Start by understanding all key concepts. Practise previous year questions from this chapter. Revise formulas and definitions regularly. Use flashcards for quick revision before the exam.
Where can I get free NCERT Solutions for Introduction to Bioinformatics Class 11 Biotechnology?
This page has free step-by-step NCERT Solutions for every exercise question in Introduction to Bioinformatics (CBSE Class 11 Biotechnology) — written the way examiners award marks: given, formula, working, answer.

Sources & Official References

Content is aligned to the official syllabus. Refer to the board website for the latest curriculum.

For serious students

Get the full Introduction to Bioinformatics chapter — for free.

Quizzes, flashcards, AI doubt-solver and a step-by-step study plan for CBSE Class 11 Biotechnology.