Introduction to Bioinformatics
CBSE · Class 11 · Biotechnology
NCERT Solutions for Introduction to Bioinformatics — CBSE Class 11 Biotechnology.
Interactive on Super Tutor
Studying Introduction to Bioinformatics? Get the full interactive chapter.
Quizzes, flashcards, AI doubt-solver and a step-by-step study plan — built for ncert solutions and more.
1,000+ Class 11 students started this chapter today
EXERCISES
1Name the two modalities of analysis following sequencing.Show solution
Answer:
The two modalities of analysis following sequencing are:
1. De novo assembly – In this approach, the sequenced reads are assembled without the use of a reference genome. It is used when no reference genome is available for the organism under study.
2. Reference-guided (Genome-guided) assembly/mapping – In this approach, the sequenced reads are aligned or mapped to an already available reference genome. It is used when a well-annotated reference genome exists for the organism.
2Name any three major types of variants.Show solution
Answer:
Three major types of variants are:
1. Single Nucleotide Polymorphisms (SNPs) – A variation at a single nucleotide position in the genome where one nucleotide is substituted by another (e.g., A→G).
2. Insertions and Deletions (InDels) – Small insertions or deletions of one or more nucleotide bases in the DNA sequence.
3. Copy Number Variations (CNVs) – Variations in the number of copies of a particular segment of the genome; a segment may be duplicated or deleted, leading to more or fewer copies than normal.
*(Other acceptable answers include structural variants and inversions.)*
3What are disease-specific variants termed?
(a) somatic
(b) germlineShow solution
Justification:
Disease-specific variants (mutations that arise in an individual's body cells and are associated with diseases such as cancer) are termed somatic variants/mutations. These mutations occur in non-reproductive (somatic) cells and are not inherited by offspring. They are acquired during the lifetime of an individual and are responsible for conditions like cancer.
Germline variants, on the other hand, are inherited mutations present in the egg or sperm cells and are passed on from parents to offspring. They are not disease-specific in the same acquired sense.
4Which is the preferred tool for transcriptome assembly, in the de novo and genome-guided modalities?
(a) Tophat2
(b) TrinityShow solution
Justification:
Trinity is the preferred tool for transcriptome assembly in both the de novo and genome-guided modalities. It is specifically designed for the reconstruction of full-length transcripts from RNA-Seq data. Trinity can work without a reference genome (de novo) as well as with a reference genome (genome-guided mode).
Tophat2, in contrast, is primarily a read-alignment tool used to map RNA-Seq reads to a reference genome; it is not a transcriptome assembler.
5What is the difference between BLAT and BLAST?Show solution
Concept: Both BLAT (BLAST-Like Alignment Tool) and BLAST (Basic Local Alignment Search Tool) are used for sequence similarity searches, but they differ in speed, approach, and application.
| Feature | BLAST | BLAT |
|---|---|---|
| Full Form | Basic Local Alignment Search Tool | BLAST-Like Alignment Tool |
| Speed | Relatively slower | Much faster than BLAST |
| Database | Searches against a database of sequences | Searches against a pre-indexed genome |
| Best suited for | Searching protein/nucleotide databases (e.g., GenBank) | Aligning sequences to a large genome (e.g., human genome) |
| Sensitivity | High sensitivity, even for distantly related sequences | Less sensitive for highly divergent sequences; best for highly similar sequences |
| Use case | Homology searches across species | Rapid mapping of ESTs, mRNA, or short reads to a genome |
In summary: BLAST is more sensitive and is used for database searches across diverse sequences, while BLAT is faster and is preferred for aligning highly similar sequences (≥95% identity) to a genome.
6What came first? Structural Bioinformatics or Genome informatics?Show solution
Explanation:
- Structural Bioinformatics has its roots in the early work on protein structure determination and analysis. The Protein Data Bank (PDB), which stores 3D structural data of biological macromolecules, was established in 1971. Early computational methods to analyse and predict protein structures predate the genomics era.
- Genome Informatics (Genomics/Bioinformatics focused on genome sequences) gained prominence after the development of DNA sequencing techniques (Sanger sequencing, 1977) and especially with large-scale genome sequencing projects such as the Human Genome Project (initiated in 1990).
Therefore, Structural Bioinformatics preceded Genome Informatics, as the analysis of macromolecular structures began before large-scale genome sequencing became feasible.
7Name any two of the major classes of biological macromolecules.Show solution
The two major classes of biological macromolecules are:
1. Nucleic Acids – These include DNA (Deoxyribonucleic acid) and RNA (Ribonucleic acid). They store and transmit genetic information and play a central role in protein synthesis.
2. Proteins – These are polymers of amino acids and perform a vast array of functions in the cell, including catalysis (enzymes), structural support, transport, and signalling.
*(Other acceptable answers: Carbohydrates and Lipids.)*
8DNA sequences can be represented by which of the following data format?
(a) FASTQ
(b) FASTA
(c) AB1
(d) All of the aboveShow solution
Justification:
DNA sequences can be represented in all three formats:
- (a) FASTQ – A text-based format that stores both the nucleotide sequence and its corresponding quality scores. It is the standard output format from Next Generation Sequencing (NGS) platforms.
- (b) FASTA – A simple text-based format that represents nucleotide or protein sequences using single-letter codes. It begins with a '>' header line followed by the sequence.
- (c) AB1 – A binary file format produced by Sanger sequencing instruments (e.g., Applied Biosystems). It contains the raw chromatogram data along with the called DNA sequence.
Hence, all of the above formats can be used to represent DNA sequences.
9Can a phylogeny be produced directly from a multi-fasta file? Justify your answer.Show solution
Justification:
A multi-FASTA file contains multiple sequences in FASTA format (each with a '>' header and its sequence), but these sequences are not aligned to each other.
To construct a phylogenetic tree (phylogeny), the following steps are required:
Step 1 – Multiple Sequence Alignment (MSA): The sequences in the multi-FASTA file must first be aligned using tools such as ClustalW, MUSCLE, or MAFFT. This alignment identifies homologous positions across all sequences.
Step 2 – Phylogenetic Analysis: Only after alignment can a phylogenetic tree be constructed using methods such as:
- Neighbour-Joining (NJ)
- Maximum Likelihood (ML)
- Bayesian Inference
Using tools like MEGA, PhyML, or RAxML.
Conclusion: A multi-FASTA file is a prerequisite (raw input), but it must undergo multiple sequence alignment first before a phylogeny can be generated. Therefore, a phylogeny cannot be produced directly from an unaligned multi-FASTA file.
10Which tool can help you visualise variants in a circular manner?
(a) UCSC Genome Browser
(b) CIRCOS
(c) IGVShow solution
Justification:
CIRCOS is a software tool specifically designed to visualise genomic data in a circular layout. It is widely used to display chromosomal relationships, structural variants, copy number variations, and other genomic features in a circular, visually informative manner.
- UCSC Genome Browser displays genomic data in a linear track-based format.
- IGV (Integrative Genomics Viewer) also displays genomic data in a linear format and is used to visualise read alignments and variants along a reference genome.
Hence, CIRCOS is the correct tool for circular visualisation of variants.
11Why do we need to sequence nucleic acids? What can one gain by understanding the sequence of nucleic acids?Show solution
Why do we need to sequence nucleic acids?
Nucleic acids (DNA and RNA) carry the fundamental genetic information of all living organisms. Sequencing nucleic acids allows us to read this information at the molecular level. The need to sequence nucleic acids arises because:
1. The sequence of DNA determines the sequence of proteins, which in turn determines the structure and function of cells and organisms.
2. Variations in nucleic acid sequences are responsible for genetic diseases, evolutionary differences, and individual traits.
3. Understanding gene expression (via RNA sequencing) helps us know which genes are active under specific conditions.
What can one gain by understanding the sequence of nucleic acids?
By understanding the sequence of nucleic acids, one can gain the following:
1. Gene identification: Locate and identify genes within a genome, including their regulatory regions.
2. Understanding genetic diseases: Identify mutations, SNPs, InDels, and other variants responsible for hereditary or acquired diseases (e.g., cancer, sickle cell anaemia).
3. Evolutionary studies: Compare sequences across species to understand evolutionary relationships and construct phylogenetic trees.
4. Drug and vaccine development: Identify target genes/proteins in pathogens for the development of new drugs, vaccines, and diagnostic tools.
5. Transcriptomics: RNA sequencing reveals gene expression patterns, helping understand how organisms respond to environmental changes, diseases, or developmental stages.
6. Personalised medicine: Understanding an individual's genomic sequence enables tailored medical treatments based on their genetic makeup.
7. Forensic science: DNA sequencing is used in forensic identification and paternity testing.
In summary, sequencing nucleic acids is fundamental to modern biology and medicine, providing insights into the blueprint of life, disease mechanisms, and evolutionary history.
Stuck on a step?
Ask Super Tutor AI to explain any solution on this page in a simpler way — free, 24x7.
Ask a Doubt FreeFrequently Asked Questions
What are the important topics in Introduction to Bioinformatics for CBSE Class 11 Biotechnology?
How to score full marks in Introduction to Bioinformatics — CBSE Class 11 Biotechnology?
Where can I get free NCERT Solutions for Introduction to Bioinformatics Class 11 Biotechnology?
Sources & Official References
- NCERT Official — ncert.nic.in
- CBSE Academic — cbseacademic.nic.in
- CBSE Official — cbse.gov.in
- National Education Policy 2020 — education.gov.in
Content is aligned to the official syllabus. Refer to the board website for the latest curriculum.
More resources for Introduction to Bioinformatics
Important Questions
Practice with board exam-style questions
Syllabus
What topics to cover
Revision Notes
Key points for last-minute revision
Study Plan
Step-by-step plan to ace this chapter
Flashcards
Quick-fire cards for active recall
Formula Sheet
All formulas in one place
Chapter Summary
Understand the chapter at a glance
Practice Quiz
Test yourself with a quick quiz
Concept Maps
See how topics connect visually
For serious students
Get the full Introduction to Bioinformatics chapter — for free.
Quizzes, flashcards, AI doubt-solver and a step-by-step study plan for CBSE Class 11 Biotechnology.