BINC BioInformatics Syllabus- Advanced

Sequence analysis

Scoring matrices: Detailed method of derivation of the PAM and BLOSUM matrices

Pairwise sequence alignments: Needleman and Wuncsh, Smith and Waterman algorithms and their implementation

Multiple sequence alignments (MSA): Use of HMM-based Algorithm for MSA (e.g. SAM method)

Taxonomy and phylogeny: Phylogenetic analysis algorithms such as maximum Parsimony, UPGMA, Transformed Distance, Neighbors-Relation, Neighbor-Joining, Probabilistic models and associated algorithms such as Probabilistic models of evolution and maximum likelihood algorithm, Bootstrapping methods, use of tools such as Phylip, Mega, PAUP

Sequence patterns and profiles:

Algorithms for derivation of and searching sequence patterns: MeMe, PHI-BLAST, SCanProsite and PRATT

Algorithms for generation of sequence profiles: Profile Analysis method of Gribskov, HMMer, PSI-BLAST

Protein and nucleic acid properties: e.g. Proteomics tools at the ExPASy server and GCG utilities and EMBOSS

Structural Biology

Identification/assignment of secondary structural elements from the knowledge of 3-D structure of macromolecule using DSSP and STRIDE methods

Prediction of protein structure: PHD and PSI-PRED methods

Tertiary structure: Detailed protocols/algorithms for Homology modeling, fold recognition and ab-initio approaches

Structures of oligomeric proteins and study of interaction interfaces

Molecular modeling and simulations

Macro-molecular force fields, salvation, long-range forces

Geometry optimization algorithms: Steepest descent, conjugate gradient

Various simulation techniques: MD, Monte Carlo, docking strategies etc

Molecular mechanics, conformational searches


Large scale genome sequencing strategies

Genome assembly and annotation

Genome databases of Plants, animals and pathogens


Gene networks: basic concepts, computational model such as Lambda receptor and lac operon

Prediction of genes, promoters, splice sites, regulatory regions: basic principles, application of methods to prokaryotic and eukaryotic genomes and interpretation of results

Basic concepts on identification of disease genes, role of bioinformatics-OMIM database, reference genome sequence, integrated genomic maps, gene expression profiling; identification of SNPs, SNP database (DbSNP). Role of SNP in Pharmacogenomics, SNP arrays

Basic concepts in identification of Drought stress response genes, insect resistant genes, nutrition enhancing genes


DNA microarray: database and basic tools, Gene Expression Omnibus (GEO), ArrayExpress, SAGE databases

DNA microarray: understanding of microarray data, normalizing microarray data, detecting differential gene expression, correlation of gene expression data to biological process and computational analysis tools (especially clustering approaches)

Comparative genomics:

Basic concepts and applications, BLAST2, MegaBlast algorithms, PipMaker, AVID, Vista, MUMmer, applications of suffix tree in comparative genomics, synteny and gene order comparisons

Comparative genomics databases: COG, VOG

Functional genomics:

Application of sequence based and structure-based approaches to assignment of gene functions – e.g. sequence comparison, structure analysis (especially active sites, binding sites) and comparison, pattern identification, etc. Use of various derived databases in function assignment, use of SNPs for identification of genetic traits

Gene/Protein function prediction using Machine learning tools viz. Neural network, SVM etc


Protein arrays: basic principles

Computational methods for identification of polypeptides from mass spectrometry

Protein arrays: bioinformatics-based tools for analysis of proteomics data (Tools available at ExPASy Proteomics server); databases (such as InterPro) and analysis tools

Protein-protein interactions: databases such as DIP, PPI server and tools for analysis of protein-protein interactions

Modeling biological systems

Systems biology – Use of computers in simulation of cellular subsystems

Metabolic networks, or network of metabolites and enzymes

Metabolic pathways: databases such as KEGG, EMP

Study of plant pathways –MetaCyc, AraCyc

Signal transduction networks

Gene regulatory networks

Bioinformatics Resources at the species level

ICTV Database, AVIS, VirGen, Viral genomes at NCBI, VBRC, VBCA, PBRC and Subviral RNA database, Species 2000, TreeBASE etc

Drug design

Drug discovery process

Role of Bioinformatics in drug design

Target identification and validation, lead optimization and validation

Structure-based drug design and ligand based drug design

Modeling of target-small molecule interactions

Vaccine design:

Reverse vaccinology and immunoinformatics

Databases in Immunology

B-cell epitope prediction methods

T-cell epitope prediction methods

Resources to study antibodies, antigen-antibody interactions

Structure Activity Relationship – QSARs and QSPRs, QSAR Methodology, Various Descriptors used in QSARs: Electronics; Topology; Quantum Chemical based Descriptors. Use of Genetic Algorithms, Neural Networks and Principle Components Analysis in the QSAR equations

Bio-Informatics National Certification (BINC) 2019 Syllabus