Dr. Sumit K. Bag
Full Name: Dr. Sumit K. Bag
Designation: Sr. Principal Scientist
Address: Computational Biology Lab, Molecular Biology and Biotechnology Division CSIR-National Botanical Research Institute, Rana Pratap Marg, Lucknow-226001
Email Address: sumit.bag@nbri.res.in
Contact Number: 2297914

Research Interests:


The group has mainly been involved in the design and development of biological software development using Machine learning and Artificial Intelligence. We have also studied the gene expression analysis during plant development and stresses. We have investigated the regulatory genomics of Arabidopsis thaliana during stress. Our group is also interested in studying the evolution of the various gene families within Cotton sp.

Research

Research Details


Design and Development of Model based SNP pipeline in plants

Major issue now a days is to differentiate the false positive SNPs in biological systems using computational tools. This provides a motivation for developing the computational system which is able predict the SNPs in diploid and polyploid species and classify the potential SNPs and provides the complete biological detail of predicted SNPs. Features spanning around the SNPs sites, for classification of True SNPs have not yet been reported. Present work is an attempt to predict efficient SNPs in plant dataset. SNPs flanking nucleotide sequences of four six dataset i.e., Arabidopsis thaliana, Secale cereale, Solanum lycopersicum, Oryza sativa, Gossypium hirsutum and Triticum aestivum were analysed for the selection of the distinguishable patterns. This study represents the highly accurate prediction method capable to classify the potential SNPs by using features solely from the DNA flanking sequences. In this study, mono, di, tri and tetra nucleotide composition and binary composition were introduced to improve the prediction performance in machine learning classifiers trained on known SNPs sequences. As a result achieved high performance in terms of ROC ranged from 0.75 to 0.95 under 10 fold cross validation. Developed model have been integrated within complex pipeline having Graphical User Interface for ease of multiple users. Hence concluded that developed pipeline PLANET-SNP is a very good prospect to predict the potential SNPs and annotate through single system highly beneficial for the research community belong to non-computational area.


Evolutionary and conservation analysis of core promoter architecture in Gossypium hirsutum for fiber specific genes.

Gossypium hirsutum (AADD), allotetraploid cotton are more preferred for agriculture over its diploid progenitors Gossypium arboreum (AA) and Gossypium raimondii (DD). To illuminate the domestication process of Gossypium hirsutum during the speciation event, its genome was re-sequenced over the past few years by different independent groups and multi-dimensional data was generated to decipher the molecular mechanisms of fiber development. With the emergence of such big data in bioinformatics and high-performance computing power gives an opportunity to find out the significant biological information which was neglected somehow. In this series, we remapped the publicly available mock-treated or untreated RNAseq data to extract the fiber specific or exclusive genes through the data mining process. It is also important to understand the evolutionary pressure of cis-regulatory elements of fiber-related genes with their diploid progenitors to postulates a promoter architecture model that will correlate with its innate expression.


Genome-wide identification, functional and evolutionary analysis of Histone deacetylase 2 (HD2) gene family in Gossypium species

Cotton crops are mainly affected by biotic and abiotic stresses which cause loss of yields. Whiteflies (Bemissia tabacci) are main biotic stress factors to influence the growth of cotton yield. Drought and salt are the main causes for abiotic loss of cotton yield. In barley, it is reported that Histone deacetylase 2 (HD2) genes are involved in biotic and abiotic stresses and in Arabidopsis and rice involvement of HD2 genes shown in abiotic stresses. Our research has identified the nine cotton HD2 genes in Gossypium hirsutum and Gossypium barbadense and comprises conserved HD2s domain. These HD2 genes are in cotton also showing significant expression in both biotic and abiotic stress conditions. Therefore, the HD2 gene family may serve as important targets for the improvement of stress tolerance in cotton as well.  

Publications

Publications

  1. Prasad, P., U. Khatoon, R.K. Verma, S.V. Sawant, and S.K. Bag, Data mining of transcriptional biomarkers at different cotton fiber developmental stages. Functional & Integrative Genomics, 2022. 22(5): p. 989-1002.

  2. Bano, N., S. Fakhrah, R.A. Lone, C.S. Mohanty, and S.K. Bag, Genome-wide identification and expression analysis of the HD2 protein family and its response to drought and salt stress in Gossypium species. Front Plant Sci, 2023. 14: p. 1109031.

  3. A Bhardwaj & SK Bag, PLANET-SNP pipeline: PLants based ANnotation and Establishment of True SNP pipeline. Genomics 111 (5) (2019) 1066-1077.

  4. A Bhardwaj, YV Dhar, MH Asif & SK Bag, In Silico identification of SNP diversity in cultivated and wild tomato species: insight from molecular simulations. Scientific Reports 6 (2016), 38715.

  5. Dutta, P., P. Prasad, Y. Indoilya, N. Gautam, A. Kumar, V. Sahu, M. Kumari, S. Singh, A.K. Asthana, S. K. Bag, and Chakrabarty, D. Unveiling the molecular mechanisms of arsenic tolerance and resilience in the primitive bryophyte Marchantia polymorpha L. Environ Pollut, 2024. 346: p. 123506.
Patents
Scholars
  • Dr. Aproov Tewari (Project Scientist II)

  • Ms. Bhawna Pandey (Project Associate I)
  • Contact

    Computational Biology Lab

    Molecular Biology and Biotechnology Division

    CSIR-National Botanical Research Institute

    Rana Pratap Marg,
    Lucknow-226001

    Email: sumit.bag@nbri.res.in

    Phone: +91-522-2297914