Coexpression and coregulation analysis of time-series gene expression data in estrogen-induced breast cancer cell
1 Institute of Bioinformatics, University Medical Center Goettingen, University of Goettingen, Goldschmidtstrasse 1, D-37077 Goettingen, Germany
2 Department of Computer Science and Engineering, University of Kalyani, Kalyani-741235, India
3 Department of Computer Science and Engineering, Jadavpur University, Kolkata-700032, India
4 Machine Intelligence Unit, Indian Statistical Institute, Kolkata-700108, India
Algorithms for Molecular Biology 2013, 8:9 doi:10.1186/1748-7188-8-9Published: 23 March 2013
Estrogen is a chemical messenger that has an influence on many breast cancers as it helps cells to grow and divide. These cancers are often known as estrogen responsive cancers in which estrogen receptor occupies the surface of the cells. The successful treatment of breast cancers requires understanding gene expression, identifying of tumor markers, acquiring knowledge of cellular pathways, etc. In this paper we introduce our proposed triclustering algorithm δ-TRIMAX that aims to find genes that are coexpressed over subset of samples across a subset of time points. Here we introduce a novel mean-squared residue for such 3D dataset. Our proposed algorithm yields triclusters that have a mean-squared residue score below a threshold δ.
We have applied our algorithm on one simulated dataset and one real-life dataset. The real-life dataset is a time-series dataset in estrogen induced breast cancer cell line. To establish the biological significance of genes belonging to resultant triclusters we have performed gene ontology, KEGG pathway and transcription factor binding site enrichment analysis. Additionally, we represent each resultant tricluster by computing its eigengene and verify whether its eigengene is also differentially expressed at early, middle and late estrogen responsive stages. We also identified hub-genes for each resultant triclusters and verified whether the hub-genes are found to be associated with breast cancer. Through our analysis CCL2, CD47, NFIB, BRD4, HPGD, CSNK1E, NPC1L1, PTEN, PTPN2 and ADAM9 are identified as hub-genes which are already known to be associated with breast cancer. The other genes that have also been identified as hub-genes might be associated with breast cancer or estrogen responsive elements. The TFBS enrichment analysis also reveals that transcription factor POU2F1 binds to the promoter region of ESR1 that encodes estrogen receptor α. Transcription factor E2F1 binds to the promoter regions of coexpressed genes MCM7, ANAPC1 and WEE1.
Thus our integrative approach provides insights into breast cancer prognosis.