Chromosomal instability is a frequent hallmark of cancer genomes and can be characterised by chromosomal rearrangements and viral integration events. Although high throughput sequencing and bioinformatics have allowed the discovery of such genomic aberrations, the potential of an integrative proteogenomic analysis to examine how such events alter protein expression and functions remains to be thoroughly examined.
Method: In this study, we analysed publicly available transcriptomic and proteomic data from 123 TCGA breast cancer samples. Transcriptomic analysis involved genome-guided and de novo assembly methods to discover chromosomal instability and contamination events from RNAseq experiments. The procedure was followed by proteomic analysis that included protein identification and quantitation using the database-searching algorithm on mass-spectrometry datasets.
Results: To this end, our study has provided further understanding of translocation and contamination events in the cohort. For instance, we identified the KANSL1-ARL17A rearrangement in 20% (25/123) of the samples across almost all the stages of the cancer development. Importantly, KANSL1 and ARL17A are cancer-causing genes involved in tumour suppression, DNA repair and movement of proteins within a cell. Although the fusion has previously been identified in breast and pancreatic cancers, through our integrative proteogenomic analysis, we can show how the canonical sequences of these proteins are partially expressed in tissues exhibiting the fusion, thus altering the normal functional landscape of the cell. In addition, through the analysis of microbial RNA, the Ralstonia pickettii (strain-12J) bacteria was discovered in 63% (78/123) of the samples. Further, the peptide “LGAIPGAGGTQR” from a predicted protein in the organism was identified using proteomics, validating the presence of the contaminant agent.
Conclusion: The method applied to breast cancer data shows the potential contribution of integrative proteogenomics to the understanding of the molecular biology of cancer and potential treatment through comparative genomics and discovery of novel features. On the other hand, the identification of foreign genomes suggests potential cross-contamination during sample collection and processing that might be leading to erroneous conclusions if disregarded.