Gene br Stage specific mRNA and
Gene 712 (2019) 143961
2.2. Stage specific mRNA and lncRNA selection from TCGA followed by mRNA – lncRNA-miRNA interaction network generation
TANRIC provided the Acetylcysteine data of lncRNAs from TCGA (Li et al., 2015) and that of mRNAs was directly derived from TCGA whereas the clinical data of corresponding patients were downloaded from FireBrowse. Significant mRNAs and lncRNAs were identified by Bayesian t-test followed by Benjamini-Hochberg correction for reducing false discovery rate after stage based sample stratification. Stage spe-cific significant RNAs were selected from Venn diagram considering common significant mRNAs and lncRNAs between one stage and the other three stages using InteractiVenn (Heberle et al., 2015). Similarly, significant mRNAs for each of the stages were also identified from two more GEO datasets (GSE52903 and GSE29817) apart from TCGA and common significant mRNAs obtained from three datasets were utilized for final analysis. If no common mRNAs were found within all three stage based selection, common mRNAs between any two sets were considered for further analysis, but for lncRNA, only the common set among all the three data sets was chosen. Cell based validated mRNA-miRNA and lncRNA-miRNA binding association information was ob-tained from Tarbase v8 (Karagkouni et al., 2018) and lncBase v2 ex-perimental module (Paraskevopoulou et al., 2013; Paraskevopoulou et al., 2016), respectively, using selected mRNAs and lncRNAs as in-puts. Finally, important mRNA-miRNA-lncRNA association networks were visualised using Cytoscape software (Su et al., 2014) for miRNAs validated to interact with both the selected mRNAs and lncRNAs.
2.3. Oncoprint, DNA methylation status and co-occurrence analysis of the mRNAs selected from TCGA and survival analysis of mRNAs as well as the validated interacting miRNA partners of selected mRNAs and lncRNAs identified from TCGA
Furthermore, for the validated interacting miRNA partners of se-lected mRNAs and lncRNAs, miRNA family enrichment analysis was performed using miRNet tool (Fan et al., 2016; Fan and Xia, 2018) and the common miRNAs in the enriched families were identified using InteractiVenn (Heberle et al., 2015). Oncoprint (Unberath et al., 2019), DNA methylation status and mRNA co-occurrence analyses were per-formed for selected mRNAs using cBioPortal (Gao et al., 2013). Cox regression based survival analysis was also performed for validated interacting miRNA partners of selected mRNAs and lncRNAs.
2.4. Pathway, GO and DSigDB analysis of mRNAs
Reactome pathway analysis, GO analysis for biological process (BP), molecular function (MF), cellular component (CC) and DSigDB analysis were performed after pooling all mRNAs selected in this study using Enrichr (Kuleshov et al., 2016). For predicting the functional role of the selected lncRNAs, Wikipathway analysis was performed from lncRNA – mRNA co-expression networks prepared using ENViz, a cytoscape app (Steinfeld et al., 2015). Finally, individual selected mRNA related in-formation was mined using Genecards (Stelzer et al., 2016). DSigDB analysis for the selected mRNAs was also performed as recent studies have often utilized gene expression signatures for identifying potential drug candidates of a particular disease using system pharmacology approach (Yoo et al., 2015). Outline of the methodology used for sta-ging marker selection and their biological function prediction is pro-vided in Fig. 1.
Fig. 1. Outline of the proposed methodology.
3.1. mRNA and lncRNA selection from microarray data and TCGA
Stage-specific selected lncRNAs and mRNAs from TCGA, validated in two GEO datasets (GSE52903 and GSE29817) are listed in Tables 1 and 2 lists the 52 DE genes selected from microarray. The number of patients in different stages in varied datasets used in the study for mRNA and lncRNA selection is provided in Supplementary Table S1, while details of the selected mRNAs and lncRNAs are provided in Supplementary Tables S2, S3 and S4. Numbers of individual as well as common mRNAs selected from TCGA, GSE52903 and GSE29817 and lncRNAs from TCGA are shown in Venn diagram (Fig. 2). The details of names of each mRNAs and lncRNAs found either as common in two datasets or exclusive in each set found via the Venn diagram are pro-vided in supplementary spreadsheets. Three mRNAs (CXCL10, HBA2 and SLC2A1) were found in both TCGA and microarray data. When DE mRNAs from GEPIA (Tang et al., 2017) were compared with the se-lected mRNAs from TCGA, it was noted that HBA1, HBA2 and HBB were downregulated, while CXCL10 was upregulated in CC. KRT5, CD24, KRT16, CXCL10, NKG7, LAMC2 and LAMP3 were upregulated, while ACTG2, MYH11, FBLN1 and DST were downregulated among the selected mRNAs from microarray.