Q-omics-based identification of candidate molecular features associated with poor prognosis in breast cancer
Query
Molecular features associated with poor prognosis in breast cancer
Workflow
Literature Discovery Suggested Task Selection Survival Analysis Consensus Analysis NetCrafter
Analysis
In this tutorial, OmixMind extracts insights from the literature, and Q-omics suggests relevant analysis tasks. We then run a recommended survival analysis to examine prognostic impact and explore consensus data across different conditions and lineages.
Insight
Survival rates differed significantly depending on the enrichment of specific infiltrating cells. Higher enrichment of some cell types was linked to poorer survival, while other cell populations showed the opposite trend.
Step 1Query Search
OmixMind
Discovering the latest research trends Exploring relevant public datasets Obtaining significant data analysis results
Step 1 Q-omics Query Search: Screenshot of the OmixMind interface with the query 'molecular features associated with poor prognosis in breast cancer' entered. Users can either import their own omics data (User omics data analysis / Import) or run Insights or Data mining on the query.
Action — Try the Insights button to generate literature trends and categories.
Step 2Literature Insight Generation
Insights
Identifying the most relevant trending papers from the last 10 years Organizing them into five main research categories related to the user query Suggesting relevant tasks available within Q-omics
Task 1: Predict top 50 RNA prognostic biomarkers in breast cancer Interactive
The leading 50 RNA expression features are ranked by their association with clinical survival outcomes in breast invasive carcinoma. These high-priority prognostic candidates are validated for technical reproducibility across diverse sampling splits to ensure robust findings within the cohort.
Task 2: Concordant Tumor-Upregulated and Poor-Prognosis Survival Hazards Computed
This task joins precomputed normal-vs-tumor and survival analyses to pinpoint genes that are significantly upregulated in breast cancer tumors and whose high expression is consistently linked to poor overall survival.
Task 3: On-demand Vital Status Expression Screening Computed
Computes a live statistical comparison (Welch's t-test) of raw RNA expression levels between deceased (dead) and living (alive) breast cancer patients in the TCGA BRCA cohort to screen for unstratified prognosis markers.
Task 4: Somatic Mutation Association with Survival Days Computed
Calculates the statistical impact of somatic mutations on patient survival duration (OS Days) in BRCA by comparing the survival timelines of mutant carriers versus wild-type patients across frequently mutated genes.
Task 5: Multi-Omics Transcription-Translation Concordance of Prognostic Biomarkers Computed
Performs an on-demand RNA-vs-protein correlation on CPTAC breast cancer tissues for top-ranked precomputed prognostic survival hits, filtering for genes with highly coupled transcriptional and translational hazard profiles.
Action — Try Task 1: Predict top 50 RNA prognostic biomarkers in breast cancer to see the results.
Step 3Survival Analysis
Overview & Findings
Examining survival rate variations across multiple sampling methods
Identifying 50 genes with significant survival impact (p < 0.01)
-29 genes RED Higher enrichment predicts worse survival
-21 genes BLUE Higher enrichment predicts better survival
Select a specific cell to perform the following analyses
-Consensus samplings
-Consensus lineages
-Kaplan-Meier plot
-NetCrafter
Table: RNA expression of 50 genes associated with patient survival in BRCA (Breast invasive carcinoma). Columns show gene symbol, measure (OS/DFS), sample split, gender, stage, period, AUC Group1 and Group2, p-value, and sampling and lineage consensus counts. Rows colored by prognostic direction (red = worse survival, blue = better survival).
Action — Select LINC02070 as an example to explore the detailed validation results.
Kaplan-Meier Plot
Higher enrichment (> 66.7%) of LINC02070 infiltrating cells is significantly associated (p < 0.001) with worse survival in breast cancer under the following sampling options:
Cohort sampling process
-Patients of all stages and genders were included and divided into tertiles based on cell enrichment levels. The highest and lowest enrichment groups included 26 and 1,042 patients, respectively, and were compared in the survival analysis. Overall survival (OS) was used to assess long-term clinical outcomes.
Kaplan-Meier overall survival (OS) plot for LINC02070 in BRCA using a Tertile split. Group 1 (high >66.7% expression, n=26, red curve) shows markedly worse survival, dropping to about 0.5 and then falling sharply by around 90 months, while Group 2 (low ≤33.3% expression, n=1042, blue curve) declines gradually to about 0.3 over roughly 290 months (log-rank p < 0.001).
Step 4Consensus Analysis
Consensus Samplings
Determining how consistently the cell remains significantly associated with survival (p < 0.05) across varying clinical conditions
-Measure
Survival metric to evaluate (Overall Survival, Disease-Free Survival)
-Sample split
Cutoff method to divide cell enrichment groups (Median, Tertile, Quartile)
-Gender
Patient sex to include (Male, Female)
-Stage
Tumor progression phases to include (Stage I-IV, etc.)
-Period
Clinical observation timeframe (3-year, 5-year, Extended)
Identifying a total of 120 significant sampling options for LINC02070
Q-omics consensus sampling list for LINC02070, showing 120 significant sampling options. Each row lists a combination of measure (DFS), sample split (Median, Quartile, Tertile), gender, stage, and period (3-year, 5-year, Extended) with AUC Group1 and Group2 values and p-values (p < 0.001), using 216 sampling groups.
Consensus Lineages
Examining if the survival association is breast cancer-specific or conserved across other lineages
Identifying significant survival associations across 6 tumor lineages in relation to LINC02070 enrichment
Q-omics consensus lineages and sampling options table: list of 6 lineages (BRCA, KIRC, PCPG, SKCM, STAD, THCA) significantly associated with LINC02070 expression in patient survival analysis. Columns include measure, sample split, gender, stage, period, AUC for groups 1 and 2, p-value, and sampling consensus count.
Step 5NetCrafter
An ontology-driven platform for constructing de novo gene networks that are specific to each input gene list and quantitatively defined by ontology-weighted similarity
Networks are further decomposed into optimal Leiden sub-networks, facilitating multifunctional interpretation and the identification of gene interaction hotspots
Gene network of shared ontologies table. Columns: gene symbol, # total functions, # significant functions, measure, sample split, gender, stage, period, AUC Group1, AUC Group2, p-value, sampling consensus, and lineage consensus. Rows include LCE1A, HSPB3, COX7B2, IGHV3-38, DNAAF4, and CCNE1. 4300 total genes available for function analysis; 4300 selected genes.
Subnetwork decomposition view of the gene co-expression network for the selected prognostic genes: all nodes shown uniformly in red over gray edges, with labeled hub genes PCDHGA1 and CSBG2.
Subnetwork Decomposition
Leiden community-detection (subnet) view of the same gene network: nodes colored by detected module, with distinct red, orange, green, blue, and yellow clusters and a dense orange community at lower left; labeled gene PCDHGA1.
Leiden Subnet
Functional genes view of the network: most nodes shown in gray with a subset of functionally relevant genes highlighted in color (purple, green, cyan, and black).
Functional Genes