B Clin data time_limit <- 180 META_clin <- readRDS("data/appendix/META_clin.rds")

%>% mutate(Status=if_else(OS_MONTHStime_limit,time_limit,OS_MONTHS)) %>% filter(PAM50!="NC") KM <- survfit(Surv(OS_MONTHS,Status)~PAM50, data=META_clin) plot_B <- ggsurvplot(KM, palette=paletteer_d("ggsci::light_uchicago")[c(1, 3:6, 9)], risk.table = TRUE, conf.int = TRUE, risk.table.y.text.col = T, #risk.table.col = "strata", risk.table.y.text = FALSE, tables.height=0.25, ylim=c(0.25,1), xlab="Time (in months)", ylab="Survival probability", break.time.by=30, fontsize=4, ggtheme=theme_pubclean())

plot_grid(plot_A, plot_grid(plot_B\(plot, plot_B\)table, nrow = 2, rel_heights = c(3,2)), ncol = 2, rel_widths = c(2,3.5), scale = 0.9, labels = "AUTO") ``` (ref:METABRIC-caption) Available omics and survival in METABRIC Breast Cancer dataset. (A) Number of patients for each omics type and their combinations, depicted as a Venn diagram. (B) Overall survival probability for all patients with clinical follow-up, stratified per breast cancer PAM50 subtype; administrative censoring at 180 months.

B.0.1 TCGA: Breast cancer

Another reference database for breast cancer is the one from the TCGA consortium (TCGA and others 2012). The cohort is smaller than METABRIC and its clinical follow-up is more limited. In contrast, the omics data are more comprehensive and include RNA sequencing and relative quantification of proteins with RPPA technology (Figure ??A).

```{r TCGA-bp, echo=FALSE, out.width = "70%", fig.cap='(ref:TCGA-bp-caption)', fig.scap='Available omics for TCGA Breast and Prostate cancer', fig.align='center', fig.height=5, fig.width=8}

References

TCGA, and others. 2012. “Comprehensive Molecular Portraits of Human Breast Tumours.” Nature 490 (7418). Nature Publishing Group: 61.