How to use

解析対象の症例・遺伝子を選択します

必要があれば設定をデフォルトから変更します
フィルタリング後にResultsのタブを選択し、解析結果を閲覧します

Required: C-CAT data files (zipped files are acceptable)

Choose case CSV Files

Browse...

Choose report CSV Files

Browse...

Download sample clinical data

Download sample report data

Filter on histology

File import

解析対象の症例・遺伝子変異リストをインポートします

Analyze with new dataset をNoにすると全症例、または前回解析症例があれば取り込まれます
Yesにすると、アップロードしたCSVファイルをインポートします

Filters for clinical information

Filters on genes

For detailed study of mutations of a gene

Other Settings

Figures in results are downloadable as png files.

FELIS; Functions Especially for LIquid and Solid tumor clinical sequencing.

https://github.com/MANO-B/FELIS

The following settings are for advanced analysis only

動作はしますが、とくに変更は不要です

Option files

Correspondence table between ID and histology (CSV)

Browse...

Specify the ID of the patient whose diagnosis you want to correct and the modified histology.

Download CSV file template

Correspondence table between ID and drug information (CSV)

Browse...

To analyze only the drug after curation

Download CSV template (before CGP)

Download CSV template (after CGP)

Correspondence table for drug renaming (CSV)

Browse...

To analyze similar drugs together by renaming the drugs
Reclassified as molecular targeted therapies, immune checkpoint inhibitors, etc.
Drugs are listed in ABC order, separated by commas

Download CSV template

Correspondence table for drug combination renaming (CSV)

Browse...

To rename drug combinations to groups
Reclassified as molecular targeted therapies, immune checkpoint inhibitors, etc.
Drugs are listed in ABC order, separated by commas

Download CSV template

Option files

Correspondence table for mutation renaming (CSV)

Browse...

To analyze similar mutations together by renaming mutations
Reclassified as Exon 19 mutation, Exon 20 mutation, gene amplification, etc.
'Other' if there is an unspecified mutation in the designated gene

Download CSV template

Correspondence table for mutation reannotation (CSV)

Browse...

To reannotate variants
F: pathogenic variants, G: neutral variants

Download CSV template

Correspondence table of histological type renaming (CSV)

Browse...

To analyze similar tissue types together by renaming them
Reclassified as differentiated gastric cancer, undifferentiated gastric cancer, etc.

Download CSV template

Correspondence table for drug renaming based on regimen (CSV)

Browse...

Provide the drug name if the drug name is unknown and the regimen is known.
Conversion from MAP therapy to 'Cisplatin,Doxorubicin,Methotrexate'

Download CSV template

Analysis setting

If you select No, csv files may not be necessary.

If you select Yes, faster when performing the same analysis repeatedly.

Click button after setting modification

Table. Characteristics for selected patients. The present, retrospective cohort study was performed with clinicogenomic, real-world data on the patients who were registered in the C-CAT database from June 1, 2019. The patients were registered by hospitals throughout Japan and provided written informed consent to the secondary use of their clinicogenomic data for research.

Figure. Recurrent oncogenic mutations in selected cases. The 30 genes with the highest frequency of oncogenic mutations are shown. Mutational landscapes were created using ComplexHeatmap package for R.

Click button after setting modification

Figure. Frequency of oncogenic mutations in the selected gene. The most frequent oncogenic mutations are shown with amino acid change.

Mutplot by Zhang W, PMID:31091262. If error occurs, correct 'source/UniPlot.txt'.

Protein structure source: Uniprot

Github for Mutplot. Link for the website

Hidden Download

Figure for probability

Figure for odds ratio

Figure. Alterations among mutually exclusive or co-occurring pairs. The 30 genes with the highest frequency of oncogenic mutations were selected to determine whether oncogenic mutations are likely to occur simultaneously between the two genes. Blue boxes indicates mutually exclusivity and red boxes indicates co-occurrence. An asterisk shows a significant correlation (p < 0.001). Analysis was performed with Rediscover package in R language. Odds ratios were estimated by Fisher exact test. An odds ratio less than 1 does not necessarily correspond to mutual exclusivity as evaluated by the negative binomial distribution. This statement highlights that a low odds ratio (OR < 1) indicates a negative association between two events but does not inherently imply mutual exclusivity. In statistical modeling, particularly with count data, the negative binomial distribution is often employed to account for overdispersion. However, the interpretation of mutual exclusivity requires careful consideration beyond the OR value alone. For instance, in the context of count data, the negative binomial regression model can be used to estimate the odds of an event occurring. However, the OR derived from such models may not fully capture the complexity of mutual exclusivity between events. Factors such as overdispersion and the underlying data distribution can influence the interpretation of the OR. Therefore, while an OR less than 1 suggests a negative association, it should not be solely relied upon to infer mutual exclusivity, especially when using models like thenegative binomial distribution. A comprehensive analysis considering the specific context and model assumptions is essential for accurate interpretation.

Figure. Recurrent oncogenic mutations across subtypes. The 30 genes with the highest frequency of oncogenic mutations were displayed.

Figure. Distribution of age, sex, detected oncogenic mutations, tumor mutation burden (TMB), metastasis pattern, patients with treatment option recommended by the expert panel, patients received recommended chemotherapy, mediantime from the initiation date of the first palliative chemotherapy to CGP, and median time from CGP to final observation. In the boxplots of age and TMB, the box borders indicate the 25th and 75th percentiles, the inner line the median, and the whiskers 1.5× the interquartile range.

Summary, cluster and mutated gene

Summary, cluster and histology

Raw data

Figure. Unsupervised clustering of the patients based on the detected oncogenic mutations. Two-dimensional mutational pattern mapping was generated using Uniform Manifold Approximation and Projection (UMAP). The three variants and histotypes with the highest odds ratios that were more common than the other clusters at p<0.05. Clustering analysis was performed as follows. All pathogenic mutations detected by the cancer-related genes were assembled into a binary matrix format per patient. The dimension of this input matrix was reduced using Uniform Manifold Approximation and Projection (UMAP) via the umap package for R (with default hyperparameters). Clustering analysis was performed using the Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) via the dbscan package for R (EPS: 1.0; minimum points: 3).

Mochizuki T, et al., Factors predictive of second-line chemotherapy in soft tissue sarcoma: An analysis of the National Genomic Profiling Database. Cancer Science, 2023. Link for the paper

Figure. Unsupervised clustering based on oncogenic mutations. For each histology, the percentage of cases belonging to one of the clusters is shown as a bar graph. Characteristic oncogenic mutations were found in each cluster. There was a tendency for each histologic type to cluster in specific clusters. Heterogeneity of genetic variation within histologic types was assessed by Shannon's entropy. Low Shannon entropy values indicate that the genetic mutation pattern of the tumor is uniform, while high values indicate diversity.

Figure. Level of evidence for targeted therapy for detected gene mutations. The highest level of evidence was extracted for each patient. Evidence levels of C-CAT are defined as A for biomarkers that predict a response to Japanese Pharmaceuticals and Medical Devices Agency (PMDA)– or FDA-approved therapies or are described in professional guidelines, B for biomarkers that predict a response based on well-powered studies with consensus of experts in the field, C for biomarkers that predict a response to therapies approved by the PMDA or FDA in another type of tumor or that predict a response based on clinical studies, D for biomarkers that predict a response based on case reports, E for biomarkers that show plausible therapeutic significance based on preclinical studies.

Survival difference will be evaluated with restricted mean survival time in this section. Analysis with hazard ratio is also provided in 'Overall survival with risk-set adjustment' section (Survival analysis start date = CGP test date).

Figure. Survival analysis after CGP test using the conventional Kaplan–Meier estimator, log–rank test were undertaken with survival package for R. EP: expert panel. RMST, restricted mean survival time.

Raw data

Figure. Survival analysis after CGP test using the conventional Kaplan–Meier estimator, log–rank test were undertaken with survival package for R. EP: expert panel. RMST, restricted mean survival time.

Group 1

Group 2

Propensity score-based adjustment

Propensity score matching

Inverse probability weighting

Threshold for IPW

Download love plot of PS-matching Download love plot of IPCW Download IPW weight distribution Download IPCW weight distribution Download PS distribution

To reduce confounding between the two groups, we performed propensity score matching. The propensity score was estimated using a logistic regression model including prespecified clinically relevant covariates (CGP platform, sex, age, PS, histology, treatment lines before CGP, and the best treatment effect before CGP). Patients were matched 1:1 using nearest‐neighbor matching without replacement on the logit of the propensity score (MatchIt package, method = “nearest”, distance = “logit”). A caliper width of 0.2 on the logit scale was applied to restrict matches to comparable individuals. Matched sets were identified using the MatchIt subclass variable, and each subclass was treated as a matched pair for following analyses.

Covariate balance before and after matching was evaluated using standardized mean differences (SMDs) with the cobalt package. Adequate balance was defined a priori as an absolute SMD < 0.1 for all covariates. Balance diagnostics were visualized using Love plots. The maximum absolute SMD after matching was additionally reported to provide a single summary measure of balance.

Because propensity score matching induces dependence within matched pairs, confidence intervals for the RMST difference were obtained using nonparametric bootstrap resampling at the matched‐pair level. Specifically, matched pairs were resampled with 2000-time replacement, RMST differences were recalculated for each bootstrap replicate, and the 2.5th and 97.5th percentiles of the bootstrap distribution were used to derive a two‐sided 95% confidence interval.

To account for baseline imbalances between treatment groups, we applied inverse probability of treatment weighting (IPTW) based on the propensity score (PS). The PS was estimated using a logistic regression model including prespecified baseline covariates. Stabilized weights were constructed to estimate the average treatment effect (ATE).

Weighted Kaplan–Meier (KM) survival curves were estimated using case weights corresponding to the IPTW. This approach yields survival functions representing a pseudo-population in which the distribution of measured baseline covariates is balanced between treatment groups. All survival times were analyzed on the original time scale (days).

Under IPTW, group-specific survival functions were estimated using weighted KM estimators. Because the KM estimator is a right-continuous step function, RMST was computed by exact integration of the step function, without numerical approximation. Specifically, RMST was calculated as the sum over successive time intervals of the interval length multiplied by the survival probability at the beginning of the interval. This yields an exact estimate of the area under the weighted KM curve up to the defined time.

Confidence intervals (CIs) for the IPTW-adjusted RMST difference were obtained using a nonparametric bootstrap procedure. When matched pairs were available, resampling was performed at the pair level. Otherwise, bootstrap samples were generated using probability-proportional-to-size resampling, with sampling probabilities proportional to the IPTW weights, reflecting each individual’s contribution to the weighted pseudo-population. Within each bootstrap sample, RMST was recalculated using the same weighting scheme, and the RMST difference was re-estimated.

The 95% CI was derived from the empirical 2.5th and 97.5th percentiles of the bootstrap distribution. This approach captures sampling variability of the weighted survival process while preserving the time scale and interpretation of RMST in days.

Between-group differences in survival distributions were assessed using weighted log-rank–type tests. Test statistics were constructed as weighted score statistics accumulated over observed event times, with weights derived from the IPTW and, for Wilcoxon-type tests, additional weighting based on the pooled weighted survival function. P-values were obtained from chi-square distributions with one degree of freedom. Additionally, weighted Cox proportional hazards models with robust variance estimation were fitted to estimate hazard ratios, with stratification applied when matched pairs were present.

All analyses were conducted using the survival, MatchIt, and cobalt packages.

Survival difference is evaluated with restricted mean survival time in this section. Analysis with hazard ratio will be also provided in 'Overall survival with risk-set adjustment' section (Survival analysis start date = CGP test date).

Figure. Suvival periods after CGP and gene mutations estimated with conventional Kaplan-Meier estimator. Restricted mean survival time in two years (days) were estimated with survRM2 package in R.

If there are too many histology subtypes, multivariable analysis may fail. Go to Settings and set: “Analyze without detailed histology” → “Yes, use OncoTree 1st level”.

Table. Univariable and multiple variable regression analysis were performed with gtsummary package for R. Factors in the multivariable analysis were selected by variable decreasing method based on the Akaike information criterion with stat package for R. Variables with variance inflation factor > 10 were excluded to eliminate multicollinearity.

Figure. Treatment reach rate and survival period after CGP. We estimated the cumulative incidence function (CIF) using Gray’s method for competing risk analysis.Treatment initiation was defined as the primary outcome, death was considered a competing event, and censoring was treated as informative.The cumulative incidence at each time point t was calculated as the probability of experiencing the specified event by t, reflecting the conditional probability in the presence of competing events.If the recommended treatment was confirmed to have been administered but the treatment initiation date was unknown, the treatment was assumed to have started at the median of the entire observation period.Statistical analyses were performed using the cmprsk package in R, and CIFs were estimated with 95% confidence intervals.We used competing risk analysis instead of the conventional Kaplan–Meier method for the following reasons:Presence of competing events: Patients who die permanently lose the opportunity to receive treatment, making death a competing event.Bias avoidance: Treating deaths as simple censored observations in Kaplan–Meier analysis could overestimate the treatment initiation rate.Clinical interpretability: CIFs provide probabilities that more accurately reflect event occurrence as observed in real-world clinical settings.

Figure. Based on clinical information, a nomogram was developed to predict treatment reach. The nomogram was created with the lrm function of the rms package for R with a setting of penalty=1. Nagelkerke R2 was calculated with blorr package for R. Possible sampling bias was corrected with 500-time bootstrap sampling and then concordance index was estimated. Best_effect: the best treatment effect of CTx before CGP.

Table. Univariable and multiple variable regression analysis were performed with gtsummary package for R. Factors selected in the multivariable analysis were selected by variable decreasing method based on the Akaike information criterion with stat package for R. Variables with variance inflation factor > 10 were excluded to eliminate multicollinearity.

Figure. Decision curve analysis was performed to verify the usefulness of the nomogram with dcurves package for R. Ten-fold cross-validation was performed to prevent overfitting. If the blue line is located above the other lines, then the nomogram-based decision to perform CGP testing may be worthwhile.

Figure. Predicted treatment reach rate and Receiver Operatorating Characteristic curve of the nomogram using pre-CGP information by pROC package for R. The nomogram was based on logistic regression analysis, random forest model, and LightGBM model, all of which calculated sensitivity and specificity using prediction results from 5-fold cross-validation and plotted ROC curves. The random forest model and LightGBM model used 5-fold cross-validation on a single training set and performed a grid search with n=8 for each parameter to determine the optimal parameters.

Download raw data

Not shown when machine learning is not performed

Clinical information 1

Clinical information 2

Metastasis information

Figure. Overall survival after the first survival-prolonging chemotherapy after adjusting for left-truncation bias. To evaluate the association between prognostic factors and survival, a risk-set (number at risk) adjustment model was applied to adjust for left-truncation bias with survival package. Note that the analysis assumes quasi-independent left-truncation (conditional Kendall tau = 0).

Take care of left-truncation bias.

Raw data

Figure. Overall survival after the first survival-prolonging chemotherapy after adjusting for left-truncation bias. To evaluate the association between prognostic factors and survival, a risk-set (number at risk) adjustment model was applied to adjust for left-truncation bias with survival package. Note that the analysis assumes quasi-independent left-truncation (conditional Kendall tau = 0).

Take care of left-truncation bias.

Group 1

Group 2

Figure. Overall survival after the first survival-prolonging chemotherapy after adjusting for left-truncation bias. To evaluate the association between prognostic factors and survival, a risk-set (number at risk) adjustment model was applied to adjust for left-truncation bias with survival package. Note that the analysis assumes quasi-independent left-truncation (conditional Kendall tau = 0).

Group 1

Group 2

この解析手法について（クリックして詳細を表示）

本アプリでは、リアルワールドデータ（RWD）であるCGP検査コホートの生存期間を、院内がん登録（マクロデータ）で得られる 年齢階級別（1～5年）OS に整合するように補正し、遺伝子変異や臨床因子の 独立した予後インパクト（Time Ratio; TR） を推定します。

ただし、観測できない背景因子（例：PS、臓器機能、治療適格性、医師の選択）に由来する 残余交絡 は完全には除去できません。したがって本解析は「因果効果の断定」ではなく、 観測可能な範囲で標準化した上での関連（予後インパクト） として解釈してください。

補正で扱っている3つの主要な歪み

年齢構成の違い（Case-mixのズレ）: CGPコホートは一般集団（院内がん登録）と年齢分布が異なることが多く、そのまま比較すると生存曲線が歪みます。
CGP到達の選択（生存者バイアス／左側切断に類似）: CGP検査を受けた患者は「検査に到達できるまで生存した」集団であり、診断直後に死亡した患者は観測されません。
検査到達時期と予後の相関（Dependent truncation）: 診断からCGPまでが短い患者は進行が速いことが多く、検査後の余命も短いという生物学的な相関が生じます。

改善した補正アプローチ：年齢階級別OSキャリブレーション + モデル推定

外部情報（院内がん登録）の取り込み: 院内がん登録から得られる「年齢階級別の1～5年OS（5点）」を参照情報として使用します。
年齢階級ごとのキャリブレーション重み（Curve matching）: CGPコホートの観測OS（time_all）に対し、年齢階級ごとに重み（iptw）を最適化し、重み付きKM曲線が外部の1～5年OS点（平滑化した曲線）に一致するように調整します。
※ 重みは「年齢階級別の曲線」を合わせるため、単一の全年齢曲線に必ず一致するわけではありません。
Mixture target（混合ターゲット）での可視化: 可視化では、解析対象集団（例：Group 1/2）の年齢構成に合わせて、年齢階級別の外部曲線を混合した mixture target （外部期待曲線）を重ねて表示します。これにより「年齢構成の違い」による見かけのズレを抑えた比較ができます。
群間比較（HR）と指標表示: 群間の差は、重み付きCoxモデル（ロバスト分散）で ハザード比（HR） を推定し、95%信頼区間とともにタイトルに表示します。
※ どちらかの群でイベントが極端に少ない場合、HRが不安定/推定不能（NA）になることがあります。
中央値OS（95%CI）の表示: 重み付きKMから群ごとの中央値OSと95%CIを算出し、タイトルに表示します。

ESS（Effective Sample Size; 有効サンプルサイズ）とは？

重み付けにより一部の症例が大きな重みを持つと、推定の不確実性が増えます。ESSは 「重み付き解析が、何人分の情報量に相当するか」 を表す指標で、概ね ESS=(Σw)^2/Σ(w^2) で計算します。

ESS ≈ N: 重みがほぼ均一で情報損失が小さい（安定）。
ESS が小さい: 少数の症例が推定を支配しやすく、結果が不安定になりやすい。
小数でも正常: ESSは「等価サンプルサイズ」であり連続量なので小数になります。

フォレストプロットの見方：Time Ratio (TR) とは？

フォレストプロットでは、CoxモデルのHRではなく、AFTモデルに基づく Time Ratio（時間比・加速係数; TR） を用います。TRは「生存時間が何倍に伸縮するか」を表す直感的な指標です。

TR > 1.0 : 予後良好 （例：TR=1.5なら、生存期間が1.5倍に延びる）
TR < 1.0 : 予後不良 （例：TR=0.5なら、生存期間が半分に縮む）
TR = 1.0 : 影響なし

※ 点は点推定値、エラーバーは95%信頼区間を示します。TR=1.0を跨がなければ統計的に有意な関連が示唆されます。

重要な注意点（解釈）

曲線が一致しない＝即バイアス除去失敗、とは限りません: 年齢階級別に合わせている場合、全年齢の外部曲線と一致しないことがあります。必ずmixture targetとの比較で確認してください。
残余交絡の可能性: 年齢以外（PS、治療適格性、重症度など）の未観測因子が群間で異なる場合、補正後も差が残る可能性があります。
イベントが少ない群では推定が不安定: HRや中央値CIが広くなる、あるいは推定不能になる場合があります。

Simulation Study: External data based calibration

Simulation Results

Single Run Estimates (Point Estimate & 95% CI)

400 Iterations Summary (Mean, MSE, and Coverage Rate [CR])

Visualizations for Manuscript (Fig 1 - 3)

Univariate Dependent Truncation: Copula vs Lynden-Bell

Estimated Median Survival Times

Reconstructed Marginal Survival Curves

Figure. Overall survival after the first survival-prolonging chemotherapy after adjusting for left-truncation bias. To evaluate the association between oncogenic mutations and survival, a risk-set adjustment model was performed to adjust for left-truncation bias with survival package.

Take care of left-truncation bias.

Figure. Overall survival after the first survival-prolonging chemotherapy after adjusting for left-truncation bias. To evaluate the association between oncogenic mutations and survival, a risk-set adjustment model was performed to adjust for left-truncation bias with survival package.

It takes minutes.

Take care of left-truncation bias.

Figure. Hazard ratio estimated by cox model with survival package.

It takes minutes.

This setting also applies to Bayesian estimation in other tabs.

Group 1

Group 2

It takes minutes.

Figure. Overall survival after the first survival-prolonging chemotherapy after adjusting for left-truncation bias. To evaluate the association between oncogenic mutations and survival, a Bayesian survival simulation based on a semi-independent, two-hit model was performed to adjust for left-truncation bias. Two survival curves from the date of commencement of chemotherapy for prolonging survival to the date of CGP and from the CGP testing date to the date of death were fitted with Weibull distribution and log-logistic distribution, respectively. The overall survival curve from the first chemotherapy induction was approximated by merging these survival curves. Survival curves were obtained from each of the 8000 iterations of inference, and the median survival and 95% equal-tailed CIs were calculated. Bayesian inference was performed with the rstan package for R. P value of conditional Kendall tau statistics was calculated, and the survival curves were adjusted for length bias, using a structural transformation method with tranSurv package for R.

Tamura T, et al., Selection bias due to delayed comprehensive genomic profiling in Japan. Cancer Science, 2022. Link for the paper

It takes minutes.

Figure. Overall survival after the first survival-prolonging chemotherapy after adjusting for left-truncation bias. To evaluate the association between oncogenic mutations and survival, a Bayesian survival simulation based on a semi-independent, two-hit model was performed to adjust for left-truncation bias. Two survival curves from the date of commencement of chemotherapy for prolonging survival to the date of CGP and from the CGP testing date to the date of death were fitted with Weibull distribution and log-logistic distribution, respectively. The overall survival curve from the first chemotherapy induction was approximated by merging these survival curves. Survival curves were obtained from each of the 8000 iterations of inference, and the median survival and 95% equal-tailed CIs were calculated. Bayesian inference was performed with the rstan package for R.

Tamura T, et al., Selection bias due to delayed comprehensive genomic profiling in Japan. Cancer Science, 2022. Link for the paper

Simulation settings

Figure. Overall survival after the first survival-prolonging chemotherapy.

Tamura T, et al., Selection bias due to delayed comprehensive genomic profiling in Japan. Cancer Science, 2022. Link for the paper

Hidden Download

Drug response analysis

Overall drug usage

Patients without treatment time excluded in treatment time dataset

Patients with RECIST-NE excluded in objective response dataset

Patients without treatment time or with RECIST-NE excluded in adverse effect dataset

Figure. Time on treatment analysis for the survival-prolonging chemotherapy. Time on treatment represents the period from the start date of chemotherapy to the end date; if the patient was on medication at the time of CGP testing, the patient was censored; otherwise, the patient was terminated, and a survival curve was generated using the Kaplan-Meier method.

Figure. Treatment time.

Group 1

Group 2

Table. Univariable and multiple variable regression analysis were performed with gtsummary package for R. Factors selected in the multivariable analysis were selected by variable decreasing method based on the Akaike information criterion with stat package for R. Variables with variance inflation factor > 10 were excluded to eliminate multicollinearity.

Volcano plots for frequent regimens

Raw data for volcano plots of all regimens

Hazard ratio and p-value were calculated by cox regression model concerning pathology.

All regimens, and genes in which more than or equal to 8 of the treated patients had mutations were included in the analysis.

Volcano plots for frequent regimens

Figure. Patients with objective response data treated with the specified drugs were divided into two groups: those who obtained an Objective response and those who did not, and their odds ratios and p-values were calculated and a volcano plot was plotted.

Odds ratio and p-value were calculated by logistic regression model concerning pathology.

All regimens, and genes in which more than or equal to 8 of the treated patients had mutations were included in the analysis.

Objective response: CR or PR.

Table. Univariable and multiple variable regression analysis were performed with gtsummary package for R. Factors selected in the multivariable analysis were selected by variable decreasing method based on the Akaike information criterion with stat package for R. Variables with variance inflation factor > 10 were excluded to eliminate multicollinearity.

Candidate genes: arbitrary selected genes, the five most frequently mutated genes, genes with significance in odds ratio of objective response rate or time on treatment.

Objective response: CR or PR.

Table. Univariable and multiple variable regression analysis were performed with gtsummary package for R. Factors selected in the multivariable analysis were selected by variable decreasing method based on the Akaike information criterion with stat package for R. Variables with variance inflation factor > 10 were excluded to eliminate multicollinearity.

Candidate genes: arbitrary selected genes, the five most frequently mutated genes, genes with significance in odds ratio of objective response rate or time on treatment.

Objective response: CR or PR.

Table. Univariable and multiple variable regression analysis were performed with gtsummary package for R. Factors selected in the multivariable analysis were selected by variable decreasing method based on the Akaike information criterion with stat package for R. Variables with variance inflation factor > 10 were excluded to eliminate multicollinearity.

Candidate genes: arbitrary selected genes, the five most frequently mutated genes, genes with significance in odds ratio of objective response rate or time on treatment.

Disease control: CR, PR, or SD.

OR: Objective response (CR or PR), ORR: Objective response rate, DC: Disease control (CR, PR, or SD), DCR: Disease control rate

95% confidence intervals were calculated using the Clopper-Pearson method.

Volcano plots for frequent regimens

Figure. Patients with objective response data treated with the specified drugs were divided into two groups: those who obtained an Objective response and those who did not, and their odds ratios and p-values were calculated and a volcano plot was plotted.

Odds ratio and p-value were calculated by logistic regression model concerning pathology.

All regimens, and genes in which more than or equal to 8 of the treated patients had mutations were included in the analysis.

Table. Univariable and multiple variable regression analysis were performed with gtsummary package for R. Factors selected in the multivariable analysis were selected by variable decreasing method based on the Akaike information criterion with stat package for R. Variables with variance inflation factor > 10 were excluded to eliminate multicollinearity.

Candidate genes: arbitrary selected genes, the five most frequently mutated genes, genes with significance in odds ratio of objective response rate or time on treatment.

Objective response: CR or PR.

Table. Univariable and multiple variable regression analysis were performed with gtsummary package for R. Factors selected in the multivariable analysis were selected by variable decreasing method based on the Akaike information criterion with stat package for R. Variables with variance inflation factor > 10 were excluded to eliminate multicollinearity.

Candidate genes: arbitrary selected genes, the five most frequently mutated genes, genes with significance in odds ratio of objective response rate or time on treatment.

Objective response: CR or PR.

Cumulative Incidence of Adverse Events

Cumulative incidence of adverse events stratified by drug effectiveness, accounting for the competing risk of treatment completion. The observation period was defined as 90 days from treatment completion to capture both acute and delayed adverse events. Blue line: patients with objective response; Red line: patients without response. Shaded areas represent 95% confidence intervals. Statistical comparison was performed using Gray's test. The subdistribution hazard ratio with 95% confidence interval and p-value from the Fine-Gray competing risk regression model adjusted for age, sex, smoking status, diagnosis, and treatment line are displayed.

No Response: PD/SD, Response: PR/CR

FELIS: Flexible Exploration for LIquid and Solid tumor clinical sequencing data

— C-CAT Secondary Use Data Analysis Platform —

FELIS（Flexible Exploration for LIquid and Solid tumor clinical sequencing data）は、C-CAT（二次利用）データを対象に、GUI（R/Shiny）を通じてコホート定義から可視化、そして臨床的に重要なバイアスを考慮したアウトカム解析までを一通り行うためのローカル実行型Webアプリです。

注意: 本ソフトは C-CAT二次利用データを適法に取得できる方のみが対象です。各施設の倫理審査・データ利用規約等に従って利用してください。本プログラムの利用者は、本プログラムの使用により生じた直接損害・間接損害を問わず、第三者による請求から C-CAT 及び NCC を免責します。当該利用者は、本プログラムの利用から生じる第三者による損害賠償請求に対する C-CAT 及び NCC の求償権の行使を妨げません。

1. 概要 (Overview)

C-CATは、日本のがんゲノム医療で実施されるCGP検査結果と臨床情報が集約されるナショナルDBです。 FELISは、その二次利用データを対象に、以下のプロセスをノーコード/低コードで反復探索できることを狙っています。

コホート構築
変異の要約と可視化
治療・予後の解析

特に、CGPのリアルワールドデータで問題になりやすい 遅延到達（delayed entry） や 左側切断（left truncation） を意識した解析導線をGUIとして統合している点が特徴です。

Settings: 解析対象（症例・組織型・遺伝子・治療など）の指定、しきい値、モデル設定など。
Results: 各解析の図表（保存/ダウンロード含む）。
Instruction / Tips: アプリ内で README / Tips を表示（Instruction タブは README.md を表示）。

3. Results：出力の意味（網羅版リファレンス）

以下は、FELISのメニュー構造に沿った各アウトカムの解説です。

■ Case summary

Summarized by mutation pattern 患者の基本属性や臨床変数を、指定した変異パターン等で要約したテーブル。
Summarized by histology 組織型（OncoTree等）単位での症例数、臨床情報、変異要約。

■ Oncoprint（図・テーブル）

Oncoprint 選択コホートで頻度の高い遺伝子の変異景観（患者×遺伝子）を可視化。
Lolliplot for the selected gene 指定遺伝子のアミノ酸変化（ホットスポット等）の頻度分布。
Table of clinical and mutation information per patient 患者単位の臨床情報と変異情報のテーブル。解析結果の元データ確認・二次解析用。

■ Mutual exclusivity（共起/排他）

Figure for probability / odds ratio 変異ペアの共起/排他傾向を、確率・オッズ比などで可視化（青＝排他、赤＝共起）。
table_mutually_exclusive ペアごとの統計量（p値等）を一覧。

■ Variant rate by histology（組織型別頻度）

figure_mut_subtype_plot 組織型サブタイプ別の頻度上位遺伝子の変異率を比較。

■ Mutation and treatment option（変異クラスタリング等）

Basic data 組織型ごとの基本統計（年齢、性別、変異、TMB、治療オプション/実治療、CGP前後の期間など）を図表化。
UMAP clustering based on mutations / Cluster and histology relationship 変異パターンに基づく患者クラスタ（UMAP + DBSCAN等）を作成し、クラスタと組織型・変異の富化を表示。
Heterogeneity within histologic types 組織型ごとのクラスタ分布（集中/分散）をエントロピー等で評価。
Frequency of patients with targeted therapy 各患者におけるエビデンスレベル別の治療オプション頻度を集計。

■ Survival after CGP（CGP後生存）

Survival and clinical information CGP検査日を起点にした生存曲線（KM等）と、層別（組織型・PS・治療など）。
Custom survival analysis 2群比較をGUIで定義し、KM/RMST等を出力。PSM/IPWなどの背景調整（Love plot、重み分布、PS分布等）も提供。
Survival and mutations, forest plot 変異有無等の層別でRMST差/推定効果をフォレストプロットで表示。
Hazard ratio Coxモデル等での単変量/多変量の推定結果（gtsummary形式）。
Survival period and treatment reach rate “治療開始”をアウトカムとし、死亡を競合リスクとして扱うCIF（累積発生関数）等の到達解析。

■ CGP benefit prediction（到達予測・DCA/ROC）

Nomogram CGP前情報から“治療到達”を予測するノモグラムを表示。
Odds ratio 到達に関連する因子のオッズ比（単変量/多変量）。
Decision curve DCA（Decision Curve Analysis）で、予測モデルを使う臨床的価値（Net benefit）を評価。
ROC curve of nomogram ROC（AUC等）による予測精度の評価。
Input data（Input_data） GUIから臨床変数を手入力し、上記モデルで予測値を返すシミュレーション。

■ Overall survival with risk-set adjustment（左側切断補正手法：risk-set）

Survival and clinical information（SurvivalandtreatmentafterCTx） 生存の起点を「化学療法開始」等に置いた時の、左側切断（delayed entry）を補正した推定。
Custom survival analysis 2群比較におけるRisk-set補正後の比較。
Frequent variants and survival / Diagnosis and survival / Mutational cluster and survival 変異、診断、クラスタなどを説明変数にした生存解析の図（フォレスト/曲線）。
Hazard ratio for survival after CTx 補正枠組み下でのCoxモデル等の推定結果テーブル。

■ Survival after CTx with Bayesian inference（左側切断補正手法：Bayesian）

Survival corrected for left-truncation bias Stanを用いたベイズ推定により、左側切断を考慮した生存曲線の中央値と信頼区間を出力。
Custom survival analysis ベイズ枠組みでの2群間比較。
Genetic variants and survival / Diagnosis and survival 変異や診断で層別した、補正後の生存曲線や効果推定。

■ Survival after CTx with control cohort data（experimental）

Custom survival analysis 院内がん登録を対照コホート情報として用いた実験的解析手法。現状はSettingsでStage4患者のみを選択する必要があり、また診断後の生存期間解析に限定。

■ Bias correction simulation（シミュレーション）

Left-truncation bias adjustment simulation パラメータを与えて、左側切断補正の挙動を視覚的に理解するためのシミュレーター。

■ Drug response（ToT/ORR/AE/薬剤別生存）

Settings 解析対象の投与ライン、解析期間（Time on treatment (ToT)など）、対象薬剤群（まとめ上げ）などを指定。
Drug usage data 患者×薬剤使用の元データテーブル（ダウンロード可能）。
Treatment time and clinical information ToTのKM曲線、事前治療との関係（散布図）、フォレストプロット。
Treatment time comparison 2群定義をGUIで指定し、ToTを比較。
ToT by gene mutation cluster / by mutated genes / by mutation pattern クラスタ・遺伝子変異・変異パターンでToTを層別化。
Hazard ratio on time on treatment（genes / clusters） ToTをアウトカムとしたCox等の推定結果テーブル。
Volcano plot（ToT / ORR / AE） レジメン×遺伝子等の関連を、効果量と有意性でvolcano plotを表示。
Cumulative incidence of adverse effect AEをイベントとして競合リスクを意識した累積発生を表示。
Survival and drug “投薬開始日”起点の生存を薬剤別に表示。

4. 統計・疫学的バイアスへの対応 (Methodology)

FELISの設計思想:

セキュア環境: オフライン環境で動作し、クラウドを使わずに高度な可視化を実現。
選択バイアスの克服: CGP RWD特有の delayed entry / left truncation による生存の過大評価（不死時間バイアス）を、Risk-set補正やベイズシミュレーションで解決。

5. 引用 (References)

研究成果に使用する際は、以下の文献を引用してください。

@article{Mano2024FELIS,
  title={FELIS: Flexible Exploration for LIquid and Solid tumor clinical sequencing data},
  author={IKEGAMI, Masachika},
  journal={GitHub Repository},
  url={[https://github.com/MANO-B/FELIS](https://github.com/MANO-B/FELIS)},
  year={2024}
}

FELIS: Flexible Exploration for LIquid and Solid tumor clinical sequencing data

— C-CAT Secondary Use Data Analysis Platform —

FELIS is a local-execution Web application (R/Shiny) designed for researchers to perform cohort definition, visualization, and bias-aware outcome analysis using C-CAT secondary use data.

Note: This software is intended only for those who can legally obtain C-CAT secondary use data. Please comply with ethical reviews and data usage terms. The user of this program shall indemnify C-CAT and NCC against any claims by third parties for direct or indirect damages arising from the use of this program. Such User shall not prevent C-CAT and NCC from exercising its right of recourse against claims for damages by third parties arising from the use of such program.

1. Overview

C-CAT (Center for Cancer Genomics and Data Management) is the national database for cancer genomic medicine in Japan. FELIS aims to enable “no-code/low-code” iterative exploration of:

Cohort construction
Mutation summarization and visualization
Treatment and prognosis analysis

Specifically, it integrates analysis workflows that account for delayed entry and left truncation, which are common issues in real-world CGP (Cancer Genomic Profiling) data.

Settings: Specify analysis targets (cases, histology, genes, treatments), thresholds, and model settings.
Results: Display and download charts/tables for each analysis.
Instruction / Tips: Display this README and operational tips within the app.

3. Results: Detailed Reference of Outputs

Below is a comprehensive list of what each output in the FELIS menu signifies.

■ Case summary

Summarized by mutation pattern: A table summarizing patient attributes and clinical variables based on specified mutation patterns.
Summarized by histology: Case counts, clinical info, and mutation summaries grouped by histology (e.g., OncoTree).

■ Oncoprint

Oncoprint: Visualization of the mutational landscape (Patients x Genes) for high-frequency genes in the selected cohort.
Lolliplot for the selected gene: Frequency distribution of amino acid changes (hotspots) for a specific gene.
Table of clinical and mutation information per patient: Raw patient-level data for verification and secondary analysis.

■ Mutual exclusivity

Figure for probability / odds ratio: Visualization of co-occurrence or mutual exclusivity trends between gene pairs.
table_mutually_exclusive: Statistical table (p-values, etc.) for each pair.
- Interpretation: Red typically indicates co-occurrence, Blue indicates exclusivity.

■ Variant rate by histology

figure_mut_subtype_plot: Comparison of mutation rates of top genes across histological subtypes to identify characteristic mutations.

■ Mutation and treatment option

Basic data: Statistics (Age, Sex, TMB, Treatment options, etc.) by histology.
UMAP clustering based on mutations: Dimensionality reduction and clustering (DBSCAN/HDBSCAN) to identify patient clusters based on mutation patterns.
Cluster and histology relationship: Enrichment analysis of clusters across histology and mutations.
Heterogeneity within histologic types: Evaluation of cluster distribution (entropy) within each histological type.
Frequency of patients with targeted therapy: Aggregated frequency of treatment options by evidence level.

■ Survival after CGP

Survival and clinical information: Survival curves (KM, etc.) starting from the date of CGP testing.
Custom survival analysis: GUI-defined two-group comparison with KM/RMST and adjustments like PSM/IPW (including Love plots and weight distributions).
Survival and mutations, forest plot: Forest plot showing RMST differences or estimated effects stratified by mutation status.
Hazard ratio: Multivariate analysis results (Cox model) in table format.
Survival period and treatment reach rate: Cumulative Incidence Function (CIF) for “treatment reach,” treating death as a competing risk.

■ CGP benefit prediction

Nomogram: Predictive model (logistic regression) for reaching recommended treatment based on pre-CGP info.
Odds ratio: ORs for factors related to treatment reach.
Decision curve: Decision Curve Analysis (DCA) to evaluate the clinical net benefit of the model.
ROC curve of nomogram: Evaluation of predictive accuracy (AUC).
Input data: Simulator to return predicted values based on manually entered clinical variables.

■ Overall survival with risk-set adjustment (Left-truncation correction)

Survival and clinical information (after CTx): Survival estimation starting from “Chemotherapy initiation” with risk-set adjustment to handle the bias of delayed CGP entry.
Custom survival analysis: Two-group comparison using the risk-set adjustment framework.
Frequent variants / Diagnosis / Mutational cluster and survival: Survival analysis (Forest/Curves) with mutations, histology, or clusters as explanatory variables.
Hazard ratio for survival after CTx: Cox model estimation results for corrected workflows.

■ Survival after CTx with Bayesian inference

Survival corrected for left-truncation bias: Bayesian estimation using Stan to output survival curves with Credible Intervals (CI).
Custom survival analysis (BayesCustom): Two-group comparison within the Bayesian framework.
Genetic variants / Diagnosis and survival: Corrected survival curves stratified by variants or histology.

■ Survival after CTx with control cohort data (Experimental)

Custom survival analysis (ControlCustom): Experimental analysis framework utilizing external control cohort information.

■ Bias correction simulation

Left-truncation bias adjustment simulation: Simulation output to visualize how correction methods behave under different parameters. Used for sensitivity checks.

■ Drug response

Settings (Drugusebylineoftreatment): Define lines of therapy, ToT/TTF definitions, and drug groupings. Provides an overview of commonly used regimens.
Drug usage data (Drugperpatient): Raw data table of patient-drug interactions.
Treatment time and clinical information: ToT KM curves, correlation with pre-treatment (scatter plots), and forest plots.
Treatment time comparison (ToT_interactive): Interactive GUI for comparing ToT between two defined groups.
ToT by gene mutation cluster / by mutated genes / by mutation pattern: Stratification of ToT by clusters, specific mutations, or patterns.
Hazard ratio on time on treatment: Cox estimation for ToT outcomes.
Volcano plot (ToT / ORR / AE): Exploration of associations between regimens/genes and outcomes (effect size vs. significance).
Cumulative incidence of adverse effect: Visualization of AE occurrences considering competing risks.
Survival and drug (Survival_drug): Survival analysis starting from the “Drug initiation date.”

4. Methodological Background (Bias Correction)

As detailed in the FELIS paper:

C-CAT data requires secure/offline processing, making cloud-based platforms difficult to use.
Real-world CGP data suffers from selection bias due to delayed entry/left truncation.
FELIS addresses this by separating Post-CGP analysis (Naive KM/CIF) and Post-CTx analysis (Risk-set adjustment/Bayesian simulation).

5. References

If you use FELIS in your research, please cite:

@article{Mano2024FELIS,
  title={FELIS: Flexible Exploration for LIquid and Solid tumor clinical sequencing data},
  author={Mano, B.},
  journal={GitHub Repository},
  url={https://github.com/MANO-B/FELIS},
  year={2024}
}

License: MIT License Developer: MANO-B

💡 FELIS 解析Tips集

【ご利用ガイド】 下記の各項目（▶ 太字の部分）をクリックすると、詳細な解析手順が展開されます。もう一度クリックすると閉じます。

💊 薬剤奏効性解析

▶ ある治療と、一つ前のラインの治療との効果比較をしたい ▽

Setting から組織型や年齢などの絞り込みを行う
Results -> Drug response -> Settings から薬剤の選択を行う
Results -> Drug response -> Summary Tables から使用状況を確認する
Results -> Drug response -> Time on treatment -> Comparison of the treatment duration of the selected drug and the preceding treatment line で、同一患者群での前治療と指定治療の time on treatment の比較を行う
Results -> Drug response -> Time on treatment -> Treatment time comparison で、CTx lines in which drugs used等を指定し前治療と指定治療の time on treatment の比較を行う

💡 Tip: 1ライン前の治療よりもToTが長ければ、その治療は指定した患者群に有効性が高い可能性があります。

▶ 二つの治療の治療効果比較をしたい ▽

Setting から組織型や年齢などの絞り込みを行う
Results -> Drug response -> Settings から薬剤の選択を行う
Results -> Drug response -> Summary Tables から使用状況を確認する
Results -> Drug response -> Time on treatment -> Treatment duration in the selected regimen, grouped by regimen で、Settingsで選択した治療の time on treatment の比較を行う
Results -> Drug response -> Time on treatment -> Treatment time comparison で、CTx lines in which drugs used等を指定し2つの治療の time on treatment の比較を行う
Analysis -> Survival after drug initiation date -> Choose drugs for treatment effect analysis で二つ以上の治療を選択する。比較したい2レジメンを、それぞれ Drug set 1 と Drug set 2 に入力する
Analysis -> Drug response analysis -> Survival and drug で、設定後にAnalyzeボタンを押し、緩和的化学療法導入後に指定したレジメンで2群で分けた生存曲線を確認する

💡 Tip: 例えば膵がんの1st line治療としてFOLFIRINOXとGEM + nab-PTXのどちらが生存期間が優れるかの比較ができたりします。

▶ 組織型ごとの治療効果をみたい ▽

Setting から組織型や年齢などの絞り込みを行う
Results -> Drug response -> Settings から薬剤の選択を行う
Results -> Drug response -> Summary Tables から使用状況を確認する
Results -> Drug response -> Time on treatment -> Treatment duration in the selected regimen, grouped by regimen で、Settingsで選択した治療の time on treatment の比較を行う
Results -> Drug response -> Time on treatment -> Treatment duration of all drugs in the selected line, grouped by diagnosisで、遺伝子変異の有無で群分けした全治療の Time on treatment を Kaplan-Meier 法で評価する
Results -> Drug response -> Time on treatment -> Time on treatment and pre-treatment for the specified treatment, KM-curve で、遺伝子変異の有無で群分けした指定治療の Time on treatment を Kaplan-Meier 法で評価する

💡 解釈のポイント: 治療期間が組織型間で変わりがなく、指定薬剤での治療期間が特定の組織型で長期である場合、その組織型が指定薬剤の有効性が高い可能性が示唆されます。

▶ ある薬剤の効果における遺伝子変異の有無の意義をみたい ▽

Setting から組織型や年齢などの絞り込みを行う
Results -> Drug response -> Settings から薬剤の選択を行う
Results -> Drug response -> Summary Tables から使用状況を確認する
Results -> Drug response -> Time on treatment -> Treatment duration in the selected regimen, grouped by regimen で、Settingsで選択した治療の time on treatment の比較を行う
Results -> Drug response -> Time on treatment -> Treatment duration of all drugs in the selected line, grouped by diagnosisで、全ての治療あるいは指定治療の Time on treatment を Kaplan-Meier 法で評価する
Results -> Drug response -> Time on treatment -> Treatment duration in the selected regimen, grouped by detailed mutation で、遺伝子変異の有無で群分けした指定治療の Time on treatment を Kaplan-Meier 法で評価する
Results -> Drug response -> Time on treatment -> Time on treatment by tissue type, KM-curve で、遺伝子変異の有無などで2群に分けたうえで Time on treatment を Kaplan-Meier 法で評価する

💡 解釈のポイント: 全薬剤での治療期間が遺伝子変異の有無で変わりがなく、指定薬剤での治療期間が遺伝子変異の有無で差がある場合、その遺伝子変異が指定薬剤の biomarker である可能性が示唆されます。

▶ 薬剤の効果と遺伝子変異の関係を volcano plot で網羅的に確認したい ▽

Setting から組織型や年齢などの絞り込みを行う
Results -> Drug response -> Settings から薬剤の選択を行う
Results -> Drug response -> Summary Tables から使用状況を確認する
Results -> Drug response -> Time on treatment -> Volcano plot for treatment time, hazard ratio でTime on treatment に関連する遺伝子変異を探索する
Results -> Drug response -> Response rate -> Volcano plot for objective response rate でObjective response に関連する遺伝子変異を探索する

💡 見方: 水色の遺伝子では変異があると奏効率が高く、赤い遺伝子では変異があると奏効率が低くなります。下の方から表が閲覧できます。

📈 生存期間解析

▶ CGP検査後の予後と遺伝子変異の関係をみたい ▽

Setting から組織型や年齢などの絞り込みを行う
Results -> Survival after CGP -> Survival analysis -> Survival after CGP and performance status から指定遺伝子セットないのいずれかに変異があるか否かで群分けした生存曲線を確認する。
Results -> Survival after CGP -> Survival analysis -> Survival after CGP and mutations, forest plot で、変異頻度の高い遺伝子について、変異の有無での2群間での生存期間の比較を行う
Results -> Survival after CGP -> Survival analysis -> Survival after CGP and mutations, KM-curve で、変異頻度の高い遺伝子について、変異の有無での2群間での生存曲線の比較を行う
Results -> Survival after CGP -> Survival analysis -> Custom survival analysis で、変異の有無などで分けた2群間での生存曲線の比較を行う

📝 補足: Setting の Timing for RMST measuring in survival analysis (years) で、forest plot で描画する生存期間 (restricted mean survival time) の差を計算する時期を指定します。

▶ 緩和的化学療法導入後の予後と遺伝子変異の関係をみたい ▽

Setting から組織型や年齢、治療コースなどの絞り込みを行う。とくに Genes of interest で注目する遺伝子セットを指定する。
Results -> Overall survival with risk-set adjustment -> Survival and clinical information -> Entire cohort で、cKendall tauが0付近で、妥当な生存期間が予測されているか確認する
Results -> Overall survival with risk-set adjustment -> Custom survival analysis で、臨床情報や遺伝子変異情報によって群分けして生存期間の比較を行う
Results -> Survival after CTx with Bayesian inference -> Genetic variants and survival, forest plot で、変異頻度の高い遺伝子について、変異の有無での2群間での生存期間の比較を行う
Results -> Survival after CTx with Bayesian inference -> Genetic variants and survival, KM-curve で、変異頻度の高い遺伝子について、変異の有無での2群間での生存曲線の比較を行う
Results -> Survival after CTx with control cohort data (experimental) -> Custom survival analysis で、臨床情報や遺伝子変異情報によって群分けして生存期間の比較を行う

📝 補足: 左側切断バイアスを補正した上で生存期間解析を行います。3種類の補正手法がありますが、院内がん登録という外部情報で補正した手法が最も妥当なように思われます（こちら）

▶ ある薬剤を生存中に使用したか否かでの生存期間の差をみたい ▽

Setting から組織型や年齢などの絞り込みを行う
Results -> Drug response -> Settings から薬剤の選択を行う
Results -> Drug response -> Summary Tables から使用状況を確認する
Results -> Drug response -> Survival after CGP -> Survival and drug で、緩和的化学療法導入後に指定したレジメンを使用したか否かでの2群で分けた生存曲線を確認する

📝 補足: 左側切断バイアスを補正した場合としない場合で生存曲線が描かれます。

▶ ある薬剤をCGP検査後に使用したか否かでの生存期間の差をみたい ▽

Setting から組織型や年齢などの絞り込みを行う
Results -> Survival after CGP -> Survival analysis -> Custom survival analysis で、薬剤の使用歴などで分けた2群間での生存曲線の比較を行う

📝 補足: 通常のカプラン・マイアー生存曲線が描かれます。

▶ CGP検査後の死亡ハザードに関係する因子を抽出したい ▽

Setting から組織型や年齢などの絞り込みを行う。
Results -> Survival after CGP -> Survival analysis -> Hazard ratio から、単変量解析・多変量解析でのハザード比に関係する臨床情報や遺伝子変異を検討する。

📝 補足: 赤池情報量規準（AIC）を用いて変数減少法で自動的に変数選択を行っています。多重共線性はVIF>10となる因子を除外しています。

👥 患者背景

▶ 患者背景のTableが欲しい ▽

Setting から組織型や年齢の絞り込みを行う
Results -> Case summary から結果を確認する
必要があれば全体をコピーしてWordに保存する

📝 補足: Setting の Filters on mutation types で選択した遺伝子変異の有無で群分けして表示されます。

▶ 患者背景のFigureが欲しい ▽

Setting から組織型や年齢の絞り込みを行う
Results -> Clustering analysis -> Basic data から結果を確認する
必要があれば全体をコピーしてWordに保存する

📖 項目の意味:

Driver: 何らかのがん化変異 (C-CAT evidence level “F”) が検出された症例か否かを示します。

Pts with recommended CTx: エキスパートパネルで推奨治療があった症例の割合。

Pts received recommended CTx: 推奨治療を実際に受けた症例の割合。

Median time from CTx to CGP: 緩和的化学療法開始日からCGP検査日までの期間の中央値。

Median time from CGP to death: CGP検査日から死亡までの期間のKaplan-Meier法での中央値。

▶ オンコプリント（遺伝子変異の一覧表）が欲しい ▽

Setting から組織型や年齢の絞り込みを行う
Results -> Oncoprint -> Figures -> Oncoprint から結果を確認する
Results -> Oncoprint -> Downloadable table から患者の臨床情報と変異情報をダウンロードする
Results -> Variation by histology から組織型ごとにどの遺伝子変異の頻度が高いのかを確認する

📝 補足: 描画した元データは Downloadable table からExcelファイルでダウンロード可能です。

▶ ロリプロット（遺伝子のどのアミノ酸残基に変異が多いかの図）が欲しい ▽

Setting から組織型や年齢の絞り込みを行う
Results -> Oncoprint -> Figures -> Lolliplot for the selected gene から遺伝子を選択し結果を確認する

📝 補足: 現状ではエキソンスキッピングやイントロンの変異には対応していません。

▶ 遺伝子間の相互排他性・共変異の情報が欲しい ▽

Setting から組織型や年齢の絞り込みを行う
Results -> Mutually exclusivity から結果を確認する

💡 見方: X軸の遺伝子とY軸の遺伝子の交わるセルの色が青いと両者は相互排他的、赤いと共変異の関係です。

🎯 治療到達性解析

▶ どのような患者にCGP検査を行うと治療到達率が高いのかを知りたい ▽

Setting から組織型や年齢、治療コースなどの絞り込みを行う。とくに Genes of interest で注目する遺伝子セットを指定する。
Results -> CGP benefit prediction -> Factors leading to treatment -> Factors leading to treatment, pre-CGP, Nomogram から、検査前に得られる患者の臨床情報に基づいて治療到達率を予測するノモグラムを得る。
Results -> CGP benefit prediction -> Factors leading to treatment -> Factors leading to treatment, pre-CGP, Odds ratio から、検査前に得られる患者の臨床情報が治療到達率に与える影響を単変量・多変量で解析する。
Results -> CGP benefit prediction -> Factors leading to treatment -> ROC curve of nomogram から、ノモグラムによる予測の精度をROC曲線で確認する。
Results -> CGP benefit prediction -> Factors leading to treatment -> Factors leading to treatment, decision curve から、ノモグラムによる予測の臨床的有用性をdecision curve analysisで評価した結果を確認する。
Results -> CGP benefit prediction -> Factors leading to treatment -> Input your data から、特定の患者さんの情報を入力すると治療到達率が予想される。

📖 参考: Decision curve analysis についてはこちらやこちらを参照下さい。

▶ CGP検査後の生存期間と治療到達率の関係性を知りたい ▽

Setting から組織型や年齢、治療コースなどの絞り込みを行う。とくに Genes of interest で注目する遺伝子セットを指定する。
Results -> Survival after CGP -> Survival analysis -> Survival period and treatment reach rate から、CGP検査後の生存期間と治療到達率について移動平均を取ったグラフを確認する。

💡 考察: CGP検査後に短期で死亡する患者は治療到達率が低いため、いかに予後が悪そうな患者さんに早めに検査を行うか、そしてPSが保たれ一定程度の生存期間がある患者さんに検査を行うかが重要と思われます。死亡を競合リスクとしたときのCGP後の推奨治療到達率は、Upfront CGPによって推奨治療に到達できる患者の割合とも考えられます。

⏳ 左側切断バイアスのシミュレーション

▶ 左側切断バイアスについて確認したい ▽

C-CATのデータのように、生存期間の測定開始日と検査日（観察開始日）が異なる場合、通常のカプラン・マイアー法では生存期間の推定が困難です。生存期間の測定開始日から検査日まで、全症例が生存している、Immortal biasが存在しているからです。生存期間の測定開始日から検査日までの生存期間と、検査日から最終観察日までの生存期間に分割すると、バイアスの一部が解消されます。ただし、「CGP検査を受けた患者は受けなかった患者と何が違うのか」は究極的にはわからず、ある程度の選択バイアスの解消は不可能と考えます。

Results -> Bias correction simulation -> An example of bias adjustment を開く。
お好みに応じてパラメタを調整する
Left-truncation bias adjustment simulation ボタンを押す
真の生存曲線、通常のカプラン・マイアー生存曲線、CGP検査前後で分割した生存曲線、ベイズ推定でのバイアス解消手法による生存曲線が描画されます。

💡 解釈: 概ね良好なバイアスの補正ができているのではないでしょうか。

💾 その他

▶ 図を保存したい ▽

図を右クリックし、拡張子を .png として名前をつけて保存して下さい。

▶ サマリーの表を保存したい ▽

画面上でテキストを選択し、コピーしてWordかExcelに貼り付けて保存して下さい。

▶ 生データの表を保存したい ▽

左上にあるボタンでエクセルファイルあるいはCSVファイルなどで名前をつけて保存して下さい。

💡 FELIS Analysis Tips

We have summarized the methods for analyses that are likely to be highly necessary. Please click on each item to check the detailed procedures.

💊 Drug response analysis

I want to compare the efficacy of a certain treatment with the treatment of the previous line

Perform filtering such as histology and age from Setting
Select drugs from Results -> Drug response -> Settings
Check the usage status from Results -> Drug response -> Summary Tables
Compare the time on treatment of the previous treatment and the specified treatment in the same patient group using Results -> Drug response -> Time on treatment -> Comparison of the treatment duration of the selected drug and the preceding treatment line
Specify CTx lines in which drugs used, etc., and compare the time on treatment of the previous treatment and the specified treatment using Results -> Drug response -> Time on treatment -> Treatment time comparison

💡 Tip: If the ToT is longer than the treatment of one line prior, that treatment may have high efficacy for the specified patient group.

I want to compare the treatment efficacy of two treatments

Perform filtering such as histology and age from Setting
Select drugs from Results -> Drug response -> Settings
Check the usage status from Results -> Drug response -> Summary Tables
Compare the time on treatment of the treatments selected in Settings using Results -> Drug response -> Time on treatment -> Treatment duration in the selected regimen, grouped by regimen
Specify CTx lines in which drugs used, etc., and compare the time on treatment of the two treatments using Results -> Drug response -> Time on treatment -> Treatment time comparison
Select two or more treatments in Analysis -> Survival after drug initiation date -> Choose drugs for treatment effect analysis. Enter the 2 regimens you want to compare into Drug set 1 and Drug set 2, respectively.
In Analysis -> Drug response analysis -> Survival and drug, press the Analyze button after setting, and check the survival curves divided into 2 groups by the specified regimens after the introduction of palliative chemotherapy.

💡 Tip: For example, it is possible to compare whether FOLFIRINOX or GEM + nab-PTX has a superior survival time as a 1st line treatment for pancreatic cancer.

I want to see the treatment efficacy for each histology

Perform filtering such as histology and age from Setting
Select drugs from Results -> Drug response -> Settings
Check the usage status from Results -> Drug response -> Summary Tables
Compare the time on treatment of the treatments selected in Settings using Results -> Drug response -> Time on treatment -> Treatment duration in the selected regimen, grouped by regimen
Evaluate the Time on treatment of all treatments grouped by the presence or absence of gene mutations using the Kaplan-Meier method in Results -> Drug response -> Time on treatment -> Treatment duration of all drugs in the selected line, grouped by diagnosis
Evaluate the Time on treatment of the specified treatment grouped by the presence or absence of gene mutations using the Kaplan-Meier method in Results -> Drug response -> Time on treatment -> Time on treatment and pre-treatment for the specified treatment, KM-curve

💡 Interpretation Point: If the treatment period does not change between histologies, and the treatment period for the specified drug is long in a specific histology, it suggests the possibility that the specified drug is highly effective for that histology.

I want to see the significance of the presence or absence of a gene mutation on the effect of a certain drug

Perform filtering such as histology and age from Setting
Select drugs from Results -> Drug response -> Settings
Check the usage status from Results -> Drug response -> Summary Tables
Compare the time on treatment of the treatments selected in Settings using Results -> Drug response -> Time on treatment -> Treatment duration in the selected regimen, grouped by regimen
Evaluate the Time on treatment of all treatments or the specified treatment using the Kaplan-Meier method in Results -> Drug response -> Time on treatment -> Treatment duration of all drugs in the selected line, grouped by diagnosis
Evaluate the Time on treatment of the specified treatment grouped by the presence or absence of gene mutations using the Kaplan-Meier method in Results -> Drug response -> Time on treatment -> Treatment duration in the selected regimen, grouped by detailed mutation
Evaluate the Time on treatment using the Kaplan-Meier method after dividing into 2 groups based on the presence or absence of gene mutations, etc., in Results -> Drug response -> Time on treatment -> Time on treatment by tissue type, KM-curve

💡 Interpretation Point: If the treatment period for all drugs does not change depending on the presence or absence of gene mutations, and the treatment period for the specified drug differs depending on the presence or absence of gene mutations, it suggests the possibility that the gene mutation is a biomarker for the specified drug.

I want to comprehensively check the relationship between drug effects and gene mutations using a volcano plot

Perform filtering such as histology and age from Setting
Select drugs from Results -> Drug response -> Settings
Check the usage status from Results -> Drug response -> Summary Tables
Explore gene mutations related to Time on treatment in Results -> Drug response -> Time on treatment -> Volcano plot for treatment time, hazard ratio
Explore gene mutations related to Objective response in Results -> Drug response -> Response rate -> Volcano plot for objective response rate

💡 How to read: For light blue genes, the response rate is high when there is a mutation, and for red genes, the response rate is low when there is a mutation. The table can be viewed from the bottom.

📈 Survival analysis

I want to see the relationship between prognosis after CGP testing and gene mutations

Perform filtering such as histology and age from Setting
Check the survival curves grouped by whether or not there is a mutation in any of the specified gene set from Results -> Survival after CGP -> Survival analysis -> Survival after CGP and performance status.
Compare the survival time between the 2 groups with and without mutations for genes with high mutation frequencies in Results -> Survival after CGP -> Survival analysis -> Survival after CGP and mutations, forest plot
Compare the survival curves between the 2 groups with and without mutations for genes with high mutation frequencies in Results -> Survival after CGP -> Survival analysis -> Survival after CGP and mutations, KM-curve
Compare the survival curves between 2 groups divided by the presence or absence of mutations, etc., in Results -> Survival after CGP -> Survival analysis -> Custom survival analysis

📝 Note: In Timing for RMST measuring in survival analysis (years) of Setting, specify the time to calculate the difference in survival time (restricted mean survival time) drawn in the forest plot.

I want to see the relationship between prognosis after the introduction of palliative chemotherapy and gene mutations

Perform filtering such as histology, age, and treatment course from Setting. In particular, specify the gene set of interest in Genes of interest.
Check whether a reasonable survival time is predicted with cKendall tau around 0 in Results -> Overall survival with risk-set adjustment -> Survival and clinical information -> Entire cohort
Compare survival times by grouping according to clinical information and gene mutation information in Results -> Overall survival with risk-set adjustment -> Custom survival analysis
Compare the survival time between the 2 groups with and without mutations for genes with high mutation frequencies in Results -> Survival after CTx with Bayesian inference -> Genetic variants and survival, forest plot
Compare the survival curves between the 2 groups with and without mutations for genes with high mutation frequencies in Results -> Survival after CTx with Bayesian inference -> Genetic variants and survival, KM-curve
Compare survival times by grouping according to clinical information and gene mutation information in Results -> Survival after CTx with control cohort data (experimental) -> Custom survival analysis

📝 Note: Survival analysis is performed after correcting for left-truncation bias. There are 3 types of correction methods, but the method corrected with external information called hospital-based cancer registries seems to be the most valid (here).

I want to see the difference in survival time depending on whether a certain drug was used during survival

Perform filtering such as histology and age from Setting
Select drugs from Results -> Drug response -> Settings
Check the usage status from Results -> Drug response -> Summary Tables
Check the survival curves divided into 2 groups by whether or not the specified regimen was used after the introduction of palliative chemotherapy in Results -> Drug response -> Survival after CGP -> Survival and drug

📝 Note: Survival curves are drawn for cases where the left-truncation bias is corrected and where it is not.

I want to see the difference in survival time depending on whether a certain drug was used after the CGP test

Perform filtering such as histology and age from Setting
Compare the survival curves between 2 groups divided by drug usage history, etc., in Results -> Survival after CGP -> Survival analysis -> Custom survival analysis

📝 Note: Normal Kaplan-Meier survival curves are drawn.

I want to extract factors related to the mortality hazard after the CGP test

Perform filtering such as histology and age from Setting.
Examine clinical information and gene mutations related to the hazard ratio in univariate and multivariate analysis from Results -> Survival after CGP -> Survival analysis -> Hazard ratio.

📝 Note: Variable selection is automatically performed by backward elimination using the Akaike Information Criterion (AIC). For multicollinearity, factors with VIF > 10 are excluded.

👥 Patient Background

I want a Table of the patient background

Perform filtering such as histology and age from Setting
Check the results from Results -> Case summary
If necessary, copy the whole thing and save it in Word

📝 Note: It is displayed grouped by the presence or absence of gene mutations selected in Filters on mutation types of Setting.

I want a Figure of the patient background

Perform filtering such as histology and age from Setting
Check the results from Results -> Clustering analysis -> Basic data
If necessary, copy the whole thing and save it in Word

📖 Meaning of items:

Driver: Indicates whether the case has any oncogenic mutation (C-CAT evidence level “F”) detected.

Pts with recommended CTx: Means the percentage of cases that had recommended treatments by the expert panel.

Pts received recommended CTx: Means the percentage of cases that actually received the recommended treatments.

Median time from CTx to CGP: Means the median time from the start date of palliative chemotherapy to the CGP test date.

Median time from CGP to death: Means the median time from the CGP test date to death by the Kaplan-Meier method.

I want an oncoprint (a list of gene mutations)

Perform filtering such as histology and age from Setting
Check the results from Results -> Oncoprint -> Figures -> Oncoprint
Download the patient’s clinical information and mutation information from Results -> Oncoprint -> Downloadable table
Check which gene mutations are highly frequent for each histology from Results -> Variation by histology

📝 Note: The drawn raw data can be downloaded as an Excel file from Downloadable table.

I want a lolliplot (a diagram showing which amino acid residues of a gene have many mutations)

Perform filtering such as histology and age from Setting
Select a gene and check the results from Results -> Oncoprint -> Figures -> Lolliplot for the selected gene

📝 Note: Currently, it does not support exon skipping or intron mutations.

I want information on mutual exclusivity and co-occurring mutations between genes

Perform filtering such as histology and age from Setting
Check the results from Results -> Mutually exclusivity

💡 How to read: If the color of the cell where the gene on the X-axis and the gene on the Y-axis intersect is blue, the two are mutually exclusive, and if it is red, they have a co-occurring mutation relationship.

🎯 Treatment reachability analysis

I want to know what kind of patients have a high treatment reach rate when a CGP test is performed

Perform filtering such as histology, age, and treatment course from Setting. In particular, specify the gene set of interest in Genes of interest.
Obtain a nomogram that predicts the treatment reach rate based on clinical information of the patient obtained before the test from Results -> CGP benefit prediction -> Factors leading to treatment -> Factors leading to treatment, pre-CGP, Nomogram.
Analyze the effect of clinical information of the patient obtained before the test on the treatment reach rate in univariate and multivariate analysis from Results -> CGP benefit prediction -> Factors leading to treatment -> Factors leading to treatment, pre-CGP, Odds ratio.
Check the accuracy of the prediction by the nomogram with an ROC curve from Results -> CGP benefit prediction -> Factors leading to treatment -> ROC curve of nomogram.
Check the results of evaluating the clinical utility of the prediction by the nomogram with decision curve analysis from Results -> CGP benefit prediction -> Factors leading to treatment -> Factors leading to treatment, decision curve.
The treatment reach rate will be predicted when you enter the information of a specific patient from Results -> CGP benefit prediction -> Factors leading to treatment -> Input your data.

📖 Reference: For decision curve analysis, please refer here or here.

I want to know the relationship between survival time after CGP testing and treatment reach rate

Perform filtering such as histology, age, and treatment course from Setting. In particular, specify the gene set of interest in Genes of interest.
Check the graph taking the moving average of the survival time after CGP testing and the treatment reach rate from Results -> Survival after CGP -> Survival analysis -> Survival period and treatment reach rate.

💡 Consideration: Since the treatment reach rate is low for patients who die shortly after the CGP test, it seems important how early to perform the test on patients with a likely poor prognosis, and to perform the test on patients whose PS is maintained and have a certain degree of survival time. The recommended treatment reach rate after CGP when death is treated as a competing risk can also be considered as the proportion of patients who can reach the recommended treatment by Upfront CGP.

⏳ Left-truncation bias simulation

I want to check about left-truncation bias

As in the data of C-CAT, when the start date of measurement of survival time and the test date (start date of observation) are different, it is difficult to estimate the survival time using the normal Kaplan-Meier method. This is because all cases are alive from the start date of measurement of survival time to the test date, and immortal bias exists. When the survival time is divided into the survival time from the start date of measurement to the test date, and the survival time from the test date to the final observation date, part of the bias is eliminated. However, it is ultimately unknown “how the patients who received the CGP test differ from those who did not,” and we consider that eliminating selection bias to a certain extent is impossible.

Open Results -> Bias correction simulation -> An example of bias adjustment.
Adjust the parameters to your liking
Press the Left-truncation bias adjustment simulation button
The true survival curve, the normal Kaplan-Meier survival curve, the survival curve divided before and after the CGP test, and the survival curve by the bias elimination method using Bayesian estimation will be drawn.

💡 Interpretation: You can probably see that a generally good bias correction has been achieved.

💾 Others

I want to save the figure

Right-click the figure and save it with a name with the extension .png.

I want to save the summary table

Select the text on the screen, copy it, paste it into Word or Excel, and save it.

I want to save the raw data table

Save it with a name as an Excel file or CSV file using the button on the top left.

How to use

解析対象の症例・遺伝子を選択します

File import

解析対象の症例・遺伝子変異リストをインポートします

Figures in results are downloadable as png files.

FELIS; Functions Especially for LIquid and Solid tumor clinical sequencing.

https://github.com/MANO-B/FELIS

The following settings are for advanced analysis only

If you select No, csv files may not be necessary.

If you select Yes, faster when performing the same analysis repeatedly.

Figure. Recurrent oncogenic mutations in selected cases. The 30 genes with the highest frequency of oncogenic mutations are shown. Mutational landscapes were created using ComplexHeatmap package for R.

Figure. Frequency of oncogenic mutations in the selected gene. The most frequent oncogenic mutations are shown with amino acid change.

Mutplot by Zhang W, PMID:31091262. If error occurs, correct 'source/UniPlot.txt'.

Protein structure source: Uniprot

Figure. Recurrent oncogenic mutations across subtypes. The 30 genes with the highest frequency of oncogenic mutations were displayed.

Summary, cluster and mutated gene

Summary, cluster and histology

Raw data

Survival difference will be evaluated with restricted mean survival time in this section. Analysis with hazard ratio is also provided in 'Overall survival with risk-set adjustment' section (Survival analysis start date = CGP test date).

Figure. Survival analysis after CGP test using the conventional Kaplan–Meier estimator, log–rank test were undertaken with survival package for R. EP: expert panel. RMST, restricted mean survival time.

Raw data

Figure. Survival analysis after CGP test using the conventional Kaplan–Meier estimator, log–rank test were undertaken with survival package for R. EP: expert panel. RMST, restricted mean survival time.

Group 1

Group 2

Propensity score-based adjustment

Propensity score matching

Inverse probability weighting

Threshold for IPW

The 95% CI was derived from the empirical 2.5th and 97.5th percentiles of the bootstrap distribution. This approach captures sampling variability of the weighted survival process while preserving the time scale and interpretation of RMST in days.

All analyses were conducted using the survival, MatchIt, and cobalt packages.

Survival difference is evaluated with restricted mean survival time in this section. Analysis with hazard ratio will be also provided in 'Overall survival with risk-set adjustment' section (Survival analysis start date = CGP test date).

Figure. Suvival periods after CGP and gene mutations estimated with conventional Kaplan-Meier estimator. Restricted mean survival time in two years (days) were estimated with survRM2 package in R.

If there are too many histology subtypes, multivariable analysis may fail. Go to Settings and set: “Analyze without detailed histology” → “Yes, use OncoTree 1st level”.

Download raw data

Take care of left-truncation bias.

Raw data

Take care of left-truncation bias.

Group 1

Group 2

Group 1

Group 2

補正で扱っている3つの主要な歪み

改善した補正アプローチ：年齢階級別OSキャリブレーション + モデル推定

ESS（Effective Sample Size; 有効サンプルサイズ）とは？

フォレストプロットの見方：Time Ratio (TR) とは？

重要な注意点（解釈）

Simulation Study: External data based calibration

Population & Target Gene Settings

Left-Truncation (T1) Pattern

Censoring Pattern (C2)

Simulation Results

Single Run Estimates (Point Estimate & 95% CI)

400 Iterations Summary (Mean, MSE, and Coverage Rate [CR])

Visualizations for Manuscript (Fig 1 - 3)

Univariate Dependent Truncation: Copula vs Lynden-Bell

Simulation Parameters

Estimated Median Survival Times

Reconstructed Marginal Survival Curves

Figure. Overall survival after the first survival-prolonging chemotherapy after adjusting for left-truncation bias. To evaluate the association between oncogenic mutations and survival, a risk-set adjustment model was performed to adjust for left-truncation bias with survival package.

Take care of left-truncation bias.

Take care of left-truncation bias.

Figure. Overall survival after the first survival-prolonging chemotherapy after adjusting for left-truncation bias. To evaluate the association between oncogenic mutations and survival, a risk-set adjustment model was performed to adjust for left-truncation bias with survival package.

It takes minutes.

Take care of left-truncation bias.

Figure. Hazard ratio estimated by cox model with survival package.

It takes minutes.

This setting also applies to Bayesian estimation in other tabs.

Group 1

Group 2

It takes minutes.

It takes minutes.

Figure. Overall survival after the first survival-prolonging chemotherapy.

Overall drug usage

Patients without treatment time excluded in treatment time dataset

Patients with RECIST-NE excluded in objective response dataset

Patients without treatment time or with RECIST-NE excluded in adverse effect dataset

Figure. Treatment time.

Group 1

Group 2

Volcano plots for frequent regimens