FELIS version 4.2.10 , data version 20251024

How to use

解析対象の症例・遺伝子を選択します

必要があれば設定をデフォルトから変更します
フィルタリング後にResultsのタブを選択し、解析結果を閲覧します

Required: C-CAT data files (zipped files are acceptable)

Choose case CSV Files

Browse...

Choose report CSV Files

Browse...

Download sample clinical data

Download sample report data

Filter on histology

File import

解析対象の症例・遺伝子変異リストをインポートします

Analyze with new dataset をNoにすると全症例、または前回解析症例があれば取り込まれます
Yesにすると、アップロードしたCSVファイルをインポートします

Filters for clinical information

Filters on genes

For detailed study of mutations of a gene

Other Settings

Figures in results are downloadable as png files.

FELIS; Functions Especially for LIquid and Solid tumor clinical sequencing.

https://github.com/MANO-B/FELIS

The following settings are for advanced analysis only

動作はしますが、とくに変更は不要です

Option files

Correspondence table between ID and histology (CSV)

Browse...

Specify the ID of the patient whose diagnosis you want to correct and the modified histology.

Download CSV file template

Correspondence table between ID and drug information (CSV)

Browse...

To analyze only the drug after curation

Download CSV template (before CGP)

Download CSV template (after CGP)

Correspondence table for drug renaming (CSV)

Browse...

To analyze similar drugs together by renaming the drugs
Reclassified as molecular targeted therapies, immune checkpoint inhibitors, etc.
Drugs are listed in ABC order, separated by commas

Download CSV template

Correspondence table for drug combination renaming (CSV)

Browse...

To rename drug combinations to groups
Reclassified as molecular targeted therapies, immune checkpoint inhibitors, etc.
Drugs are listed in ABC order, separated by commas

Download CSV template

Option files

Correspondence table for mutation renaming (CSV)

Browse...

To analyze similar mutations together by renaming mutations
Reclassified as Exon 19 mutation, Exon 20 mutation, gene amplification, etc.
'Other' if there is an unspecified mutation in the designated gene

Download CSV template

Correspondence table for mutation reannotation (CSV)

Browse...

To reannotate variants
F: pathogenic variants, G: neutral variants

Download CSV template

Correspondence table of histological type renaming (CSV)

Browse...

To analyze similar tissue types together by renaming them
Reclassified as differentiated gastric cancer, undifferentiated gastric cancer, etc.

Download CSV template

Correspondence table for drug renaming based on regimen (CSV)

Browse...

Provide the drug name if the drug name is unknown and the regimen is known.
Conversion from MAP therapy to 'Cisplatin,Doxorubicin,Methotrexate'

Download CSV template

Analysis setting

If you select No, csv files may not be necessary.

If you select Yes, faster when performing the same analysis repeatedly.

Click button after setting modification

Table. Characteristics for selected patients. The present, retrospective cohort study was performed with clinicogenomic, real-world data on the patients who were registered in the C-CAT database from June 1, 2019. The patients were registered by hospitals throughout Japan and provided written informed consent to the secondary use of their clinicogenomic data for research.

Figure. Recurrent oncogenic mutations in selected cases. The 30 genes with the highest frequency of oncogenic mutations are shown. Mutational landscapes were created using ComplexHeatmap package for R.

Click button after setting modification

Figure. Frequency of oncogenic mutations in the selected gene. The most frequent oncogenic mutations are shown with amino acid change.

Mutplot by Zhang W, PMID:31091262. If error occurs, correct 'source/UniPlot.txt'.

Protein structure source: Uniprot

Github for Mutplot. Link for the website

Hidden Download

Figure for probability

Figure for odds ratio

Figure. Alterations among mutually exclusive or co-occurring pairs. The 30 genes with the highest frequency of oncogenic mutations were selected to determine whether oncogenic mutations are likely to occur simultaneously between the two genes. Blue boxes indicates mutually exclusivity and red boxes indicates co-occurrence. An asterisk shows a significant correlation (p < 0.001). Analysis was performed with Rediscover package in R language. Odds ratios were estimated by Fisher exact test. An odds ratio less than 1 does not necessarily correspond to mutual exclusivity as evaluated by the negative binomial distribution. This statement highlights that a low odds ratio (OR < 1) indicates a negative association between two events but does not inherently imply mutual exclusivity. In statistical modeling, particularly with count data, the negative binomial distribution is often employed to account for overdispersion. However, the interpretation of mutual exclusivity requires careful consideration beyond the OR value alone. For instance, in the context of count data, the negative binomial regression model can be used to estimate the odds of an event occurring. However, the OR derived from such models may not fully capture the complexity of mutual exclusivity between events. Factors such as overdispersion and the underlying data distribution can influence the interpretation of the OR. Therefore, while an OR less than 1 suggests a negative association, it should not be solely relied upon to infer mutual exclusivity, especially when using models like thenegative binomial distribution. A comprehensive analysis considering the specific context and model assumptions is essential for accurate interpretation.

Figure. Recurrent oncogenic mutations across subtypes. The 30 genes with the highest frequency of oncogenic mutations were displayed.

Figure. Distribution of age, sex, detected oncogenic mutations, tumor mutation burden (TMB), metastasis pattern, patients with treatment option recommended by the expert panel, patients received recommended chemotherapy, mediantime from the initiation date of the first palliative chemotherapy to CGP, and median time from CGP to final observation. In the boxplots of age and TMB, the box borders indicate the 25th and 75th percentiles, the inner line the median, and the whiskers 1.5× the interquartile range.

Summary, cluster and mutated gene

Summary, cluster and histology

Raw data

Figure. Unsupervised clustering of the patients based on the detected oncogenic mutations. Two-dimensional mutational pattern mapping was generated using Uniform Manifold Approximation and Projection (UMAP). The three variants and histotypes with the highest odds ratios that were more common than the other clusters at p<0.05. Clustering analysis was performed as follows. All pathogenic mutations detected by the cancer-related genes were assembled into a binary matrix format per patient. The dimension of this input matrix was reduced using Uniform Manifold Approximation and Projection (UMAP) via the umap package for R (with default hyperparameters). Clustering analysis was performed using the Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) via the dbscan package for R (EPS: 1.0; minimum points: 3).

Mochizuki T, et al., Factors predictive of second-line chemotherapy in soft tissue sarcoma: An analysis of the National Genomic Profiling Database. Cancer Science, 2023. Link for the paper

Figure. Unsupervised clustering based on oncogenic mutations. For each histology, the percentage of cases belonging to one of the clusters is shown as a bar graph. Characteristic oncogenic mutations were found in each cluster. There was a tendency for each histologic type to cluster in specific clusters. Heterogeneity of genetic variation within histologic types was assessed by Shannon's entropy. Low Shannon entropy values indicate that the genetic mutation pattern of the tumor is uniform, while high values indicate diversity.

Figure. Level of evidence for targeted therapy for detected gene mutations. The highest level of evidence was extracted for each patient. Evidence levels of C-CAT are defined as A for biomarkers that predict a response to Japanese Pharmaceuticals and Medical Devices Agency (PMDA)– or FDA-approved therapies or are described in professional guidelines, B for biomarkers that predict a response based on well-powered studies with consensus of experts in the field, C for biomarkers that predict a response to therapies approved by the PMDA or FDA in another type of tumor or that predict a response based on clinical studies, D for biomarkers that predict a response based on case reports, E for biomarkers that show plausible therapeutic significance based on preclinical studies.

Survival difference will be evaluated with restricted mean survival time in this section. Analysis with hazard ratio is also provided in 'Overall survival with risk-set adjustment' section (Survival analysis start date = CGP test date).

Figure. Survival analysis after CGP test using the conventional Kaplan–Meier estimator, log–rank test were undertaken with survival package for R. EP: expert panel. RMST, restricted mean survival time.

Raw data

Figure. Survival analysis after CGP test using the conventional Kaplan–Meier estimator, log–rank test were undertaken with survival package for R. EP: expert panel. RMST, restricted mean survival time.

Group 1

Group 2

Propensity score-based adjustment

Propensity score matching

Inverse probability weighting

Threshold for IPW

Download love plot of PS-matching Download love plot of IPCW Download IPW weight distribution Download IPCW weight distribution Download PS distribution

To reduce confounding between the two groups, we performed propensity score matching. The propensity score was estimated using a logistic regression model including prespecified clinically relevant covariates (CGP platform, sex, age, PS, histology, treatment lines before CGP, and the best treatment effect before CGP). Patients were matched 1:1 using nearest‐neighbor matching without replacement on the logit of the propensity score (MatchIt package, method = “nearest”, distance = “logit”). A caliper width of 0.2 on the logit scale was applied to restrict matches to comparable individuals. Matched sets were identified using the MatchIt subclass variable, and each subclass was treated as a matched pair for following analyses.

Covariate balance before and after matching was evaluated using standardized mean differences (SMDs) with the cobalt package. Adequate balance was defined a priori as an absolute SMD < 0.1 for all covariates. Balance diagnostics were visualized using Love plots. The maximum absolute SMD after matching was additionally reported to provide a single summary measure of balance.

Because propensity score matching induces dependence within matched pairs, confidence intervals for the RMST difference were obtained using nonparametric bootstrap resampling at the matched‐pair level. Specifically, matched pairs were resampled with 2000-time replacement, RMST differences were recalculated for each bootstrap replicate, and the 2.5th and 97.5th percentiles of the bootstrap distribution were used to derive a two‐sided 95% confidence interval.

To account for baseline imbalances between treatment groups, we applied inverse probability of treatment weighting (IPTW) based on the propensity score (PS). The PS was estimated using a logistic regression model including prespecified baseline covariates. Stabilized weights were constructed to estimate the average treatment effect (ATE).

Weighted Kaplan–Meier (KM) survival curves were estimated using case weights corresponding to the IPTW. This approach yields survival functions representing a pseudo-population in which the distribution of measured baseline covariates is balanced between treatment groups. All survival times were analyzed on the original time scale (days).

Under IPTW, group-specific survival functions were estimated using weighted KM estimators. Because the KM estimator is a right-continuous step function, RMST was computed by exact integration of the step function, without numerical approximation. Specifically, RMST was calculated as the sum over successive time intervals of the interval length multiplied by the survival probability at the beginning of the interval. This yields an exact estimate of the area under the weighted KM curve up to the defined time.

Confidence intervals (CIs) for the IPTW-adjusted RMST difference were obtained using a nonparametric bootstrap procedure. When matched pairs were available, resampling was performed at the pair level. Otherwise, bootstrap samples were generated using probability-proportional-to-size resampling, with sampling probabilities proportional to the IPTW weights, reflecting each individual’s contribution to the weighted pseudo-population. Within each bootstrap sample, RMST was recalculated using the same weighting scheme, and the RMST difference was re-estimated.

The 95% CI was derived from the empirical 2.5th and 97.5th percentiles of the bootstrap distribution. This approach captures sampling variability of the weighted survival process while preserving the time scale and interpretation of RMST in days.

Between-group differences in survival distributions were assessed using weighted log-rank–type tests. Test statistics were constructed as weighted score statistics accumulated over observed event times, with weights derived from the IPTW and, for Wilcoxon-type tests, additional weighting based on the pooled weighted survival function. P-values were obtained from chi-square distributions with one degree of freedom. Additionally, weighted Cox proportional hazards models with robust variance estimation were fitted to estimate hazard ratios, with stratification applied when matched pairs were present.

All analyses were conducted using the survival, MatchIt, and cobalt packages.

Survival difference is evaluated with restricted mean survival time in this section. Analysis with hazard ratio will be also provided in 'Overall survival with risk-set adjustment' section (Survival analysis start date = CGP test date).

Figure. Suvival periods after CGP and gene mutations estimated with conventional Kaplan-Meier estimator. Restricted mean survival time in two years (days) were estimated with survRM2 package in R.

If there are too many histology subtypes, multivariable analysis may fail. Go to Settings and set: “Analyze without detailed histology” → “Yes, use OncoTree 1st level”.

Table. Univariable and multiple variable regression analysis were performed with gtsummary package for R. Factors in the multivariable analysis were selected by variable decreasing method based on the Akaike information criterion with stat package for R. Variables with variance inflation factor > 10 were excluded to eliminate multicollinearity.

Figure. Treatment reach rate and survival period after CGP. We estimated the cumulative incidence function (CIF) using Gray’s method for competing risk analysis.Treatment initiation was defined as the primary outcome, death was considered a competing event, and censoring was treated as informative.The cumulative incidence at each time point t was calculated as the probability of experiencing the specified event by t, reflecting the conditional probability in the presence of competing events.If the recommended treatment was confirmed to have been administered but the treatment initiation date was unknown, the treatment was assumed to have started at the median of the entire observation period.Statistical analyses were performed using the cmprsk package in R, and CIFs were estimated with 95% confidence intervals.We used competing risk analysis instead of the conventional Kaplan–Meier method for the following reasons:Presence of competing events: Patients who die permanently lose the opportunity to receive treatment, making death a competing event.Bias avoidance: Treating deaths as simple censored observations in Kaplan–Meier analysis could overestimate the treatment initiation rate.Clinical interpretability: CIFs provide probabilities that more accurately reflect event occurrence as observed in real-world clinical settings.

Figure. Based on clinical information, a nomogram was developed to predict treatment reach. The nomogram was created with the lrm function of the rms package for R with a setting of penalty=1. Nagelkerke R2 was calculated with blorr package for R. Possible sampling bias was corrected with 500-time bootstrap sampling and then concordance index was estimated. Best_effect: the best treatment effect of CTx before CGP.

Table. Univariable and multiple variable regression analysis were performed with gtsummary package for R. Factors selected in the multivariable analysis were selected by variable decreasing method based on the Akaike information criterion with stat package for R. Variables with variance inflation factor > 10 were excluded to eliminate multicollinearity.

Figure. Decision curve analysis was performed to verify the usefulness of the nomogram with dcurves package for R. Ten-fold cross-validation was performed to prevent overfitting. If the blue line is located above the other lines, then the nomogram-based decision to perform CGP testing may be worthwhile.

Figure. Predicted treatment reach rate and Receiver Operatorating Characteristic curve of the nomogram using pre-CGP information by pROC package for R. The nomogram was based on logistic regression analysis, random forest model, and LightGBM model, all of which calculated sensitivity and specificity using prediction results from 5-fold cross-validation and plotted ROC curves. The random forest model and LightGBM model used 5-fold cross-validation on a single training set and performed a grid search with n=8 for each parameter to determine the optimal parameters.

Download raw data

Not shown when machine learning is not performed

Clinical information 1

Clinical information 2

Metastasis information

Figure. Overall survival after the first survival-prolonging chemotherapy after adjusting for left-truncation bias. To evaluate the association between prognostic factors and survival, a risk-set (number at risk) adjustment model was applied to adjust for left-truncation bias with survival package. Note that the analysis assumes quasi-independent left-truncation (conditional Kendall tau = 0).

Take care of left-truncation bias.

Raw data

Figure. Overall survival after the first survival-prolonging chemotherapy after adjusting for left-truncation bias. To evaluate the association between prognostic factors and survival, a risk-set (number at risk) adjustment model was applied to adjust for left-truncation bias with survival package. Note that the analysis assumes quasi-independent left-truncation (conditional Kendall tau = 0).

Take care of left-truncation bias.

Group 1

Group 2

Figure. Overall survival after the first survival-prolonging chemotherapy after adjusting for left-truncation bias. To evaluate the association between oncogenic mutations and survival, a risk-set adjustment model was performed to adjust for left-truncation bias with survival package.

Take care of left-truncation bias.

Figure. Overall survival after the first survival-prolonging chemotherapy after adjusting for left-truncation bias. To evaluate the association between oncogenic mutations and survival, a risk-set adjustment model was performed to adjust for left-truncation bias with survival package.

Figure. Overall survival after the first survival-prolonging chemotherapy after adjusting for left-truncation bias. To evaluate the association between prognostic factors and survival, a risk-set (number at risk) adjustment model was applied to adjust for left-truncation bias with survival package. Note that the analysis assumes quasi-independent left-truncation (conditional Kendall tau = 0).

Group 1

Group 2

この解析・シミュレーション手法について（クリックして詳細を表示）

本アプリでは、リアルワールドデータ（RWD）であるCGP検査コホートから 「真の治療効果・遺伝子変異の予後インパクト」 を抽出するため、Accelerated failure time modelによる疫学的なバイアスを排除した生存期間解析のシミュレーションを行っています。ただし、例えば膵がんのFOLFILINOXとGEM+nab-PTXを比較すると前者の方が生存期間が長くなりますが、これはデータベースの情報から知ることができない患者選択バイアス（より負担の大きいFOLFILINOXに耐える患者が治療を受けた）が大きいと考えられます。因果推論ではなく相関関係をみているものと理解ください。生存期間の予測モデル作成には有用とも言えます。

克服している3つの重大なバイアス

生存者バイアス（左側切断）: CGP検査を受けた患者は「検査に到達できるまで長生きできた」という特殊な集団です。そのまま一般集団（院内がん登録）と比較すると不当に予後が良く見えてしまいます。
患者背景のズレ: CGPコホートは一般集団に比べて若年層が多いなどの偏りがあります。
進行スピードの相関: 診断からCGP検査までが短い（進行が早い）患者は、検査後の余命も短いという自然な生物学的相関があります。

解析のアプローチ（Doubly Robust Estimation）

IPTW（逆確率重み付け）による背景の標準化: 院内がん登録（マクロデータ）の生存率を基準に、CGP患者の「年齢」と「検査到達タイミング」を一般集団の分布に強制的に一致させる重み付けを行います。
多変量加速モデル（Multivariate AFT Model）: 年齢・性別・組織型という強力な交絡因子を多変量モデルで差し引き、「特定の遺伝子変異」が独立して生存期間に与える純粋な影響を抽出します。
絶対時間の再構築（シミュレーション）: 一般集団の期待生存日数をベースに、抽出した効果を掛け合わせ、臨床的な相関を保ちながら「もしこの患者たちが一般集団だったら」という仮想的なカプランマイヤー曲線を再構築します。

フォレストプロットの見方：Time Ratio (TR) とは？

一般的なハザード比（HR）とは異なり、この解析では 「Time Ratio（時間比・加速係数）」 を算出しています。これは「本来生きられるはずだった寿命が、何倍に伸縮するか」を表す直感的な指標です。

TR > 1.0 : 予後良好 （例：TR=1.5なら、生存期間が1.5倍に延びる）
TR < 1.0 : 予後不良 （例：TR=0.5なら、生存期間が半分に縮む）
TR = 1.0 : 影響なし

※ プロットの点は点推定値、エラーバーは95%信頼区間を示します。TRが1.0の点線を跨いでいなければ、統計的に有意な独立因子であることを意味します。

Simulation Study: Tamura & Ikegami Model Ver 2.3.2

Simulation Results

Single Run Estimates (Point Estimate & 95% CI)

400 Iterations Summary (Mean, MSE, and Coverage Rate [CR])

Visualizations for Manuscript (Fig 1 - 3)

Univariate Dependent Truncation: Copula vs Lynden-Bell

Estimated Median Survival Times

Reconstructed Marginal Survival Curves

Figure. Overall survival after the first survival-prolonging chemotherapy after adjusting for left-truncation bias. To evaluate the association between oncogenic mutations and survival, a risk-set adjustment model was performed to adjust for left-truncation bias with survival package.

Take care of left-truncation bias.

Figure. Overall survival after the first survival-prolonging chemotherapy after adjusting for left-truncation bias. To evaluate the association between oncogenic mutations and survival, a risk-set adjustment model was performed to adjust for left-truncation bias with survival package.

It takes minutes.

Take care of left-truncation bias.

Figure. Hazard ratio estimated by cox model with survival package.

It takes minutes.

This setting also applies to Bayesian estimation in other tabs.

Group 1

Group 2

It takes minutes.

Figure. Overall survival after the first survival-prolonging chemotherapy after adjusting for left-truncation bias. To evaluate the association between oncogenic mutations and survival, a Bayesian survival simulation based on a semi-independent, two-hit model was performed to adjust for left-truncation bias. Two survival curves from the date of commencement of chemotherapy for prolonging survival to the date of CGP and from the CGP testing date to the date of death were fitted with Weibull distribution and log-logistic distribution, respectively. The overall survival curve from the first chemotherapy induction was approximated by merging these survival curves. Survival curves were obtained from each of the 8000 iterations of inference, and the median survival and 95% equal-tailed CIs were calculated. Bayesian inference was performed with the rstan package for R. P value of conditional Kendall tau statistics was calculated, and the survival curves were adjusted for length bias, using a structural transformation method with tranSurv package for R.

Tamura T, et al., Selection bias due to delayed comprehensive genomic profiling in Japan. Cancer Science, 2022. Link for the paper

It takes minutes.

Figure. Overall survival after the first survival-prolonging chemotherapy after adjusting for left-truncation bias. To evaluate the association between oncogenic mutations and survival, a Bayesian survival simulation based on a semi-independent, two-hit model was performed to adjust for left-truncation bias. Two survival curves from the date of commencement of chemotherapy for prolonging survival to the date of CGP and from the CGP testing date to the date of death were fitted with Weibull distribution and log-logistic distribution, respectively. The overall survival curve from the first chemotherapy induction was approximated by merging these survival curves. Survival curves were obtained from each of the 8000 iterations of inference, and the median survival and 95% equal-tailed CIs were calculated. Bayesian inference was performed with the rstan package for R.

Tamura T, et al., Selection bias due to delayed comprehensive genomic profiling in Japan. Cancer Science, 2022. Link for the paper

Simulation settings

Figure. Overall survival after the first survival-prolonging chemotherapy.

Tamura T, et al., Selection bias due to delayed comprehensive genomic profiling in Japan. Cancer Science, 2022. Link for the paper

Hidden Download

Drug response analysis

Overall drug usage

Patients without treatment time excluded in treatment time dataset

Patients with RECIST-NE excluded in objective response dataset

Patients without treatment time or with RECIST-NE excluded in adverse effect dataset

Figure. Time on treatment analysis for the survival-prolonging chemotherapy. Time on treatment represents the period from the start date of chemotherapy to the end date; if the patient was on medication at the time of CGP testing, the patient was censored; otherwise, the patient was terminated, and a survival curve was generated using the Kaplan-Meier method.

Figure. Treatment time.

Group 1

Group 2

Table. Univariable and multiple variable regression analysis were performed with gtsummary package for R. Factors selected in the multivariable analysis were selected by variable decreasing method based on the Akaike information criterion with stat package for R. Variables with variance inflation factor > 10 were excluded to eliminate multicollinearity.

Volcano plots for frequent regimens

Raw data for volcano plots of all regimens

Hazard ratio and p-value were calculated by cox regression model concerning pathology.

All regimens, and genes in which more than or equal to 8 of the treated patients had mutations were included in the analysis.

Volcano plots for frequent regimens

Figure. Patients with objective response data treated with the specified drugs were divided into two groups: those who obtained an Objective response and those who did not, and their odds ratios and p-values were calculated and a volcano plot was plotted.

Odds ratio and p-value were calculated by logistic regression model concerning pathology.

All regimens, and genes in which more than or equal to 8 of the treated patients had mutations were included in the analysis.

Objective response: CR or PR.

Table. Univariable and multiple variable regression analysis were performed with gtsummary package for R. Factors selected in the multivariable analysis were selected by variable decreasing method based on the Akaike information criterion with stat package for R. Variables with variance inflation factor > 10 were excluded to eliminate multicollinearity.

Candidate genes: arbitrary selected genes, the five most frequently mutated genes, genes with significance in odds ratio of objective response rate or time on treatment.

Objective response: CR or PR.

Table. Univariable and multiple variable regression analysis were performed with gtsummary package for R. Factors selected in the multivariable analysis were selected by variable decreasing method based on the Akaike information criterion with stat package for R. Variables with variance inflation factor > 10 were excluded to eliminate multicollinearity.

Candidate genes: arbitrary selected genes, the five most frequently mutated genes, genes with significance in odds ratio of objective response rate or time on treatment.

Objective response: CR or PR.

Table. Univariable and multiple variable regression analysis were performed with gtsummary package for R. Factors selected in the multivariable analysis were selected by variable decreasing method based on the Akaike information criterion with stat package for R. Variables with variance inflation factor > 10 were excluded to eliminate multicollinearity.

Candidate genes: arbitrary selected genes, the five most frequently mutated genes, genes with significance in odds ratio of objective response rate or time on treatment.

Disease control: CR, PR, or SD.

OR: Objective response (CR or PR), ORR: Objective response rate, DC: Disease control (CR, PR, or SD), DCR: Disease control rate

95% confidence intervals were calculated using the Clopper-Pearson method.

Volcano plots for frequent regimens

Figure. Patients with objective response data treated with the specified drugs were divided into two groups: those who obtained an Objective response and those who did not, and their odds ratios and p-values were calculated and a volcano plot was plotted.

Odds ratio and p-value were calculated by logistic regression model concerning pathology.

All regimens, and genes in which more than or equal to 8 of the treated patients had mutations were included in the analysis.

Table. Univariable and multiple variable regression analysis were performed with gtsummary package for R. Factors selected in the multivariable analysis were selected by variable decreasing method based on the Akaike information criterion with stat package for R. Variables with variance inflation factor > 10 were excluded to eliminate multicollinearity.

Candidate genes: arbitrary selected genes, the five most frequently mutated genes, genes with significance in odds ratio of objective response rate or time on treatment.

Objective response: CR or PR.

Table. Univariable and multiple variable regression analysis were performed with gtsummary package for R. Factors selected in the multivariable analysis were selected by variable decreasing method based on the Akaike information criterion with stat package for R. Variables with variance inflation factor > 10 were excluded to eliminate multicollinearity.

Candidate genes: arbitrary selected genes, the five most frequently mutated genes, genes with significance in odds ratio of objective response rate or time on treatment.

Objective response: CR or PR.

Cumulative Incidence of Adverse Events

Cumulative incidence of adverse events stratified by drug effectiveness, accounting for the competing risk of treatment completion. The observation period was defined as 90 days from treatment completion to capture both acute and delayed adverse events. Blue line: patients with objective response; Red line: patients without response. Shaded areas represent 95% confidence intervals. Statistical comparison was performed using Gray's test. The subdistribution hazard ratio with 95% confidence interval and p-value from the Fine-Gray competing risk regression model adjusted for age, sex, smoking status, diagnosis, and treatment line are displayed.

No Response: PD/SD, Response: PR/CR

FELIS for C-CAT database

Functions Especially for LIquid and Solid tumor clinical sequencing for C-CAT database.
English version of this README file.
Copyright © 2024 Masachika Ikegami, Released under the MIT license.

実際の使用法の解説やTips

こちらを参考に使用いただくとわかりやすいと思います。

Trial Website

Shinyapps.ioで機能制限版(v1.6.6)での動作確認が可能です。
1GBメモリの環境のため頻繁にメモリ不足でクラッシュします。基本的には以下に記載の手順のとおりLocal環境で実行ください。
計算資源の限界のためCGP検査後の生存期間解析およびOdds比・ハザード比の多変量解析はLocalでのみ実行可能です。

C-CAT利活用データの解析Webアプリ

国立がん研究センターに設置されているがんゲノム情報管理センター(C-CAT)には保険診療で行われたがん遺伝子パネル検査(Comprehensive Genomic Profiling, CGP検査)の結果と臨床情報が集約されています。この情報を学術研究や医薬品等の開発を目的とした二次利活用する仕組みがあります。現状では所属施設の倫理審査とC-CATでの倫理審査を経た研究でのみ使用可能であり、また病院やアカデミア以外の組織では年間780万円の利用料金が必要と敷居が高いですが、類似した海外のデータベースであるAACR project GENIEと比較して薬剤の情報や臨床情報が詳しい点で優れており、希少がん・希少フラクションの研究においてこれまでになかった切り口での解析が可能になると考えられています。

C-CATのデータを用いるに当たってはビッグデータかつリアルワールドデータの解析には特有の問題があり、また一定程度のデータ処理を行うプログラミングの知識が必要になります。GUIを用いたソフトウェアにより解析の敷居を下げることで、臨床医の日常診療におけるクリニカルクエスチョンに基づいた探索的研究を容易とし、C-CAT利活用データの活用を促進するために本ソフトウェアを作成しました。Felisはネコの学名であり、C-CAT関連の命名にはネコの名前縛りがあるようです。

C-CATからデータを入手可能な方のみが本ソフトウェアを使用可能となる現状はご理解ください。
使用方法がよく分からない場合はmaikegamあっとncc.go.jpまでご相談ください。

解析手法は以下の論文に基づきます

Tamura T et al., Selection bias due to delayed comprehensive genomic profiling in Japan, Cancer Sci, 114(3):1015-1025, 2023.
左側切断バイアスについてはこちらのwebsiteも参照ください。

Mochizuki T et al., Factors predictive of second-line chemotherapy in soft tissue sarcoma: An analysis of the National Genomic Profiling Database, Cancer Sci, 115(2):575-588, 2024.

System Requirements

Hardware Requirements

数千例の解析であれば問題ありませんが、数万例の解析を行う場合は32GB以上のメモリが必要です。
生存期間解析はStanを用いたモンテカルロ法でのシミュレーションを行います。4コア以上でできるだけ高速なCPUの使用が望まれます。
RAM: 4+ GB
CPU: 4+ cores

3000例、30遺伝子についての生存期間解析を64 GB RAM, M1MAX MacStudioで行った場合、およそ1時間を要します。

Software Requirements

Docker file

Dockerを使用可能であれば面倒なインストール作業をせずにすぐに使用開始可能です。
Dockerの使用法はWindows向けやMacOS向けを参照ください。
Docker desktop使用時は、CPUは4コア以上、メモリは可及的に大きく設定ください。
FELISのDocker fileはDocker-hubに登録しています。

# 先にDocker desktopを起動しておきます
# Windowsはコマンドプロンプト、Macはターミナルで以下を実行
# 適宜sudoで実施ください
# バージョンアップを行う場合もこのコマンドを実行します
docker pull ikegamitky/felis:latest

# バージョンアップが不調の時は、以下の例の様にlatestを変更して直接バージョンを指定するとよいかもしれません。
# この場合は以降のコマンドにおけるlatestの記載も対応するバージョンに変更して実行します。
Intel: docker pull ikegamitky/felis:1.6.5
Apple silicon Mac: docker pull ikegamitky/felis:1.6.5.mac

# 古いソフトが動き続けてしまっているばあいは、以下で終了します。
docker ps -a
docker kill [container id]
docker rm [container id]

使用時は以下のコマンドを入力し、ブラウザで http://localhost:3838 にアクセスするとFELISが起動します。

docker run -d --rm -p 3838:3838 ikegamitky/felis:latest R --no-echo -e 'library(shiny);runApp("/srv/shiny-server/felis-cs", launch.browser=F)'

##上記で動かない場合は以下を
# Docker containerを起動
docker run -it --rm -p 3838:3838 ikegamitky/felis:latest R
# Rで以下の2行を実行
library(shiny)
runApp("/srv/shiny-server/felis-cs", launch.browser=F)

サーバーでFELISを起動する場合、ターミナルから以下のコマンドを入力後はssh接続は不要です。
接続先のIPアドレスが172.25.100.1であれば、ブラウザで http://172.25.100.1:3838 にアクセスするとFELISが起動します。

# ssh username@servername
docker run -d -p 3838:3838 ikegamitky/felis:latest nohup shiny-server
# exit

Dockerを使用する場合は解析ファイルの読み込みセクションまで飛ばしてください。

R language

適宜ウェブサイトを参照しRを導入ください。
特にバージョンの指定はありませんが、本ソフトウェアはv4.3.2を使用して作成しました。
以下、コマンドラインからRを起動して作業を行います。

Rstan

こちらのRStan Getting Started (Japanese)を参照ください。

MacOSでのインストールにはXcode CLTが必要で、さらにmacrtoolsをgithubからインストールする関係でgithubへのアカウント登録が必要です。こちらのウェブサイトを参照ください。生存期間解析が不要であれば、Rstanをインストールしないという選択も可能です。
WindowsでのインストールはRのバージョンに合わせてRtoolsをインストールください。
Linuxでのインストールは適宜実施ください。

## MacOSの場合
## githubに登録し、PATを入手する
### 1. Sign in github.
### 2. Access Settings - Developer Settings in the Dashboard.
### 3. Generate a Personal access token (classic) without any checkboxes.
### 4. Copy the generated token.
## ターミナルで以下のコマンドを実行しCommand Line Tools for Xcodeのインストールを行う　
### xcode-select --install
## Rコンソールで以下のコマンドを実行する
install.packages("remotes")
remotes::install_github("coatless-mac/macrtools", auth_token = "入手したPAT")
options(timeout=1000)
macrtools::macos_rtools_install()
dotR <- file.path(Sys.getenv("HOME"), ".R")
if (!file.exists(dotR)) dir.create(dotR)
M <- file.path(dotR, "Makevars")
if (!file.exists(M)) file.create(M)
arch <- ifelse(R.version$arch == "aarch64", "arm64", "x86_64")
cat(paste("\nCXX17FLAGS += -O3 -mtune=native -arch", arch, "-ftemplate-depth-256"),
    file = M, sep = "\n", append = FALSE)
install.packages("rstan", repos = c("https://mc-stan.org/r-packages/", getOption("repos")))

## Windowsの場合
## Rtoolsをインストールする
## Rコンソールで以下のコマンドを実行する
install.packages("rstan", repos = c("https://mc-stan.org/r-packages/", getOption("repos")))

Shiny

WebアプリとするためにShinyを使用しました。

install.packages("shiny")

Package dependencies

依存しているパッケージ群をRターミナルからインストールください。
初めて実行する場合は相当に時間がかかります(最短で2時間程度、慣れていないとインストールの完遂は困難です)。
依存するライブラリ群を必要に応じてapt/brewなどでinstallすることになり大変ですので、Dockerの使用が望まれます。

install.packages(c('ggplot2', 'umap', 'tidyr', 'dbscan', 'shinyWidgets', 'readr', 'dplyr', 'stringr', 'RColorBrewer', 'gt', 'gtsummary', 'flextable', 'survival', 'gridExtra', 'survminer', 'tranSurv', 'DT', 'ggsci', 'scales', 'patchwork', 'sjPlot', 'sjlabelled', 'forcats', 'markdown','PropCIs','shinythemes', 'data.table', 'ggrepel', 'httr', 'plyr', 'rms', 'dcurves', 'Matching', 'blorr', 'broom', 'survRM2', 'rsample', 'shinydashboard', 'pROC', 'withr', 'rpart', 'ranger', 'bonsai', 'tidymodels', 'discrim', 'klaR', 'probably', 'lightgbm', 'partykit', 'betacal', 'ggbeeswarm', 'BiocManager'), dependencies = TRUE)
BiocManager::install("maftools", update=FALSE)
BiocManager::install("ComplexHeatmap", update=FALSE)
BiocManager::install("drawProteins", update=FALSE)
install.packages("Rediscover")
install.packages("tidybayes")

# drawProteinsのインストールが上手くいかない場合
# githubのサインイン、PATの発行を行った上で以下を実行
install.packages("remotes")
remotes::install_github('brennanpincardiff/drawProteins', auth_token = "入手したPAT")

# Rのバージョンによりrmsのインストールが上手くいかない場合
# versionは以下URLを確認し適宜変更ください
# https://cran.r-project.org/src/contrib/Archive/rms/
install.packages("remotes")
remotes::install_version(package = "rms", version = "6.7.0", dependencies = FALSE)

Rの設定

Rstudioの使用をお勧めします。
Figureの日本語表示が上手くいかない場合はこちらを参照ください。

FELISの起動

FELISのダウンロード
使用するバージョンのFELISのZIPファイルをダウンロードし、適当なフォルダにダウンロード・解凍してください。

wget https://github.com/MANO-B/FELIS/raw/main/felis_latest.zip
unzip felis_latest.zip

ここでは”/srv/shiny-server/felis-cs”とします。

FELISの起動以下のコマンドでWebアプリが起動します。
Rstudioですと画面の右上に表示されるRun Appボタンから起動できます。

$ R

R version 4.3.2 (2023-10-31) -- "Eye Holes"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: aarch64-apple-darwin20 (64-bit)
.
.
.
'help.start()' で HTML ブラウザによるヘルプがみられます。 
'q()' と入力すれば R を終了します。

> library(shiny)
> runApp('/srv/shiny-server/felis-cs', launch.browser=T)

解析ファイルの読み込み

解析ファイルの入手まずは解析したい症例の情報をC-CAT利活用検索ポータルからダウンロードします。設定を英語ではなく日本語バージョンとし、症例を選択した上で、以下の画像の通り
・レポートCSV（全データ出力）
・症例CSV（全データ出力）
の2つのファイルをダウンロードします。ZIPファイルは解凍せずそのまま使用可能です。お試し用のダミーデータをダウンロード可能です。

Input C-CAT filesタブを開きます。
ダウンロードした症例CSVとレポートCSVを、画面左上のBrowse…ボタンから選択して読み込みます。
複数のファイルを選択肢読み込むことも可能です。
その他、オプションとして薬剤や組織型を変更する対応表の入力も可能です。

解析対象の指定

Settingタブを開きます。
Start file loading/analysis settingsボタンを押すと設定項目が表示されます。
多数の項目が設定可能です。

組織型に関するフィルタ

Filter by histology
　　解析対象とする組織型の絞り込みを行います。
Histology type to be analyzed
　　一つの組織型として扱って解析したい組織型群を選択します(なければ未選択)。
Name for histology type
　　まとめて解析したい組織型を代表する名前を選択します。
Minimum patients for each histology
　　稀な組織型は発生部位に名前を変更して解析できます。
　　解析する組織型の最小症例数を設定します。

臨床事項に関するフィルタ

Filter by sex
　　解析対象とする性別の絞り込みを行います。
Filter by panel
　　解析対象とするがん遺伝子パネル検査の絞り込みを行います。
Age for analysis
　　解析対象とする年齢の絞り込みを行います。
Threshold age for oncoprint
　　OncoprintでのYoung/Oldの分類の閾値を設定します。
Filter by performance status
　　解析対象とするPSの絞り込みを行います。
Filter by smoking status
　　解析対象とする喫煙歴の絞り込みを行います。
Filter by test-year
　　検査を実施した年の絞り込みを行います。

遺伝子に関するフィルタ

Genes of interest (if any)
　　Oncoprintや生存期間解析等で優先する遺伝子を選択します。
Gene-set of interest (if any)
　　とくに注目する遺伝子セットがあれば選択します。
Case selection based on the mutations
　　変異を有する症例・有さない症例のみを選択して解析可能です。
Genes for lolliplot (if any)
　　注目する遺伝子を選択してください。完全な描画にはInternet接続が必要です。
　　Internet接続がない場合は簡易表示します。
　　Mutplotのスクリプトを使用しています。
Threshold mutation count for lolliplot
　　頻度の高い変異を強調するための設定です。

変異の種類に関するフィルタ

Gene to analyze (if any)
　　特に変異の部位やパターンなどを詳細にみたい遺伝子を選択します。
　　例：EGFR TKD変異
Variants
　　一つの変異パターンとしてまとめて解析する変異を選択します。
Name for variants
　　変異パターンを命名します。
Pathological significance of the genes
　　この遺伝子のみ解析対象とする病的意義を変更可能です。
Treat specified variants independently?
　　指定した変異のみを一つの遺伝子として扱うことが可能です。
　　例：EGFR TKD変異をEGFR_TKD遺伝子にリネーム

その他の設定

Gene number for oncoprint
　　Oncoprintや生存期間解析で対象とする遺伝子の絞り込みを行います。
　　特に生存期間解析にかかる時間に影響が出ます。
Oncoprintの表示
　　Oncoprintにおけるソートの順序を設定します。
Variants for analysis
　　がん化変異のみ解析するか、病的意義に関わらず全ての変異を解析するか選択します。
How to analyze fusion genes
　　パートナー遺伝子が多数ある場合には一つ一つの数が少なくなります。
　　NTRK fusion, ALK fusionのようにまとめて解析するかどうか選択します。
Distance value for DBSCAN clustering
　　クラスタリング解析において弁別する距離の閾値を設定します。
Timing for RMST measuring
　　Restricted mean survival time解析を行う時点を指定します。
CTx lines to analyze
　　解析対象とする薬剤のラインを指定します。
　　1st-lineのみ指定すると、前治療との比較が実施されません。

解析の実行

Analysisタブを開きます。
多数の解析が可能です。説明文が適宜最下部に表示されます。
各ボタンに対応したタブに結果が表示されます。
表示された図は.pngの拡張子で保存可能です。

症例のまとめを表示

選択した症例のまとめをCase summaryタブに表示します。

変異パターンで分類してSummarized by mutation patternタブに表示します。
組織型で分類してSummarized by histologyタブに表示します。

パネル間の変異数の比較

パネルごとの変異のVAFの分布をComparison figureタブに表示します。
組織型とパネルごとにTMBや変異数をまとめた表をComparison figureタブに表示します。

Oncoprintを表示

選択した症例の遺伝子変異をOncoprintタブに表示します。
選択した遺伝子のLolliplotをLolliplot for the selected geneタブに表示します。Internet接続が必要です。上手く表示されない場合は/APP_DIR/source/UniProt.txtにUniprot IDを追記してください。
症例の表をTable of clinical and mutation information per patientタブに表示します。左上のボタンからダウンロードが可能です。

相互排他・共変異を表示

Rediscover packageを用いた遺伝子変異感の相互排他性解析結果をMutual exclusivityタブに表示します。
青が相互排他的、赤が共変異の関係にあることを意味します。
P<0.001の場合にアスタリスクが表示されます。

組織型ごとの各遺伝子の変異率を表示

変異頻度の高い遺伝子について、組織型ごとの遺伝子変異の頻度をVariation by histologyタブに表示します。

遺伝子変異に基づくクラスタリング

変異遺伝子に基づくクラスタリングをUMAPおよびDBSCANを用いて実施します。結果はClustering analysisタブ以下に表示します。

各組織型ごとの基礎的情報についてBasic dataタブに表示します。
- Driver: がん化変異が一つ以上検出された症例の割合
- optionおよびtreat: エキスパートパネルで推奨治療があった・治療を受けた頻度(％)
- time_before_CGP: 緩和的化学療法開始からCGP検査までのmedian survival (days)。
- time_after_CGP: CGP検査から死亡までについてのmedian survival (days)。
各クラスタに集積している組織型や遺伝子変異をUMAP clustering based on mutationsタブに表示します。
P<0.05で集積している組織型を、他のクラスタと比較したオッズ比が高い順に3つまで表示します。
P<0.05で集積している遺伝子変異を、他のクラスタと比較したオッズ比が高い順に3つまで表示します。
各クラスタにおける年齢層をCluster and age relationshipタブに表示します。
各クラスタにおける組織型をCluster and histology relationshipタブに表示します。
各組織型が少数のクラスタに集積するのか多数のクラスタに分布するのかをエントロピーとしてHeterogeneity within histologic typesタブに表示します。
Shannon entropyで計算しています。低い値ほど集積傾向があります。
クラスタと組織型の関係性についての表をTable of clusters and histologiesタブに表示します。
左上のボタンからダウンロードが可能です。
クラスタと遺伝子変異の関係性についての表をTable of clusters and genetic variantsタブに表示します。
左上のボタンからダウンロードが可能です。
各組織型のうちEvidence levelのある薬剤に対応する変異が検出された頻度をFrequency of patients with targeted therapyタブに表示します。
高いEvidence levelに統一：Evidence level Aの薬剤とBの薬剤の両方がある患者はAとしています。
各組織型の化学療法開始からCGP検査まで、およびCGP検査から死亡までの平均生存期間の関係をRelationship between pre-CGP and post-CGP periodsタブに表示します。
各組織型の症例数と治療についての情報の関係をPatients per histology and treatment reach rateタブに表示します。
患者数とEvidence level A, B以上, C以上の薬剤がある割合を散布図としました。
患者数と推奨治療がある割合、推奨治療を受けた割合、推奨治療がある患者が推奨治療を受けた割合を散布図としました。
各組織型のCGP検査までの期間と治療についての情報の関係をPre-CGP period and treatment reach rateタブに表示します。
各組織型のCGP検査後の生存期間と治療についての情報の関係をPost-CGP period and treatment reach rateタブに表示します。
各組織型の平均年齢と治療についての情報の関係をAge and treatment reach rateタブに表示します。

CGP検査後の生存期間解析

遺伝子変異、治療内容、PSなどに着目したCGP検査後の生存期間解析を実施します。 log-log transformationを用いて95%信頼区間を算出します。
どのような患者が予後不良で早期のCGP検査が推奨されるかがみえてきます。
結果はSurvival after CGPタブ以下に表示します。

推奨治療の有無や治療内容で群分けをした生存期間解析をSurvival and treatment after CGPタブに表示します。
UnrecomTreat(+): 推奨治療以外の治療を受けた患者
RecomTreat(+): 推奨治療を受けた患者
Treat(-): CGP検査後に治療を受けなかった患者
組織型、PS、遺伝子変異の有無で群分けをした生存期間解析をSurvival after CGP and performance statusタブに表示します。
過去の治療の最良総合効果や治療コース数で群分けをした生存期間解析をSurvival after CGP and previous treatmentタブに表示します。
遺伝子変異の解析は、注目する遺伝子があればいずれかに変異があるか否かで群分けされます。
遺伝子変異の有無で群分けをしたmedian survival (days)をSurvival after CGP and mutations, forest plotタブに表示します。
変異頻度が少ない場合は95％信頼区間が表示されません。
遺伝子変異の有無で群分けをしたKaplan-Meier survival curveをSurvival after CGP and mutations, KM-curveタブに表示します。
CGP検査後の死亡に関するハザード比のforest plotをHazard ratio for survival after CGPタブに表示します。
95%以上一致する因子は多重共線性があると判断し除外しています。
死亡イベントが2以下の因子は結果が表示されません。
推奨治療を受けるかどうかを予測するノモグラムをFactors leading to Treatment, pre-CGP, Nomogramタブに表示します。
推奨治療がある患者が推奨治療を受けるかどうかを予測するノモグラムをFactors leading to Treatment, post-CGP, Nomogramタブに表示します。
推奨治療を受けるかどうかの因子をFactors leading to Treatment, pre-CGP, Odds ratioタブに表示します。
推奨治療がある患者が推奨治療を受けるかどうかの因子をFactors leading to Treatment, post-CGP, Odds ratioタブに表示します。
推奨治療を受けるかどうかのノモグラムの性能をDecision curve analysisで評価しFactors leading to Treatment, decision curveタブに表示します。結果の考え方についてはこちらを参照ください。
Decision curve analysisの詳細をFactors leading to Treatment, tableタブに表示します。

化学療法導入の生存期間解析(時間がかかります)

左側切断バイアスを考慮した緩和的化学療法導入後の生存期間解析を実施します。 Stanを用いたシミュレーションのため解析が数十分のオーダーで時間を要します。結果はSurvival after CTxタブ以下に表示します。

左側切断バイアスを補正した場合、Number at riskで補正した場合、シミュレーションで補正した場合の生存期間解析をSurvival corrected for left-truncation biasタブに表示します。
注目する遺伝子の変異の有無で群分けをした生存期間解析をSurvival corrected for left-truncation biasタブに表示します。
遺伝子変異の解析は、注目する遺伝子があればいずれかに変異があるか否かで群分けされます。
注目する遺伝子がなければもっとも変異頻度が高い遺伝子に変異があるか否かで群分けされます。
遺伝子変異の有無でmedian survival の差分(days)を計算した結果をGenetic variants and survival, forest plotタブに表示します。
死亡イベントが少ない場合は結果が表示されません。
遺伝子変異の有無で群分けをしたsurvival curveをGenetic variants and survival, KM-curveタブに表示します。

Palliative CTxで使用した薬剤リスト(1st-4th line)

緩和目的の化学療法の1st-4th lineで使用されたレジメンを抽出しAnalysisタブ中に表示します。以後の解析で注目する薬剤を選択します。
入力が不正確と思われる場合があるため、傾向をみる程度の使用が望ましいです。

上記ボタンで選択した薬剤の奏効性解析

Treatment on time (ToT)に着目して薬剤の奏効期間と遺伝子変異や組織型の関係性を評価します。
結果はDrug responseタブ以下に表示します。

全薬剤の情報を治療ライン別にまとめてDrug use, by line of treatmentタブに表示します。
全薬剤の情報を治療効果別にまとめてDrug use, by treatment effectタブに表示します。
指定ラインの薬剤の情報を変異パターン別にまとめてUse of designated line agents, by mutation patternタブに表示します。
指定ラインの薬剤の情報を組織型別にまとめてUse of designated line agents, by histologyタブに表示します。
指定ラインの薬剤の情報を注目する遺伝子変異別にまとめてUse of designated line agents, by mutated genesタブに表示します。
ToTの情報がある指定ライン・指定薬剤の情報を変異パターン別にまとめてUse of designated lines and drugs with ToT information, by mutation patternタブに表示します。
ToTの情報がある指定ライン・指定薬剤の情報を組織型別にまとめてUse of designated lines and drugs with ToT information, by histologyタブに表示します。
ToTの情報がある指定ライン・指定薬剤の情報を注目する遺伝子変異別にまとめてUse of designated lines and drugs with ToT information, by mutated genesタブに表示します。
RESICTの情報がある指定ライン・指定薬剤の情報を変異パターン別にまとめてUse of designated lines and drugs with RECIST information, by mutation patternタブに表示します。
RESICTの情報がある指定ライン・指定薬剤の情報を組織型別にまとめてUse of designated lines and drugs with RECIST information, by histologyタブに表示します。
RESICTの情報がある指定ライン・指定薬剤の情報を注目する遺伝子変異別にまとめてUse of designated lines and drugs with RECIST information, by mutated genesタブに表示します。
注目する薬剤のToTと、その前治療のToTの関係性のwaterfall plotと散布図をTime on treatment and pre-treatment for the specified treatment, scatter plotタブに表示します。
打ち切り症例は除いています。
注目する薬剤のToTと、その前治療のToT・他の薬剤のToTとの比較、遺伝子変異とToTの関係についてのKaplan-Meier survival curveをTime on treatment and pre-treatment for the specified treatment, KM-curveタブに表示します。
注目する薬剤と組織型に関するToTのKaplan-Meier survival curveをTime on treatment by tissue type, KM-curveタブに表示します。
注目する薬剤と遺伝子変異クラスタに関するToTのKaplan-Meier survival curveをTime on treatment by gene mutation cluster, KM-curveタブに表示します。
注目する薬剤と遺伝子変異の有無に関するToTのmedian OSのforest plotをTime on treatment by mutated genes, forest plotタブに表示します。
注目する薬剤と遺伝子変異の有無に関するToTのKaplan-Meier survival curveをTime on treatment by mutated genes, KM-curveタブに表示します。
ToTと注目する遺伝子、注目する変異パターンの関係についてのKaplan-Meier survival curveをTime on treatment and mutations of interest, KM-curveタブに表示します。
治療中断に至る要因のHazard ratioの表をHazard ratio on time on treatmentタブに表示します。
Objective responseに至る要因のOdds ratioの表をOdds ratio on objective response rateタブに表示します。
Disease controlに至る要因のOdds ratioの表をOdds ratio on disease control rateタブに表示します。
遺伝子変異クラスタごとの奏効性の表をMutation clustering and RECISTタブに表示します。
Clopper–Pearson法を用いて95%信頼区間を算出します。
変異パターンごとの奏効性の表をMutation pattern and RECISTタブに表示します。
組織型ごとの奏効性の表をHistology and RECISTタブに表示します。
遺伝子変異ごとの奏効性の表をMutated genes and RECISTタブに表示します。
CGP後に使用した薬剤ごとにCGP検査後の生存曲線をSurvival and drugタブに描画します。

説明

ソフトの使用法などをInstructionタブに表示します。
　　

今後の予定

Pathway間の相互排他性解析を追加
診断時からの生存期間解析を追加(診断時のstageが登録されている症例が増えればその群分けを追加)
HER2免疫染色、MSIなどパネル検査前に行われた検査の結果と、パネル検査による遺伝子変異と、どちらがより薬剤奏効性を予測するかの解析を追加
Liquid sequencingにおけるvariant frequencyと薬剤奏効性の関連性の解析を追加
VUSも含めて組織型間での変異集積パターンの相違を評価
関数のモジュール化

C−CATのデータベースのバージョンごとのFELIS推奨バージョン

C-CATのデータはバージョンごとに列名が追加・変更されることがあるため、FELISの適合するバージョンが必要です。
C-CAT database version 20240820 & 20240621: FELIS version 1.9.1

FELIS 解析例

必要性の高いであろう解析について方法をまとめてみました。

薬剤奏効性解析

ある治療と、一つ前のコースの治療との効果比較をしたい

1. Input C-CAT filesからcase/report CSVファイルを取り込む
2. Settingから組織型や年齢、治療コースなどの絞り込みを行う
3. Analysis -> Drug response analysis -> List of drugs used in Palliative CTxボタンを押す
4. Results -> Drug response -> Tables -> Drug use, by line of treatmentから使用状況を確認する
5. Analysis -> Drug response analysis -> Choose drugs for treatment effect analysisで治療を選択する
6. Analysis -> Drug response analysis -> Analyze with the setting selected aboveボタンで解析を行う
7. Results -> Drug response -> Time on treatment -> Time on treatment and pre-treatment for the specified treatment, scatter plotで、同一患者群での前治療と指定治療のtime on treatmentの比較を行う
8. Results -> Drug response -> Time on treatment -> Time on treatment and pre-treatment for the specified treatment, KM-curveで、同一患者群での前治療と指定治療のtime on treatmentの比較を行う

二つの治療の治療効果比較をしたい

1. Input C-CAT filesからcase/report CSVファイルを取り込む
2. Settingから組織型や年齢、治療コースなどの絞り込みを行う
3. Analysis -> Drug response analysis -> List of drugs used in Palliative CTxボタンを押す
4. Results -> Drug response -> Tables -> Drug use, by line of treatmentから使用状況を確認する
5. Analysis -> Drug response analysis -> Choose drugs for treatment effect analysisで二つ以上の治療を選択する。比較したい2レジメンを、それぞれDrug set 1とDrug set 2に入力する
6. Analysis -> Drug response analysis -> Analyze with the setting selected aboveボタンで解析を行う
7. Results -> Drug response -> Survival after CGP -> Survival and drugで、緩和的化学療法導入後に指定したレジメンで2群で分けた生存曲線を確認する

とくに、1st lineの治療に限定すると、例えば膵がんの1st line治療としてFOLFIRINOXとGEM + nab-PTXのどちらが生存期間が優れるかの比較ができたりします。
発売が新しい薬剤では左側切断バイアスが強く出るため、治療開始日でマッチングさせた解析も加えました。

組織型ごとの治療効果をみたい

1. Input C-CAT filesからcase/report CSVファイルを取り込む
2. Settingから組織型や年齢、治療コースなどの絞り込みを行う
3. Analysis -> Drug response analysis -> List of drugs used in Palliative CTxボタンを押す
4. Results -> Drug response -> Tables -> Drug use, by line of treatmentから使用状況を確認する
5. Analysis -> Drug response analysis -> Choose drugs for treatment effect analysisで治療を選択する
6. Analysis -> Drug response analysis -> Analyze with the setting selected aboveボタンで解析を行う
7. Results -> Drug response -> Time on treatment -> Time on treatment and pre-treatment for the specified treatment, KM-curveで、遺伝子変異の有無で群分けした指定治療のTime on treatmentをKaplan-Meier法で評価する

全ての薬剤での治療期間と指定薬剤での治療期間に差がある場合、その遺伝子変異が指定薬剤のbiomarkerである可能性が示唆されます。

ある薬剤の効果における遺伝子変異の有無の意義をみたい

1. Input C-CAT filesからcase/report CSVファイルを取り込む
2. Settingから組織型や年齢、治療コースなどの絞り込み、探索したい遺伝子の指定を行う
3. Analysis -> Drug response analysis -> List of drugs used in Palliative CTxボタンを押す
4. Results -> Drug response -> Tables -> Drug use, by line of treatmentから使用状況を確認する
5. Analysis -> Drug response analysis -> Choose drugs for treatment effect analysisで治療を選択する
6. Analysis -> Drug response analysis -> Analyze with the setting selected aboveボタンで解析を行う
7. Results -> Drug response -> Time on treatment -> Time on treatment by tissue type, KM-curveで、全ての治療あるいは指定治療のTime on treatmentをKaplan-Meier法で評価する

全ての薬剤での治療期間と指定薬剤での治療期間に差がある場合、その組織型に指定薬剤が有効ないし無効である可能性が示唆されます。

薬剤の効果と遺伝子変異の関係をvolcano plotで網羅的に確認したい

1. Input C-CAT filesからcase/report CSVファイルを取り込む
2. Settingから組織型や年齢、治療コースなどの絞り込みを行う
3. Analysis -> Drug response analysis -> List of drugs used in Palliative CTxボタンを押す
4. Results -> Drug response -> Tables -> Drug use, by line of treatmentから使用状況を確認する
5. Analysis -> Drug response analysis -> Choose drugs for treatment effect analysisで治療を選択する
6. Analysis -> Drug response analysis -> Analyze with the setting selected aboveボタンで解析を行う
7. Results -> Drug response -> Response rate -> Volcano plot for objective response rateで奏効性に関連する遺伝子変異を探索する

右上の赤い遺伝子では変異があると奏効率が高く、左上の青い遺伝子では変異があると奏効率が低くなります。

生存期間解析

CGP検査後の予後と遺伝子変異の関係をみたい

1. Input C-CAT filesからcase/report CSVファイルを取り込む
2. Settingから組織型や年齢、治療コースなどの絞り込みを行う。とくにGenes of interestで注目する遺伝子セットを指定する。
3. Analysis -> Survival analysis after CGP test ボタンを押す
4. Results -> Survival after CGP -> Survival analysis -> Survival after CGP and performance statusから指定遺伝子セットないのいずれかに変異があるか否かで群分けした生存曲線を確認する。
5. Results -> Survival after CGP -> Survival analysis -> Survival after CGP and mutations, forest plotで、変異頻度の高い遺伝子について、変異の有無での2群間での生存期間の比較を行う
6. Results -> Survival after CGP -> Survival analysis -> Survival after CGP and mutations, KM-curveで、変異頻度の高い遺伝子について、変異の有無での2群間での生存曲線の比較を行う

SettingのTiming for RMST measuring in survival analysis (years) で、forest plotで描画する生存期間(restricted mean survival time)の差を計算する時期を指定します。

緩和的化学療法導入後の予後と遺伝子変異の関係をみたい

1. Input C-CAT filesからcase/report CSVファイルを取り込む
2. Settingから組織型や年齢、治療コースなどの絞り込みを行う。とくにGenes of interestで注目する遺伝子セットを指定する。
3. Analysis -> Survival analysis after CTx induction ボタンを押す
4. Results -> Survival after CTx -> Genetic variants and survival, forest plotで、変異頻度の高い遺伝子について、変異の有無での2群間での生存期間の比較を行う
5. Results -> Survival after CTx -> Genetic variants and survival, KM-curveで、変異頻度の高い遺伝子について、変異の有無での2群間での生存曲線の比較を行う

左側切断バイアスを補正した場合としない場合で生存曲線が描かれます。

ある薬剤を生存中に使用したか否かでの生存期間の差をみたい

1. Input C-CAT filesからcase/report CSVファイルを取り込む
2. Settingから組織型や年齢、治療コースなどの絞り込みを行う
3. Analysis -> Drug response analysis -> List of drugs used in Palliative CTxボタンを押す
4. Results -> Drug response -> Tables -> Drug use, by line of treatmentから使用状況を確認する
5. Analysis -> Drug response analysis -> Choose drugs for treatment effect analysisで治療を選択する
6. Analysis -> Drug response analysis -> Analyze with the setting selected aboveボタンで解析を行う
7. Results -> Drug response -> Survival after CGP -> Survival and drugで、緩和的化学療法導入後に指定したレジメンを使用したか否かでの2群で分けた生存曲線を確認する

左側切断バイアスを補正した場合としない場合で生存曲線が描かれます。

ある薬剤をCGP検査後に使用したか否かでの生存期間の差をみたい

1. Input C-CAT filesからcase/report CSVファイルを取り込む
2. Settingから組織型や年齢、治療コースなどの絞り込みを行う
3. Analysis -> Drug response analysis -> List of drugs used in Palliative CTxボタンを押す
4. Results -> Drug response -> Tables -> Drug use, by line of treatmentから使用状況を確認する
5. Analysis -> Drug response analysis -> Choose drugs for treatment effect analysisで治療を選択する
6. Analysis -> Drug response analysis -> Analyze with the setting selected aboveボタンで解析を行う
7. Results -> Drug response -> Survival after CGP -> Survival and drugで、CGP検査後に指定したレジメンを使用したか否か、そして治療を受けなかった群で2〜3群に分けた生存曲線を確認する

通常のカプラン・マイアー生存曲線が描かれます。

CGP検査後の死亡ハザードに関係する因子を抽出したい

1. Input C-CAT filesからcase/report CSVファイルを取り込む
2. Settingから組織型や年齢、治療コースなどの絞り込みを行う。
3. Analysis -> Survival analysis after CGP test ボタンを押す
4. Results -> Survival after CGP -> Survival analysis -> Hazard ratio for survival after CGP -genes から、単変量解析・多変量解析でのハザード比に関係する臨床情報や遺伝子変異を検討する。
5. Results -> Survival after CGP -> Survival analysis -> Hazard ratio for survival after CGP -genes から、単変量解析・多変量解析でのハザード比に関係する臨床情報や遺伝子パターン（クラスタリング）を検討する。

赤池情報量規準を用いて自動的に変数選択を行っています。

患者背景

患者背景のTableが欲しい

1. Input C-CAT filesからcase/report CSVファイルを取り込む
2. Settingから組織型や年齢の絞り込みを行う
3. AnalysisからCase summaryボタンを押す
4. Results -> Case summaryから結果を確認する
5. 必要があれば全体をコピーしてWordに保存する

SettingのFilters on mutation types で選択した遺伝子変異の有無で群分けして表示されます。

患者背景のFigureが欲しい

1. Input C-CAT filesからcase/report CSVファイルを取り込む
2. Settingから組織型や年齢の絞り込みを行う
3. AnalysisからClustering based on variantsボタンを押す
4. Results -> Clustering analysis -> Basic dataから結果を確認する
5. 必要があれば全体をコピーしてWordに保存する

Driverの項目は、何らかのがん化変異(C-CAT evidence level "F")が検出された症例か否かを示します。
Pts with recommended CTxはエキスパートパネルで推奨治療があった症例の割合を意味します。
Pts received recommended CTxは推奨治療を実際に受けた症例の割合を意味します。
Median time from CTx to CGPは緩和的化学療法開始日からCGP検査日までの期間の中央値を意味します。
Median time from CGP to deathはCGP検査日から死亡までの期間のKaplan-Meier法での中央値を意味します。

オンコプリント（遺伝子変異の一覧表）が欲しい

1. Input C-CAT filesからcase/report CSVファイルを取り込む
2. Settingから組織型や年齢の絞り込みを行う
3. AnalysisからOncoprintボタンを押す
4. Results -> Oncoprint -> Figures -> Oncoprintから結果を確認する
5. AnalysisからMutation rate of each gene for each histologyボタンを押す
6. Results -> Variation by histologyから組織型ごとにどの遺伝子変異の頻度が高いのかを確認する

描画した元データはDownloadable tableからExcelファイルでダウンロード可能です。

ロリプロット（遺伝子のどのアミノ酸残基に変異が多いかの図）が欲しい

1. Input C-CAT filesからcase/report CSVファイルを取り込む
2. Settingから組織型や年齢の絞り込みを行う。3列目のGene for lolliplotから遺伝子を指定する。
3. AnalysisからOncoprintボタンを押す
4. Results -> Oncoprint -> Figures -> Lolliplot for the selected geneから結果を確認する

描画した元データはDownloadable tableからExcelファイルでダウンロード可能です。
現状ではエキソンスキッピングやイントロンの変異には対応していません。

遺伝子間の相互排他性・共変異の情報が欲しい

1. Input C-CAT filesからcase/report CSVファイルを取り込む
2. Settingから組織型や年齢の絞り込みを行う
3. AnalysisからMutually exclusive or co-occurring mutationボタンを押す
4. Results -> Mutually exclusivityから結果を確認する

X軸の遺伝子とY軸の遺伝子の交わるセルの色が青いと両者は相互排他的、赤いと共変異の関係です。

治療到達性解析

どのような患者にCGP検査を行うと治療到達率が高いのかを知りたい

1. Input C-CAT filesからcase/report CSVファイルを取り込む
2. Settingから組織型や年齢、治療コースなどの絞り込みを行う。とくにGenes of interestで注目する遺伝子セットを指定する。
3. Analysis -> CGP benefit prediction analysis ボタンを押す
4. Results -> CGP benefit prediction -> Factors lesding to treatment -> Factors lesding to treatment, pre-CGP, Nomogram から、検査前に得られる患者の臨床情報に基づいて治療到達率を予測するノモグラムを得る。
5. Results -> CGP benefit prediction -> Factors lesding to treatment -> Factors lesding to treatment, pre-CGP, Odds ratio から、検査前に得られる患者の臨床情報が治療到達率に与える影響を単変量・多変量で解析する。
6. Results -> CGP benefit prediction -> Factors lesding to treatment -> ROC curve of nomogram から、ノモグラムによる予測の精度をROC曲線で確認する。
7. Results -> CGP benefit prediction -> Factors lesding to treatment -> Factors lesding to treatment, decision curve から、ノモグラムによる予測の臨床的有用性をdecision curve analysisで評価した結果を確認する。
8. Results -> CGP benefit prediction -> Factors lesding to treatment -> Analyze your data から、特定の患者さんの情報を入力すると治療到達率が予想される。

Decision curve analysisについてはこちらやこちらを参照下さい。

CGP検査後の生存期間と治療到達率の関係性を知りたい

データのキュレーション

組織型のキュレーションをしたい

2019年ころの症例を中心にして、詳細な組織型が登録されていない場合があります。
各病院の担当者が入力した手入力の情報を基にして再分類することが可能です。
1. Input C-CAT filesからcase/report CSVファイルを取り込む
2. Settingから組織型や年齢の絞り込みを行う
3. AnalysisからOncoprintボタンを押す
4. Oncoprint -> Downloadable tableの左上のExcelボタンから結果をダウンロードする
5. P列（病理診断名）、Q列（臨床診断名）、R列（提出検体の病理診断名）を参考に、S列（がん種.OncoTree.）を修正する
6. Input C-CAT files -> Correspondence table between ID and histology (CSV) -> Download CSV file templateボタンを押し保存する
7. 5で作成した表のハッシュID列とがん種.OncoTree.列の内容をID列とHistology列に貼り付ける
8. Input C-CAT files -> Correspondence table between ID and histology (CSV)から作成したCSVファイルを取り込む

「がん種.OncoTree.」の記載と「がん種.OncoTree.LEVEL1.」の記載が同じ症例だけキュレーションすると労力が少ないと思います。

左側切断バイアスのシミュレーション

左側切断バイアスについて確認したい

C-CATのデータのように、生存期間の測定開始日と検査日（観察開始日）が異なる場合、通常のカプラン・マイアー法では生存期間の推定が困難です。
生存期間の測定開始日から検査日まで、全症例が生存している、Immortal biasが存在しているからです。
生存期間の測定開始日から検査日までの生存期間と、検査日から最終観察日までの生存期間に分割すると、バイアスの一部が解消されます。
ただし、「CGP検査を受けた患者は受けなかった患者と何が違うのか」は究極的にはわからず、ある程度の選択バイアスの解消は不可能と考えます。
1. Results -> Bias correction simulation -> An example of bias adjustmentを開く。
2. お好みに応じてパラメタを調整する
3. Left-truncation bias adjustment simulationボタンを押す
4. 真の生存曲線、通常のカプラン・マイアー生存曲線、CGP検査前後で分割した生存曲線、ベイズ推定でのバイアス解消手法による生存曲線が描画されます。

概ね良好なバイアスの補正ができているのではないでしょうか。

その他

図を保存したい

図を右クリックし、拡張子を.pngとして名前をつけて保存して下さい。

サマリーの表を保存したい

保存して下さい。

生データの表を保存したい

左上にあるボタンでエクセルファイルあるいはCSVファイルなどで名前をつけて保存して下さい。

How to use

解析対象の症例・遺伝子を選択します

File import

解析対象の症例・遺伝子変異リストをインポートします

Figures in results are downloadable as png files.

FELIS; Functions Especially for LIquid and Solid tumor clinical sequencing.

https://github.com/MANO-B/FELIS

The following settings are for advanced analysis only

If you select No, csv files may not be necessary.

If you select Yes, faster when performing the same analysis repeatedly.

Figure. Recurrent oncogenic mutations in selected cases. The 30 genes with the highest frequency of oncogenic mutations are shown. Mutational landscapes were created using ComplexHeatmap package for R.

Figure. Frequency of oncogenic mutations in the selected gene. The most frequent oncogenic mutations are shown with amino acid change.

Mutplot by Zhang W, PMID:31091262. If error occurs, correct 'source/UniPlot.txt'.

Protein structure source: Uniprot

Figure. Recurrent oncogenic mutations across subtypes. The 30 genes with the highest frequency of oncogenic mutations were displayed.

Summary, cluster and mutated gene

Summary, cluster and histology

Raw data

Survival difference will be evaluated with restricted mean survival time in this section. Analysis with hazard ratio is also provided in 'Overall survival with risk-set adjustment' section (Survival analysis start date = CGP test date).

Figure. Survival analysis after CGP test using the conventional Kaplan–Meier estimator, log–rank test were undertaken with survival package for R. EP: expert panel. RMST, restricted mean survival time.

Raw data

Figure. Survival analysis after CGP test using the conventional Kaplan–Meier estimator, log–rank test were undertaken with survival package for R. EP: expert panel. RMST, restricted mean survival time.

Group 1

Group 2

Propensity score-based adjustment

Propensity score matching

Inverse probability weighting

Threshold for IPW

The 95% CI was derived from the empirical 2.5th and 97.5th percentiles of the bootstrap distribution. This approach captures sampling variability of the weighted survival process while preserving the time scale and interpretation of RMST in days.

All analyses were conducted using the survival, MatchIt, and cobalt packages.

Survival difference is evaluated with restricted mean survival time in this section. Analysis with hazard ratio will be also provided in 'Overall survival with risk-set adjustment' section (Survival analysis start date = CGP test date).

Figure. Suvival periods after CGP and gene mutations estimated with conventional Kaplan-Meier estimator. Restricted mean survival time in two years (days) were estimated with survRM2 package in R.

If there are too many histology subtypes, multivariable analysis may fail. Go to Settings and set: “Analyze without detailed histology” → “Yes, use OncoTree 1st level”.

Download raw data

Take care of left-truncation bias.

Raw data

Take care of left-truncation bias.

Group 1

Group 2

Figure. Overall survival after the first survival-prolonging chemotherapy after adjusting for left-truncation bias. To evaluate the association between oncogenic mutations and survival, a risk-set adjustment model was performed to adjust for left-truncation bias with survival package.

Take care of left-truncation bias.

Take care of left-truncation bias.

Figure. Overall survival after the first survival-prolonging chemotherapy after adjusting for left-truncation bias. To evaluate the association between oncogenic mutations and survival, a risk-set adjustment model was performed to adjust for left-truncation bias with survival package.

Group 1

Group 2

克服している3つの重大なバイアス

解析のアプローチ（Doubly Robust Estimation）

フォレストプロットの見方：Time Ratio (TR) とは？

Simulation Study: Tamura & Ikegami Model Ver 2.3.2

Population & Target Gene Settings

Left-Truncation (T1) Pattern

Censoring Pattern (C2)

Simulation Results

Single Run Estimates (Point Estimate & 95% CI)

400 Iterations Summary (Mean, MSE, and Coverage Rate [CR])

Visualizations for Manuscript (Fig 1 - 3)

Univariate Dependent Truncation: Copula vs Lynden-Bell

Simulation Parameters

Estimated Median Survival Times

Reconstructed Marginal Survival Curves

Figure. Overall survival after the first survival-prolonging chemotherapy after adjusting for left-truncation bias. To evaluate the association between oncogenic mutations and survival, a risk-set adjustment model was performed to adjust for left-truncation bias with survival package.

Take care of left-truncation bias.

Take care of left-truncation bias.

Figure. Overall survival after the first survival-prolonging chemotherapy after adjusting for left-truncation bias. To evaluate the association between oncogenic mutations and survival, a risk-set adjustment model was performed to adjust for left-truncation bias with survival package.

It takes minutes.

Take care of left-truncation bias.

Figure. Hazard ratio estimated by cox model with survival package.

It takes minutes.

This setting also applies to Bayesian estimation in other tabs.

Group 1

Group 2

It takes minutes.

It takes minutes.

Figure. Overall survival after the first survival-prolonging chemotherapy.

Overall drug usage

Patients without treatment time excluded in treatment time dataset

Patients with RECIST-NE excluded in objective response dataset

Patients without treatment time or with RECIST-NE excluded in adverse effect dataset

Figure. Treatment time.

Group 1