Disclaimer

This platform provides ethical risk assessment for pathology foundation models (PFMs). The evaluation results are generated automatically based on the submitted models and available data. These results are provided for informational purposes only and do not constitute professional, medical, or legal advice.

By using this platform, you acknowledge and agree to the following:

  • You will use this tool solely for research or authorized clinical workflow purposes.
  • You will not redistribute model outputs or assessment results without proper approval.
  • PFMs may produce probabilistic predictions that are not guaranteed to be correct in all cases.
  • Final diagnostic, treatment, or management decisions must be made by certified healthcare professionals.
  • The platform operators do not guarantee the completeness, accuracy, or suitability of the assessment results for any specific purpose.
  • Use of the platform is at your own risk, and the operators are not liable for any direct or indirect consequences arising from the use of the results.
  • Potential risks include, but are not limited to, misinterpretation of probabilistic predictions, incomplete evaluation of model behavior, and unexpected biases in the assessment.
  • Users are responsible for ensuring compliance with applicable laws, regulations, and institutional policies when using model outputs.

By using this platform, you acknowledge that you have read, understood, and accepted this disclaimer.

3

Ethical Aspects

Privacy Leakage, Clinical Reliability, Fairness.

26

Evaluation Tasks

26 datasets.

> 8

Organs

Span 8 types of cancer & Pan-cancer.

2

Image type

Whole slide images and Regions of interest.

The overview

image

Privacy Leakage

Examining whether features embeddings generated by PFMs encode patient-sensitive information. Assessing the extent to which these information can be inferred from the feature embeddings.

Factors: Gender, Age, Race, Institution

Datasets: 9

Results:

PFM-generated feature embeddings encode patients' demographic information

PFM-generated feature embeddings encode patients' medical institution

image

Clinical Reliability

Examining how non-diagnostic features affect the generalization ability and reliability of PFM-based downstream models. Evaluating their performance in OOD settings.

Diagnostic task: Patch classification/retrieval, WSI classification/survival analysis

Datasets: 9

Results:

Patch classification/retrieval performance declines in OOD settings.

WSI prediction performance declines in OOD settings.

image

Fairness

Examining whether downstream models trained on PFM-derived feature embeddings maintain minimal performance disparities across different demographic or institutional subgroups

Diagnostic task: Patch/WSI classification

Datasets: 9

Results:

downstream models trained on PFM-derived features may exhibit fairness issues, raising concerns about their equitable applicability in real-world clinical environments.

image

Contact with us

Liansheng Wang, lswang@xmu.edu.cn (Corresponding Author)

Weiping Lin, wplin@stu.xmu.edu.cn (First Contributing Author)