Multi-task learning pipeline for prostate cancer classification using the SICAPv2 dataset.
- Multi-label classification for mixed Gleason patterns
- Multi-resolution processing (224x224 + 384x384)
- Transformer-based feature fusion
- Specialized cribriform detection
- Stain normalization
The SICAPv2 dataset is located in data/SICAPv2/ with the following structure:
data/SICAPv2/
├── images/ # 18,783 histology patch images (512x512, 10X magnification)
├── masks/ # Corresponding annotation masks
├── partition/
│ ├── Test/ # Train/test split files
│ └── Validation/ # Cross-validation folds (Val1-Val4)
├── wsi_labels.xlsx # WSI-level Gleason scores
└── readme.txt # Dataset documentation
- Setup virtual environment:
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt- Test the setup:
python test_imports.py- Run training:
python main.py- Total samples: 18,783 patches (512×512 pixels)
- Magnification: 10X
- Labels: Multi-label classification (NC, G3, G4, G5, G4C)
- Cribriform detection: Specialized detection for G4C patterns
- Cross-validation: 4-fold patient-based splits available
Results and model checkpoints will be saved to SICAPv2_results/ directory.