This portfolio project classifies satellite image tiles as agricultural or non-agricultural land using deep learning models built in both Keras/TensorFlow and PyTorch.
The work progresses from data-loading experiments to CNN baselines, then to CNN-Vision Transformer hybrid models that combine local feature extraction with global attention.
- Binary land-use classification from satellite image tiles
- 6,000 image dataset with balanced classes
- Keras and PyTorch data pipelines
- CNN baselines in both frameworks
- CNN-ViT hybrid models in both frameworks
- Cross-framework evaluation using accuracy, precision, recall, F1, ROC-AUC, loss, and confusion matrices
The dataset contains 6,000 JPG satellite tiles:
| Class | Meaning | Images |
|---|---|---|
class_0_non_agri |
Non-agricultural land | 3,000 |
class_1_agri |
Agricultural land | 3,000 |
The project data pipeline downloads the dataset from IBM Skills Network cloud storage:
https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/4Z1fwRR295-1O3PMQBH6Dg/images-dataSAT.tar
Large local data files are kept out of Git. See data/README.md for setup notes.
| Model | Accuracy | Precision | Recall | F1 Score | ROC-AUC |
|---|---|---|---|---|---|
| Keras CNN | 0.9925 | 1.0000 | 0.9850 | 0.9924 | 1.0000 |
| PyTorch CNN | 0.9988 | 0.9983 | 0.9993 | 0.9988 | 1.0000 |
| Keras CNN-ViT Hybrid | 0.9958 | 0.9990 | 0.9927 | 0.9958 | 0.9998 |
| PyTorch CNN-ViT Hybrid | 0.9990 | 0.9990 | 0.9990 | 0.9990 | 1.0000 |
The PyTorch models achieved the strongest scores in these runs, with the final PyTorch CNN-ViT hybrid reaching 99.90% accuracy.
.
├── scripts/
│ ├── 01_data_loading_memory_vs_generator.py
│ ├── 02_keras_data_pipeline.py
│ ├── 03_pytorch_data_pipeline.py
│ ├── 04_keras_cnn_classifier.py
│ ├── 05_pytorch_cnn_classifier.py
│ ├── 06_keras_vs_pytorch_cnn_comparison.py
│ ├── 07_keras_cnn_vit_hybrid.py
│ ├── 08_pytorch_cnn_vit_hybrid.py
│ └── 09_final_cnn_vit_evaluation.py
├── src/
│ ├── config.py
│ ├── data_utils.py
│ ├── metrics.py
│ └── visualization.py
├── LICENSE
├── data.md
├── models.md
├── results_summary.md
└── requirements.txt
The scripts/ folder contains the project workflow as Python source code, organized from data loading through final model evaluation.
The Python scripts are:
01_data_loading_memory_vs_generator.py02_keras_data_pipeline.py03_pytorch_data_pipeline.py04_keras_cnn_classifier.py05_pytorch_cnn_classifier.py06_keras_vs_pytorch_cnn_comparison.py07_keras_cnn_vit_hybrid.py08_pytorch_cnn_vit_hybrid.py09_final_cnn_vit_evaluation.py
These scripts are intended for code review, search, and future refactoring.
Create an environment and install the dependencies:
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtRun or inspect the Python scripts in scripts/ for the source workflow.
This project began as an IBM/Coursera deep learning capstone sequence. I reorganized it into a portfolio-ready repository with a clear Python workflow, documented results, reusable helper modules, and GitHub-friendly handling for large data and model artifacts.
- Add a small inference script for classifying a new satellite tile.
- Export selected plots from model runs into
reports/figures/. - Add a lightweight Streamlit demo for interactive predictions.
- Track experiments with a reproducible configuration file.
This project is licensed under the Apache License 2.0. See LICENSE for the full license text.