Tasks AgentFly has been trained and evaluated on, with links to training reports. The shipped training scripts live under examples/train_scripts/ — copy the closest match and adapt; see Build Your Own Task.
| Task | Model | Report | Status |
|---|---|---|---|
| SearchR1 | Qwen2.5 | report | ✅ |
| WebShop | Qwen2.5 | report | ✅ |
| ScienceWorld | Qwen3-4B-Instruct | report | ✅ |
| SWE | Qwen3-32B | report | ✅ (on going) |
| SimuScene | SFT DeepSeek-R1-Distill-Qwen | report | ✅ |
Training curves and metrics are logged to WandB for each experiment.