Skip to content

simula/datasets.simula.no

Repository files navigation

datasets.simula.no

A collection of open datasets published by Simula Research Laboratory and SimulaMet.

Currently, we have published the following datasets:

Medical and Biology Datasets

  • Cellular, A cell autophagy dataset. [project]
  • Depresjon, The Depresjon Dataset. [publication | project]
  • GastroVision, A multicenter dataset. [publication | project]
  • HTAD, A Home-Tasks Activities Dataset with Wrist-accelerometer and Audio Features. [publication | project]
  • HYPERAKTIV, A Motor Activity Database of Patients with ADHD. [publication | project]
  • HyperKvasir, The Largest Gastrointestinal Dataset. [publication | project]
  • Kvasir, A Multi-Class Image-Dataset for Computer Aided Gastrointestinal Disease Detection. [publication | project]
  • Kvasir Capsule, The largest gastrointestinal PillCAM dataset. [publication | project]
  • Kvasir Instrument, A gastrointestinal instrument Dataset. [publication | project]
  • Kvasir SEG, Segmented Polyp Dataset for Computer Aided Gastrointestinal Disease Detection. [publication | project]
  • Kvasir-VQA, A Text-Image Pair GI Tract Dataset. [publication | project]
  • Kvasir-VQA-x1, A Large-Scale Multi-Task Benchmark for GI Tract Visual Question Answering. [publication | project]
  • KvasirCapsule SEG, A Capsule Endoscopy Segmentation Dataset. [publication | project]
  • MedMultiPoints, A Multimodal Dataset for Object Detection, Localization, and Counting in Medical Imaging. [publication | project]
  • Medico Multimedia - VISEM Tracking, A sperm tracking dataset. [publication | project]
  • Nerthus, A Bowel Preparation Quality Video Dataset. [publication | project]
  • Psykose, A Motor Activity Database of Patients with Schizophrenia. [publication | project]
  • VISEM, A Multimodal Video Dataset of Human Spermatozoa. [publication | project]
  • VISEM QC, A sperm quality control dataset. [project]

Sport and Activity Datasets

  • Alfheim, Soccer video and player position dataset. [publication | project]
  • Arx, A Text-Classification Dataset Consisting of Norwegian Soccer Articles from VG and TV2. [publication | project]
  • ExposureEngine, Oriented Logo Detection and Sponsor Visibility Analytics in Sports Broadcasts. [project]
  • Heimdallr, A Dataset For Sport Analysis. [project]
  • HockeyAI, A Multi-Class Ice Hockey Dataset for Object Detection. [publication | project]
  • HockeyOrient, A Dataset for Ice Hockey Player Orientation Classification. [publication | project]
  • HockeyRink, A Dataset for Precise Ice Hockey Rink Keypoint Mapping and Analytics. [publication | project]
  • PMData, A lifelogging dataset of 16 persons during 5 months using Fitbit, Google Forms and PMSys. [publication | project]
  • ScopeSense, A 8.5-month sport, nutrition, and lifestyle lifelogging dataset. [project]
  • Soccer Summarization, Soccer game captions and summary in English for game summarization. [publication | project]
  • SoccerChat, A Multimodal Video-Text Dataset for Natural Language Soccer Game Understanding. [publication | project]
  • SoccerMon, Subjective and objective data collected over two years from two different elite women´s soccer teams. [project]
  • SoccerNet-Echoes, A Soccer Game Audio Commentary Dataset. [publication | project]
  • SoccerSum, The SoccerSum Dataset for Automated Detection, Segmentation, and Tracking of Objects on the Soccer Pitch. [publication | project]
  • TACDEC, TACDEC: Dataset of Tackle Events in Soccer Game Videos. [publication | project]

Other Datasets

  • Anarchy Online, Server-side Network Traffic from Anarchy Online: Analysis, Statistics and Applications. [publication | project]
  • European Cloud Cover, A dataset containing reanalysis data from ERA5 and satellite retrievals from METeosat Second Generation. [publication | project]
  • Eye Tracker, A Serious Game Based Dataset. [publication | project]
  • HSDPA, HSDPA-bandwidth logs for mobile HTTP streaming scenarios. [publication | project]
  • Image Sentiment, A dataset for image sentiment analysis. [publication | project]
  • Njord, A fishing boat dataset. [project]
  • Right Inflight, A Dataset for Exploring the Automatic Prediction of Movies Suitable for a Watching Situation. [project]
  • THREAT, A Large Annotated Corpus for Detection of Violent Threats. [project]
  • Toadstool, A Dataset for Training Emotional and Intelligent Machines Playing Super Mario Bros. [publication | project]
  • WICO Graph Dataset, A Labeled Dataset of Twitter Subgraphs based on Conspiracy Theory and 5G-Corona Misinformation Tweets. [publication | project]
  • WICO Text, A labeled dataset of conspiracy theory and 5G-corona misinformation tweets. [publication | project]

How to contribute

Datasets are added via pull request. See CONTRIBUTING.md for the full walkthrough.

Contact

If you have any questions or need assistance, please open an issue in the repository or contact steven@simula.no.

Contributors

Languages