The Image Seed Data Set contains 276 real X-ray images of wheat kernels from three species: Kama (72 samples), Canadian (96 samples), and Rosa (108 samples). The images were obtained using a short-focus X-ray apparatus (Elektronika 25), digitized, and standardized by centering each grain image in a black-padded canvas of uniform size.
The dataset reflects class imbalance common in real-world data. To support robust evaluation, it includes five cross-validation (CV) splits. Each split consists of 80% training and 20% testing data, with disjoint test sets across folds. This means that test images in one fold are never reused in another.
The following table shows the number of training and test samples per class across all five folds:
| Class | Set | Cross Validation Subset | ||||
|---|---|---|---|---|---|---|
| Fold 0 | Fold 1 | Fold 2 | Fold 3 | Fold 4 | ||
| Kama | Train | 57 | 56 | 59 | 55 | 61 |
| Test | 15 | 16 | 13 | 17 | 11 | |
| Canadian | Train | 81 | 77 | 69 | 84 | 73 |
| Test | 15 | 19 | 27 | 12 | 23 | |
| Rosa | Train | 83 | 88 | 93 | 82 | 86 |
| Test | 25 | 20 | 15 | 26 | 22 | |
The following resources will be made available after the official article publication:
If you use the dataset, please cite the following work:
@article{KOWALSKI_seeds_2025:MONA,
author = {Kowalski, Piotr A. and Jeczmionek, Ernest and Charytanowicz, Malgorzata and {\L}ukasik, Szymon and Kulczycki, Piotr},
doi = {10.1007/s11036-025-02479-0},
isbn = {1572-8153},
journal = {Mobile Networks and Applications},
title = {Seeds Image --Introduction and Baseline Experiments with the New Labeled Benchmark for Machine Learning Tasks},
url = {https://doi.org/10.1007/s11036-025-02479-0},
year = {2025},
bdsk-url-1 = {https://doi.org/10.1007/s11036-025-02479-0}
}
@InProceedings{kowalski2025classification,
author = {Kowalski, Piotr A. and Jeczmionek, Ernest and Charytanowicz, Malgorzata and {\L}ukasik, Szymon and Niewczas, Jerzy and Kulczycki, Piotr},
editor = {Perakovic, Dragan and Knapcikova, Lucia},
title = {Classification of Wheat Species Using Convolutional Neural Networks: A Comparative Study},
booktitle = {Future Access Enablers for Ubiquitous and Intelligent Infrastructures},
year = {2025},
publisher = {Springer Nature Switzerland},
address = {Cham},
pages = {3--8},
isbn = {978-3-031-72393-3}
}