Identifying arthropods in real-time is a hard problem. First, arthropods have an extremely large number of species, even narrowing the search to the biodiversity within specific biomes. Furthermore, they have a huge variety of forms, so classifiying at a higher level, such as at order level, is still a problem. Look at the incredible variety of the order Coleoptera for example.
Yet there are applications, e.g. in agriculture, biodiversity calculations, collection scanning, in general any application that benefits from large scale automation, that require just that: real-time arthropod taxonomy.

So here is a first try, comparing several machine learning methods, and evaluating the results in terms of accuracy and speed.
For more details please see the Jupyter Notebooks: https://github.com/aot29/arthropod_taxonomy/
Dataset
A dataset of arthropod images was obtained from Kaggle (Drange, 2020). This dataset (manually compiled from iNaturalist and similar sources), contains 15376 files of 7 taxa (orders) of arthropods:
- Araneae (spiders)
- Coleoptera (beetles)
- Diptera (flies)
- Hemiptera (cicadas, aphids, …)
- Hymenoptera (wasps, bees, …)
- Lepidoptera (butterflies, moths)
- Odonata (dragonflies, …)
This dataset is licensed as CC BY-NC-SA 4.0.
Most images are taken in the wild, using a variety of foto equipment by multiple authors. Most images contain one specimen of interest, but some contain multiple specimens of the same or of different taxa. Each image comes with one or several crop regions, so I applied the crops and resized the images to 300×300 pixels.
Baseline and Support Vector Machine
As a baseline, I chose the most-common class baseline, i.e. since there are seven classes (arthropod orders) and there is roughly the same image count in each class, the baseline is 16%, i.e. the probability of getting it right by plain chance is 0.16. Any model would have to do better than that.
The first model I tried is Support Vector Machine. This is a simple statistical model, that works on linear data. So the images were flattened, the model was fitted with 90% of the data and tested on the remaining 10%. The average accuracy was 24.14%, so not much better than the baseline, and also very slow. A more sophisticated model is required.
Convolutional Neural Network (CNN)
A CNN is a neural network that extracts features from an image by sliding a small window over the image. I applied a simple CNN consisting of 2 convolutional layers, 2 max pooling layers and 1 dense layer to the dataset, and tested it on the same test data as the previous model. The average accuracy was not much better than the baseline. The CNN appears very confused by arthropod variability!
Transfer learning from MobileNetV3Small
Therefore, a more sophisticated method is required. Transfer learning is a machine learning technique, were a pre-trained model, here MobileNetv3Small, is used to extract features from the image. The advantage is that the pre-trained model was fitted using millions of internet images (cats, dogs, cars, …), and so already knows quite a bit of the World before it sees it first arthropod. I build a small neural network, were transfer learning is the first layer, followed by a pooling layer, a normalization layer and 2 dense layers (not in that order), implemented using Keras (Chollet et al., 2015). The average accuracy on the same 10% test data was 63%. Not terrific, but much better.
I tried to improve the accuracy of the transfer learning model by adding a augmentation layer, i.e. a layer that will serve slightly changed images at every epoch. Two random changes are applied: mirroring, skewing, saturation, sharpness flip etc. However, the accuracy did not improve noticeably.
model | average accuracy | average time |
---|---|---|
Dummy | 0.1581 | 0.000003 |
Support Vector Machine | 0.241400 | 4.365616 |
Convolutional NN | 0.1831 | 0.042676 |
Transfer MobileNet | 0.6292 | 0.042685 |
Transfer Augmented | 0.6135 | 0.042758 |
Speed
So what about speed? The speed of predicting the arthropod order, so while running the model, not during training, was measured and as the average per image, for 10 runs of 1000 images randomly selected from the test dataset.
Here too, transfer learning is the best model for the task (Table 1).
Improving the results
The accuracy obtained is not very high, the speed however is OK, in my opinion. To improve the accuracy, a larger training set would be required. The present dataset could be augmented using Tensorflow’s data augmentation functions. Additionally, there are several other pre-trained models that could be tested, although some might require a larger computer!
For more details please see the Jupyter Notebooks: https://github.com/aot29/arthropod_taxonomy/
References
Chollet, F. et al. (2015), Keras, https://keras.io.
Drange, G. (2020). Arthropod Taxonomy Orders Object Detection Dataset. Kaggle. https://doi.org/10.34740/KAGGLE/DSV/1240192
Howard, A., Sandler, M., Chu, G., Chen, L. C., Chen, B., Tan, M., … & Adam, H. (2019). Searching for mobilenetv3. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 1314-1324).