Conference: Symposium “Hörvermögen von Pinguinen / Hearing in Penguins”, 153. Jahresversammlung der Deutschen Ornithologen-Gesellschaft, 19. und 20. September 2020. Full presentation DOI: 10.13140/RG.2.2.33907.14883
Storing a taxonomic tree in a relational database
Taxonomic trees are ubiquitous in biodiversity software. A very common application is using a tree to allow the users to browse the data. Other applications are: training classification models, curating a collection, visualizing research results etc.
Data is often stored in a relational database, such as MySQL. Unfortunately, relational databases are not particularly well suited for storing tree structures. Yet the choice of a database may be guided by more important requirements, and so the taxonomic tree is sometimes implemented as an afterthought. The result can be a structure that is difficult to maintain and to query, sometimes requiring more work than expected to maintain and finally yielding a less satisfying experience for the end user.
I will show some counterexamples, and how get a better result by using a data structure called “nested set”.Continue reading “Storing a taxonomic tree in a relational database”
Diversity indices are a common descriptive statistic used in biodiversity informatics. Diversity indices typically express the species richness of a given habitat or area. The α-diversity index is suitable when studying a single habitat and is expressed by a single number. There are several commonly used equations used to compute α-diversity. In this example, I will be using the Simpson’s diversity index, which is computed by the formula:
Where S is the number of species in the sample and p is the proportion of a particular species. The Simpson’s diversity index is thus more influenced by common species rather than by rare species and is often considered to be an index reflecting the actual species diversity in a sample.
To illustrate this, I will use will use data obtained from GBIF. Remember, α-diversity is suitable for expressing the diversity within a single habitat, so I will obtain data accordingly. Here I chose the Tiergarten, a large (210 hectare) park in central Berlin.Continue reading “Computing α-diversity”
Ecological datasets in Python
Datasets included in library distributions are very practical for explaining concepts and for tutorials, as of course no extra download is required. A while ago, I posted a list of biodiversity datasets that come with R-core. Here I continue along the same line and list datasets coming with popular Python libraries.Continue reading “Ecological datasets in Python”
Comparing the distribution of Corvus corone and Corvus cornix
In this post, I will use a divergent color scale to plot two distributions on the same map. As an example, I chose to plot the European distribution of two species of corvids: the carrion crow (Corvus corone) and the hooded crow (Corvus cornix). There has been some adjustments to the taxonomical status of the hooded crow (see Parkin et al., 2003 for details), hoewever, currently, they are regarded as different species.
In this map, I will use a divergent color scale to show areas in Europe where each species is dominant, and also show areas where both species are present.Continue reading “Comparing the distribution of Corvus corone and Corvus cornix“
Simple distribution maps using ggplot
In a previous post, I discussed how to plot GBIF occurrence data using OpenStreetMaps. Here, I will plot a distribution map. Distribution maps differ from occurrence maps in that occurrences are aggregated and plotted as a heat map. Additionally, the map has to be projected using an equal area projection.
I will illustrate these two features by plotting the distribution of the tawny owl (Strix aluco) in Europe.
Facets and time frames: plotting the migration of the stork
In a previous post, I discussed how to plot occurrence data from GBIF on a map. In this post, I will discuss how to plot a bird migration by producing an occurrence map for each month of the year. I will use the migration of the stork (Ciconia ciconia) as an example.Continue reading “Facets and time frames: plotting the migration of the stork”
Plotting GBIF occurrence data on a map using OpenStreetMap
In a previous post, I dicussed how to get occurrence data from the Global Biodiversity Information Facility (GBIF). For my current project at the Natural History Museum in Berlin, I work on penguins. In this post, I will plot occurrences of penguin species on a map. Occurrence maps show the geographical position of occurrences.Continue reading “Plotting GBIF occurrence data on a map using OpenStreetMap”
Getting occurrence data from GBIF
The Global Biodiversity Information Facility (GBIF) is a data aggregator for biodiversity data. The big advantage of using an aggregator like GBIF over getting data directly from the original data source is that an aggregator provides a single point of entry to many data sets, so analysing one data set is technically interoperable with any other data set.Continue reading “Getting occurrence data from GBIF”
Spectrograms are a common visualization of sound data. Visualizing sound data can be useful when doing a presentation or for publication. Additionally, machine learning algorithms for classifying sound data generally use spectrograms as their starting point, instead of the sound data itself, as many advanced algorithnms for classifying images are readily available. The example uses the R packages warbleR (Araya-Salas & Smith-Vidaurre, 2017), seewave (Sueur, Aubin, Simonis, 2008) and tuneR (Ligges et al., 2018).
This example draws the spectrogram of the call of a tawny owl (Strix aluco).Continue reading “Spectrograms”