Introduction

Human land use is a major driver of biodiversity loss (Sala et al. 2000). However, not all types of land use are equally threatening to biodiversity, and some strategies of land management can effectively sustain substantial biodiversity (Tscharntke et al. 2005; Rands et al. 2010; Mouysset et al. 2012). One of the prerequisites for appropriate land management is a thorough understanding of species distribution patterns, often across entire landscapes or regions (Gaston 2000; Dover et al. 2011). Quantifying distribution patterns, in turn, demands robust and reproducible field survey protocols for a range of different species (Lobo et al. 2010). Important variables in this context include patterns of local species richness (Yoccoz et al. 2001), species turnover (Tylianakis et al. 2005; Kessler et al. 2009), and species composition (Klimek et al. 2007).

Research projects investigating biodiversity distribution patterns are usually constrained by limited resources including money, personnel and time (Field et al. 2005; Baasch et al. 2010). These constraints pose limits on the affordable sampling effort, both with respect to the number of sites surveyed and the amount of effort per site. Scientists may opt for applying substantial effort at relatively few sites or for surveying a large number of sites with reduced effort. Collecting data in ways that allow the detection process to be modelled is often considered important to minimize the impact of false absences, especially in the case of animals (MacKenzie et al. 2002; Lahoz-Monfort et al. 2013; Stauffer et al. 2002). This is often done by repeatedly surveying a given site, but other methods are possible such as recording times to detection (Guillera-Arroita et al. 2011).

To collect reliable data using limited resources, ecologists thus face a trade-off between the number of survey sites and the number of repeated surveys at each sample site (Bried et al. 2011; Reed et al. 2011; Reynolds et al. 2011; Bailey et al. 2007; Suarez-Seoane et al. 2002; Guillera-Arroita and Lahoz-Monfort 2012; Guillera-Arroita et al. 2010). One tool to investigate tolerable information loss when survey effort is reduced is to evaluate the statistical power of the different survey designs (Field et al. 2005; Legg and Nagy 2006; Bailey et al. 2007; Vellend et al. 2008; Guillera-Arroita and Lahoz-Monfort 2012; Sewell et al. 2012). Power analysis calculates the size of an effect that is detectable with a certain level of confidence and significance for a given design. Power increases as more effort is spent per site (given that detectability increases), as well as when the number of sites is increased.

In this study, we examined how estimated species diversity patterns changed with varying survey intensity and a varying number of survey sites. We focused on a case study in Central Romania, a region that is characterized by low-intensity land use practices (Baur et al. 2006; Fischer et al. 2012; Kuemmerle et al. 2008), which have created a heterogeneous landscape that supports high biodiversity (Rakosy 2005; Page et al. 2012; Fischer et al. 2012). However, biodiversity in the region is threatened by a series of complex socio-economic changes, including potential changes in land use. These changes include land abandonment and agricultural intensification (Bouma et al. 1998; Stoate et al. 2009; Akeroyd and Page 2011), both of which have been observed to negatively affect biodiversity elsewhere in Europe (Suarez-Seoane et al. 2002; Verhulst et al. 2004).

We conducted surveys for three taxonomic groups, namely plants, birds and butterflies, which are particularly diverse in Romania compared to most other parts of Europe (Akeroyd 2006). Our study served as a pilot to design subsequent large-scale surveys for these groups. First, we investigated the effect of increasing survey intensity on diversity patterns, as represented by species richness, turnover and composition. Second, we calculated the statistical power of alternative plausible designs varying in survey intensity and number of survey sites for a specific relationship, namely the relationship between landscape heterogeneity, represented by the variability in land covers within a specific area, and species richness.

Methods

Study area

The study was conducted within a 50 km radius of Sighişoara, southern Transylvania, Romania (45°45′48N–46°40′17N; 24°8′7E–25°26′40E). The landscape is undulating, with altitudes between 266 and 1,095 m above sea level. It is characterized by a heterogeneous and fine-grained mosaic of different land uses, including substantial amounts of semi-natural vegetation. Approximately 37 % of land is arable, 24 % is grassland (pastures and meadows), and 28 % is covered by forests. We initially identified a large number of potential survey points by comprehensively walking the land around each of five villages, covering all major land covers around each village in the process. Based on this initial reconnaissance survey, we randomly selected 35 points as survey sites, located in arable land (n = 17), grassland (n = 13) and forest (n = 5). Each survey site was defined as a circle measuring one hectare. Sites were located with a minimum distance of 200 m from each other and a maximum distance of 6,339 m within one village.

Field surveys

Plants

We used two different survey approaches to quantify plant species richness and composition. First, we used a ‘classical’ approach at all 35 survey sites from 1st May to 30th May 2011. We established three 30 × 30 m plots in each 1 ha site. Within each 30 × 30 m plot, we selected one representative 3.16 × 3.16 m subplot, in which we recorded the presence and percentage cover of all vascular plant species (Fig. 1). Second, we used a ‘cartwheel’ approach to resample plants in a subset of 19 (n: arable land = 6, grassland = 8, forest = 5) of the 35 survey sites from 1st June to 15th July 2011. We decided to only resample sites that have remained largely unchanged since the first sampling round, i.e. in which no harvesting or mowing have occurred. In each 1 ha site, we distributed ten plots of 1 × 1 m at a random distance from the middle point, every 36 degrees. We alternated the random distances so that five plots were distributed within 40 m of the center (the inner 0.5 ha) and five were located between 40 and 56 m from the center (the outer 0.5 ha; Fig. 1). We then recorded the presence and percentage cover of all vascular plant species in each plot. Phenological changes over the two survey periods were minor, and did not cause systematic differences in the species detected.

Fig. 1
figure 1

Illustration of the sampling scheme for a bird surveys; b plants surveys: classical approach; c plant surveys: cartwheel approach; and d butterfly surveys

Birds

Birds were surveyed at all 35 sites using 20 min point counts (Bibby 2000) between 1st May and 8th June 2011, on those days without rain or strong wind (Fig. 1). At each site, four surveys were conducted between 05:30 and 11:00 AM, noting the presence of singing males. We controlled for temporal bias by rotating the site order, except for the forest sites which were always surveyed first in the morning to maximize detections.

Butterflies

Butterflies were surveyed four times at 26 sites (12 sites in arable land, 12 grassland sites and two forest sites) by walking Standard Pollard Transects (Pollard and Yates 1993) between 1st June and 15th July 2011. At each site, we sampled four transects with a length of 50 m to the east, south, north and west from the center (i.e. total of 200 m per site; Fig. 1). Surveys were conducted at a pace of 10 m per minute when weather conditions were appropriate (no rain, <90 % cloud cover, >17 °C, no strong wind). All butterflies within 2.5 m on either side of a given transect were caught with a butterfly net, identified and released. For identification, we used pan-European and eastern European guides (Tshikolovets 2003; Lafranchis 2004).

Analysis

Estimation of species richness and composition

We calculated species richness as the sum of all recorded species per taxonomic group over all plots or repeats in a given site. We calculated Whittaker's β-diversity index as a measure of species turnover among the sites and repeats in our dataset (Whittaker 1960; Anderson et al. 2011).

To compare plant survey methods, we correlated the species richness obtained by the two approaches using Spearman Rank correlation. In subsequent analyses, we considered data obtained by the cartwheel approach, since the randomized placement of plots within a site was more representative for the variation within a site.

We applied hierarchical community models to estimate true species richness at each site. Hierarchical community models can be used to estimate true species richness under consideration of the species specific detectability (Dorazio and Royle 2005; Dorazio et al. 2006). We considered the detectability of each species as a function of survey date and set the number of augmented species to 2/3 of the observed richness (Kéry and Royle 2009; Zipkin et al. 2009). Species augmentation accounts for the possibility that some species remained unobserved in a survey with imperfect detection. A community model with species augmentation will estimate the occupancy of unobserved species as a function of estimated detection probability of the observed species. The occupancy of observed and unobserved species, in turn, is used to calculate true species richness. Moreover, we assumed that detectability was constant and that populations were closed, that is, population sizes were constant and were not subject to processes such as recruitment, mortality or dispersal. Estimated true species richness at the site level was highly correlated with observed species richness (see results). However, the estimated values of true species richness were rather high for plants and butterflies (see results). This likely over-estimation probably resulted from the small number of sites and the fact that populations were not closed (for more details see: Kéry and Schaub 2012, pp. 414–461). Based on the high correlations with observed richness, but partly unrealistically high estimates for butterflies and plants, we continued further analyses using observed species richness rather than estimated true richness values as a baseline describing the outcomes of a “full survey effort”.

We described species composition using several multivariate analysis tools. To describe species composition we conducted detrended correspondence analyses (DCAs) with presence/absence data for birds, and abundance data for plants. Abundance data of butterflies was analysed using principal component analyses (PCAs). We chose these ordination methods because the length of the gradient of the first DCA axis was >3 for plants and birds and <3 for butterflies (Ter Braak and Prentice 1988).

Assessment of the impact of survey effort reductions

For a given group of species, we were interested in comparing the data from a “full survey effort” with that of a “reduced survey effort”. Our full survey effort consisted of ten plots per site for plant surveys, four repeats per site for butterfly surveys, and four repeats per site for bird surveys. For each group, we considered species richness, species turnover and species composition. We treated the results of species richness and species composition resulting from the full survey effort as “observed” richness and composition, respectively. We simulated subsets of the full survey effort by randomly dropping one to seven plots (for plants) or one to three repeats (for birds and butterflies) from the dataset. Random sampling of reduced datasets was repeated 100 times for each selection, and agreement of the reduced set was compared with the full dataset. Species richness and turnover of the reduced datasets was compared to the full dataset using Spearman Rank correlations.

We then assessed how strongly species composition changes when reducing the survey effort. This was done by using Procrustes analyses, which identifies differences of the locations of objects between two ordinations. Comparisons were performed between the ordination of the reduced dataset and the full dataset and differences were quantified by calculating a correlation based on the symmetric sum of squares between the two ordinations (Peres-Neto and Jackson 2001).

Power analysis of the effect of different survey designs

Study design and data quality fundamentally influence the statistical power in the analysis of survey data. We therefore investigated the effect of different designs on the power of linear models relating species richness with environmental variables. We used a simulation approach that reflects the nature of the variability in the field data, but in which the sample size can be varied. It is then possible to test how strong the actual effect of a specific variable needs to be, for a dataset with a certain sample size to detect such an effect.

Specifically, we applied power analyses to detect effects of landscape heterogeneity on species richness. The loss of landscape heterogeneity is a key concern in Europe’s agricultural landscapes (Benton et al. 2003), and is particularly relevant to our study area where low-input, small scale farming is increasingly replaces by industrialized high-input agriculture. We limited this analysis to arable sites, because this is where heterogeneity is most likely to be lost in the future due to land use intensification. We calculated heterogeneity as the standard deviation (SD) of the normalized difference vegetation index (NDVI) from 10 m monochromatic SPOT data (©CNES (2007), Distribution Spot Image SA) within each of the one hectare (arable) sites.

The methods used for the subsequent simulations are described in detail by Bolker (2008), and are summarized here for our data. During the simulation we increased the sample size from the original number of 17 sites of arable land to a hypothetical maximum of 170 sites. We generated explanatory data from a uniform distribution spanning the range of heterogeneity values observed in the original 17 sites. We also varied effect size from no effect to a strong effect, that is, from no change in species richness along the heterogeneity gradient to a change in species richness that equaled the maximum number of species that was counted in a single site (32 species for plants, 12 species for birds and 22 species for butterflies). This effect was converted to 200 increasingly large hypothetical slopes for a regression line (from slope = 0 to increasingly steeper slopes). Based on a given slope, we simulated species richness for each taxonomic group. To these simulated species richness values, we added a random variation. Random variation was generated by randomly drawing values from a normal distribution with a mean of zero and a standard deviation as large as in the original species richness data (10.27 for plants, 1.93 for birds, and 5.43 for butterflies). For this purpose, we used the plant richness data from surveying seven plots, and bird and butterfly richness data from three repeated surveys.

For each dataset thus generated, we fitted a simple linear model of simulated richness on simulated heterogeneity. We repeated this process 1,000 times for each combination of number of survey sites and slope. For each combination of number of survey sites and slope, we noted how often we found a significant effect in the simulated data. Because data were simulated to be variable, sometimes the simulated effect was detected at the significance level of 0.05, and sometimes no effect was detected despite there being one (type II error). We were interested in how the incidence of type II errors varied with the number of survey sites and effect size (slope)—both more survey sites and steeper slopes will reduce the incidence of type II errors, that is, lead to greater statistical power. For each examined taxonomic group, and for a given number of survey sites, we noted the minimum slope (“minimum detectable effect” or MDE) at which the type II error rate was <0.2 (i.e. power >0.8). In a last step, the MDE was expressed as the difference in the number of species between the site with the lowest and highest heterogeneity.

Results

We detected 293 vascular plant species from 35 sites with the classical approach and 310 plant species from 19 sites with the cartwheel approach. We recorded 53 bird species (35 sites) and 81 butterfly species (26 sites) (Table 1). We found the highest values for species turnover between sites for plants with the classical approach (mean ± SD: ß = 12.6 ± 11.1) and the cartwheel approach (ß = 8.8 ± 5.9), followed by birds (ß = 9.1 ± 6.9). Butterflies showed the lowest turnover (ß = 7.1 ± 8.4).

Table 1 Mean species richness per site (and standard deviation) in the three land cover types surveyed

Plant species richness from the two different sampling methods was strongly positively correlated (Pearson correlation coefficient r = 0.77, df = 17, P < 0.05). Species richness differed between the two approaches most strongly within agricultural fields (Pearson correlation r = 0.04, df = 5, P = 0.9; non-arable sites: r = 0.92, df = 12, P < 0.05). Here, survey plots were selected to be within actual fields for the classical approach, while the random selection of plots in the cartwheel approach more frequently included weed and field edge vegetation. Consequently, estimates of richness were higher using the cartwheel method. There were positive correlations between the site-level richness of plants and butterflies (Pearson correlation r = 0.42, df = 24, P < 0.05; cartwheel approach r = 0.71, df = 14, P < 0.05), but no significant correlations between butterflies and birds (r = −0.02, df = 24, P = 0.91), and plants and birds (Pearson correlation r = −0.004, df = 33, P = 0.98; cartwheel approach r = −0.39, df = 17, P = 0.1).

Mean observed species richness per site was 46.9 for plants; 17.7 for butterflies and 9.6 for birds. Observed species richness correlated highly with estimated true species richness from the hierarchical community models (plants r = 0.83, df = 17, P < 0.001; birds r = 0.99, df = 33, P < 0.001; butterflies r = 0.99, df = 24, P < 0.001). However, the absolute values of estimated mean richness per site were unrealistically high for plants and butterflies: Plants (mean; credible interval (2.5–97.5 %): 92.6 (81.9–106.6); Butterflies: 60 (47.5–73.6); Birds: 9.4 (6.7–13.3). Hence, we continued all subsequent analyses using observed species richness. The average detection probabilities were estimated to be 0.25 for birds (±0.15 SD), 0.17 for plants (±0.12) and 0.16 for butterflies (±0.17).

Correlations between species richness from reduced survey effort and results from the full survey effort showed an overall pattern of asymptotic increase with increasing survey effort, especially for plants (Fig. 2). For species turnover and composition, we also found consistently high correlations between estimates from reduced survey effort and full survey effort. For example, when considering seven plant plots per site, three repeats for birds, and three repeats for butterflies, the mean correlations with estimates for the full dataset were >0.9, for species richness, turnover and composition (Fig. 2).

Fig. 2
figure 2

Correlations between data from reduced survey effort (1 to 9 plots for plants; 1 to 3 repeats for birds and butterflies) and the maximum survey effort (10 plots for plants; 4 repeats for birds and butterflies). Reduced survey effort was simulated by randomly sub-setting the full data set 1,000 times for each level of data reduction

Power analysis with simulated data showed an exponential decrease of the minimum detectable effect with increasing sample size. The marginal increase in statistical power per additional survey site was lower when the number of sites was already high (Fig. 3). Minimum detectable effects were smallest for birds (1 species for 100 survey sites) and larger for butterflies and plants (approximately 3 species for 100 survey sites).

Fig. 3
figure 3

Power analysis with simulated data. Minimum detectable effect (MDE) is plotted as a function of the number of survey sites. MDE was defined as the absolute change in species richness along the observed heterogeneity gradient in arable fields that could be detected in a linear model with given sample size

Discussion

Given the fast changes happening in human-dominated landscapes, ecologists need to use efficient survey protocols to be able to detect effects on wildlife. Field research projects face logistical, time and monetary constraints (Tyre et al. 2003), which inherently limit the affordable survey intensity. Dense sampling schemes—such as those that use survey protocols which aim to cover at least three percent of the area of a landscape with at least five repeats (Bried et al. 2011)—are rarely feasible. Typically, only small portions of the landscape can be surveyed (Stohlgren et al. 1997). A common approach therefore is to rely on a stratified random sampling design and then extrapolate data across the landscape (Stohlgren et al. 1997; Rosenstock et al. 2002).

Here, we present a protocol to assess the effects of survey effort on the detection of biodiversity patterns based on a case study. We show that for our data survey efforts per site could be moderately reduced, because the corresponding increase in bias was relatively small and relative biodiversity patterns remained stable. Such a reduction, however, needs to happen in a sensible and balanced way in order to assure sufficient statistical power to detect environmental effects on species richness. Also, this conclusion is based on the assumption that detection probability does not vary spatially.

Overall, our findings are broadly consistent with a range of previous works from different systems. For example, Stohlgren et al. (1997) tested reducing a larger set of plant sample replicates in different vegetation communities in the Rocky Mountains and found that already ten quadrats of one square meter per sampling unit provided sufficient information in order to detect fine-scale patterns of plant diversity. Similarly, other studies showed that in Australia and California, most animal species that were surveyed could be detected even if survey effort within a given sampling protocol was reduced to three repeat surveys (Pellet 2008; Field et al. 2005). Based on an assessment of birds, amphibians and invertebrates in Australia, Tyre et al. (2003) further suggested that with current survey methods, sampling from 100 sites and pooling data over three repeats yielded accurate results. This, too, is consistent with our findings—using 100 or more sites led to minimum detectable effects of changes in species richness in response to heterogeneity of three species for plants and butterflies, and one species for birds. Due to the coherences with findings from other studies, we assume our sampling protocol for landscape-scale surveys is applicable to other study systems as well.

Our results suggest that it can be reasonable to reduce survey effort per site when aiming at broad patterns of biodiversity and when the detectability of investigated taxa is high. Moreover, even a low survey effort per site can yield high statistical power provided that the survey effort per site is balanced in a meaningful way with the number of sites surveyed. A key advantage of using many sites is that data then is much more likely to be representative of the study area as a whole, which is valid at least for occurrence patterns of organisms with relatively high abundance and detectability. Abundance greatly influences detectability, and both factors determine whether a species is actually recorded (Royle and Nichols 2003). Rare species and species with a low detectability are highly susceptible to false absences compared to common species or ones with a high detectability, which can lead to an underestimation of their distribution (MacKenzie and Royle 2005; Lahoz-Monfort et al. 2013). Therefore, higher levels of survey effort are often recommended for rare species (e.g. Bried and Pellet 2012). In summary, we demonstrated a useful sampling protocol for assessing broad diversity patterns of relatively abundant species in response to environmental gradients (Vellend et al. 2008). However, we caution that our method may be of limited use for rare or cryptic species. Eventually, the required survey effort depends on the study area and the investigated species (Bried et al. 2012). With our case study, we provide an example how to allocate project resources meaningfully to obtain a high statistical power.

Conclusion

Developing field survey protocols is a challenging task for ecologists and demands thorough consideration of both theoretical and practical issues. Our results suggest that in Southern Transylvania, at least three temporal replicates on at least 100 study sites appeared to be sufficient to study landscape effects on diversity patterns of birds and butterflies following our sampling methods. To model plant diversity patterns, a combination of seven one square meter plots per one hectare site at approximately 100 sites appeared to be sufficient.

Before implementing landscape-scale surveys, we recommend ecologists conduct pilot studies for several reasons: (1) to trial and customize different techniques and sampling schemes; (2) to identify what is the most efficient use of available resources; and (3) to estimate the statistical power of plausible alternative designs. Our findings suggest that under certain conditions, relative patterns of biodiversity can remain relatively stable, when survey effort is moderately reduced. This in turn, can help to allocate resources to sampling more sites and to more representatively survey large areas. The general procedure presented in this paper is transferrable to other study systems and may be used as a guideline to help develop reasonable survey designs.