Does delaying the first mowing date benefit biodiversity in meadowland?

Meadows are regularly mown in order to provide fodder or litter for livestock and to prevent vegetation succession. However, the time of year at which meadows should be first mown in order to maximize biological diversity remains controversial and may vary with respect to context and focal taxa. We carried out a systematic review and meta-analysis on the effects of delaying the first mowing date upon plants and invertebrates in European meadowlands. Following a CEE protocol, ISI Web of Science, Science Direct, JSTOR, Google and Google Scholar were searched. We recorded all studies that compared the species richness of plants, or the species richness or abundance of invertebrates, between grassland plots mown at a postponed date (treatment) vs plots mown earlier (control). In order to be included in the meta-analysis, compared plots had to be similar in all management respects, except the date of the first cut that was (mostly experimentally) manipulated. They were also to be located in the same meadow type. Meta-analyses applying Hedges’d statistic were performed. Plant species richness responded differently to the date to which mowing was postponed. Delaying mowing from spring to summer had a positive effect, while delaying either from spring to fall, or from early summer to later in the season had a negative effect. Invertebrates were expected to show a strong response to delayed mowing due to their dependence on sward structure, but only species richness showed a clearly significant positive response. Invertebrate abundance was positively influenced in only a few studies. The present meta-analysis shows that in general delaying the first mowing date in European meadowlands has either positive or neutral effects on plant and invertebrate biodiversity (except for plant species richness when delaying from spring to fall or from early summer to later). Overall, there was also strong between-study heterogeneity, pointing to other major confounding factors, the elucidation of which requires further field experiments with both larger sample sizes and a distinction between taxon-specific and meadow-type-specific responses.

As a response, most countries have implemented agrienvironment schemes (AES), in which farmers are subsidised to modify their farming practice to provide environmental benefits. AES mostly aim at protecting and restoring farmland biodiversity [13,14]. They are voluntary programmes in which farmers usually receive direct payments for providing services that go beyond conventional agricultural practices, such as management of semi-natural habitats. Currently, about 30% of European farmland is under some sort of agri-environmental contract [15].
Low input (extensively managed) hay and litter meadows are among the most commonly implemented agri-environmental measures [13,16]. The most important management action on these grasslands is mowing. Mowing vegetation at least once a year has a positive effect on vascular plant species richness, especially when cuttings are removed [17,18]. However, since it has been demonstrated that early-summer mowing often has a detrimental effect on species richness of flowering plants, as it hampers completion of the reproductive cycle [17], later mowing is generally found to be more favourable for vascular plant biodiversity [19,20].
Annual mowing has a contrasting effect on invertebrates [21,22]. Although detrimental to many insects in the short term [23][24][25][26][27][28], mowing is beneficial to a large number of heliophilous and thermophilous species because it prevents the growth of bushes and trees and thus maintains semi-natural grasslands [29]. It has also been suggested that delaying the date of first mowing could be positive for a multitude of invertebrates, including butterflies, spiders, grasshoppers and ground beetles that depend on various vegetation structures [30][31][32][33][34][35]. For vertebrates, the situation is different: mowing renders food resources suddenly available (e.g. insects and rodents) that were previously hidden in the sward. Foragers may congregate towards these rich, although ephemeral food supplies [36,37]. On the other hand, ground-breeding birds are likely to be heavily penalised by early mowing [e.g. 38].
While most AES have the clear objective of restoring biodiversity and ecosystem services [13,14,39], they often bind farmers to threshold dates for agricultural operations. The date of the first mowing of meadows is usually defined as a trade-off between expected agricultural yield and supposed effects on wildlife. Given that this first mowing date is the most easily changed management practice [7,31], it appropriate adjustment is the most likely to provide environmental benefits at little economical cost. Using a meta-analytical framework, we studied the currently available scientific literature about the pros and cons for biodiversity of delaying mowing in farmed European meadowland; we also identified major gaps in knowledge related to this theme. The synthesis will be useful to both agro-ecologists and policy-makers involved in farmland management.

Objective
The primary objective of the review was to answer the following question: Does delaying the first mowing date increase biodiversity in European farmland meadows?

Methods
We followed the review methodology of the Collaboration for Environmental Evidence partnership [40,41] and published an a-priori protocol [41].

Search strategy
The following web databases were searched for relevant documents: ISI Web of Science, Science Direct, JSTOR, Google (100 first hits), Google Scholar (100 first hits). Searches were conducted in English, French and German using translations of the following logical search string: (mowing OR cutting) AND (meadow OR grassland) AND (biodiversity OR richness). The term "Europe" was not included in the search keywords as stated in the Review Protocol [41], because European studies that do not mention the term Europe may have been missed. Studies originating from extra European regions were later excluded from the review. Any apparently relevant citations or links were followed one step away from the original hit. In addition, national and international experts on the subject were asked for any relevant literature and unpublished data.

Article screening
All references retrieved from the web search were scanned at the title, abstract and full text filter level by a first reviewer. From the 367 initial references, 200 (randomly selected) were rescanned by a second reviewer in order to check for inclusion consistency. The following study inclusion criteria were used: Relevant subjects: semi-natural grasslands that are mown annually, including conventionally managed grasslands, AES meadows, hay or litter meadows. Types of intervention: first mowing date delayed (treatment). Types of comparator: comparison with similar meadows or plots that are first mown on an earlier date (control). Treatment and control plots must be similar in all management respects, except the date of the first cut, and must be located in the same habitat type. Types of outcome: species richness and/or abundance (any taxa).
Inclusion consistency was checked with kappa statistics, and agreement between the reviewers was satisfactory (k = 0.81) [42].

Study quality assessment
All articles accepted met the requirements of category II-2 and above of the classification system of [43]. This allowed for both experimental and observational studies to be included, but excluded studies that provided only qualitative evidence.

Data extraction
Some studies reported more than one treatment (two or more delayed cuts) or more than one type of measurable outcome (e.g. species richness and abundance, or different taxonomic groups such as plants and invertebrates). In these cases, all comparisons were recorded as independent data points, and this is why there are more data points (units of analysis) than articles [44,45] (Figure 1; Table 1).
The following information was extracted from the studies for each data point: (1) taxon, (2) species richness or abundance, (3) standard deviation, (4) sample size, (5) study duration in years, (6) plot size of vegetation relevés or sampling methodology for invertebrates, (7) ordinal days of the early cut and (8) delayed cut, and finally (9) meadow type, classified as dry, mesophilous or wet. Additional potential sources of heterogeneity were also extracted such as fertilizer application, number of cuts per year, grazing activity, and biogeographical region where the study was carried out. Diversity indexes such as the Shannon index were recorded when present, but did not lead to sufficient data points for a meta-analysis (MA).
Taxa were plants, invertebrates or a specific group of invertebrates. Standard deviations (SD) were usually retrieved from standard errors (SE) or variances. If no estimate of variance was provided, we requested it from the original authors. If original authors could not provide SD, or sample size was equal to one (i.e. no variance), the corresponding study was included only in the unweighted analyses (see statistical analysis section below). The ordinal days (day 1 = January 1 st ) of the early cut (control) and delayed cut (treatment) were used to calculate the number of days between the two mowing regimes. If the exact date of the early or delayed cut was unknown, but only the month was given, then the 15 th of the month was used for calculations. If the terms "early" or "late" in a given month were mentioned, then the 7 th or 24 th , respectively, of that corresponding month were used.
Delaying cutting is often studied within a broader context of agricultural extensification for biodiversity, including reduced number of mowing events, changes in fertilizer inputs and/or type of fertilizer, oversowing, etc. Studies of cases in which delaying mowing occurred in the presence of such confounding factors could not be included in the MA as the effect of delaying the first cut cannot be separated from these other confounding factors [e.g. 32].

Statistical analysis
Meta-analyses were conducted on three groups of studies according to their measurable outcomes: 1) plant species richness; 2) invertebrate species richness; 3) invertebrate abundances. Studies on plant species richness lasted between two and 40 years, and if multiple timepoints were available along the time series, only the data for the last year (longest time period) were considered. Studies on invertebrates were usually shorter, mostly three to four years, and due to a high inter-annual variation, these studies often reported biodiversity responses averaged across the years. Here we used these reported average values.
The Hedges'd statistic was used to estimate effect size, Hedges'd equalling to the standardized mean difference between delayed and early cuts [46]: where X D and X E are the means of the delayed and early cut outcomes, S is their pooled standard deviation, while the term J corrects for small sample bias [47]. It was calculated using the function escalc of the R package metafor [48]. Random-and mixed-effects models (mixed-effects models are random-effects models with covariates) were chosen as it is now common practice for this kind of analysis [47]. Under random-and mixed-effects models, the true effect size, i.e. the effect size as if there were no sampling errors, can vary from study to study, but usually do so under a normal distribution [49,50]. Here the Q test and I 2 statistic were used to assess heterogeneity between studies. The Q test is the test of significance,    [90] and the I 2 statistic estimates how much of the total variability in the mean effect size (composed of heterogeneity and sampling error) can be attributed to heterogeneity among the true effect size [48,50]. First, the null model was generated. Then all univariate models including the following moderators (effect modifiers) were tested: ordinal day, time lapse (in days) between the early and the delayed cuts, study duration (in years), meadow type and plot size of the vegetation relevés. Multivariate models (various combinations of the above mentioned variables) were also explored. Further subgroup analyses were conducted to investigate the influence of key moderators separately. Models were ranked based on their AIC values (Akaike Information Criterion) and on the level of significance of the estimates [51]. Permutation tests were not always possible due to an insufficient number of data points, which limits the number of possible iterations. Therefore test statistics of the effect sizes and corresponding confidence intervals (CIs) referred to the normal distribution (Z test). Publication bias was assessed using funnel plots, by applying a regression test for funnel plot asymmetry [46,48].
In addition to the proper weighted meta-analyses, unweighted meta-analyses were performed using the response ratio as effect size. Response ratio (lr) is equal to the natural logarithm of the ratio of the delayed on the early cut date [46]. Note that this way a positive value means a positive effect of delaying mowing.
Although less powerful than proper-weighted metaanalyses, this approach allows the inclusion of studies that did not report SD or where sample size was one, i.e. studies for which no Hedges'd could be calculated.
Bootstrapping was used to calculate 95% confidence intervals (CI); if CI overlapped zero, the effect size was considered to be non significant. All statistics were performed using R version 2.13.0 [52].

Results
On the 16 th of March 2011, 367 articles were retrieved from the web searches. The influence on biodiversity of delaying the first mowing date could be investigated in 27 articles that matched inclusion criteria ( Figure 1). Subsequently, twelve articles were excluded due to duplication or unsuitable data for a MA. Duplication happened when it was obvious that two studies based on the same experimental set up were looking at the same metric while either addressing different questions or considering different times. For example, the articles [53] and [54] reported studies investigating the impact of different mowing regimes on plant species richness in the same experimental set up, same plots, but one after 15 years, and the other after 22 years of management, respectively. In such cases, only the latest study (longest duration) was included in the MA. Nine additional studies were found in bibliography sections of the selected papers or obtained after contacting experts, which makes a total of 24 suitable studies submitted to the present MA (Figure 1). In some studies more than one delayed cut or more than one invertebrate group were investigated, resulting in a total of 55 data points ( Table 1). All studies were experimental but three that used an observational approach [31,55,56], though the observational studies were well replicated (9 or 18 times, see Table 1). From these 55 data points, 35 deal with plant species richness, ten with invertebrate species richness, and ten with invertebrate abundance. In eleven cases (nine for plant species richness, one for invertebrate species richness and one for invertebrate * Study designs were either experimental (exp) where mowing treatments were randomly applied, or observational (obs) were mowing treatments were not randomly applied. All 55 data points (unit of analysis) and their respective references included in the meta-analysis. Rows are ordered by taxon and specific outcome measures. Note that rows with two outcomes (species richness and abundance) count as two data points. The time (month) of the early and delayed first cut are given for both control and treatment plots, as well as the duration of the study in years and the sample size. Studies where the Standard Deviation (SD) was not provided could only be included in meta-analyses based on the response ratio. See Additional file 2 for more details on each data points.
abundance), the study did not report SD, or sample size equalled one. Consequently, these data points could only be included in the MA assessing response ratio. Two suitable studies on seed shed and seed bank were also found, but not included because their very specific focus was too marginal with respect to our main research question [20,57]. There was no single study on birds that complied with our selection criteria. In effect, all bird studies consisted of observational studies with potential confounding factors. An additional file shows the included data points in more detail [see Additional Postponing the first mowing date is a widespread agrienvironmental measure in Europe, though it is usually coupled with other measures such as reduction of fertilizer applications. This makes sense from an agronomical point of view since postponing mowing must be accompanied by reduced hay productivity in order to avoid over-mature grass laying on the ground and/or mouldering at the time of mowing. It would then be difficult to separate the effects of postponing mowing from the effect of fertilizer reduction. Therefore, most of the studies included in the present MA concern extensively managed grasslands with no fertilizer application and a single cut per year.

Effects on plant species richness
Results based on the response ratio were qualitatively the same as the results based on the Hedges'd. Therefore, only the results of the weighted meta-analysis based on the Hedges'd are presented below due to their superior explanatory power. An additional file shows the results of the unweighted meta-analysis based on the response ratio [see Additional file 3].
In the null model, no overall significant effect of delaying the first mowing date was supported as regards plant species richness (mean Hedges'd = 0.017 with 95% CI −0.237 -0.2716, z = 0.134, P = 0.882, Figure 2). However, heterogeneity between studies was significant (Q = 56.88, d.f. = 25, P < 0.001, I 2 = 54%), indicating that the true effect size does vary from one study to the next. With study duration (in years) included in the model as a moderator, no significant influence of that moderator on the effect size was discerned (slope = 0.016 with 95% CI -0.019 -0.051, z = 0.878 P = 0.380, Figure 3a).
In further univariate models, a significant negative influence of the date of the early cut (control) was established (slope = −0.015 with 95% CI −0.025 -−0.005, z = −2.878, P = 0.004, Figure 3b). This means that the earlier the cut in the year, the more pronounced the effect on biodiversity of delaying the first cut. On the other hand, when the early cut occurred late in the season (July to August), delaying it had no, or even a negative, effect on plant species richness. Between studies heterogeneity was significant (Q = 43.12, d.f. = 24, P = 0.010), indicating again that other moderators may also influence the effect sizes. On the contrary, the date of the delayed cut did not significantly influence the effect size (slope = −0.007 with 95% CI −0.013 -0.001, z = −1.805, P = 0.071), although it did explain some of the heterogeneity.
In order to further investigate this issue and to evaluate the extent to which heterogeneity can be explained by variation in this moderator (first mowing date), two subset MAs were conducted. The first included only the data points with an early cut in spring (before July 1) associated with a delayed cut in summer (July to September); the second included all other combinations of early and delayed cuts (spring to fall, early summer to late summer and summer to fall, but excluded one early spring to late spring study [58]). In the first case, mean Hedges'd became significantly positive (mean Hedges' d = 0.388 with 95% CI 0.092 -0.684, z = 2.569, P = 0.010, Figure 2b). Between studies heterogeneity was significant (Q = 24.88, d.f. = 14, P = 0.036), while I 2 (40%) was not. In the second case, mean Hedges'd became significantly negative (mean Hedges'd = −0.504 with 95% CI −0.763 -−0.246, z = −3.828, P < 0.001, Figure 2c). Heterogeneity was not significant (Q = 4.56, d.f. = 9, P = 0.871), indicating that these latter studies provided consistent results.
Note that none of the models including one or more moderators (study duration, mowing date, time interval between mowings, habitat type, and plot size of the vegetation relevés) performed better that the null model according to AIC values [Additional file 4]. In addition, no asymmetry was detected in any funnel plots, which rules out any publication bias effect [Additional file 5].

Effects on invertebrate abundance
Delaying the first mowing date had no significant effect on invertebrate abundance (mean Hedges'd = −0.053 with 95% CI −0.889 -0.783, z = −0.1249, P = 0.901, Figure 5). However, the resulting Q-Q plot was not satisfactory, while the funnel plot showed a significant asymmetry in the distribution of the data points due to the two outlying studies of Morris [59,60]. Excluding Morris's studies from the analysis resulted in model assumptions and funnel plot becoming satisfactory, with a significant positive effect of delaying the first mowing date (mean Hedges'd = 0.533 with 95% CI 0.222 -0.844, z = 3.3564, P = 0.001, Figure 5a), even in the absence of heterogeneity (Q = 6.59, d.f. = 6, P = 0.360). The apparent generality of this result must be treated with caution, however, as it is based on only two independent experiments. Model ranking accounting for all studies, including Morris's studies, showed that the model that included the dates of both early and delayed mowing had a lower AIC value, with a negative effect for early mowing (slope = −2.130 with 95% CI −3.017 -−1.241, z = −4.6989, P < 0.001) and a positive effect of delayed mowing (slope = 5.607 with 95% CI 3.283 -7.930, z = 4.730, P < 0.001) [see Additional file 4]. This means that effect size is greater the earlier the first mowing and later the delayed mowing. The influence of study duration was not investigated because all study durations were either 3 or 4 years.

Limitations of available information
The main limitation of this systematic review is the low number of data points stemming from an even lower number of studies (Table 1), which precluded investigations on specific invertebrate taxa, and on the influence of several moderators. As a consequence, only the main general effects of postponing mowing could be clearly The Forest plot is divided in three sections according to postponing schedule: a) study that delayed the first cut from early spring to late spring, b) studies that delayed the first cut from spring (May-June) to summer (July-August-September), and c) studies that delayed the first cut from spring to fall, early summer to late summer or summer to fall. Effect sizes are Hedges'd, i.e. the standardized mean differences between delayed and early cuts. The squares and bars represent the mean values and 95% confidence intervals of the effect sizes, while the size of the squares reflects the weight of the studies. The combined effects (sub-summary and summary) appear as diamonds and the vertical dashed line represents the line of no effect.
investigated. Moreover, in the MA there was great heterogeneity in plant species richness, indicating that other factors (moderators) than delaying the first mowing probably influence the effect size. While the date of the first mowing was found to be an important factor, study duration was not (Figure 3). It was also expected that heterogeneity would be influenced by the great variety of meadow types involved. However, no analyses could be conducted on this factor due to the highly unbalanced distribution of the habitats among the data points (n = 36 mesophilous meadows; 16 wet meadows; 3 dry meadows). Moreover, from the sixteen wet meadow data points, nine could not be included in the weighted MA.
Additional management factors such as fertilizer application, occurrence of a second cut, seed oversowing, and autumn grazing would also influence the effect size, but they could not be investigated for the same reasons. Note that the most common management practice (42 data points out of 55) was no fertilizer application, no grazing and a single cut per year. Study design could also play a role. While most studies were experimental, three used a purely observational approach [31,55,56]. Experimental frameworks also differed greatly in sample sizes, plot sizes and sampling methodologies, which additionally affect the probability of detecting changes. Publication bias was not apparent from the funnel plots; however, some biogeographical bias might be present as most studies originated from the UK [see Additional file 2].

Conclusions
The present study shows that, in most cases, delaying the first mowing date in European meadows has either positive or neutral effects on plant and invertebrate biodiversity. Our MA also provides evidence of betweenstudy heterogeneity, emphasizing that factors other than mowing date might play an important role, a topic which deserves further investigations. These findings have particular relevance to all agri-environment schemes (AES) where the date of first mowing is strictly regulated. They are also important for the management of low input meadows, where delaying mowing may improve and secure primary production. It is has been shown that primary productivity in more diverse plant communities is more stable and resilient to disturbances [61]. In addition to agricultural grasslands, open nature reserves are often mown [e.g. 62,63]. When conservation is the primary goal of such management, the first possible mowing date should be considered carefully.
Plant species richness reacted differently according to the way mowing was postponed. Delaying mowing from spring to summer had a positive effect, while delaying either from spring to fall, or from early summer to late summer, or from summer to fall had a negative effect (Figure 2). The time interval between two mowing events was expected to have a greater positive impact the longer the time interval between cuts, though the time interval, in fact, appeared to be not significant.
Invertebrates were expected to show an even stronger response to delayed mowing than plants, due to their heavy dependence on vegetation structure [33,64,65] and high susceptibility to mechanized harvesting processes [66]. However, only invertebrate species richness showed a a b Figure 3 Short title: Hedges'd versus a) study duration, or b) date of the early cut. Standardized mean differences (Hedges'd) of the effect of delaying the first mowing date on plant species richness as a function of a) study duration (in years), or b) the date of the early cut (control plot). The size of the dots reflects the weight of the study.
clear overall significant positive response (Figure 4), while no effect was detected on invertebrate abundance. It was only after removing two studies [59,60] contradicting basic MA assumptions that delaying the first mowing date was found to have a positive effect on invertebrate abundance ( Figure 5).
The types of meadow considered in this reviewboth from a phytosociological viewpoint (e.g. Arrhenatherion,  Mesobromion, Filipendulion or Caricion) and a functional perspective (e.g. hay or litter meadow)are also believed to interact with the effects of delaying mowing. Unfortunately, the variety of meadow types across studies yielded an insufficiently balanced sample to enable investigation of the influence of that moderator. For the same reason, we were unable to consider effects on specific invertebrate taxa, notwithstanding that responses are also expected to vary with respect to taxa body size, mobility, and life history traits [27,62,67,68]. Extensification of grassland management practices is reported to positively affect general plant and invertebrate biodiversity [e.g. 32,69,70], which is confirmed by this MA. However, contrary to some other studies [e.g. 71], we could not detect any conservation conflicts between our two main focal taxa (plants and invertebrates), when some practices benefit one taxon to the detriment of the other.

Evidence of effectiveness and management recommendations
This review confirms that postponing of mowing from spring (May-June) to summer (July-September) is appropriate to promote plant and invertebrate diversity. In contrast, postponing mowing from spring to fall (October-November) or from early summer (July) to late summer or fall may have a negative impact on the vegetation species richness. Invertebrates might still benefit from it but these two postponing schemes could not be differentiated due to small sample size. Regarding wet and litter meadows, a late cut (September or later) is usually recommended [72], but unfortunately we are not in a position to confirm this recommendation, in the absence of habitat specific analyses.
When postponing mowing cannot be done at the field scale, leaving uncut grass areas within the cultivated landscape matrix can be an alternative solution to favour plants and animals [see also [73][74][75][76]. At the landscape scale, creating a mosaic of different mowing regimes will increase species diversity, as there is no single appropriate mowing time that suits all organisms [33,54,77]. In addition to the date of first mowing, a low annual cutting frequency also promotes wild plants [78] and invertebrates [79,80]. There was no single study on birds that complied with our selection criteria. However, all studies on ground-nesting birds recommend postponing mowing until after fledgings have left the nests [e.g. [81][82][83][84][85]. These management recommendations do not apply everywhere and must be related to the socio-economic context. For example, in highly fertilized systems (high intensity management) biodiversity is generally too low for these measures to have positive effects [e.g. 86].

Implications for further research
Our review focuses solely on the general effect of delaying the first mowing date upon plant and invertebrate species richness as well as invertebrate abundance. Some general trends could be extracted from the scientific literature, but there is still considerable uncertainty concerning the estimated effect sizes, since the influence of several moderators has barely been investigated. Altogether, invertebrates were far less documented than plants, with only seven studies of the impact of delaying mowing on species richness and/or abundance, and even these showed a major geographical bias (six studies from the UK and one from Finland). Clearly this is not sufficient to get the full picture: further long-term, experimental investigations of target taxonomic groups and species regarding responses to mowing regimes are needed. This lack of invertebrate studies is true not only for mowing but also for all factors that may influence grassland invertebrates, such as grazing, habitat fragmentation and management intensity [87]. Only experimental work can disentangle the effects of various, often concomitant management practices (e.g. mowing date and fertilizer application). We thus encourage experiments where management practices are investigated in a full factorial design or where a single management practice is tested against a control plot or field that differs only in regard to this practice. Field scale experiments should be preferred to plot scale, especially when investigating animals that can move from a plot to another. Additionally, landscape characteristics are known to influence communities of plants and animals within farmland, and should therefore be accounted for in any attempt to model the effects of management practices on those communities [88].
provided thorough editing of the manuscript. All authors commented and approved the final manuscript.