Search strategy
Most of the evidence synthesised in this systematic review was selected from the recently compiled systematic map of biodiversity impacts of roadside management [18]. The systematic map was based on literature searches using 13 publication databases, four search engines and 36 specialist websites. The majority of these searches were performed in October–December 2015. English was the primary search language, but searches on specialist websites for relevant literature in Danish, Dutch, French, German, Norwegian, Spanish or Swedish were made using search terms in these languages. We checked the comprehensiveness of our searches using the bibliographies of five literature reviews (see Bernes et al. [17] for details on the search strategy and a full list of specialist websites and literature reviews).
When deciding whether an article included in the systematic map was also eligible for inclusion in the review, we used the criteria described in the next section. This set of eligibility criteria is a more restrictive version of that used for the systematic map.
To identify more recently-published literature on the specific topic of the systematic review, we also performed a search update, using the following subset of the search terms used for the systematic map:
- Population::
-
roadside*, “road side*”, (road* AND (verge* OR edge*)), roundabout*, “traffic island*”, “median strip*”, “central reservation*”, boulevard*, parkway*, (avenue* AND tree*).
- Outcomes::
-
*diversity, species, abundance, vegetation.
The terms within the ‘population’ and ‘outcomes’ categories were combined using the Boolean operator ‘OR’. The two categories were then combined using the Boolean operator ‘AND’. An asterisk (*) is a ‘wildcard’ that represents any group of characters, including no character. The search terms were identical to the original search terms except for “dispers*”, which was not included in this updated search because species dispersal was not of interest for this review (and we did not find any relevant studies on this topic in the systematic map).
The search update was performed in May 2017 and covered literature published in 2015 or later. When making literature searches for the systematic map, we found that about 90% of recent studies eventually included as relevant had been identified through Scopus and/or Transport Research International Documentation (TRID). Therefore, we considered it sufficient to base the search update on these two resources, with a supplementary search in Google Scholar. When searching in Google Scholar, we examined the first 200 hits (based on relevance) for appropriate data. No language or document type restrictions were applied, but searches were conducted using English search terms only. A detailed description of our searches for literature is available in Additional file 1.
Article screening and study eligibility criteria
Articles identified during the search update were evaluated for inclusion at three successive levels. First, they were assessed by title. Next, each article found to be potentially relevant on the basis of title was judged for inclusion on the basis of abstract. Finally, each article found to be potentially relevant on the basis of abstract was judged for inclusion based on the full text. At all stages of this screening process, the reviewer tended towards inclusion in cases of uncertainty.
The screening was performed by reviewers who participated in the main screening of studies for the systematic map and who were therefore well acquainted with the relevant literature and with the study eligibility criteria. The screening of articles from the search update could be seen as a continuation of the main screening, for which detailed, multi-level consistency checking was performed. Articles identified by the primary reviewer as potentially utilisable based on the full text were also assessed by a second reviewer, and reviewers did not assess studies authored by themselves. Final decisions on whether to include doubtful cases were taken by the review team as a whole. A list of studies rejected on the basis of full-text assessment is provided in Additional file 2 together with the reasons for exclusion.
In order to be included in the review, studies included in the systematic map or identified during the search update had to pass each of the following criteria:
-
Relevant subjects Roadsides anywhere in the world. We defined a roadside as the unpaved zone of a road that is exposed to roadside management.
-
Relevant types of intervention Maintenance or restoration of roadsides based on non-chemical vegetation removal such as mowing, grazing, burning, clearance of shrubs and saplings, coppicing, pruning, or mechanical removal of invasive plants.
-
Relevant type of comparator Non-intervention or alternative forms of the interventions. Comparisons can in principle be made both temporally and spatially. Studies with a ‘BA’ (Before/After) design compare data collected at the same site prior to and following an intervention. Other studies may be based on comparison of different parts of a roadside, some that have been subject to a certain kind of management and some that have not. These may be termed as ‘CI’ (Comparator/Intervention) studies, or ‘BACI’ (Before/After/Comparator/Intervention) if they present data collected both before and after the intervention.
-
Relevant types of outcome Measures of functional/taxonomic diversity of vascular plants or invertebrates (including abundance of assemblages and single species). Ratings of intervention effects based on visual assessments of vegetation vitality were not considered to be relevant.
-
Relevant type of study Primary field studies (reviews and other secondary compilations were not included).
-
Language Full text written in English, Danish, Dutch, French, German, Norwegian, Spanish or Swedish.
Study validity assessment
Studies that passed the eligibility criteria described above were subject to critical appraisal: Based on assessments of their clarity and susceptibility to bias, they were categorised as having high, low or unclear validity (with regard to our review question).
Studies were excluded from the review due to low validity if any of the following factors applied:
-
No true replication (interventions not replicated).
-
Intervention and comparator sites not well-matched (sites substantially different before intervention).
-
Severely confounding factors present (e.g. additional treatments carried out at the intervention sites but not at the comparator sites).
We also excluded studies that were unclear to such an extent that their validity could not be judged, for instance due to absence of key information on study design. More specifically, we categorised a study as having unclear validity if any of the following factors applied:
-
Methodological description insufficient (e.g. not clear to what extent the study was actually conducted at roadsides).
-
Intervention data could not be interpreted (e.g. since they consisted of post hoc records such as ‘evidence of mowing’).
If none of the above five factors applied, the study was considered to have high validity and was included in the systematic review.
The validity of each study was assessed by one reviewer and double-checked by another one. Reviewers did not assess studies authored by themselves. Final decisions on how to judge doubtful cases were taken by the review team as a whole. A list of studies rejected on the basis of validity assessment is provided in Additional file 3 together with the reasons for exclusion.
Data extraction strategy
Outcome means and measures of variability (standard deviation) or precision (standard error) were extracted from tables, graphs and text in the included articles. When necessary, image analysis software (WebPlotDigitizer, http://arohatgi.info/WebPlotDigitizer/) was used. Extracted outcomes included measures of species richness, species diversity (e.g. diversity indices) and abundance of taxonomic or functional groups of organisms. Data on abundances of individual species were not extracted, since such data have limited relevance to our review question and since few studies had reported on the same species.
Where relevant outcomes had been reported in a format that impeded inclusion in quantitative analyses, study authors were asked to supply raw or summarised digital data instead. Received raw data were compiled and summarised by ourselves (if needed). Metadata, such as data on potential effect modifiers (see below), were extracted if present in the published material; no requests were sent for unpublished data of that kind.
Data were recorded using two Excel spreadsheets (see Additional file 4). In the first one, each row represented a comparison between an intervention and a control (no intervention). The second spreadsheet was used for studies where no untreated control was included; here, each row represented a comparison between two different kinds of intervention. Extracted data were examined by a second reviewer.
Potential effect modifiers and reasons for heterogeneity
To the extent that data were available, the following potential effect modifiers were considered and recorded:
Roadside data
-
Type, timing and intensity/frequency of roadside management.
-
Goals of the management (e.g. conservation/restoration of biodiversity).
-
Roadside manager.
-
Width, aspect and slope of roadside.
-
Type and structure of roadside vegetation.
-
Soil type.
-
Nutrient status of the soil.
-
Shading, e.g. by trees.
Road data
-
Road type (width, type of surface).
-
Time elapsed since the road (or roadside) was constructed.
-
Traffic (no. of vehicles per day).
-
Road maintenance (e.g. salting, gritting, dust control, snow clearance).
Study setting
-
Geographical coordinates.
-
Altitude.
-
Mean annual temperature and precipitation.
-
Vegetation, land use and history of land use in the surroundings of the road.
Study setting
-
Geographical coordinates.
-
Altitude.
-
Mean annual temperature and precipitation.
-
Vegetation, land use and history of land use in the surroundings of the road.
Where data on altitude and climate were not available in the included articles, we retrieved them from Google Earth and WorldClim [23], respectively, using the coordinates of study sites.
Data synthesis and presentation
The studies included in this review reported on several different types of vegetation removal (e.g. mowing, burning and grazing) and their effects on different measures of biodiversity (species richness, species diversity and abundance) of vascular plants and invertebrates. However, the only combinations of intervention and outcome that allowed for quantitative analysis, being covered by at least three studies, were the impacts of mowing (including different mowing regimes) on five aspects of plant communities: overall species richness and species diversity of plants, and abundance (cover) of forbs, graminoids and woody plants. Data on species richness were included in our analyses only if reported as the total number of vascular plant species (we did not use data on subcategories such as invasive or native species, flowering herbs, annual species etc.). Our analyses of species diversity were limited to data on the Shannon index (H′), since this was the only species diversity index consistently reported across studies.
Due to the heterogeneity of the published statistical analyses in general, and the lack of reported measures of variability in particular (such measures were only available for 37% of the studies), we conducted no meta-analysis of the extracted outcomes. None of the imputation methods described by Wiebe et al. [24] were considered appropriate for the data analysed in this review. Specifically, an approach based on weighted mean imputation of variance would suffer from the large between-study variation, and imputation methods using p values were not considered feasible due to inconsistent reporting of such values.
Therefore, we analysed the effects of mowing on biodiversity using a simplified approach based on one-sample t tests of study-level mean effect ratios. The analysis was conducted exclusively on the effects of mowing and how those effects depended on different mowing regimes. Other effect modifiers were not considered applicable in the analysis as they were not reported consistently across studies.
Effect ratios were calculated according to Eqs. 1 or 2, depending on the study design (CI or BACI):
$$ Effect\;ratio\;\left( {CI} \right) = B{D_I}/B{D_C} $$
(1)
$$ Effect~ratio~\left( {BACI} \right) = ~{{\frac{{BD_{{IA}} }}{{BD_{{CA}} }}} \mathord{\left/ {\vphantom {{\frac{{BD_{{IA}} }}{{BD_{{CA}} }}} {\frac{{BD_{{IB}} }}{{BD_{{CB}} }}}}} \right. \kern-\nulldelimiterspace} {\frac{{BD_{{IB}} }}{{BD_{{CB}} }}}} $$
(2)
where BD = biodiversity measure, I = intervention, C = comparator, B = before, A = after. Generally, for each analysis, we calculated the average effect ratio across all relevant comparisons reported by a study. However, where studies provided separate outcomes for different road types (e.g. highways and rural roads), we calculated separate means for each road type. Where outcomes had been reported over several years, we only used data from the final year of sampling.
To the extent possible, we analysed how each of the five plant-community aspects were affected by (i) mowing in general (vs. no mowing), (ii) specific mowing regimes (vs. no mowing), and (iii) differences between mowing regimes. We characterised the mowing regimes based on three elements: (1) frequency, i.e. number of treatments per year, (2) timing of the treatment(s), and (3) hay treatment (removal or no removal). When examining specific mowing regimes, we analysed the effects of each of these three elements separately (based on comparisons where the other two elements had been kept unchanged), but we also analysed the effects of specific combinations of mowing frequency and hay treatment. Similarly, when examining differences between mowing regimes, we made separate analyses of the effects of mowing frequency, timing of mowing, and hay treatment. These analyses were made both across all available data on such changes, and on subsets with specific selections of regime elements (e.g. comparisons between mowing once and twice per year that only included cases with hay removal).
Each set of study-mean effect ratios was analysed using a one-sample t test, testing the null hypothesis that effect ratios were different from one. This was done for both weighted and unweighted data. Adjusted from [25], weighting was based on the number of replicate treatments (n) and the sampling area within each replicate (s), following Eq. 3. In the original equation, Steward et al. [25] multiplied the replicate term by both area sampled and plot area, but as we use data from both observational and experimental studies we applied only sampling area within replicates in our calculation.
$$ {\text{Observation}}\;{\text{weight = }}\sqrt[3]{{\frac{{{{\text{n}}_{\text{I}}} \times {{\text{n}}_{\text{C}}}}}{{{{\text{n}}_{\text{I}}}{ + }\;{{\text{n}}_{\text{C}}}}} \times \frac{{\sqrt {\text{s}} }}{2}}} $$
(3)
Thus, weighted analyses were restricted to studies reporting both n and s. To be able to compare results based on weighted and unweighted data, we also made unweighted analyses using the same subsets of studies as for the weighted analyses.
Additionally, we analysed the difference between effects on forbs and graminoids with paired t tests, using only studies that had reported abundances of both groups.
We were not able to examine the possible influence of publication bias on the synthesis because of the incomplete reporting of variation, precision and statistical significance in the included studies. The findings of studies included in the review have also been summarised in a narrative synthesis.