Searching for articles
The final search string will be:
(grassland* OR meadow* OR pasture*) AND (restor* OR seed addition OR seed transfer OR hay transfer OR sow* OR strew*) AND (*diversity OR enhance* OR success OR richness OR establish*)
This search string was developed using the recommendations of the CEE Guidelines . The scoping was done on the Web of Science database. Scoping included a first version of the search string which was developed by extracting important terms that were found in our test list, which includes important studies done in this field (see Additional file 1). The hits of the first search string were compared to reference lists of two independent reviews about this topic [15, 28]. The search string was then adapted accordingly and yielded the final version with population, intervention and outcome terms from the question components. The population terms (grassland* OR meadow* OR pasture*) include the desired final and/or the initial studied population. The intervention terms (restor* OR seed addition OR seed transfer OR hay transfer OR sow* OR strew*) were recognized to be used repeatedly in grassland restoration and re-creation studies and assure the inclusion of the desired intervention methods for our review. The outcome terms (*diversity OR enhance* OR success OR richness OR establish*) cover the variety of different results related to changes in biodiversity. No comparator terms where included in the search string since our desired comparator (control site with no intervention) where not always mentioned in the title or abstract. If the search engine allows it, the search will be restricted to the research area of Ecology, Restoration and Conservation Biology and related areas. Depending on the database being used this will be done by adding further terms or through further refinement in the advanced search modus, e.g. in Web of Science by adding the terms AND SU = (Agriculture OR Biodiversity & Conservation OR Environmental Sciences & Ecology OR Evolutionary Biology OR Plant Sciences OR Zoology).
Relevant literature will be searched in the following bibliographic online databases:
Using the ‘Publish or perish’ software, which retrieves references from google scholar (https://scholar.google.ch/), 1000 references will be checked as well.
On 26 April 2019 a pilot run was conducted with Web of Science Core Collection with the above search string and the restrictions in research area (SU = …), which yielded 5′751 hits.
Grey literature, will be searched in the search engines BASE (https://www.base-search.net/) and google (https://www.google.ch/), where the first 500 hits will be retrieved and scanned for relevance . Furthermore, we will look for grey literature by asking our stakeholder group and other national and international experts in the field. Finally, the following organizational websites will be searched:
Searches in bibliographic databases will be conducted in English using the above mentioned search string. Using a simplified translated search string in English, German, French and Polish we will conduct additional searches for grey literature in google scholar, google and BASE and go through the above listed organizational websites in their respective languages.
Assembling a library of search results
All results from the above mentioned search will be added to a Mendeley library and duplicates will be removed.
Article screening and study eligibility criteria
At the beginning a random sample of 20% of the articles will be screened at the title and afterwards at the abstract level by the main reviewer. Studies that were conducted outside of Europe, that were not restoration studies or generally do not match our research question will be excluded directly at the title or abstract level. For the remaining articles a full-text screening will be performed. A second reviewer will perform the same screening process at each screening stage on the same subset of articles and Cohen’s kappa will be used to check for inclusion consistency . If the kappa score will reach < 0.6, the inconsistencies among the reviewers will be discussed and the inclusion criteria possibly redefined. Afterwards the screening will be repeated by both reviewers and inclusion consistency checking will be done again. If inclusion consistency is met, the main reviewer will finish the screening with the remaining articles.
The following criteria have to be fulfilled for an article to be included:
Eligible populations: Grasslands in temperate Europe, which we define as being within the Cfb-Zone according to the Köppen–Geiger climate classification system .
Eligible interventions: The only seed addition methods to be included are hay transfer from a species-rich donor grassland, sowing of seeds originating from a species-rich donor grassland from the respective region or sowing of a commercial seed mixture especially designed for restoration or re-creation purposes of grasslands [19, 23]. Before seed addition the soil has to be disturbed through either ploughing, harrowing or top soil removal.
Eligible comparators: Control sites/plots with no intervention, i.e. no seed/hay added and managed in the same way as the intervention plots.
Eligible outcomes: Species richness, percentage cover (for plants) or abundance (for invertebrates), or any biodiversity index of at least one taxonomic group.
Eligible types of study design: Only experimental studies will be included. These can be published as journal articles, PhD or MSc theses, book chapters, technical reports or other documents that fulfill our criteria.
A list with all excluded studies at abstract and full text level together with the reasons for exclusion will be provided.
Study validity assessment
Eligible studies will go through critical appraisal of internal validity and will be categorized as having high, medium or low risk of bias, concerning our review question. A similar categorization was done in Jakobsson et al. , but it was adapted to fit the purpose of this review. If a study shows high risk of bias and therefore low internal validity, it will be excluded from the synthesis. This will be the case if a study shows at least one of the following limitations:
Intervention and comparator sites are not well matched, e.g. soil conditions differ profoundly.
Severely confounding factors present.
Confounding factors can be the exposure of the intervention and comparator sites to different conditions after restoration/re-creation such as different types of management (mowing vs. grazing or a mix of both). If not excluded so far, a study will be categorized as being of medium internal validity if it matches one of the following conditions:
Study duration < 3 years, i.e. time since restoration/re-creation until last data collection
Non-random plot allocation.
Because many restoration/re-creation studies are site limited, a completely random plot/site-allocation is not always feasible, which increases the risk of selection bias. For this reason, we will also include studies with non-random plot allocation or with no replicates. In addition, if the description of the methods will not be sufficient enough, the data in the results section will be difficult to interpret or if important measurements (these could be any of the ones listed in the “Data coding and extraction strategy” section below) which were mentioned in the methods are not or only partially reported, we will attempt to contact the corresponding authors in order to obtain the necessary data or explanations. In case of no answer the respective study will be considered as of medium internal validity. Studies with medium internal validity will be analyzed separately in a narrative analysis (see “Data synthesis and presentation”).
A subset of 20% of the studies will be appraised by two reviewers independently and disagreements and process of resolution will be reported. The remaining studies will be appraised by the main reviewer. A list of the excluded articles with the reason of exclusion will be provided. Studies where none of the above listed conditions apply will be regarded as having low risk of bias and therefore of high internal validity and suitable for data extraction.
Data coding and extraction strategy
As response variables the mean species richness and, if available, the mean evenness (e.g. Shannon’s index) will be extracted together with their respective standard deviation. If evenness is not provided, we will calculate it from the reported percentage cover (for plants), abundance (for invertebrates) and species richness, if feasible. Data will be obtained either from tables in the manuscripts or from the text. If other types of variation are provided, such as standard error, they will be converted into standard deviation. If the values are not provided in the manuscript, we will contact the corresponding author asking for these values or for the raw data in order to calculate them.
Meta-data which could potentially be relevant for comparison among studies will be coded and will include:
Mean annual precipitation
Mean annual temperature
Establishment year of the study
Former land use
Soil conditions before intervention, i.e. pH, N-content and P-content
Plant community of donor site or targeted community
Grassland habitat type, such as: dry, wet or mesotrophic grassland
Number of replicates
Seed addition method, such as: hay transfer, sowing of collected seeds from donor site or sowing of commercial seed mixture
Soil disturbance, such as: ploughing, harrowing or top soil removal
Management after initial restoration, such as: grazing, mowing or mulching.
Meta-data will be coded from tables or from the text in the manuscript. If the altitude or the climatological data are not provided in the original study, they will be obtained from the WorldClim database . If any of the other data will not be found in the text, the authors will be contacted. The extracted data will be made available as an additional file.
In order to ensure consistency, data of a random set of five articles will be coded and extracted by two reviewers. In case of disagreement in the coding, the results will be discussed among the reviewers. Once agreement is met, data of the remaining articles will be coded and extracted by the main reviewer.
Potential effect modifiers/reasons for heterogeneity
Publications about grassland restoration or re-creation use data from experiments ranging in their study duration from 1 year  to over 10 years . Especially in the first few years the plant composition can fluctuate from 1 year to another. For this reason, the study duration has a high potential to be an effect modifier. Also the soil condition such as nutrient content can play an important role in the success of the restoration. Soil measurements are not always performed before the restoration, but the former land use before the restoration can be a good proxy for that, e.g. a highly intensive crop field with regular nutrient input via manure addition versus an extensively managed meadow. Finally, the climatic conditions can also influence the outcome. The list of potential effect modifiers is based on a previous literature research that we conducted and expert knowledge, but is not exhaustive and will be adapted during the review process if necessary.
Data synthesis and presentation
Due to logistic constraints, seed addition experiments for grassland restoration and re-creation are often limited to few or no replicates. Studies with non-random plot allocation, no replicates or where no standard deviation can be retrieved will be used for a narrative analysis (medium internal validity, see “Study validity assessment”), i.e. including descriptive statistics and brief descriptions from a selection of individual studies and their findings. If enough studies with replicates and their respective means and variances will be found a quantitative meta-analysis will be conducted. Such meta-analysis will be done in R  with the metafor package . Although we will use the species richness as a common measure with the same unit, i.e. number of species, the methods with which the species richness was assessed might differ from study to study, e.g. different plot size for taking the measure. For this reason, we will calculate the standardized mean difference (Hedge’s d) or/and the response ratio for the species richness together with the variances for each study. The same will be done for other measures such as coverage (for plants), abundance (for invertebrates) and species evenness, if enough studies will provide these values. Assuming heterogeneity between the studies we will use for the further inferential analysis the random-effects model with unweighted estimation with the restricted maximum likelihood estimator if we have many studies, i.e. > 10, otherwise we will use the fixed-effects model with weighted estimation . Moderators will be added (see “Potential effect modifiers” section above) and their relative importance in explaining the variance will be assessed with the τ2, I2 and Q-values. Furthermore, to check the robustness of the result the risk of publication bias will be determined with funnel-plots and the p-uniform function from the puniform-package [42, 43] and sensitivity analysis will be carried out.
Finally, knowledge gaps and clusters will be identified in the field of grassland restoration and re-creation. Focus will be given to different species groups included in the studies. While in the reviews that were done on this topic so far mostly plants were included as diversity measures [15, 27,28,29], an under representation of other species groups, such as invertebrates, is expected. Moreover, we will check if certain seed addition methods are used more frequently than others. In order to do so, studies with high and medium internal validity will be counted according to the above mentioned categories (i.e. studies on plants or invertebrates, hay transfer vs. direct seeding etc.). The entire protocol complies with the ROSES reporting standards (see Additional file 2).