How effective are created or restored freshwater wetlands for nitrogen and phosphorus removal?

Eutrophication of aquatic environments is a major environmental problem in large parts of the world. In Europe, EU legislation (the Water Framework Directive and the Marine Strategy Framework Directive), international conventions (OSPAR, HELCOM) and national environmental objectives emphasize the need to reduce the input of plant nutrients to freshwater and marine environments. A widely used method to achieve this is to let water pass through a constructed or restored wetland (CW). However, the large variation in measured nutrient removal rates in such wetlands calls for a systematic review. The objective of this review is to quantify nitrogen and phosphorus removal rates in constructed or restored wetlands and relate them to wetland characteristics, loading characteristics, and climate factors. Wetlands are created to treat water from a number of different sources. Sources that will be considered in this review include agricultural runoff and urban storm water run-off, as well as aquaculture wastewater and outlets from domestic wastewater treatment plants, with particular attention to the situation in Sweden. Although the performance of wetlands in temperate and boreal regions is most relevant to the Swedish stakeholders a wider range of climatic conditions will be considered in order to make a thorough evaluation of climatic factors. Searches for primary studies will be performed in electronic databases as well as on the internet. One author will perform the screening of all retrieved articles at the title and abstract level. To check that the screening is consistent and complies with the agreed inclusion/exclusion criteria, subsets of 100 articles will be screened by the other authors. When screening at full-text level the articles will be evenly distributed among the authors. Kappa tests will be used to evaluate screening consistency. Data synthesis will be based on meta-regression. The nutrient removal rates will be taken as response variables and the effect modifiers will be used as explanatory variables. More specifically, the meta-regression will be performed using generalized additive models that can handle nonlinear relationships and major interaction effects. Furthermore, subgroup analyses will be undertaken to elucidate statistical relationships that are specific to particular types of wetlands.


Summary
Eutrophication of aquatic environments is a major environmental problem in large parts of the world. A widely used method for reducing the input of nutrients into freshwater and marine environments is to allow water to pass through a created or restored wetland. However, the large variation in measured nutrient removal rates in such wetlands has made it difficult to assess the efficiency of such interventions. The systematic review summarized below compiles all the available evidence, synthesizes the results and assesses the overall effect. We also examine various effect modifiers and show that some conditions are more favourable than others.
The literature search generated almost 6000 unique articles and reports. All these were screened for relevance, and all the relevant studies were critically appraised. In the end 93 articles, which contained studies of 203 wetlands, were used for data extraction. Most of the wetlands studied are located in Europe and North America. Quantitative synthesis consists of meta-analyses and response surface analyses. Regressions were performed using generalized additive models that can handle nonlinear relationships and interaction effects.
While the removal rate of both total nitrogen (TN) and total phosphorus (TP) is highly dependent on the loading rate, median values were found to be 93 and 1.2 g·m -2 ·yr -1 , respectively. Removal efficiency, or relative load reduction, for TN was significantly correlated to hydrologic loading rate (HLR) and annual average temperature (T). The median value was 37%, with a 95% confidence interval of 29-44%. Removal efficiency for TP was significantly correlated to inlet TP concentration, HLR, T, and Wetland area. Median TP removal efficiency was 46% with a 95% confidence interval of 37-55%.
On average, created and restored wetlands significantly reduce the transport of TN and TP and may thus be effective in efforts to counteract eutrophication. However, restored wetlands on former farmland were significantly less efficient than other wetlands at TP removal. In addition, wetlands with precipitation-driven HLRs and/ or hydrologic pulsing show significantly lower TP removal efficiencies compared to wetlands with controlled HLRs.
Why create and restore wetlands?
In Europe, like in many other parts of the world, nutrient enrichment of water bodies is a major environmental problem. Several EU directives emphasize the need to reduce the input of nutrients to both freshwater and marine ecosystems (e.g. the Water Framework Directive, the Marine Strategy Framework Directive and the Nitrate Directive). This is also an important part of the Helsinki Commission (HELCOM) Baltic Sea Action Plan, which contains several suggested measures targeting nutrient losses from agricultural land. Wetland creation is one of these, as it is known that the biogeochemical transformations that occur in wetlands generally result in a reduction in the nutrient content of the water flow.
Wetlands commonly used to cover a large proportion of the land area, but in many parts of the world these water bodies were drained or filled in order to create new forest and arable land. In Sweden, only a fraction of the mires, wet woodlands, wet meadows and transition zones between land and water that existed in the 19th century remain. However, in recent decades the eco-system services provided by wetlands have been increasingly acknowledged. To compensate for the massive loss of wetland areas, wetland creation has been practiced in Sweden on a fairly large scale since the 1990s-initially focused on nitrogen removal and biodiversity enhancement. Thousands of hectares of wetlands have been financed through various governmental initiatives.

Different types of wetlands
Created wetlands are wetlands on land that has never been wetland before. Restored wetlands are wetlands on previously drained land or natural wetland areas that have been altered by other means. Created wetlands can be of different types, and are usually classified as Free Water Surface Constructed Wetlands (FWS), Horizontal Subsurface Flow Constructed Wetlands (HSF) and Vertical Flow Constructed Wetlands (VF).
Free Water Surface Constructed wetlands are usually between 0.1 and 2 m deep, with a plant community composed of a mix of algae and submersed, floating or emergent wetland plants.
Horizontal Subsurface Flow Constructed wetlands are typically designed with a permeable filter material ("soil") planted with emergent wetland plants. Water flows horizontally in and beneath their rhizosphere, which creates a mix of saturated anaerobic and unsaturated aerobic zones. A Vertical Flow Constructed wetland is similarly constructed, but water is applied to the surface of the filtering material and percolates through the rhizosphere. This results in a typically unsaturated aerated "soil". In some VF wetlands the water is applied at the bottom of the filtering material, resulting in an upward flow through the rhizosphere and saturated anaerobic conditions in the soil.
In this systematic review a small number of wetlands were classified as Combined Horizontal wetlands. These are wetlands that have been integrated with other units, such as a sedimentation pond or an overflow unit. Here, nutrient removal was calculated for the whole system. In addition, some wetlands were classified as Riparian wetlands, which is any wetland at the interface between land and a river or stream.
When wetlands are restored, interventions are typically made to recreate previously drained, or by other means altered, natural wetlands. Restoration refers to recovery of ecological and hydrological processes as well as geomorphology.

Large variations in nutrient removal efficiency and removal rates
There are numerous studies of the physical and biogeochemical processes involved in the removal of both nitrogen and phosphorus. These are therefore relatively well known. Nitrogen removal takes place through: (1) sedimentation and sediment accretion; (2) plant uptake; and (3) denitrification and volatilization. The processes involved in phosphorus removal are: (1) sedimentation and sediment accretion; (2) plant uptake; (3) sorption; and (4) precipitation/co-precipitation.
The success of each of the above-mentioned mechanisms can depend on a large number of factors related to loading characteristics, wetland characteristics and climate. It is therefore not surprising that studies on nutrient removal efficiency and removal rates in wetlands have shown widely varying results. This makes it difficult to assess the extent to which wetland creation is an efficient measure to reduce eutrophication and to fulfil the different Swedish environmental goals related to eutrophication.

What is a systematic review?
In this review, we used a systematic approach to synthesise available evidence on the effects of biomanipulation. Systematic reviews are entirely based on existing studies -in this respect, they do not differ from ordinary literature reviews of scientific questions.
The difference lies, instead, in the rigour. A systematic review is characterised by meticulous planning, methodical procedures and a transparent, objective and complete documentation of all assessments carried out in the course of the work. This approach is designed to increase reliability and repeatability, avoid bias and facilitate meta-analysis (quantitative conclusions based on data from several different studies).
The purpose of the systematic review summarized below is to clarify whether created and restored wetlands are efficient at removing nitrogen and phosphorus from water, and to identify factors that might explain the large observed variations in removal efficiency and removal rates.

Large body of evidence
The literature searches, which are described in detail in an a priori published protocol, generated approximately 6000 unique articles that could potentially provide useful data. The protocol also specified exactly which studies were eligible for inclusion in the review by defining sets of relevance criteria and quality criteria. After screening, 93 articles, containing studies on 203 different wetlands, were included in the review.
We only included wetlands treating secondary-or tertiary-treated domestic wastewater, urban storm water, lake/river water, freshwater aquaculture effluents and runoff from agricultural fields. Untreated wastewater was not considered since it is not permissible to discharge such water into the environment in most European countries. Industrial or agricultural wastewater can vary considerably in composition, and was therefore also excluded. Farmyard runoff was in most cases classified as agricultural wastewater, and thus excluded, since it is often mixed with untreated parlour washings and silage/farmyard manure effluents, among other things. Another important inclusion criterion was that the wetlands must have been studied for at least one complete annual cycle in order to avoid bias due to seasonal variations.
Due to climatological constraints, most of the studies were performed in Europe and North America, but a small number of studies from Australia, New Zealand and East Asia were included as well (see figure 1). Although not a prerequisite for inclusion, all of the included wetlands were primarily created or restored for the purpose of nutrient removal. However, a few of them were multi-purpose wetlands where additional design constraints had been taken into account.
There was a large span in wetland size. The included wetlands ranged between 1 and 10 7 m 2 . For comparison, most of the created or restored wetlands in Sweden range between 10 2 and 10 5 m 2 (figure 2a). The span in HLR among the included wetlands was also large. The HLR ranged between 0.1 and 1350 m/yr, but for most wetlands the HLR was below 50 m/yr, and in 90% of cases the HLR was below 150 m/yr (figure 2b).

Wetlands work, but planning and design is important
Most studies on nutrient removal in wetlands do not report any variance in annual removal rates or efficiencies. In some cases this is a consequence of the fact that only one wetland was studied and that the study only lasted for one year. There is thus neither any true replication nor any quasi-replication. In other cases the study of one wetland lasted for several years but only a long-term average was reported with no information about the inter-annual variance.
Average and median values for loading rates, removal rates and removal efficiencies for TN and TP are shown in table 1 and 2, respectively. The values in these tables are based on all included studies, i.e. those with and without replication. The studies have been assigned to either of two quality categories. Studies with the highest quality were assigned to category 3. To be included in this category the studies must fulfil certain criteria regarding study length, sampling frequency, hydrological control and replication. Studies in category 2 were judged to be less reliable but still good enough to be included in the review.
The annual loading rates of TN in the included wetlands ranged from 2.1 to 2486 g·m -2 ·yr -1 , and  averaged 505 g·m -2 ·yr -1 . The average removal rate of TN was 181 g·m -2 ·yr -1 , while the average removal efficiency was 39%. The range in loading and removal rates is quite wide, and the distribution is skewed to the right (median values are much lower than average values). The distribution of removal efficiencies is more likely to be normally distributed (Figure 3). Although there is no significant difference in average or median removal efficiencies between category 2 studies and category 3 studies, the variability is smaller among category 3 studies. It is worth noting that none of the wetlands among the category 3 studies had negative removal rates.
As in the case of TN, the spans in the loading and removal rates of TP are quite large (table 2). The average loading rate and removal rate were 36 and 13 g·m -2 ·yr -1 , respectively. The median values are much lower, indicating a skewed distribution (figure 4). There is no significant difference in average TP removal efficiencies between category 2 studies and category 3 studies but, as with TN, the variability between wetlands was smaller among the category 3 studies compared to the category 2 studies.
Replicated studies can be used in meta-analyses. The advantage of such analyses is that high quality studies with large data sets can be given more weight in the quantitative synthesis, and that uncertainties in the overall effect can be calculated. In this systematic review we have used standard methods for meta-analyses where the log response ratio is the effect size. The log response ratio (ln R) is defined as ln R = ln (Load out /Load in ).
The average effect size for various groups of wetlands is shown in Figure 5 for TN and in Figure 6 for TP. Numerical values are presented in Table 3. For TN the overall summary effect (ln R±1 S.E.) is -0.46±0.05, which represents a median TN removal ratio (R) of 0.63. This means that the median TN load reduction, or removal efficiency, is 37%, with a 95% confidence interval of 29-44%. The TN removal efficiency is significantly higher in wetlands receiving tertiary treated wastewater than in wetlands receiving secondary treated wastewater. No significant differences can be demonstrated between other groups of wetlands.
The variation in removal efficiency is generally larger for TP than for TN, resulting in wider 95% confidence intervals in the averages for the different groups of wetlands. For TP the overall summary effect (ln R±1 S.E.) is -0.62±0.08, which represents a median TP removal ratio (R) of 0.54. This means that the median TP load reduction, or removal efficiency, is 46% with a 95% confidence interval of 37-55%. Subgroup analysis shows that land use history and flow regime may influence TP removal efficiency.
TP removal efficiency is significantly lower in restored wetlands on cropland than in other wetlands. The main difference between restored wetlands on cropland and created wetlands on cropland is that to restore a wetland, it is not really necessary to excavate the soil extensively since the location can naturally accommodate a wetland. In principle, it is thus sufficient to just stop draining the area. One possible explanation for the observed difference could be that in restored wetlands, accumulated    phosphorus in the agricultural soil is released when the conditions are changed. In addition, the grouping by water regime suggests that wetlands with precipitation-driven HLR are significantly less effective than wetlands with other water regimes. This is also true when the restored wetlands on formerly drained cropland are excluded (Figure 6d). If such wetlands were included, the difference between precipitation-driven and other wetlands would appear to be even more significant (data not shown). Inclusion or exclusion of restored wetlands on formerly drained cropland does not alter the general patterns shown in the other subgroup analyses. TP removal efficiency also tends to be higher in climates with hot summers, but the 95 % confidence intervals overlap.
To further examine the importance of various effect modifiers we performed response surface analyses using general additive models (GAM) and taking the potential pairwise interaction effects of the predictors into account by allowing thin plate splines (TPS) in the GAM models. This type of regression analysis is based on mean values per wetland study, and the response surfaces derived illustrate how estimates of median removal efficiency and median removal rate are influenced by various effect modifiers.
TN removal efficiency (% load reduction) was significantly negatively related to HLR. TN removal efficiency was also found to be positively correlated to annual average air temperature.
Other investigated predictors showed non-significant (p>0.05) relationships to TN removal efficiency. Using both HLR and air temperature as predictors in a GAM improved the model fit (reduced the deviance) and demonstrated that the linear response to air temperature was also significant in the presence of a function of hydraulic loading. The model fit was further improved when the one-dimensional splines in log hydraulic loading and air temperature, respectively, were substituted for a thin plate spline that allowed interaction effects between hydraulic loading and air temperature (Figure 7a).
The TN removal rate expressed as g m -2 d -1 was found to be positively correlated with the inflow concentration, with a steeper increase in removal rate at concentrations higher than about 18 mg/l. The TN removal rate was also positively correlated with hydraulic loading. Furthermore, the TN removal rate was negatively correlated with wetland area, but the decline in removal rate with wetland size appeared to be somewhat lower at areas above approximately 1 ha. When both hydraulic loading and TN concentration at inlet were used as predictors in a GAM the deviance was substantially re- Error bars show the 95% confidence interval (where number of wetlands (n) is 1 it is based on the within study variance only).
duced, and a further reduction was achieved when the two one-dimensional splines were substituted for a thin plate spline allowing interaction effects. A plot of predicted removal rates according to this model is shown in Figure 7b, where the overall positive response to hydraulic loading and inflow concentration is clearly visible.
According to combined linear/spline regression models, the removal efficiency of TP was influenced by all four of the investigated predictors, that is, TP inlet concentration, hydraulic loading, wetland area and air temperature. When GAM models with two predictors were examined the best fit (lowest deviance) was obtained for a thin plate spline model with log inlet concentration and log HLR (Figure 7c).
The TP removal rate (in g m -2 d -1 ) was positively correlated with concentration at inlet, with a steeper increase in removal rate at concentrations above approximately 0.55 mg/l. In contrast, the TP removal rate was negatively correlated with wetland area at areas below 2·ha (above 2·ha the removal rate was fairly constant). When both inlet concentration and HLR were used as predictors of removal rate, and the interaction effects between these predictors were taken into account using a thin plate spline function the deviance was significantly lower than in the best one-dimensional spline model. Fitted TP removal rates are shown in Figure 7d. Figures 7b and 7d suggest that the removal rates are very low at low nutrient concentrations at the wetland inlet and low HLRs. To obtain an appreciable removal rate either the inlet concentration or the HLR (or both) need to be increased. On the other hand, the HLR should be increased with some caution since the removal efficiency decreases with increasing HLR (figure 7a and 7c). When a wetland is being designed, a balance should thus be found between an HLR that is high enough to allow for a meaningful removal rate at a given inlet concentration, and an HLR that is low enough to keep the removal efficiency sufficiently high to make a significant difference to the total transport of nutrients.

Implications for policy and management
Median values for removal efficiency of total nitrogen and total phosphorus were 39% and 46%, respectively.
Nutrient loading rates (inlet concentrations x HLR) need to be carefully estimated as part of the design of created and restored wetlands. In general, high nutrient loading rates result in high removal rates (expressed in g·m -2 ·yr -1 ). However, high hydraulic loading rates may result in reduced removal efficiency (expressed in %).
The removal efficiency for total phosphorus is significantly lower in restored wetlands on cropland compared to other wetlands. Such wetlands have in many cases been shown to release more phosphorus than they receive. Water regime seems to be another factor that can influence phosphorus removal efficiency. Wetlands where the HLR is driven by precipitation show a significantly lower phosphorus removal efficiency than wetlands with a controlled HLR.

Implications for further research
Hydrological processes and especially hydraulic loading are inadequately measured in many Figure 7. Response surface analyses based on general additive models (GAM) taking interaction effects into account by allowing thin plate splines (TPS): a) TN removal efficiency predicted by log HLR and air temperature, b) TN removal rate predicted by log TN concentration at wetland inlet and log HLR, c) TP removal efficiency predicted by log HLR and log TP concentration at wetland inlet, d) TP removal rate predicted by log TP concentration at wetland inlet and log HLR.
papers: 45 of the 143 papers with relevant outcome data excluded from this study only included inlet measurements, had incomplete water balances or lacked hydrological data, making it impossible or too uncertain to calculate mass balances. Only total nitrogen or total phosphorus was measured in many studies. This prevented us from evaluating the importance of speciation of these elements to the removal of nitrogen and phosphorus.
Long-term performance of wetlands as nutrient sinks, extending over more than 20 years, is poorly investigated. More research is also needed on the effects of hydrologic pulsing and different management methods.
The variation between studies is considerably smaller among high quality studies compared to studies that were judged less reliable. This suggests that part of the large variation between studies may be explained by measurement errors due to less rigorous study designs.

How this review was conducted
During the planning phase of the systematic review, Swedish stakeholders with an interest in mitigation of eutrophication were invited to comment on the scope and focus of the review. The final design of the review was described in a protocol that was published in the peer-reviewed journal Environmental Evidence in August 2013.
Searches for scientific literature were made in ten different literature databases. Grey literature was searched for using the search engine Google. In addition to searches where English search terms were used, searches were also performed using Swedish, Danish and Dutch search terms. Furthermore, the websites of relevant specialist organizations were also searched. Generally, the first 100 hits were examined in the searches using Google and on specialist websites.
The search strings used in the searches generated almost 6000 unique articles and reports (figure 8).
Based on title and abstract, most of them could be excluded for lacking relevant results. However, 962 articles and reports were read in full, and during that process 252 articles and reports were found to provide information relevant to our question. These 252 articles were then critically appraised. This means that they were evaluated against a set of a priori defined quality criteria regarding length of study, sampling frequency and hydrological measurements, among other things. In the end 93 articles, which contained studies of 203 wetlands, were found to be of sufficiently high quality to provide reliable data on nutrient removal, 91 of which were published in peer-reviewed scientific journals and two of which were grey literature reports.
Quantitative synthesis of the extracted data was conducted using standard methods for meta-analysis and log response ratios as effect size. Subgroup analyses were performed to investigate whether heterogeneity in the results could be explained by single variables. To further investigate the importance of various effect modifiers, response surface analyses were performed using general additive models. Potential pairwise interaction effects of the predictors were taken into account by allowing thin plate splines (TPS) in the GAMs.
This systematic review was initiated and financed by the Mistra Council for Evidence-Based Environmental Management (EviEM). The review was conducted by a specially appointed team of researchers ( Figure 9) chaired by Wilhelm Granéli, Professor Emeritus of Aquatic Ecology at Lund University, Sweden. The project was managed by Magnus Land, EviEM.

Free access to full report
The full report on this systematic review is published in the journal Environmental Evidence (http://environmentalevidencejournal.biomedcentral.com/). The report is also available on EviEM's website (www.eviem.se).