Identifying the most effective behavioural assays and predator cues for quantifying anti-predator responses in mammals: a systematic review protocol

Mammals, globally, are facing population declines. Strategies increasingly employed to recover threatened mammal populations include protecting populations inside predator-free havens, and translocating animals from one site to another, or from a captive breeding program. These approaches can expose predator-naïve animals to predators they have never encountered and as a result, many conservation projects have failed due to the predation of individuals that lacked appropriate anti-predator responses. Hence robust ways to measure anti-predator responses are urgently needed to help identify naïve populations at risk, to select appropriate animals for translocation, and to monitor managed populations for trait change. Here, we outline a protocol for a systematic review that collates existing behavioural assays developed for the purpose of quantifying anti-predator responses, and identifies assay types and predator cues that provoke the greatest behavioural responses. We will retrieve articles from academic bibliographic databases and grey literature sources (such as government and conservation management reports), using a Boolean search string. Each article will be screened for the satisfaction of eligibility criteria determined using the PICO (Population—Intervention—Comparator—Outcome) framework, to yield the final article pool. Using metadata extracted from each article, we will map all known behavioural assays for quantifying anti-predator responses in mammals and will then examine the context in which each assay has been implemented (e.g. species tested, predator cue characteristics). Finally, with mixed effects modelling, we will determine which of these assays and predator cue types elicit the greatest behavioural responses (standardised difference in response between treatment and control groups). The final review will highlight the most robust methodology, will reveal promising techniques on which to focus future assay development, and will collate relevant information for conservation managers.

well-informed and well-tested management interventions. Many of these interventions will need to be underpinned by a mechanistic understanding of species' behaviour.
How an animal responds to predators has substantial bearing on its ability to survive. Predation, particularly from introduced predators, has been a major driver of mammal declines and extinctions around the world [5][6][7][8][9]. This is especially true for individuals and populations that have had limited or no exposure to predators, such as many island populations [10,11], individuals raised in captivity and those moved to an environment with novel predators [12][13][14]. Improving our understanding of how animals behave in response to predatory stimuli should provide crucial insights for their conservation management and can improve our ability to retain antipredator traits in managed populations [12,15,16]. An animal's response to predators may be behavioural (e.g. spatial and temporal avoidance [17,18], avoiding detection [19] and evasion [20]) or physical responses (e.g. chemical [21] and physical defences [22]). Behavioural responses are likely to be more plastic and responsive at shorter time frames than physical responses, and are therefore particularly important when considering the acute impacts of predators on the persistence of predator-naïve species.
Common strategies employed to prevent faunal extinctions include captive breeding [23], translocations (the deliberate movement of animals from one population or site for release in another [24]) and establishment of populations in predator-free havens (areas isolated from predators through a geographical or physical barrier, such as islands or fenced enclosures [25][26][27]). Such approaches have secured a number of populations of mammals, including African elephants [28,29], European lynx [30], elk [31], giant pandas [32], and Tasmanian devils [33]. Despite their initial successes, these strategies are at risk of longer term failure because they can expose naïve individuals to novel contexts for which they may lack appropriate behavioural responses. Further, such populations become vulnerable to acute population collapses from uncontrolled predator incursions.
Australia provides a compelling case study to illustrate the challenges of mammal conservation. More than one third of modern mammal extinctions have occurred in Australia, largely due to the introduction of feral cats and foxes [34]. In response, havens free of introduced predators are a key component of conserving much of the remaining mammal fauna [26,27,35]. Australia's current network of havens provides habitats for at least 32 mammal species, and has secured at least 188 populations and sub-populations [26]. Evidence is emerging, however, that in the absence of feral and/or native predators, havened populations no longer exhibit anti-predator behaviours [13,[36][37][38][39][40]. This renders individuals in these populations fundamentally unfit for reintroduction back into areas where predators still persist. Because the success of many translocations has ultimately been compromised by predation [35,41,42], the future of mammal conservation in Australia, and more broadly, hinges on developing methods and strategies that can quantify and conserve antipredator behaviours in havened and translocated populations [39].
To undertake an adaptive management approach, we require monitoring and evaluation of anti-predator responses in mammalian species. Despite awareness that behavioural traits such as boldness or shyness can influence conservation outcomes, measuring such traits is rarely incorporated into monitoring and management [16,43]. Anti-predator responses have only recently been identified as a potential barrier to the success of conservation projects [13,[37][38][39], and while an array of academic literature exists that details various methods for measuring these behaviours [15,38,39,[44][45][46][47][48], accessing the methodologies, comparing them for rigor and identifying the most appropriate measure is labour intensive. Stakeholders, such as conservation and population managers, are likely to be seeking this information, but also likely to be limited by the time and resources necessary to find it. Ultimately, we currently lack a robust framework for the universal monitoring and evaluation of anti-predator traits [49]. The first step to developing such a framework is to understand which behavioural assays have been conducted, which are most effective (capture or provoke the greatest behavioural response), and whether the type of predator cue is important. In the absence of this crucial information, the adoption of inappropriate and poorly-performing behavioural metrics may prevail.

Identification and engagement of stakeholders
In addition to the review team, stakeholders relevant to this review have been identified as those who research or manage animal populations, for example, members of species recovery teams (Fig. 1). To ensure the information collected throughout this review is tailored toward the target audience, and thus of the most relevance for application, a variety of stakeholders from each of the categories in Fig. 1 were consulted during the development of this protocol. We invited 27 stakeholders to comment on the draft protocol, and after receiving 16 replies (ten from Australia and six from other countries), we incorporated their suggestions.

Objective of the review
We will present all known behavioural assays for measuring or quantifying anti-predator responses in mammals by collating information into an accessible format.
Specifically, we will: (1) reveal different methods, (2) describe the context within which each method was conducted, and (3) highlight methods or aspects that warrant further examination, thus guiding the future development of behavioural assays. Further, using a modelling approach, we will then identify which types of behavioural assays and predator cues elicit the greatest responses in mammals (difference in effect size between the treatment and control conditions). A formal evidence synthesis is required to explore all potential methods and to avoid bias toward those published in academic journals, because much information may come from governmental reports and species recovery plans [16,50]. The final review will act as a guide: it will highlight existing methodologies and provide additional information to assess their relevance, allowing stakeholders to easily select the most appropriate and effective behavioural assay for their purpose.
Using the PICO (Population-Intervention-Comparator-Outcome) framework [51], we have broken our review into two questions that will define our search scope. We will first systematically map all known methodologies answering a primary question: what behavioural assays have been used to quantify anti-predator responses in mammals? The elements of this question are:

Intervention
(i) A behavioural assay that quantifies anti-predator responses to predator exposure (ii) A behavioural assay that quantifies anti-predator responses to predator cues Articles that conform to both the Population and Intervention criteria will be used to answer this primary question. A secondary question we seek to answer will be assessed quantitatively by modelling the metadata collected from each article, asking: which assay-types and predator cues elicit the greatest behavioural responses? This question utilises the same Population and Intervention criteria as the primary question, but requires further assessment using Comparator and Outcome criteria to select studies for the systematic review. The additional elements of the secondary question are:

Comparator
Comparison between levels of predator exposure (e.g. before versus after exposure, exposure versus no exposure) or comparison between exposure to a predator cue versus a control.

Outcome
Difference in the behavioural response between the treatment (e.g. predator/predator cue exposure) and control conditions. Metrics of responses will differ between studies depending on assay type and will be compared using standardised effect sizes.
Articles that involve at least one Comparator element can then additionally be considered for the systematic review to investigate which Intervention elements (behavioural assays and predator cues) produce the greatest Outcome. The PICO elements of our two questions are illustrated in Fig. 2.

Searching for articles Scoping
To develop a search strategy, an initial scoping exercise was conducted using a test-list of 10 benchmark articles that assess anti-predator responses (Additional file 1), each selected as they cover a variety of different assays and predator scenarios. The titles, key words, and  1 End-user stakeholder groups (right-hand boxes) consulted when designing a systematic review of methods that quantify anti-predator behaviour in mammals. Arrows indicate each groups' broad interests in the various steps (left-hand boxes) required for improving conservation outcomes abstracts of each scoping article were mined, both manually, and using word clouds (R package wordcloud [52]; in the R environment [53]), to determine the most appropriate search terms [54]. An initial search string was then created using Boolean operators to combine the relevant terms based on the review team's knowledge, and the terms identified from the scoping articles. Trial searches were conducted using the Web of Science: Core Collection. We systematically removed terms that appeared to broaden the search outside the scope of the review. To ensure the proposed strategy adequately returned relevant literature, the search output was scanned for relevant articles and each of the scoping benchmark articles. Unreturned articles were then closely inspected, and the search strategy was adjusted until it retrieved all 10 benchmark articles [51]. The comprehensiveness of the search strategy was then tested using a list of 5 independent articles (Additional file 1), all of which were retrieved by the final search strategy.

Academic literature
Based on the subject matter covered by each, we will search the following bibliographic databases from which to collect peer-reviewed journal articles: Web of Science (Core Collection, BIOSIS Citation Index, Zoological Record, CAB abstracts) and Scopus.

Grey literature
To reduce bias toward published literature, we aim to also search a variety of grey literature sources [49,50]. Using our search string above, we will collate theses and dissertations from two bibliographic databases specific to grey literature: Proquest Dissertation and EThOS: UK Theses and Dissertations. Conference proceedings will be searched in the Web of Science database using the predetermined search strategy. The following website will also be searched, using the search terms "anti-predator" and "antipredator": opengrey.eu; trove.nla.gov.au. Specialist documents will be searched from within the following repositories, using the search terms "anti-predator" and "antipredator": IUCN general publications (https:// porta ls. iucn. org/ libra ry/ dir/ publi catio ns-list); IUCN Conservation Planning Specialist Group (http:// www. cpsg. org/ docum ent-repos itory); Conservation Evidence (http:// www. Conse rvati onEvi dence. com); WWF (https:// www. world wildl ife. org/ publi catio ns). A web-based search engine, Google (www. google. com), will be searched to supplement our search results. The first 50 links returned using each combination of the search terms "anti-predator/antipredator" and "behaviour/behavior", will be inspected and added to the article pool if not yet identified [55].

Additional literature
Based on the knowledge of the review team and stakeholders, additional publications not identified by the search strategy may also be included. Search results will be limited to articles written and published in English (due to the language capabilities of the review team). All database and grey-literature searches will be documented, and this information will be made available with the final review publication. All searches will be conducted within two years of the final analysis being submitted for publication.

Article screening and study eligibility criteria
Duplicate articles will be removed, and article screening will be conducted through CADIMA [51,56]. To remove bias, two screeners will independently review articles at title and abstract level simultaneously to determine relevance, followed by the full text versions, to decide which meet the inclusion criteria. Each screener will assess an overlap of 10% of all articles (to a maximum of 50 articles screened) at both the title/abstract stage, and at the full text stage. Reliability between screeners will be assessed using Kappa calculations (with values > 0.5 deemed acceptable [12,57]). In instances where screeners do not agree on the inclusion/exclusion of an article, they will discuss, and then consult a third member of the review team if necessary. If theses or dissertations have additionally been published as journal articles or specialist reports, we will assess the methods described in both, and only include the article that provides the most detail. While not anticipated, if reviewers find themselves assessing their own work, a third impartial member of the review team will oversee the assessment of any conflicting articles. A full list of excluded articles will be made available with the final review, detailing reasoning for their exclusion.
Each article will be screened against eligibility criteria based on the PICO framework as outlined in Table 1. The screeners will first review each article by title and abstract simultaneously, to assess the satisfaction of the eligibility criteria ( Table 1).
Articles that satisfy the Population and Intervention eligibility criteria will be used to pursue the primary question, and will then additionally be assessed against the Comparator and Outcome eligibility criteria for inclusion in the secondary quantitative component where they may address the effectiveness of the Intervention elements; either assay types or predator cue types. All articles considered for this analysis must have incorporated at least one of the Comparator elements and all of the Outcome elements listed in Table 1. In articles with more than one predator cue or population type (e.g. current, historic and control predator cues or exposure > 5 years ago, in the last five years and never exposed), we will extract the effect size (difference between the treatment condition and the control) of the cue or population that was hypothesized by the authors to elicit the largest response (thus limiting the number of data entries from each article to one per assay).

Study validity assessment
Studies that satisfy the Population and Intervention criteria but not the Comparator and Outcome criteria will not be critically appraised and will exclusively be used in the narrative synthesis identifying different methodologies for quantifying anti-predator responses. Those studies that do satisfy the four Population, Intervention, Comparator and Outcome eligibility criteria will undergo further critical appraisal using the CEE critical Table 1 Study eligibility criteria based on PICO (Population-Intervention-Comparator-Outcome) framework Population Eligible subjects include any population of non-human terrestrial mammals (free-living, wild-caught, captive, or domesticated) from around the world. We will not include studies that have used simulated populations Intervention Eligible studies will use behavioural assays to quantify anti-predator behaviour in response to: (i) Exposure to live true predators (ii) Exposure to predator-related cues, or events that represent a proxy for predatory situations (studies with humans as the predator can be included) Comparator The study must contain at least one of the following comparisons [12]: (i) A before/after comparison (BA) that investigates how anti-predator responses change before and after exposure to predators (ii) A control/intervention comparison (CI) that compares anti-predator responses between a group exposed to the predator/s and a designated control group not exposed (iii) A control/intervention comparison (CI) that compares anti-predator responses of individuals exposed to both a predator cue and a control treatment (iv) A before/after/control/intervention comparison (BACI) combining the above components Outcome Metrics for behavioural responses will vary between assays and will be compared using standardised effect sizes (the difference in mean behavioural responses between the treatment and control conditions). To calculate standardized effect sizes (using Hedges' g [58]), articles must provide (i) the mean response to each treatment, (ii) its corresponding variance (standard deviation, standard error or variance), and (iii) the sample size for each treatment appraisal tool (Additional file 2, [59]). Critical appraisal will be undertaken by two members of the review team, and each appraiser will assess on overlap of 5% of studies (to a maximum of 20) to ensure consistency. If appraisers reach different conclusions around any study, the validity criteria will be refined, and consistency checking will be repeated.

Data coding and extraction strategy
Once screened, the following meta-data variables will be extracted or scored where possible: • Species -Common name -Common name -Latin name -IUCN conservation status -Size (small < 5 kg, medium 5-20 kg, large > 20 kg) • Assay -Assay type (e.g. flight initiation distance, trap behaviour, giving-up density) -Behaviour measured (e.g. avoidance, docility, exploratory behaviour, fear) -What equipment is required (e.g. camera traps, specialist equipment) -What equipment is required (e.g. camera traps, specialist equipment) • Type of predator exposure -Comparison between populations with varying exposure to predators (yes/no) -Use of predator cue (yes/no) For the quantitative component, we will extract the mean response of each treatment, its corresponding variance (standard deviation, standard error or variance), and the sample size for each treatment. In articles where this information is presented graphically, we will calculate the measures from the figures (with the axes as scale bars) using the software Image J [60]. Metadata will be scored using a customised data sheet (Additional file 3; adapted from [61]) by two members of the review team. Each member will crosscheck 5% of articles (to a maximum of 20) to ensure consistency, and if differences are found in the extracted information, the meta-data protocol will be refined and cross check will begin again until all data extracted is consistent. Where any information is unclear or missing, authors will be contacted. After contacting authors, if the treatment/control standard deviations or sample sizes are absent, or if more than 50% of metadata are still missing, the article will be excluded from the quantitative review component. Extracted data will be made available with the full review as supplementary material.

Potential effect modifiers/reasons for heterogeneity
The following additional factors to be investigated by the review were compiled using the expertise of the review team, incorporating suggestions from stakeholders. We may unintentionally exclude some useful data by only searching articles written in the English language. There may be a bias in the types of animals for which measures have been developed, for example, threatened or charismatic species. The type of predator cue used may substantially affect the outcome, as less effective cues may not be representative of an individuals' response to a true predation event [62][63][64][65]. For the most robust quantification of behaviour, methodology should use repeat measures, incorporate measures of repeatability, and validate the assays, for example, by quantifying the fitness outcomes of various behavioural responses [66,67]. With such a systematic review, we hope to highlight where biases may be occurring, and reveal areas where more robust methodology is needed to guide the development of behavioural assays.

Data synthesis and presentation
The results from this systematic review will be presented both in a narrative synthesis (to address the primary question) and with a quantitative analysis (to address the secondary question) [51]. To answer the first question, what behavioural assays have been used to quantify anti-predator responses in mammals, each article and the associated meta-data will be detailed in a table of findings that will divide studies up based on the different assaytypes. Specific examples of different methods will be discussed in further detail within the text of the review. Some descriptive statistics based on the meta-data will be used to reveal patterns such as species tested. We will discuss techniques that are used regularly and aspects of existing methodology that have been well developed and tested. For example, we will quantify the number of replicates per study, reveal the proportion of studies that incorporated measures of repeatability, and assess how existing methods have been validated (and describe the mechanisms used). We will also discuss features that are lacking from existing methodology, or characteristics that are poorly represented (e.g. specific taxonomic groups). There will be a section that features suggestions for future development of behavioural assays.
The secondary question, which assay-types and predator cues elicit the greatest behavioural response, will be answered based on the meta-data extracted surrounding the experimental design of each study. Using the treatment means, standard deviations and sample size extracted from each study, we will calculate a standardized measure of effect size for differences between means using Hedges' g [58]: where µ t is the mean of the treatment group, µ c is the mean of the control group and s p is the pooled standard deviation. The formula for pooled standard deviation is: where n t and s t are the number of observations and standard deviation for the treatment group respectively, and n c and s c are the number of observations and standard deviation for the control group respectively. Hedges' g was chosen over other effect size measures such as Cohen's d, as it is suited to a range of sample sizes and because it facilitates comparisons across studies by weighting each measure based on the number of observations [68]. We will build two mixed effects models using R [53] to identify which predator cue types and behavioural assay types elicit the greatest difference in effect size (Hedges' g), while controlling for potential confounding factors where possible. We will include each study's unique identifier as a random effect in both models to account for the non-independence of multiple effect sizes from each study. The protocol for this review adheres to the ROSES guidelines (see Additional file 4 for checklist).