The first step in planning a search is to design a strategy to maximise the probability of identifying relevant articles whilst minimizing the time spent doing so. There are several aspects of a search strategy detailed in this article. Planning may also include discussions about eligibility criteria for subsequent screening  as they are often linked to search terms. Planning should also include discussions about decision criteria defining when to stop the search as resource constraints (such as time, manpower, skills) may be a major reason to limit the search and should be anticipated and explained in the protocol (see “Deciding when to stop”).
Establishing a test-list
A test-list is a set of articles that have been identified as relevant to answer the question of the evidence synthesis (e.g. are within the scope and provide some evidence to answer the question). The test-list can be created by asking experts, researchers and stakeholders (i.e. anyone who has an interest in the review question) for suggestions and by perusing existing reviews. The project team should read the articles of the test-list to make sure they are relevant to the synthesis question. Establishing a test-list is independent of the search itself and is used to help develop the search strategy and to assess the performance of the search strategy. The performance of a search strategy should be reported, i.e. whether the search strategy correctly retrieves relevant articles and whether all available relevant literature to answer the evidence synthesis question is likely to have been identified (see “Assessing retrieval performance”). The test-list may be presented in the protocol submitted for peer-review.
The test-list should ideally cover the range of authors, journals, and research projects within the scope of the question. In order to be an effective tool it needs to reflect the range of the evidence likely to be encountered in the review. The number of articles to include in the test-list is a case-by-case decision and may also depend on the breadth of the question. When using a very small test-list, the project team may inappropriately conclude that the search is effective whilst it is not. Using the test-list may be an indicator for the project team to improve the search strategy, or to help decide when to stop the search (see “Deciding when to stop”).
Identifying search terms
A search string that is efficient at finding relevant articles means that a maximum of relevant papers will have been found and the project team will not have to run the search again during the course of the conduct of the evidence synthesis. Moreover, it may be re-used as such when amending or updating the search in the future, saving time and resources (see “Part 4”). Initial search terms can usually be generated from the question elements and by looking at the articles in the test-list. However, authors of articles may not always describe the full range of the PICO/PECO criteria in the few words available in the title and abstract. As a consequence, building search strings from search terms requires project teams to draw upon both their scientific expertise, a certain degree of imagination, and an analysis of titles and abstracts to consider how authors might use different terminologies to describe their research.
Reading the articles of the test-list as well as existing relevant reviews often helps to identify search terms describing the population, intervention/exposure, outcome(s), and the context of interest. Synonyms can also be looked for in dictionaries. An advantage of involving librarians in the project team and among the peer-reviewers is that they bring their knowledge of specialist thesauri to the creation of search term lists. For example, for questions in agriculture, CAB Abstracts provides a thesaurus whose terms are added to database records. The thesaurus terms can offer broad or narrow concepts for the search term of interest, and can provide additional ways to capture articles or to discover overlooked words (http://www.cabi.org/cabthesaurus/). As well as database thesauri that offer terms that can be used within individual databases, there are other thesauri that are independent of databases. For example, the Terminological Resource for Plant Functional Diversity (http://top-thesaurus.org/) offers terms for 700 plant characteristics, plant traits and environmental associations. Experts and stakeholders may suggest additional keywords, for instance when an intervention is related to a special device (e.g. technical name of an engine, chemical names of pollutants) or a population is very specific (e.g. taxonomic names which have been changed over time, technical terminology of genetically-modified organisms). Other approaches can be used to identify search terms and facilitate eligibility screening (e.g. text-mining, citation screening, cluster analysis and semantic analysis) and are likely to be helpful for CEE evidence synthesis.
The search terms identified using these various methods are presented as part of the draft evidence-synthesis protocol so that additional terms may be suggested by peer-reviewers. Once the list is finalised in the published protocol it should not be changed, unless justification is provided in the final evidence-synthesis.
Identifying relevant sources of articles
Various sources of articles relevant to the question may exist. Understanding the coverage, the functions and limitations of information sources can be time-consuming, so involving a librarian or information specialist at this stage is highly recommended. We will use bibliography to refer to a list of articles generally described by authorship, title, year of publication, place of publication, editor, and often, keywords as well as, more recently, DOI identifiers. A bibliographic source allows these bibliographies to be created by providing a search and retrieval interface. Much of the information today is likely to come from searches of electronic bibliographic sources, which are becoming increasingly comprehensive with the passage of time as more material is digitised (see “Addressing the need for grey literature” and “Searching for grey literature”). In this paper we use the term “electronic bibliographic source” in the broad sense. It includes individual electronic bibliographic sources (e.g. Biological Abstracts) as well as platforms that allow simultaneous searches of several sources of information (e.g. Web of Science or Google Scholar) or could be accessed through search engines (such as Google). Platforms are a way to access databases.
Coverage and accessibility
Several sources should be searched to ensure that as many relevant articles as possible are identified [1, 15]. A decision needs to be made as to which sources would be the most appropriate for the question. This mostly depends on the disciplines addressed by the question (e.g. biology, social sciences, other disciplines) and the identification of sources that may provide the greatest quantity of relevant articles for a limited number of searches and their contribution in reducing the various biases described earlier in the paper (see “Identifying relevant sources of articles”). The quantity of results given by an electronic bibliographic source is NOT a good indicator of the relevance of the articles identified and thus should not be a criterion to select or discard this source. Information about access to databases and articles (coverage) can be obtained directly from the project team by sharing knowledge and experience, asking librarians and information experts and, if needed, stakeholders. Peer-review of the evidence synthesis protocol may also provide extra feedback and information regarding the relevance of searching in some other sources.
Some databases are open-access, such as Google Scholar, whereas others require subscription such as Agricola (http://agricola.nal.usda.gov/). Therefore, access to electronic bibliographic sources may depend on institutional library subscriptions, and so availability to project teams will vary across organisations. A diverse project team from a range of institutions may therefore be beneficial to ensure adequate breadth of search strategies. When the project team does not have access to all the relevant bibliographic sources, it should explain its approach and list the sources that were available but not searchable and acknowledge these limitations. This may include indications as to how to further upgrade the evidence synthesis at a later stage.
Types of sources
We first present bibliographic sources which allow the use of search strings, mostly illustrated from the environmental sciences. An extensive list of searchable databases for the social sciences is available in Kugley et al. . Other sources and methods mentioned below (such as searches on Google) are complementary but cannot be the core strategy of the search process of an evidence-synthesis as they are less reproducible and transparent.
Bibliographic sources may vary in the search tools provided by their platforms. Help pages give information on search capabilities and these should be read carefully. Involving librarians who keep up-to-date with developments in information sources and platforms is likely to save considerable time.
Electronic bibliographic sources
The platforms which provide access to bibliographic information sources may vary according to:
The syntax needed within search strings (see “Building the search string”) and the complexity of search strings that they will accept.
Access: not all bibliographic sources are completely accessible. It depends on the subscriptions available to the project team members in their institutions. The Web of Science platform, for example, contains several databases, and it is important to check and document which ones are accessible to the project team via that platform.
Disciplines: subject-based bibliographic sources (CAB ebooks; applied life sciences, agriculture, environment, veterinary sciences, applied economics, food science and nutrition) versus multidisciplinary sources (Scopus, Web of Science);
Geographical regions (e.g. Latin America, HAPI-Hispanic American Periodicals Index, or Europe CORDIS). It may be necessary to search region-specific bibliographic sources if the evidence-synthesis question has a regional focus ;
Document types: scientific papers, conference or proceedings, chapters, books, theses. Many university libraries hold digital copies of their theses, such as the EThOS British Library thesis database. Conference papers may be a source of unpublished results relevant for the synthesis, and may be found through the BIOSIS Citation index or the Conference Proceedings Citation Index (Thomson Reuters 2016, in ).
Durations at the time of writing, in the Web of Science Core Collection some articles may be accessible from 1900 although by no means all, in Scopus they may date from 1960).
The websites of individual commercial publishers may be valuable sources of evidence, since they can also offer access to books, chapters of books, and other material (e.g. datasets). Using their respective search tools and related help pages allows the retrieval of relevant articles based on search terms. For example, Elsevier’s ScienceDirect and Wiley Interscience are publishers’ platforms that give access to their journals, their tables of contents and (depending on licence) abstracts and the ability to download the article.
Web-based search engines
Google is one example of a web-based search engine that searches the Internet for content including articles, books, theses, reports and grey literature (see “Addressing the need for grey literature” and “Searching for grey literature”). It also provides its own search tools and help pages. Such resources are typically not transparent (i.e. they order results using an unknown and often changing algorithm, ) and are restricted in their scope or in the number of results that can be viewed by the user (Google Scholar). Google Scholar has been shown not to be suitable as a standalone resource in systematic reviews but it remains a valuable tool for supplementing bibliographic searches [6, 19] and to obtain full-text PDF of articles. BASE Bielefeld academic search engine (https://www.base-search.net) is developed by the University of Bielefeld (Germany) and gives access to a wide range of information, including academic articles, audio files, maps, theses, newspaper articles, and datasets. It lists sources of data and displays detailed search results so that transparent reporting is facilitated .
Finding full-text documents
Full-text documents will be needed only when the findings of the search have been screened for eligibility and retained based on their title and abstract, and need to be screened at full-text (see ). Limitations to access to full-texts can be a source of bias in the synthesis, and finding documents may be time-consuming as it may involve inter-library loans or direct contact with authors. Documents can be obtained directly if (a) the articles are open-access, (b) the articles have been placed on an author’s personal webpage, or (c) are included in the project team’ institutional subscriptions. Checking institutional access when listing the sources of bibliography may help the project team anticipate needs to get extra support.
Choosing bibliographic management software
Specific reference management software may be used to extract the results of the search from the bibliographic source onto a computer or in an online dedicated space (e.g. EndNote online). This can assist future removal of duplicates and eligibility screening . Establishing an efficient workflow to collect, organize, store and share the articles retrieved by the searches should save the project team’s time. Common reference management software includes: EndNote and Reference Manager (subscription), or Zotero (open-source) and Mendeley (freeware). The choice of software is likely to be influenced by available resources and the familiarity of the project team with specific software, and may require training. The choice of software should ideally be made at the beginning of the project, during the scoping, and is particularly important if the project team is dispersed across different locations, to ensure that access to references is facilitated at different stages of the work.
The following elements may help when choosing bibliographic management software:
Ease of transferring references between different software packages in case the project team members do not have access to all packages;
Ability to add extra metadata relevant to the evidence synthesis (for instance coding around language, geographical location of results reported in each article) to assist with study identification or grouping for analysis (including bibliometric analysis);
Limitations that may pose a problem (e.g. EndNote online is limited to 10,000 references);
Possibility to retrieve full-texts, automatically or semi-automatically;
Limitations to the number of users of the software;
Remote access to the software and/or results (to share among team members);
Options for storage (e.g. the Cloud) and associated costs;
Possibilities to create bibliographic lists according to the style(s) required by the editor of the review (e.g. cite-as-you-write).
The functionality for exporting lists of bibliographic records varies across both electronic sources and the reference management software used to store records. Some platforms may require citations to be exported individually (e.g. Google Scholar) whereas others allow downloading in batches (e.g. Web of Science). When the size of each batch is much smaller than the total number to be exported (even if since 2017 Web of Science extended downloads to batches of 5000 articles, searches may produce thousands of records), exporting is made in a series of batches, which is a time-consuming process. Extracting articles ordered by publication date rather than by relevance (e.g. all articles published between 1950 and 2000 in a first session, and the others later) may prevent errors. In all cases, the project team needs to make sure all articles have been correctly retrieved (preferably with their abstracts). Some publishers ask that you contact them if you wish to export large quantities of articles and this may be worth considering. If there is no easy way to access the full set of results, it is important to be transparent about the possible impact of this when reporting the search.
Addressing the need for grey literature
“Grey literature” relates to documents that may be difficult to locate because they are not indexed in usual bibliographic sources. It has been defined as “manifold document types produced on all levels of government, academics, business and industry in print and electronic formats that are protected by intellectual property rights, of sufficient quality to be collected and preserved by libraries and institutional repositories, but not controlled by commercial publishers; i.e. where publishing is not the primary activity of the producing body” (12th Int Conf On Grey Lit. Prague 2010, but see ). Grey literature includes reports, proceedings, theses and dissertations, newsletters, technical notes, white papers, etc. (see list on http://www.greynet.org/greysourceindex/documenttypes.html). This literature may not be as easily found by internet and bibliographic searches, and may need to be identified by other means (e.g. asking experts) which may be time-consuming and requires careful planning .
Searches for grey literature might be included in evidence synthesis for two main reasons: (1) to try to minimize possible publication bias (see “Submitting the search strategy in the protocol for peer-review”; ), where ‘positive’ (i.e. confirmative, statistically significant) results are more likely to be published in academic journals ; and (2) to include studies not intended for the academic domain, such as practitioner reports and consultancy documents which may nevertheless contain relevant information such as details on study methods or results not reported in journal articles often limited by word length.
Deciding when to stop
If time and resources were unlimited, the project team should be able to identify all published articles relevant to the evidence-synthesis question. In the real world this is rarely possible. Deciding when to stop a search should be based on explicit criteria and it should be explained in the protocol or synthesis. Often, reaching the budget limit (in terms of project team time) is the key reason for stopping the search  but justification for stopping should rely primarily on the acceptability of the performance of the search for the project team. Searching only one database is not considered as adequate . Observing a high rate of article retrieval for the test-list should not preclude the conduct additional searches in other sources to check whether new relevant papers are identified. Practically, when searching in electronic bibliographic sources, search terms and search strings are modified progressively, based on what is retrieved at each iteration, using the “test-list” as one indicator of performance. When each additional unit of time spent in searching returns fewer relevant references, this may be a good indication that it is time to stop the search . Statistical techniques, such as capture-recapture and the relative recall method, exist to guide decisions about when to stop searching, although to our knowledge they have not been used in CEE evidence-synthesis to date (reviewed in ).
For web-searches (e.g. using Google) it is difficult to provide specific guidance on how much searching effort is acceptable. In some evidence syntheses, authors have chosen a “first 50 hits” approach (hits meaning articles, e.g. ) or a ‘first 200 hits’ approach , but the CEE does not encourage such arbitrary cut-offs. What should be reported is whether stopping the screening after the first 50 (or more) retrieved articles is justified by a decline in the relevance of new articles. As long as relevant articles are being identified, the project team should ideally keep on screening the list of results.
Submitting the search strategy in the protocol for peer-review
Publishing the search strategy in the evidence synthesis protocol enables peer reviewers and stakeholders to provide input at an early stage and to detect missing elements (e.g. keywords, databases of important sources of grey literature), highlight possible misunderstandings, question the relevance of some options (scope, dates, variety of outcomes, etc.), before the final search is conducted. This step aims to ensure that the search will be of the best possible quality and relevance for the future users of the synthesis. If the scope of the search needs to be restrained due to resource limitations, this is presented to the readers before the review is conducted, and should minimize misunderstanding and criticisms when disclosing the results.