• A-Z Publications

Annual Review of Psychology

Volume 70, 2019, review article, how to do a systematic review: a best practice guide for conducting and reporting narrative reviews, meta-analyses, and meta-syntheses.

  • Andy P. Siddaway 1 , Alex M. Wood 2 , and Larry V. Hedges 3
  • View Affiliations Hide Affiliations Affiliations: 1 Behavioural Science Centre, Stirling Management School, University of Stirling, Stirling FK9 4LA, United Kingdom; email: [email protected] 2 Department of Psychological and Behavioural Science, London School of Economics and Political Science, London WC2A 2AE, United Kingdom 3 Department of Statistics, Northwestern University, Evanston, Illinois 60208, USA; email: [email protected]
  • Vol. 70:747-770 (Volume publication date January 2019) https://doi.org/10.1146/annurev-psych-010418-102803
  • First published as a Review in Advance on August 08, 2018
  • Copyright © 2019 by Annual Reviews. All rights reserved

Systematic reviews are characterized by a methodical and replicable methodology and presentation. They involve a comprehensive search to locate all relevant published and unpublished work on a subject; a systematic integration of search results; and a critique of the extent, nature, and quality of evidence in relation to a particular research question. The best reviews synthesize studies to draw broad theoretical conclusions about what a literature means, linking theory to evidence and evidence to theory. This guide describes how to plan, conduct, organize, and present a systematic review of quantitative (meta-analysis) or qualitative (narrative review, meta-synthesis) information. We outline core standards and principles and describe commonly encountered problems. Although this guide targets psychological scientists, its high level of abstraction makes it potentially relevant to any subject area or discipline. We argue that systematic reviews are a key methodology for clarifying whether and how research findings replicate and for explaining possible inconsistencies, and we call for researchers to conduct systematic reviews to help elucidate whether there is a replication crisis.

Article metrics loading...

Full text loading...

Literature Cited

  • APA Publ. Commun. Board Work. Group J. Artic. Rep. Stand. 2008 . Reporting standards for research in psychology: Why do we need them? What might they be?. Am. Psychol . 63 : 848– 49 [Google Scholar]
  • Baumeister RF 2013 . Writing a literature review. The Portable Mentor: Expert Guide to a Successful Career in Psychology MJ Prinstein, MD Patterson 119– 32 New York: Springer, 2nd ed.. [Google Scholar]
  • Baumeister RF , Leary MR 1995 . The need to belong: desire for interpersonal attachments as a fundamental human motivation. Psychol. Bull. 117 : 497– 529 [Google Scholar]
  • Baumeister RF , Leary MR 1997 . Writing narrative literature reviews. Rev. Gen. Psychol. 3 : 311– 20 Presents a thorough and thoughtful guide to conducting narrative reviews. [Google Scholar]
  • Bem DJ 1995 . Writing a review article for Psychological Bulletin. Psychol . Bull 118 : 172– 77 [Google Scholar]
  • Borenstein M , Hedges LV , Higgins JPT , Rothstein HR 2009 . Introduction to Meta-Analysis New York: Wiley Presents a comprehensive introduction to meta-analysis. [Google Scholar]
  • Borenstein M , Higgins JPT , Hedges LV , Rothstein HR 2017 . Basics of meta-analysis: I 2 is not an absolute measure of heterogeneity. Res. Synth. Methods 8 : 5– 18 [Google Scholar]
  • Braver SL , Thoemmes FJ , Rosenthal R 2014 . Continuously cumulating meta-analysis and replicability. Perspect. Psychol. Sci. 9 : 333– 42 [Google Scholar]
  • Bushman BJ 1994 . Vote-counting procedures. The Handbook of Research Synthesis H Cooper, LV Hedges 193– 214 New York: Russell Sage Found. [Google Scholar]
  • Cesario J 2014 . Priming, replication, and the hardest science. Perspect. Psychol. Sci. 9 : 40– 48 [Google Scholar]
  • Chalmers I 2007 . The lethal consequences of failing to make use of all relevant evidence about the effects of medical treatments: the importance of systematic reviews. Treating Individuals: From Randomised Trials to Personalised Medicine PM Rothwell 37– 58 London: Lancet [Google Scholar]
  • Cochrane Collab. 2003 . Glossary Rep., Cochrane Collab. London: http://community.cochrane.org/glossary Presents a comprehensive glossary of terms relevant to systematic reviews. [Google Scholar]
  • Cohn LD , Becker BJ 2003 . How meta-analysis increases statistical power. Psychol. Methods 8 : 243– 53 [Google Scholar]
  • Cooper HM 2003 . Editorial. Psychol. Bull. 129 : 3– 9 [Google Scholar]
  • Cooper HM 2016 . Research Synthesis and Meta-Analysis: A Step-by-Step Approach Thousand Oaks, CA: Sage, 5th ed.. Presents a comprehensive introduction to research synthesis and meta-analysis. [Google Scholar]
  • Cooper HM , Hedges LV , Valentine JC 2009 . The Handbook of Research Synthesis and Meta-Analysis New York: Russell Sage Found, 2nd ed.. [Google Scholar]
  • Cumming G 2014 . The new statistics: why and how. Psychol. Sci. 25 : 7– 29 Discusses the limitations of null hypothesis significance testing and viable alternative approaches. [Google Scholar]
  • Earp BD , Trafimow D 2015 . Replication, falsification, and the crisis of confidence in social psychology. Front. Psychol. 6 : 621 [Google Scholar]
  • Etz A , Vandekerckhove J 2016 . A Bayesian perspective on the reproducibility project: psychology. PLOS ONE 11 : e0149794 [Google Scholar]
  • Ferguson CJ , Brannick MT 2012 . Publication bias in psychological science: prevalence, methods for identifying and controlling, and implications for the use of meta-analyses. Psychol. Methods 17 : 120– 28 [Google Scholar]
  • Fleiss JL , Berlin JA 2009 . Effect sizes for dichotomous data. The Handbook of Research Synthesis and Meta-Analysis H Cooper, LV Hedges, JC Valentine 237– 53 New York: Russell Sage Found, 2nd ed.. [Google Scholar]
  • Garside R 2014 . Should we appraise the quality of qualitative research reports for systematic reviews, and if so, how. Innovation 27 : 67– 79 [Google Scholar]
  • Hedges LV , Olkin I 1980 . Vote count methods in research synthesis. Psychol. Bull. 88 : 359– 69 [Google Scholar]
  • Hedges LV , Pigott TD 2001 . The power of statistical tests in meta-analysis. Psychol. Methods 6 : 203– 17 [Google Scholar]
  • Higgins JPT , Green S 2011 . Cochrane Handbook for Systematic Reviews of Interventions, Version 5.1.0 London: Cochrane Collab. Presents comprehensive and regularly updated guidelines on systematic reviews. [Google Scholar]
  • John LK , Loewenstein G , Prelec D 2012 . Measuring the prevalence of questionable research practices with incentives for truth telling. Psychol. Sci. 23 : 524– 32 [Google Scholar]
  • Juni P , Witschi A , Bloch R , Egger M 1999 . The hazards of scoring the quality of clinical trials for meta-analysis. JAMA 282 : 1054– 60 [Google Scholar]
  • Klein O , Doyen S , Leys C , Magalhães de Saldanha da Gama PA , Miller S et al. 2012 . Low hopes, high expectations: expectancy effects and the replicability of behavioral experiments. Perspect. Psychol. Sci. 7 : 6 572– 84 [Google Scholar]
  • Lau J , Antman EM , Jimenez-Silva J , Kupelnick B , Mosteller F , Chalmers TC 1992 . Cumulative meta-analysis of therapeutic trials for myocardial infarction. N. Engl. J. Med. 327 : 248– 54 [Google Scholar]
  • Light RJ , Smith PV 1971 . Accumulating evidence: procedures for resolving contradictions among different research studies. Harvard Educ. Rev. 41 : 429– 71 [Google Scholar]
  • Lipsey MW , Wilson D 2001 . Practical Meta-Analysis London: Sage Comprehensive and clear explanation of meta-analysis. [Google Scholar]
  • Matt GE , Cook TD 1994 . Threats to the validity of research synthesis. The Handbook of Research Synthesis H Cooper, LV Hedges 503– 20 New York: Russell Sage Found. [Google Scholar]
  • Maxwell SE , Lau MY , Howard GS 2015 . Is psychology suffering from a replication crisis? What does “failure to replicate” really mean?. Am. Psychol. 70 : 487– 98 [Google Scholar]
  • Moher D , Hopewell S , Schulz KF , Montori V , Gøtzsche PC et al. 2010 . CONSORT explanation and elaboration: updated guidelines for reporting parallel group randomised trials. BMJ 340 : c869 [Google Scholar]
  • Moher D , Liberati A , Tetzlaff J , Altman DG PRISMA Group. 2009 . Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. BMJ 339 : 332– 36 Comprehensive reporting guidelines for systematic reviews. [Google Scholar]
  • Morrison A , Polisena J , Husereau D , Moulton K , Clark M et al. 2012 . The effect of English-language restriction on systematic review-based meta-analyses: a systematic review of empirical studies. Int. J. Technol. Assess. Health Care 28 : 138– 44 [Google Scholar]
  • Nelson LD , Simmons J , Simonsohn U 2018 . Psychology's renaissance. Annu. Rev. Psychol. 69 : 511– 34 [Google Scholar]
  • Noblit GW , Hare RD 1988 . Meta-Ethnography: Synthesizing Qualitative Studies Newbury Park, CA: Sage [Google Scholar]
  • Olivo SA , Macedo LG , Gadotti IC , Fuentes J , Stanton T , Magee DJ 2008 . Scales to assess the quality of randomized controlled trials: a systematic review. Phys. Ther. 88 : 156– 75 [Google Scholar]
  • Open Sci. Collab. 2015 . Estimating the reproducibility of psychological science. Science 349 : 943 [Google Scholar]
  • Paterson BL , Thorne SE , Canam C , Jillings C 2001 . Meta-Study of Qualitative Health Research: A Practical Guide to Meta-Analysis and Meta-Synthesis Thousand Oaks, CA: Sage [Google Scholar]
  • Patil P , Peng RD , Leek JT 2016 . What should researchers expect when they replicate studies? A statistical view of replicability in psychological science. Perspect. Psychol. Sci. 11 : 539– 44 [Google Scholar]
  • Rosenthal R 1979 . The “file drawer problem” and tolerance for null results. Psychol. Bull. 86 : 638– 41 [Google Scholar]
  • Rosnow RL , Rosenthal R 1989 . Statistical procedures and the justification of knowledge in psychological science. Am. Psychol. 44 : 1276– 84 [Google Scholar]
  • Sanderson S , Tatt ID , Higgins JP 2007 . Tools for assessing quality and susceptibility to bias in observational studies in epidemiology: a systematic review and annotated bibliography. Int. J. Epidemiol. 36 : 666– 76 [Google Scholar]
  • Schreiber R , Crooks D , Stern PN 1997 . Qualitative meta-analysis. Completing a Qualitative Project: Details and Dialogue JM Morse 311– 26 Thousand Oaks, CA: Sage [Google Scholar]
  • Shrout PE , Rodgers JL 2018 . Psychology, science, and knowledge construction: broadening perspectives from the replication crisis. Annu. Rev. Psychol. 69 : 487– 510 [Google Scholar]
  • Stroebe W , Strack F 2014 . The alleged crisis and the illusion of exact replication. Perspect. Psychol. Sci. 9 : 59– 71 [Google Scholar]
  • Stroup DF , Berlin JA , Morton SC , Olkin I , Williamson GD et al. 2000 . Meta-analysis of observational studies in epidemiology (MOOSE): a proposal for reporting. JAMA 283 : 2008– 12 [Google Scholar]
  • Thorne S , Jensen L , Kearney MH , Noblit G , Sandelowski M 2004 . Qualitative meta-synthesis: reflections on methodological orientation and ideological agenda. Qual. Health Res. 14 : 1342– 65 [Google Scholar]
  • Tong A , Flemming K , McInnes E , Oliver S , Craig J 2012 . Enhancing transparency in reporting the synthesis of qualitative research: ENTREQ. BMC Med. Res. Methodol. 12 : 181– 88 [Google Scholar]
  • Trickey D , Siddaway AP , Meiser-Stedman R , Serpell L , Field AP 2012 . A meta-analysis of risk factors for post-traumatic stress disorder in children and adolescents. Clin. Psychol. Rev. 32 : 122– 38 [Google Scholar]
  • Valentine JC , Biglan A , Boruch RF , Castro FG , Collins LM et al. 2011 . Replication in prevention science. Prev. Sci. 12 : 103– 17 [Google Scholar]
  • Article Type: Review Article

Most Read This Month

Most cited most cited rss feed, job burnout, executive functions, social cognitive theory: an agentic perspective, on happiness and human potentials: a review of research on hedonic and eudaimonic well-being, sources of method bias in social science research and recommendations on how to control it, mediation analysis, missing data analysis: making it work in the real world, grounded cognition, personality structure: emergence of the five-factor model, motivational beliefs, values, and goals.

Systematic Literature Review

  • First Online: 01 January 2014

Cite this chapter

systematic literature review figure

  • Aline Dresch 4 ,
  • Daniel Pacheco Lacerda 4 &
  • José Antônio Valle Antunes Jr 5  

8165 Accesses

This chapter presents a method that can be applied to perform a Systematic Literature Review. The Systematic Literature Review is a critical step in conducting scientific research. This chapter focuses particularly on the importance of this step for research conducted under the Design Science paradigm.

The knowledge of the world is only to be acquired in the world, and not in the closet. (Philip Chesterfield)

Maria Isabel Wolf Motta Morandi — UNISINOS — [email protected] .

Luis Felipe Riehs Camargo — UNISINOS — [email protected] .

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

systematic literature review figure

Literature Reviews and Systematic Reviews of Research: The Roles and Importance

systematic literature review figure

Literature Reviews: An Overview of Systematic, Integrated, and Scoping Reviews

systematic literature review figure

Literature reviews as independent studies: guidelines for academic practice

Abrami, P. C. et al. (2010, August 1). Issues in conducting and disseminating brief reviews of evidence. Evidence and Policy : A Journal of Research, Debate and Practice , 6 (3), 371–389.

Google Scholar  

Adler, M. J., & van Doren, C. (1972). How to read a book . New York: A Touchstone Book Published by Simon & Schuster.

Alturki, A., Gable, G. G., & Bandara, W. (2011). A design science research roadmap DESRIST . Milwaukee: Springer.

Barnett-Page, E., & Thomas, J. (2009). Methods for the synthesis of qualitative research: A critical review. BMC Medical Research Methodology, 9 (59), 1–11.

Bayazit, N. (2004). Investigating design : A review of forty years of design research. Massachusetts Institute of Technology: Design Issues, 20 (1), 16–29.

Beverley, C. A., Booth, A., & Bath, P. A. (2003). The role of the information specialist in the systematic review process: A health information case study. Health information and libraries journal, 20 (2), 65–74.

Article   Google Scholar  

Brunton, G. et al. (2006). A synthesis of research addressing children’s, young people’s and parents’ views of walking and cycling for transport. London: University of London.

Brunton, G., Stansfield, C., & Thomas, J. (2012). Finding relevant studies. In: D. Gough, S. Oliver, & J. Thomas (Eds.), An introduction to systematic reviews (pp. 107–134). London: Sage Publications Ltd.

Brunton, J., & Thomas, J. (2012). Information management in reviews. In: D. Gough, S. Oliver, & J. Thomas (Eds.), An introduction to systematic reviews (pp. 83–106). London: Sage Publications Ltd.

Cooper, H. M., Hedges, L. V., & Valentine, J. C. (2009). The handbook of research synthesis and meta-analysis . United Kingdom: Russell Sage Foundation Publications.

Coren, E. et al. (2014). Parent-training interventions to support intellectually disabled parents campbell systematic reviews. United Kingdom: [s.n.]. Retrieved from Feb 23, 2014 http://campbellcollaboration.org/lib/project/172/

Dixon-Woods, M. et al. (2006, February 1). How can systematic reviews incorporate qualitative research? A critical perspective. Qualitative Research , 6 (1), 27–44.

EPPI Centre (2013). http://eppi.ioe.ac.uk/cms/

Gough, D., Oliver, S., & Thomas, J. (2012). An introduction to systematic reviews . London: Sage Publications Ltd.

Gough, D., & Thomas, J. (2012). Commonality and diversity in reviews. In: D. Gough, S. Oliver, & J. Thomas (Eds.), An introduction to systematic reviews (pp. 35–65). London: Sage Publications Ltd.

Gregor, S., & Jones, D. (2007). The anatomy of a design theory. Journal of the Association for Information Systems, 8 (5), 312–335.

Hammerstrøm, K., Wade, A., & Jorgensen, A.-M. K. (2010). Searching for studies: A guide to information retrieval for Campbell Systematic Reviews (Vol. 1). Oslo: The Campbell Collaboration.

Harden, A., et al. (2009). Teenage pregnancy and social disadvantage: Systematic review integrating controlled trials and qualitative studies. BMJ, 339 (b424), 1–11.

Harden, A., & Gough, D. (2012). Quality and relevance appraisal. In: D. Gough, S. Oliver, & J. Thomas (Eds.), An introduction to systematic reviews (pp. 153–178). London: SAGE Publications, Inc.

Harris, M. R. (2005). The librarian’s roles in the systematic review process: A case study. Journal of the Medical Library Association : JMLA , 93 (1), 81–87.

Higgins, J. P., & Green, S. (2006). Cochrane handbook for systematic reviews of intervetions 4.2.6. In: J. P. Higgins, & S. Green (Eds.), Cochrane library . Chichester, UK: Wiley.

Keown, K., van Eerd, D., & Irvin, E. (2008). Stakeholder engagement opportunities in systematic reviews: knowledge transfer for policy and practice. Foundations of Continuing Education, 28 (2), 67–72.

Khan, K. S., et al. (2003). Five steps to conducting a systematic review. Journal of the Royal Society of Medicine, 96 (3), 118–121.

Kitchenham, B. (2010). What’s up with software metrics?—A preliminary mapping study. Journal of Systems and Software, 83 (1), 37–51.

Kitchenham, B., et al. (2010). Systematic literature reviews in software engineering—A tertiary study. Information and Software Technology, 52 (8), 792–805.

Lavis, J. N. (2009). How can we support the use of systematic reviews in policymaking? PLoS Medicine, 6 (11), e1000141.

Littel, J. H., Corcoran, J., & Pillai, V. (2008). Systematic reviews and meta-analysis . New York: Oxford University Press.

Book   Google Scholar  

Lundh, A., & Gøtzsche, P. C. (2008). Recommendations by cochrane review groups for assessment of the risk of bias in studies. BMC Medical Research Methodology, 8 (22), 1–9.

March, S. T., & Smith, G. F. (1995). Design and natural science research on information technology. Decision Support Systems, 15 , 251–266.

March, S. T., & Storey, V. C. (2008). Design Science in the information systems discipline: An introduction to the special issue on design science research. MIS Quaterly , 32 (4), 725–730.

Oliver, S., Dickson, K., & Newman, M. (2012). Getting started with a review. In: D. Gough, S. Oliver, & J. Thomas (Eds.), An introduction to systematic reviews (pp. 66–82). London: Sage Publications Ltd.

Oliver, S., & Sutcliffe, K. (2012). Describing and analysing studies. In: D. Gough, S. Oliver, & J. Thomas (Eds.), An introduction to systematic reviews (pp. 135–152). London: Sage Publications Ltd.

Rees, R., & Oliver, S. (2012). Stakeholder perspectives and participation in systematic reviews. In: D. Gough, S. Oliver, & J. Thomas (Eds.), An introduction to systematic reviews (pp. 17–35). London: Sage Publications Ltd.

Sandelowski, M., et al. (2012). Mapping the mixed methods—mixed research synthesis Terrain. J Mix Method Res Author Manuscript, 6 (4), 317–331.

Saunders, M., Lewis, P., & Thornhill, A. (2012). Research methods for business students (6th ed.). London: Pearson Education Limited.

Schiller, C., et al. (2013). A framework for stakeholder identification in concept mapping and health research: A novel process and its application to older adult mobility and the built environment. BMC Public Health, 13 , 1–9.

Seuring, S., & Gold, S. (2012). Conducting content-analysis based literature reviews in supply chain management. Supply Chain Management: An International Journal, 17 (5), 544–555.

Sinha, M. K., & Montori, V. M. (2006). Reporting bias and other biases affecting systematic reviews and meta-analyses: A methodological commentary. Expert Review of Pharmacoeconomics and Outcomes Research, 6 (5), 603–611.

Smith, V., et al. (2011). Methodology in conducting a systematic review of systematic reviews of healthcare interventions. BMC Medical Research Methodology, 11 (1), 15.

Thomas, J., Harden, A., & Newman, M. (2012). Synthesis: Combining results systematically and appropriately. In: D. Gough, S. Oliver, & J. Thomas (Eds.), An introduction to systematic reviews (pp. 179–226). London: Sage Publications Ltd.

Tranfield, D., Denyer, D., & Smart, P. (2003). Towards a methodology for developing evidence-informed management knowledge by means of systematic review. British Journal of Management, 14 (3), 207–222.

Van Aken, J. E. (2011). The Research Design for Design Science Research in Management Eindhoven.

Van Aken, J. E., & Romme, G. (2009). Reinventing the future : Adding design science to the repertoire of organization and management studies. Organization Management Journal, 6 , 5–12.

Walls, J. G., Wyidmeyer, G. R., & Sawy, O. A. E. (1992). Building an information system design theory for vigilant EIS. Information Systems Research , 3 , 36–60.

Download references

Author information

Authors and affiliations.

GMAP | UNISINOS, Porto Alegre/RS, Brazil

Aline Dresch & Daniel Pacheco Lacerda

UNISINOS, Porto Alegre/RS, Brazil

José Antônio Valle Antunes Jr

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Aline Dresch .

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Dresch, A., Lacerda, D.P., Antunes, J.A.V. (2015). Systematic Literature Review. In: Design Science Research. Springer, Cham. https://doi.org/10.1007/978-3-319-07374-3_7

Download citation

DOI : https://doi.org/10.1007/978-3-319-07374-3_7

Published : 20 August 2014

Publisher Name : Springer, Cham

Print ISBN : 978-3-319-07373-6

Online ISBN : 978-3-319-07374-3

eBook Packages : Business and Economics Business and Management (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Preparing your manuscript

This section provides general style and formatting information only. Formatting guidelines for specific article types can be found below. 

  • Methodology
  • Systematic review update

General formatting guidelines

  •   Preparing main manuscript text
  •   Preparing illustrations and figures
  •   Preparing tables ​​​​​​​
  •   Preparing additional files

Preparing figures

When preparing figures, please follow the formatting instructions below.

  • Figures should be numbered in the order they are first mentioned in the text, and uploaded in this order. Multi-panel figures (those with parts a, b, c, d etc.) should be submitted as a single composite file that contains all parts of the figure.
  • Figures should be uploaded in the correct orientation.
  • Figure titles (max 15 words) and legends (max 300 words) should be provided in the main manuscript, not in the graphic file.
  • Figure keys should be incorporated into the graphic, not into the legend of the figure.
  • Each figure should be closely cropped to minimize the amount of white space surrounding the illustration. Cropping figures improves accuracy when placing the figure in combination with other elements when the accepted manuscript is prepared for publication on our site. For more information on individual figure file formats, see our detailed instructions.
  • Individual figure files should not exceed 10 MB. If a suitable format is chosen, this file size is adequate for extremely high quality figures.
  • Please note that it is the responsibility of the author(s) to obtain permission from the copyright holder to reproduce figures (or tables) that have previously been published elsewhere. In order for all figures to be open access, authors must have permission from the rights holder if they wish to include images that have been published elsewhere in non open access journals. Permission should be indicated in the figure legend, and the original source included in the reference list.

Figure file types

We accept the following file formats for figures:

  • EPS (suitable for diagrams and/or images)
  • PDF (suitable for diagrams and/or images)
  • Microsoft Word (suitable for diagrams and/or images, figures must be a single page)
  • PowerPoint (suitable for diagrams and/or images, figures must be a single page)
  • TIFF (suitable for images)
  • JPEG (suitable for photographic images, less suitable for graphical images)
  • PNG (suitable for images)
  • BMP (suitable for images)
  • CDX (ChemDraw - suitable for molecular structures)

For information and suggestions of suitable file formats for specific figure types, please see our author academy .

Figure size and resolution

Figures are resized during publication of the final full text and PDF versions to conform to the BioMed Central standard dimensions, which are detailed below.

Figures on the web:

  • width of 600 pixels (standard), 1200 pixels (high resolution).

Figures in the final PDF version:

  • width of 85 mm for half page width figure
  • width of 170 mm for full page width figure
  • maximum height of 225 mm for figure and legend
  • image resolution of approximately 300 dpi (dots per inch) at the final size

Figures should be designed such that all information, including text, is legible at these dimensions. All lines should be wider than 0.25 pt when constrained to standard figure widths. All fonts must be embedded.

Figure file compression

  • Vector figures should if possible be submitted as PDF files, which are usually more compact than EPS files.
  • TIFF files should be saved with LZW compression, which is lossless (decreases file size without decreasing quality) in order to minimize upload time.
  • JPEG files should be saved at maximum quality.
  • Conversion of images between file types (especially lossy formats such as JPEG) should be kept to a minimum to avoid degradation of quality.

If you have any questions or are experiencing a problem with figures, please contact the customer service team at [email protected] .

Preparing main manuscript text

Quick points:

  • Use double line spacing
  • Include line and page numbering
  • Use SI units: Please ensure that all special characters used are embedded in the text, otherwise they will be lost during conversion to PDF
  • Do not use page breaks in your manuscript

File formats

The following word processor file formats are acceptable for the main manuscript document:

  • Microsoft word (DOC, DOCX)
  • Rich text format (RTF)
  • TeX/LaTeX 

Please note: editable files are required for processing in production. If your manuscript contains any non-editable files (such as PDFs) you will be required to re-submit an editable file when you submit your revised manuscript, or after editorial acceptance in case no revision is necessary.

Additional information for TeX/LaTeX users

You are encouraged to use the Springer Nature LaTeX template when preparing a submission. A PDF of your manuscript files will be compiled during submission using pdfLaTeX and TexLive 2021. All relevant editable source files must be uploaded during the submission process. Failing to submit these source files will cause unnecessary delays in the production process.

Style and language

Improving your written english .

Presenting your work in well-written English gives it its best chance for editors and reviewers to understand it and evaluate it fairly.   

We have some editing services that can help you to get your writing ready for submission. 

Language quality checker 

You can upload your manuscript and get a free language check from our partner AJE. The software uses AI to make suggestions that can improve writing quality. Trained on 300,000+ research manuscripts from more than 400+ areas of study and over 2000 field-specific topics the tool will deliver fast, highly accurate English language improvements. Your paper will be digitally edited and returned to you within approximately 10 minutes. 

Try the tool for free now

Language and manuscript preparation services 

Let one of our experts assist you with getting your manuscript and language into shape - our services cover:

  • English language improvement 
  • scientific in-depth editing and strategic advice 
  • figure and tables formatting 
  • manuscript formatting to match your target journal
  • specialist academic translation to English from Spanish, Portuguese, Japanese, or simplified Chinese

Get started and save 15%

Please note that using these tools, or any other service, is not a requirement for publication, nor does it imply or guarantee that editors will accept the article, or even select it for peer review.

您怎么做才有助于改进您的稿件以便顺利发表?

如果在结构精巧的稿件中用精心组织的英语展示您的作品,就能最大限度地让编辑和审稿人理解并公正评估您的作品。许多研究人员发现,获得一些独立支持有助于他们以尽可能美好的方式展示他们的成果。Springer Nature Author Services 的专家可帮助您准备稿件,具体 包括润色英语表述、添加有见地的注释、为稿件排版、设计图表、翻译等 。 

开始使用即可节省 15% 的费用

您还可以使用我们的 免费语法检查工具 来评估您的作品。

请注意,使用这些工具或任何其他服务不是发表前必须满足的要求,也不暗示或保证相关文章定会被编辑接受(甚至未必会被选送同行评审)。

発表に備えて、論文を改善するにはどうすればよいでしょうか?

内容が適切に組み立てられ、質の高い英語で書かれた論文を投稿すれば、編集者や査読者が論文を理解し、公正に評価するための最善の機会となります。多くの研究者は、個別のサポートを受けることで、研究結果を可能な限り最高の形で発表できると思っています。Springer Nature Author Servicesのエキスパートが、 英文の編集、建設的な提言、論文の書式、図の調整、翻訳 など、論文の作成をサポートいたします。 

今なら15%割引でご利用いただけます

原稿の評価に、無料 の文法チェック ツールもご利用いただけます。

これらのツールや他のサービスをご利用いただくことは、論文を掲載するための要件ではありません。また、編集者が論文を受理したり、査読に選定したりすることを示唆または保証するものではないことにご注意ください。

게재를 위해 원고를 개선하려면 어떻게 해야 할까요?

여러분의 작품을 체계적인 원고로 발표하는 것은 편집자와 심사자가 여러분의 연구를 이해하고 공정하게 평가할 수 있는 최선의 기회를 제공합니다. 많은 연구자들은 어느 정도 독립적인 지원을 받는 것이 가능한 한 최선의 방법으로 자신의 결과를 발표하는 데 도움이 된다고 합니다. Springer Nature Author Services 전문가들은 영어 편집, 발전적인 논평, 원고 서식 지정, 그림 준비, 번역 등과 같은 원고 준비를 도와드릴 수 있습니다. 

지금 시작하면 15% 할인됩니다

또한 당사의 무료 문법 검사 도구를 사용하여 여러분의 연구를 평가할 수 있습니다.

이러한 도구 또는 기타 서비스를 사용하는 것은 게재를 위한 필수 요구사항이 아니며, 편집자가 해당 논문을 수락하거나 피어 리뷰에 해당 논문을 선택한다는 것을 암시하거나 보장하지는 않습니다.

¿Cómo puede ayudar a mejorar el artículo para su publicación?

Si presenta su trabajo en un artículo bien estructurado y en inglés bien escrito, los editores y revisores podrán comprenderlo mejor y evaluarlo de forma justa. Muchos investigadores piensan que un poco de apoyo independiente les ayuda a presentar los resultados de la mejor forma posible. Los expertos de Springer Nature Author Services pueden ayudarle a preparar el artículo con la edición en inglés, comentarios para su elaboración, el formato del artículo, la preparación de figuras, la traducción y mucho más. 

Empiece ahora y ahorre un 15%

También puede usar nuestra herramienta gratuita Grammar Check para evaluar su trabajo.

Tenga en cuenta que utilizar estas herramientas, así como cualquier otro servicio, no es un requisito para publicación, y tampoco implica ni garantiza que los editores acepten el artículo, ni siquiera que lo seleccionen para revisión científica externa.

Como pode ajudar a melhorar o seu manuscrito para publicação?

Apresentar o seu trabalho num manuscrito bem estruturado e em inglês bem escrito confere-lhe a melhor probabilidade de os editores e revisores o compreenderem e avaliarem de forma justa. Muitos investigadores verificam que obter algum apoio independente os ajuda a apresentar os seus resultados da melhor forma possível. Os especialistas da Springer Nature Author Services podem ajudá-lo na preparação do manuscrito, incluindo edição de língua inglesa, comentários de desenvolvimento, formatação do manuscrito, preparação de figuras, tradução e muito mais. 

Comece agora e poupe 15%

Também pode utilizar a nossa ferramenta gratuita de verificação de gramática para efetuar uma avaliação do seu trabalho.

Tenha em conta que a utilização destas ferramentas, ou de qualquer outro serviço, não constitui um requisito para publicação, nem implica nem garante que os editores aceitem o artigo ou o selecionem para revisão por pares.

  Data and materials

For all journals, BioMed Central strongly encourages all datasets on which the conclusions of the manuscript rely to be either deposited in publicly available repositories (where available and appropriate) or presented in the main paper or additional supporting files, in machine-readable format (such as spread sheets rather than PDFs) whenever possible. Please see the list of recommended repositories in our editorial policies.

For some journals, deposition of the data on which the conclusions of the manuscript rely is an absolute requirement. Please check the Instructions for Authors for the relevant journal and article type for journal specific policies.

For all manuscripts, information about data availability should be detailed in an ‘Availability of data and materials’ section. For more information on the content of this section, please see the Declarations section of the relevant journal’s Instruction for Authors. For more information on BioMed Centrals policies on data availability, please see our editorial policies .

Formatting the 'Availability of data and materials' section of your manuscript

The following format for the 'Availability of data and materials section of your manuscript should be used:

"The dataset(s) supporting the conclusions of this article is(are) available in the [repository name] repository, [unique persistent identifier and hyperlink to dataset(s) in http:// format]."

The following format is required when data are included as additional files:

"The dataset(s) supporting the conclusions of this article is(are) included within the article (and its additional file(s))."

BioMed Central endorses the Force 11 Data Citation Principles and requires that all publicly available datasets be fully referenced in the reference list with an accession number or unique identifier such as a DOI.

For databases, this section should state the web/ftp address at which the database is available and any restrictions to its use by non-academics.

For software, this section should include:

  • Project name: e.g. My bioinformatics project
  • Project home page: e.g. http://sourceforge.net/projects/mged
  • Archived version: DOI or unique identifier of archived software or code in repository (e.g. enodo)
  • Operating system(s): e.g. Platform independent
  • Programming language: e.g. Java
  • Other requirements: e.g. Java 1.3.1 or higher, Tomcat 4.0 or higher
  • License: e.g. GNU GPL, FreeBSD etc.
  • Any restrictions to use by non-academics: e.g. licence needed

Information on available repositories for other types of scientific data, including clinical data, can be found in our editorial policies .

See our editorial policies for author guidance on good citation practice.

Please check the submission guidelines for the relevant journal and article type. 

What should be cited?

Only articles, clinical trial registration records and abstracts that have been published or are in press, or are available through public e-print/preprint servers, may be cited.

Unpublished abstracts, unpublished data and personal communications should not be included in the reference list, but may be included in the text and referred to as "unpublished observations" or "personal communications" giving the names of the involved researchers. Obtaining permission to quote personal communications and unpublished data from the cited colleagues is the responsibility of the author. Only footnotes are permitted. Journal abbreviations follow Index Medicus/MEDLINE.

Any in press articles cited within the references and necessary for the reviewers' assessment of the manuscript should be made available if requested by the editorial office.

How to format your references

Please check the Instructions for Authors for the relevant journal and article type for examples of the relevant reference style.

Web links and URLs: All web links and URLs, including links to the authors' own websites, should be given a reference number and included in the reference list rather than within the text of the manuscript. They should be provided in full, including both the title of the site and the URL, as well as the date the site was accessed, in the following format: The Mouse Tumor Biology Database. http://tumor.informatics.jax.org/mtbwi/index.do . Accessed 20 May 2013. If an author or group of authors can clearly be associated with a web link, such as for weblogs, then they should be included in the reference.

Authors may wish to make use of reference management software to ensure that reference lists are correctly formatted.

Preparing tables

When preparing tables, please follow the formatting instructions below.

  • Tables should be numbered and cited in the text in sequence using Arabic numerals (i.e. Table 1, Table 2 etc.).
  • Tables less than one A4 or Letter page in length can be placed in the appropriate location within the manuscript.
  • Tables larger than one A4 or Letter page in length can be placed at the end of the document text file. Please cite and indicate where the table should appear at the relevant location in the text file so that the table can be added in the correct place during production.
  • Larger datasets, or tables too wide for A4 or Letter landscape page can be uploaded as additional files. Please see [below] for more information.
  • Tabular data provided as additional files can be uploaded as an Excel spreadsheet (.xls ) or comma separated values (.csv). Please use the standard file extensions.
  • Table titles (max 15 words) should be included above the table, and legends (max 300 words) should be included underneath the table.
  • Tables should not be embedded as figures or spreadsheet files, but should be formatted using ‘Table object’ function in your word processing program.
  • Color and shading may not be used. Parts of the table can be highlighted using superscript, numbering, lettering, symbols or bold text, the meaning of which should be explained in a table legend.
  • Commas should not be used to indicate numerical values.

If you have any questions or are experiencing a problem with tables, please contact the customer service team at [email protected] .

Preparing additional files

As the length and quantity of data is not restricted for many article types, authors can provide datasets, tables, movies, or other information as additional files.

All Additional files will be published along with the accepted article. Do not include files such as patient consent forms, certificates of language editing, or revised versions of the main manuscript document with tracked changes. Such files, if requested, should be sent by email to the journal’s editorial email address, quoting the manuscript reference number. Please do not send completed patient consent forms unless requested.

Results that would otherwise be indicated as "data not shown" should be included as additional files. Since many web links and URLs rapidly become broken, BioMed Central requires that supporting data are included as additional files, or deposited in a recognized repository. Please do not link to data on a personal/departmental website. Do not include any individual participant details. The maximum file size for additional files is 20 MB each, and files will be virus-scanned on submission. Each additional file should be cited in sequence within the main body of text.

If additional material is provided, please list the following information in a separate section of the manuscript text:

  • File name (e.g. Additional file 1)
  • File format including the correct file extension for example .pdf, .xls, .txt, .pptx (including name and a URL of an appropriate viewer if format is unusual)
  • Title of data
  • Description of data

Additional files should be named "Additional file 1" and so on and should be referenced explicitly by file name within the body of the article, e.g. 'An additional movie file shows this in more detail [see Additional file 1]'.

For further guidance on how to use Additional files or recommendations on how to present particular types of data or information, please see How to use additional files .

  • Editorial Board
  • Manuscript editing services
  • Instructions for Editors
  • Sign up for article alerts and news from this journal
  • Follow us on Twitter

Annual Journal Metrics

Citation Impact 2023 Journal Impact Factor: 6.3 5-year Journal Impact Factor: 4.5 Source Normalized Impact per Paper (SNIP): 1.919 SCImago Journal Rank (SJR): 1.620

Speed 2023 Submission to first editorial decision (median days): 92 Submission to acceptance (median days): 296

Usage 2023 Downloads: 3,531,065 Altmetric mentions: 3,533

  • More about our metrics

Systematic Reviews

ISSN: 2046-4053

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • Write for Us
  • BMJ Journals

You are here

  • Volume 24, Issue 2
  • Five tips for developing useful literature summary tables for writing review articles
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • http://orcid.org/0000-0003-0157-5319 Ahtisham Younas 1 , 2 ,
  • http://orcid.org/0000-0002-7839-8130 Parveen Ali 3 , 4
  • 1 Memorial University of Newfoundland , St John's , Newfoundland , Canada
  • 2 Swat College of Nursing , Pakistan
  • 3 School of Nursing and Midwifery , University of Sheffield , Sheffield , South Yorkshire , UK
  • 4 Sheffield University Interpersonal Violence Research Group , Sheffield University , Sheffield , UK
  • Correspondence to Ahtisham Younas, Memorial University of Newfoundland, St John's, NL A1C 5C4, Canada; ay6133{at}mun.ca

https://doi.org/10.1136/ebnurs-2021-103417

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

Literature reviews offer a critical synthesis of empirical and theoretical literature to assess the strength of evidence, develop guidelines for practice and policymaking, and identify areas for future research. 1 It is often essential and usually the first task in any research endeavour, particularly in masters or doctoral level education. For effective data extraction and rigorous synthesis in reviews, the use of literature summary tables is of utmost importance. A literature summary table provides a synopsis of an included article. It succinctly presents its purpose, methods, findings and other relevant information pertinent to the review. The aim of developing these literature summary tables is to provide the reader with the information at one glance. Since there are multiple types of reviews (eg, systematic, integrative, scoping, critical and mixed methods) with distinct purposes and techniques, 2 there could be various approaches for developing literature summary tables making it a complex task specialty for the novice researchers or reviewers. Here, we offer five tips for authors of the review articles, relevant to all types of reviews, for creating useful and relevant literature summary tables. We also provide examples from our published reviews to illustrate how useful literature summary tables can be developed and what sort of information should be provided.

Tip 1: provide detailed information about frameworks and methods

  • Download figure
  • Open in new tab
  • Download powerpoint

Tabular literature summaries from a scoping review. Source: Rasheed et al . 3

The provision of information about conceptual and theoretical frameworks and methods is useful for several reasons. First, in quantitative (reviews synthesising the results of quantitative studies) and mixed reviews (reviews synthesising the results of both qualitative and quantitative studies to address a mixed review question), it allows the readers to assess the congruence of the core findings and methods with the adapted framework and tested assumptions. In qualitative reviews (reviews synthesising results of qualitative studies), this information is beneficial for readers to recognise the underlying philosophical and paradigmatic stance of the authors of the included articles. For example, imagine the authors of an article, included in a review, used phenomenological inquiry for their research. In that case, the review authors and the readers of the review need to know what kind of (transcendental or hermeneutic) philosophical stance guided the inquiry. Review authors should, therefore, include the philosophical stance in their literature summary for the particular article. Second, information about frameworks and methods enables review authors and readers to judge the quality of the research, which allows for discerning the strengths and limitations of the article. For example, if authors of an included article intended to develop a new scale and test its psychometric properties. To achieve this aim, they used a convenience sample of 150 participants and performed exploratory (EFA) and confirmatory factor analysis (CFA) on the same sample. Such an approach would indicate a flawed methodology because EFA and CFA should not be conducted on the same sample. The review authors must include this information in their summary table. Omitting this information from a summary could lead to the inclusion of a flawed article in the review, thereby jeopardising the review’s rigour.

Tip 2: include strengths and limitations for each article

Critical appraisal of individual articles included in a review is crucial for increasing the rigour of the review. Despite using various templates for critical appraisal, authors often do not provide detailed information about each reviewed article’s strengths and limitations. Merely noting the quality score based on standardised critical appraisal templates is not adequate because the readers should be able to identify the reasons for assigning a weak or moderate rating. Many recent critical appraisal checklists (eg, Mixed Methods Appraisal Tool) discourage review authors from assigning a quality score and recommend noting the main strengths and limitations of included studies. It is also vital that methodological and conceptual limitations and strengths of the articles included in the review are provided because not all review articles include empirical research papers. Rather some review synthesises the theoretical aspects of articles. Providing information about conceptual limitations is also important for readers to judge the quality of foundations of the research. For example, if you included a mixed-methods study in the review, reporting the methodological and conceptual limitations about ‘integration’ is critical for evaluating the study’s strength. Suppose the authors only collected qualitative and quantitative data and did not state the intent and timing of integration. In that case, the strength of the study is weak. Integration only occurred at the levels of data collection. However, integration may not have occurred at the analysis, interpretation and reporting levels.

Tip 3: write conceptual contribution of each reviewed article

While reading and evaluating review papers, we have observed that many review authors only provide core results of the article included in a review and do not explain the conceptual contribution offered by the included article. We refer to conceptual contribution as a description of how the article’s key results contribute towards the development of potential codes, themes or subthemes, or emerging patterns that are reported as the review findings. For example, the authors of a review article noted that one of the research articles included in their review demonstrated the usefulness of case studies and reflective logs as strategies for fostering compassion in nursing students. The conceptual contribution of this research article could be that experiential learning is one way to teach compassion to nursing students, as supported by case studies and reflective logs. This conceptual contribution of the article should be mentioned in the literature summary table. Delineating each reviewed article’s conceptual contribution is particularly beneficial in qualitative reviews, mixed-methods reviews, and critical reviews that often focus on developing models and describing or explaining various phenomena. Figure 2 offers an example of a literature summary table. 4

Tabular literature summaries from a critical review. Source: Younas and Maddigan. 4

Tip 4: compose potential themes from each article during summary writing

While developing literature summary tables, many authors use themes or subthemes reported in the given articles as the key results of their own review. Such an approach prevents the review authors from understanding the article’s conceptual contribution, developing rigorous synthesis and drawing reasonable interpretations of results from an individual article. Ultimately, it affects the generation of novel review findings. For example, one of the articles about women’s healthcare-seeking behaviours in developing countries reported a theme ‘social-cultural determinants of health as precursors of delays’. Instead of using this theme as one of the review findings, the reviewers should read and interpret beyond the given description in an article, compare and contrast themes, findings from one article with findings and themes from another article to find similarities and differences and to understand and explain bigger picture for their readers. Therefore, while developing literature summary tables, think twice before using the predeveloped themes. Including your themes in the summary tables (see figure 1 ) demonstrates to the readers that a robust method of data extraction and synthesis has been followed.

Tip 5: create your personalised template for literature summaries

Often templates are available for data extraction and development of literature summary tables. The available templates may be in the form of a table, chart or a structured framework that extracts some essential information about every article. The commonly used information may include authors, purpose, methods, key results and quality scores. While extracting all relevant information is important, such templates should be tailored to meet the needs of the individuals’ review. For example, for a review about the effectiveness of healthcare interventions, a literature summary table must include information about the intervention, its type, content timing, duration, setting, effectiveness, negative consequences, and receivers and implementers’ experiences of its usage. Similarly, literature summary tables for articles included in a meta-synthesis must include information about the participants’ characteristics, research context and conceptual contribution of each reviewed article so as to help the reader make an informed decision about the usefulness or lack of usefulness of the individual article in the review and the whole review.

In conclusion, narrative or systematic reviews are almost always conducted as a part of any educational project (thesis or dissertation) or academic or clinical research. Literature reviews are the foundation of research on a given topic. Robust and high-quality reviews play an instrumental role in guiding research, practice and policymaking. However, the quality of reviews is also contingent on rigorous data extraction and synthesis, which require developing literature summaries. We have outlined five tips that could enhance the quality of the data extraction and synthesis process by developing useful literature summaries.

  • Aromataris E ,
  • Rasheed SP ,

Twitter @Ahtisham04, @parveenazamali

Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

Competing interests None declared.

Patient consent for publication Not required.

Provenance and peer review Not commissioned; externally peer reviewed.

Read the full text or download the PDF:

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

Systematic literature reviews over the years

Affiliations.

  • 1 Assignity, Krakow, Poland.
  • 2 Public Health Department, Aix-Marseille University, Marseille, France.
  • 3 Clever-Access, Paris, France.
  • PMID: 37614556
  • PMCID: PMC10443963
  • DOI: 10.1080/20016689.2023.2244305

Purpose: Nowadays, systematic literature reviews (SLRs) and meta-analyses are often placed at the top of the study hierarchy of evidence. The main objective of this paper is to evaluate the trends in SLRs of randomized controlled trials (RCTs) throughout the years. Methods: Medline database was searched, using a highly focused search strategy. Each paper was coded according to a specific ICD-10 code; the number of RCTs included in each evaluated SLR was also retrieved. All SLRs analyzing RCTs were included. Protocols, commentaries, or errata were excluded. No restrictions were applied. Results: A total of 7,465 titles and abstracts were analyzed, from which 6,892 were included for further analyses. There was a gradual increase in the number of annual published SLRs, with a significant increase in published articles during the last several years. Overall, the most frequently analyzed areas were diseases of the circulatory system ( n = 750) and endocrine, nutritional, and metabolic diseases ( n = 734). The majority of SLRs included between 11 and 50 RCTs each. Conclusions: The recognition of SLRs' usefulness is growing at an increasing speed, which is reflected by the growing number of published studies. The most frequently evaluated diseases are in alignment with leading causes of death and disability worldwide.

Keywords: ICD-10 classification; Systematic literature review; randomized controlled trial; rapid review.

© 2023 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.

PubMed Disclaimer

Conflict of interest statement

No potential conflict of interest was reported by the author(s).

The number of RCT SLRs…

The number of RCT SLRs published over the years. Source: PubMed (search run…

The number of RCTs published…

The number of RCTs published over the years. Source: PubMed (search run in…

Distribution of RCT SLRs over…

Distribution of RCT SLRs over the years (search run in May 2023).

Number of RCT SLRs stratified…

Number of RCT SLRs stratified by disease area and by the number of…

The number of RCT SLRs stratified by disease area, published in 3 distinct…

The number of SLRs on…

The number of SLRs on COVID-19 published since 2020. Source: PubMed (search run…

The number of epidemiological SLRs…

The number of epidemiological SLRs published over the years. Source: PubMed (search run…

The burden of disease by…

The burden of disease by cause, measured in DALYs. Source: Institute for Health…

Similar articles

  • Inter-reviewer reliability of human literature reviewing and implications for the introduction of machine-assisted systematic reviews: a mixed-methods review. Hanegraaf P, Wondimu A, Mosselman JJ, de Jong R, Abogunrin S, Queiros L, Lane M, Postma MJ, Boersma C, van der Schans J. Hanegraaf P, et al. BMJ Open. 2024 Mar 19;14(3):e076912. doi: 10.1136/bmjopen-2023-076912. BMJ Open. 2024. PMID: 38508610 Free PMC article. Review.
  • Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas. Crider K, Williams J, Qi YP, Gutman J, Yeung L, Mai C, Finkelstain J, Mehta S, Pons-Duran C, Menéndez C, Moraleda C, Rogers L, Daniels K, Green P. Crider K, et al. Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217. Cochrane Database Syst Rev. 2022. PMID: 36321557 Free PMC article.
  • The future of Cochrane Neonatal. Soll RF, Ovelman C, McGuire W. Soll RF, et al. Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12. Early Hum Dev. 2020. PMID: 33036834
  • Interventions to Prevent Falls in Community-Dwelling Older Adults: A Systematic Review for the U.S. Preventive Services Task Force [Internet]. Guirguis-Blake JM, Michael YL, Perdue LA, Coppola EL, Beil TL, Thompson JH. Guirguis-Blake JM, et al. Rockville (MD): Agency for Healthcare Research and Quality (US); 2018 Apr. Report No.: 17-05232-EF-1. Rockville (MD): Agency for Healthcare Research and Quality (US); 2018 Apr. Report No.: 17-05232-EF-1. PMID: 30234932 Free Books & Documents. Review.
  • Airway clearance devices for cystic fibrosis: an evidence-based analysis. Medical Advisory Secretariat. Medical Advisory Secretariat. Ont Health Technol Assess Ser. 2009;9(26):1-50. Epub 2009 Nov 1. Ont Health Technol Assess Ser. 2009. PMID: 23074531 Free PMC article.
  • Challenges faced by human resources for health in Morocco: A scoping review. Al Hassani W, Achhab YE, Nejjari C. Al Hassani W, et al. PLoS One. 2024 May 7;19(5):e0296598. doi: 10.1371/journal.pone.0296598. eCollection 2024. PLoS One. 2024. PMID: 38713675 Free PMC article. Review.
  • Aromataris E, Pearson A.. The systematic review: an overview. Am J Nurs. 2014;114(3):53–11. doi: 10.1097/01.NAJ.0000444496.24228.2c - DOI - PubMed
  • Paré G, Trudel MC, Jaana M, et al. Synthesizing information systems knowledge: a typology of literature reviews. Inf Manag. 2015;52(2):183–199. doi: 10.1016/j.im.2014.08.008 - DOI
  • Linnenluecke MK, Marrone M, Singh AK. Conducting systematic literature reviews and bibliometric analyses. Aust J Manag. 2019;45(2):175–194. doi: 10.1177/0312896219877678 - DOI
  • Cochrane AL. 1931-1971 a critical review with particular reference to the medical profession. Medicines for the year 2000. London: Office for Health Economics; 1979. pp. 1–11.
  • Denyer D, Tranfield D. Producing a systematic review. The sage handbook of organizational research methods. Thousand Oaks, CA: Sage Publications Ltd; 2009. pp. 671–689.

Related information

Linkout - more resources, full text sources.

  • Europe PubMed Central
  • PubMed Central
  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

  • Search Menu
  • Sign in through your institution
  • Advance Articles
  • Editor's Choice
  • Collections
  • Supplements
  • InSight Papers
  • BSR Registers Papers
  • Virtual Roundtables
  • Author Guidelines
  • Submission Site
  • Open Access Options
  • Self-Archiving Policy
  • About Rheumatology
  • About the British Society for Rheumatology
  • Editorial Board
  • Advertising and Corporate Services
  • Journals Career Network
  • Dispatch Dates
  • Terms and Conditions
  • Journals on Oxford Academic
  • Books on Oxford Academic

Issue Cover

Article Contents

Introduction, supplementary material, data availability, acknowledgements.

  • < Previous

Comparative efficacy and safety of bimekizumab in psoriatic arthritis: a systematic literature review and network meta-analysis

ORCID logo

  • Article contents
  • Figures & tables
  • Supplementary Data

Philip J Mease, Dafna D Gladman, Joseph F Merola, Peter Nash, Stacy Grieve, Victor Laliman-Khara, Damon Willems, Vanessa Taieb, Adam R Prickett, Laura C Coates, Comparative efficacy and safety of bimekizumab in psoriatic arthritis: a systematic literature review and network meta-analysis, Rheumatology , Volume 63, Issue 7, July 2024, Pages 1779–1789, https://doi.org/10.1093/rheumatology/kead705

  • Permissions Icon Permissions

To understand the relative efficacy and safety of bimekizumab, a selective inhibitor of IL-17F in addition to IL-17A, vs other biologic and targeted synthetic DMARDs (b/tsDMARDs) for PsA using network meta-analysis (NMA).

A systematic literature review (most recent update conducted on 1 January 2023) identified randomized controlled trials (RCTs) of b/tsDMARDs in PsA. Bayesian NMAs were conducted for efficacy outcomes at Weeks 12–24 for b/tsDMARD-naïve and TNF inhibitor (TNFi)-experienced patients. Safety at Weeks 12–24 was analysed in a mixed population. Odds ratios (ORs) and differences of mean change with the associated 95% credible interval (CrI) were calculated for the best-fitting models, and the surface under the cumulative ranking curve (SUCRA) values were calculated to determine relative rank.

The NMA included 41 RCTs for 22 b/tsDMARDs. For minimal disease activity (MDA), bimekizumab ranked 1st in b/tsDMARD-naïve patients and 2nd in TNFi-experienced patients. In b/tsDMARD-naïve patients, bimekizumab ranked 6th, 5th and 3rd for ACR response ACR20/50/70, respectively. In TNFi-experienced patients, bimekizumab ranked 1st, 2nd and 1st for ACR20/50/70, respectively. For Psoriasis Area and Severity Index 90/100, bimekizumab ranked 2nd and 1st in b/tsDMARD-naïve patients, respectively, and 1st and 2nd in TNFi-experienced patients, respectively. Bimekizumab was comparable to b/tsDMARDs for serious adverse events.

Bimekizumab ranked favourably among b/tsDMARDs for efficacy on joint, skin and MDA outcomes, and showed comparable safety, suggesting it may be a beneficial treatment option for patients with PsA.

For joint efficacy, bimekizumab ranked highly among approved biologic/targeted synthetic DMARDs (b/tsDMARDs).

Bimekizumab provides better skin efficacy (Psoriasis Area and Severity Index, PASI100 and PASI90) than many other available treatments in PsA.

For minimal disease activity, bimekizumab ranked highest of all available b/tsDMARDs in b/tsDMARD-naïve and TNF inhibitor–experienced patients.

PsA is a chronic, systemic, inflammatory disease in which patients experience a high burden of illness [ 1–3 ]. PsA has multiple articular and extra-articular disease manifestations including peripheral arthritis, axial disease, enthesitis, dactylitis, skin psoriasis (PSO) and psoriatic nail disease [ 4 , 5 ]. Patients with PsA can also suffer from related inflammatory conditions, uveitis and IBD [ 4 , 5 ]. Approximately one fifth of all PSO patients, increasing to one quarter of patients with moderate to severe PSO, will develop PsA over time [ 6 , 7 ].

The goal of treatment is to control inflammation and prevent structural damage to minimize disease burden, normalize function and social participation, and maximize the quality of life of patients [ 1 , 4 ]. As PsA is a heterogeneous disease, the choice of treatment is guided by individual patient characteristics, efficacy against the broad spectrum of skin and joint symptoms, and varying contraindications to treatments [ 1 , 4 ]. There are a number of current treatments classed as conventional DMARDs such as MTX, SSZ, LEF; biologic (b) DMARDs such as TNF inhibitors (TNFi), IL inhibitors and cytotoxic T lymphocyte antigen 4 (CTLA4)-immunoglobulin; and targeted synthetic (ts) DMARDs which include phosphodiesterase-4 (PDE4) and Janus kinase (JAK) inhibitors [ 1 , 8 ].

Despite the number of available treatment options, the majority of patients with PsA report that they do not achieve remission and additional therapeutic options are needed [ 9 , 10 ]. Thus, the treatment landscape for PsA continues to evolve and treatment decisions increase in complexity, especially as direct comparative data are limited [ 2 ].

Bimekizumab is a monoclonal IgG1 antibody that selectively inhibits IL-17F in addition to IL-17A, which is approved for the treatment of adults with active PsA in Europe [ 11 , 12 ]. Both IL-17A and IL-17F are pro-inflammatory cytokines implicated in PsA [ 11 , 13 ]. IL-17F is structurally similar to IL-17A and expressed by the same immune cells; however, the mechanisms that regulate expression and kinetics differ [ 13 , 14 ]. IL-17A and IL-17F are expressed as homodimers and as IL-17A–IL-17F heterodimers that bind to and signal via the same IL-17 receptor A/C complex [ 13 , 15 ].

In vitro studies have demonstrated that the dual inhibition of both IL-17A and IL-17F with bimekizumab was more effective at suppressing PsA inflammatory genes and T cell and neutrophil migration, and periosteal new bone formation, than blocking IL-17A alone [ 11 , 14 , 16 , 17 ]. Furthermore, IL-17A and IL-17F protein levels are elevated in psoriatic lesions and the superiority of bimekizumab 320 mg every 4 weeks (Q4W) or every 8 weeks (Q8W) over the IL-17A inhibitor, secukinumab, in complete clearance of psoriatic skin was demonstrated in a head-to-head trial in PSO [ 16 , 18 ]. Collectively, this evidence suggests that neutralizing both IL-17F and IL-17A may provide more potent abrogation of IL-17-mediated inflammation than IL-17A alone.

Bimekizumab 160 mg Q4W demonstrated significant improvements in efficacy outcomes compared with placebo, and an acceptable safety profile in adults with PsA in the phase 3 RCTs BE OPTIMAL (NCT03895203) (b/tsDMARD-naïve patients) and BE COMPLETE (NCT03896581) (TNFi inadequate responders) [ 19 , 20 ].

The objective of this study was to establish the comparative efficacy and safety of bimekizumab 160 mg Q4W vs other available PsA treatments, using network meta-analysis (NMA).

Search strategy

A systematic literature review (SLR) was conducted according to the Preferred Reporting Items for Systematic Reviews (PRISMA) guidelines [ 21 ] and adhered to the principles outlined in the Cochrane Handbook for Systematic Reviews of Interventions, Centre for Reviews and Dissemination’s Guidance for Undertaking Reviews in Healthcare, and Methods for the Development of National Institute of Health and Care Excellence (NICE) Public Health Guidance [ 22–24 ]. The SLR of English-language publications was originally conducted on 3 December 2015, with updates on 7 January 2020, 2 May 2022 and 1 January 2023 in Medical Literature Analysis and Retrieval System Online (MEDLINE ® ), Excerpta Medica Database (Embase ® ) and the Cochrane Central Register of Controlled Trials (CENTRAL) for literature published from January 1991 onward using the Ovid platform. Additionally, bibliographies of SLRs and meta-analyses identified through database searches were reviewed to ensure any publications not identified in the initial search were included in this SLR. Key clinical conference proceedings not indexed in Ovid (from October 2019 to current) and ClinicalTrials.gov were also manually searched. The search strategy is presented in Supplementary Table S1 (available at Rheumatology online).

Study inclusion

Identified records were screened independently and in duplicate by two reviewers and any discrepancies were reconciled via discussion or a third reviewer. The SLR inclusion criteria were defined by the Patient populations, Interventions, Comparators, Outcome measures, and Study designs (PICOS) Statement ( Supplementary Table S2 , available at Rheumatology online). The SLR included published studies assessing approved therapies for the treatment of PsA. Collected data included study and patient population characteristics, interventions, comparators, and reported clinical and patient-reported outcomes relevant to PsA. For efficacy outcomes, pre-crossover data were extracted in studies where crossover occurred. All publications included in the analysis were evaluated according to the Cochrane risk-of-bias tool for randomized trials as described in the Cochrane Handbook [ 25 ].

Network meta-analysis methods

NMA is the quantitative assessment of relative treatment effects and associated uncertainty of two or more interventions [ 26 , 27 ]. It is used frequently in health technology assessment, guideline development and to inform treatment decision making in clinical practice [ 26 ].

Bimekizumab 160 mg Q4W was compared with current b/tsDMARDs at regulatory-approved doses ( Table 1 ) by NMA. All comparators were selected on the basis they were relevant to clinical practice, i.e. recommended by key clinical guidelines, licensed by key regulatory bodies and/or routinely used.

NMA intervention and comparators

Therapeutic classDrug dose and frequency of administration
Intervention
 IL-17A/17FiBimekizumab 160 mg Q4W
Comparators
 IL-17AiSecukinumab 150 mg with or without loading dose Q4W or 300 mg Q4W, ixekizumab 80 mg Q4W
 IL-23iGuselkumab 100 mg every Q4W or Q8W, risankizumab 150 mg Q4W
 IL-12/23iUstekinumab 45 mg or 90 mg Q12W
 TNFiAdalimumab 40 mg Q2W, certolizumab pegol 200 mg Q2W or 400 mg Q4W pooled, etanercept 25 mg twice a week, golimumab 50 mg s.c. Q4W or 2 mg/kg i.v. Q8W, infliximab 5 mg/kg on weeks 0, 2, 6, 14, 22
 CTLA4-IgAbatacept 150 mg Q1W
 JAKiTofacitinib 5 mg BID, upadacitinib 15 mg QD
 PDE-4iApremilast 30 mg BID
 OtherPlacebo
Therapeutic classDrug dose and frequency of administration
Intervention
 IL-17A/17FiBimekizumab 160 mg Q4W
Comparators
 IL-17AiSecukinumab 150 mg with or without loading dose Q4W or 300 mg Q4W, ixekizumab 80 mg Q4W
 IL-23iGuselkumab 100 mg every Q4W or Q8W, risankizumab 150 mg Q4W
 IL-12/23iUstekinumab 45 mg or 90 mg Q12W
 TNFiAdalimumab 40 mg Q2W, certolizumab pegol 200 mg Q2W or 400 mg Q4W pooled, etanercept 25 mg twice a week, golimumab 50 mg s.c. Q4W or 2 mg/kg i.v. Q8W, infliximab 5 mg/kg on weeks 0, 2, 6, 14, 22
 CTLA4-IgAbatacept 150 mg Q1W
 JAKiTofacitinib 5 mg BID, upadacitinib 15 mg QD
 PDE-4iApremilast 30 mg BID
 OtherPlacebo

See Supplementary Table S4 , available at Rheumatology online for additional dosing schedules used in included studies. BID: twice daily; CTLA4-Ig: cytotoxic T lymphocyte antigen 4-immunoglobulin; IL-17A/17Fi: IL-17A/17F inhibitor; IL-17Ai: IL-17A inhibitor; IL-12/23i: IL-12/23 inhibitor; IL-23i: IL-23 inhibitor; JAKi: Janus kinase inhibitor; NMA: network meta-analysis; PDE-4i: phosphodiesterase-4 inhibitor; Q1W: once weekly; Q2W: every 2 weeks; Q4W: every 4 weeks; Q8W: every 8 weeks; Q12W: every 12 weeks; QD: once daily; TNFi: TNF inhibitor.

Two sets of primary analyses were conducted, one for a b/tsDMARD-naïve PsA population and one for a TNFi-experienced PsA population. Prior treatment with TNFis has been shown to impact the response to subsequent bDMARD treatments [ 28 ]. In addition, most trials involving b/tsDMARDs for the treatment of PsA (including bimekizumab) report separate data on both b/tsDMARD-naïve and TNFi-experienced subgroups, making NMA in each of these patient populations feasible.

For each population the following outcomes were analysed: American College of Rheumatology response (ACR20/50/70), Psoriasis Area and Severity Index (PASI90/100), and minimal disease activity (MDA). The analysis of serious adverse events (SAE) was conducted using a mixed population (i.e. b/tsDMARD-naïve, TNFi-experienced and mixed population data all were included) as patients’ previous TNFI exposure was not anticipated to impact safety outcomes following discussions with clinicians. The NMA included studies for which data were available at week 16, if 16-week data were not available (or earlier crossover occurred), data available at weeks 12, 14 or 24 were included. Pre-crossover data were included in the analyses for efficacy outcomes to avoid intercurrent events.

Heterogeneity between studies for age, sex, ethnicity, mean time since diagnosis, concomitant MTX, NSAIDs or steroid use was assessed using Grubb’s test, also called the extreme Studentized deviate method, to identify outlier studies.

All univariate analyses involved a 10 000 run-in iteration phase and a 10 000-iteration phase for parameter estimation. All calculations were performed using the R2JAGS package to run Just Another Gibbs Sampler (JAGS) 3.2.3 and the code reported in NICE Decision Support Unit (DSU) Technical Support Document Series [ 29–33 ]. Convergence was confirmed through inspection of the ratios of Monte-Carlo error to the standard deviations of the posteriors; values >5% are strong signs of convergence issues [ 31 ]. In some cases, trials reported outcome results of zero (ACR70, PASI100, SAE) in one or more arms for which a continuity correction was applied to mitigate the issue, as without the correction most models were not convergent or provided a large posterior distribution making little clinical sense [ 31 ].

Four NMA models [fixed effects (FE) unadjusted, FE baseline risk-adjusted, random effects (RE) unadjusted and RE baseline risk-adjusted] were assessed and the best-fit models were chosen using methods described in NICE DSU Technical Support Document 2 [ 31 ]. Odds ratios (ORs) and differences of mean change (MC) with the associated 95% credible intervals (CrIs) were calculated for each treatment comparison in the evidence network for the best fitting models and presented in league tables and forest plots. In addition, the probability of bimekizumab 160 mg Q4W being better than other treatments was calculated using surface under the cumulative ranking curve (SUCRA) to determine relative rank. Conclusions (i.e. better/worse or comparable) for bimekizumab 160 mg Q4W vs comparators were based on whether the pairwise 95% CrIs of the ORs/difference of MC include 1 (dichotomous outcomes), 0 (continuous outcomes) or not. In the case where the 95% CrI included 1 or 0, then bimekizumab 160 mg Q4W and the comparator were considered comparable. If the 95% CrI did not include 1 or 0, then bimekizumab 160 mg Q4W was considered either better or worse depending on the direction of the effect.

Compliance with ethics guidelines

This article is based on previously conducted studies and does not contain any new studies with human participants or animals performed by any of the authors.

Study and patient characteristics

The SLR identified 4576 records through databases and 214 records through grey literature, of which 3143 were included for abstract review. Following the exclusion of a further 1609 records, a total of 1534 records were selected for full-text review. A total of 66 primary studies from 246 records were selected for data extraction. No trial was identified as having a moderate or high risk of bias ( Supplementary Table S3 , available at Rheumatology online).

Of the 66 studies identified in the SLR, 41 studies reported outcomes at weeks 12, 16 or 24 and met the criteria for inclusion in the NMA in either a b/tsDMARD-naïve population ( n  = 20), a TNFi-experienced population ( n  = 5), a mixed population with subgroups ( n  = 13) or a mixed PsA population without subgroups reported ( n  = 3). The PRISMA diagram is presented in Fig. 1 . Included and excluded studies are presented in Supplementary Tables S4 and S5 , respectively (available at Rheumatology online).

PRISMA flow diagram. The PRISMA flow diagram for the SLR conducted to identify published studies assessing approved treatments for the treatment of PsA. cDMARD: conventional DMARD; NMA: network meta-analysis; NR: not reported; PD: pharmacodynamic; PK: pharmacokinetic; PRISMA: Preferred Reporting Items for Systematic Reviews and Meta-Analyses; RCT: randomized controlled trial; SLR: systematic literature review

PRISMA flow diagram. The PRISMA flow diagram for the SLR conducted to identify published studies assessing approved treatments for the treatment of PsA. cDMARD: conventional DMARD; NMA: network meta-analysis; NR: not reported; PD: pharmacodynamic; PK: pharmacokinetic; PRISMA: Preferred Reporting Items for Systematic Reviews and Meta-Analyses; RCT: randomized controlled trial; SLR: systematic literature review

The baseline study and patient characteristics (where reported) are presented in Supplementary Table S6 (available at Rheumatology online). There were 20–483 patients included in treatment arms. The median age of patients was 48.9 years, the median percentage of males was 50.3% and a median of 92.3% of patients were Caucasian. Patients had a mean time since diagnosis of 7.6 years and a mean PASI score of 8.7. The mean (range) use of concomitant MTX, NSAIDs and steroids were 53.9% (29.1% to 84.0%), 72.4% (33.3% to 100.0%) and 16.8% (9.2% to 30.0%), respectively. Heterogeneity was generally low across studies except for the concomitant use of MTX, NSAIDs and steroids. Using an approach consistent with established NMA methods in PsA [ 34–36 ], a meta-regression model using JAGS code reported in NICE DSU Technical Support Document 3 [ 33 ] was used to account for variation in placebo responses when model-fit statistics suggested that baseline risk-adjusted models provided a better fit to the data.

NMA results

The network diagrams for ACR50 in b/tsDMARD-naïve and TNFi-experienced patients are presented in Fig. 2A and B with network diagrams for other outcomes presented in Supplementary Fig. S1 (available at Rheumatology online). The networks for ACR response were larger, in terms of both number of studies and patients included, than the networks for PASI. Similarly, the networks for b/tsDMARD-naïve patients were larger than TNFi-experienced patients across all outcomes analysed. Placebo was used as a common comparator in all networks and there were a few studies that included more than two arms (OPAL-Broaden, Select-PsA-1, SPIRIT-P1 and BE OPTIMAL) that included adalimumab as the reference arm in b/tsDMARD-naïve patients. Lastly, networks included studies where the primary outcome was evaluated at time points longer than 16 weeks (e.g. EXCEED study at 52 weeks) but as per the methods, 16-week data formed the network.

Network of evidence for ACR50. (A) b/tsDMARD-naïve patients. (B) TNFi-experienced patients. The size of the circle representing each intervention is proportional to the number of patients included in the analysis. The line width is proportional to the number of studies connecting the interventions. ABA: abatacept; ADA: adalimumab; APR: apremilast; b/tsDMARD-naïve: biologic and targeted synthetic DMARD-naïve; BKZ: bimekizumab; CZP: certolizumab pegol; ETA: etanercept; GOL: golimumab; GUS: guselkumab; IFX: infliximab; IV: intravenous; IXE: ixekizumab; PBO: placebo; Q4W: every 4 weeks; Q8W: every 8 weeks; RIS: risankizumab; SEC: secukinumab; TNFi-experienced: TNF inhibitor–experienced; TOF: tofacitinib; UPA: upadacitinib; UST: ustekinumab; w/o LD: without loading dose

Network of evidence for ACR50. ( A ) b/tsDMARD-naïve patients. ( B ) TNFi-experienced patients. The size of the circle representing each intervention is proportional to the number of patients included in the analysis. The line width is proportional to the number of studies connecting the interventions. ABA: abatacept; ADA: adalimumab; APR: apremilast; b/tsDMARD-naïve: biologic and targeted synthetic DMARD-naïve; BKZ: bimekizumab; CZP: certolizumab pegol; ETA: etanercept; GOL: golimumab; GUS: guselkumab; IFX: infliximab; IV: intravenous; IXE: ixekizumab; PBO: placebo; Q4W: every 4 weeks; Q8W: every 8 weeks; RIS: risankizumab; SEC: secukinumab; TNFi-experienced: TNF inhibitor–experienced; TOF: tofacitinib; UPA: upadacitinib; UST: ustekinumab; w/o LD: without loading dose

The best-fit model is noted for each outcome with full model fit statistics for all outcomes presented in Supplementary Table S7 (available at Rheumatology online). Forest plots for ACR50 and PASI100 are presented in Figs 3 and 4 , with forest plots for other outcomes, along with the league tables in Supplementary Fig. S2 and Table S8 , respectively (available at Rheumatology online).

ACR50. The results for the NMA on ACR50 at week 16. (A) b/tsDMARD-naïve patients including forest plot and SUCRA values. FE baseline–adjusted model DIC = 469.59. (B) TNFi-experienced patients including forest plot and SUCRA values. RE-unadjusted model DIC = 205.33. aWeek 24 data were used as week 16 data was not available. *The 95% CrI does not include 1; bimekizumab 160 mg Q4W is considered either better or worse depending on the direction of the effect. ABA: abatacept; ADA: adalimumab; APR: apremilast; b/tsDMARD-naïve: biologic and targeted synthetic DMARD-naïve; BKZ: bimekizumab; CrI: credible interval; CZP: certolizumab pegol; DIC: deviance information criterion; ETA: etanercept; FE: fixed effects; GOL: golimumab; GUS: guselkumab; IFX: infliximab; IV: intravenous; IXE: ixekizumab; NMA: network meta-analysis; PBO: placebo; Q4W: every 4 weeks; Q8W: every 8 weeks; RE: random effects; RIS: risankizumab; SEC: secukinumab; SUCRA: surface under the cumulative ranking curve; TNFi-experienced: TNF inhibitor–experienced; TOF: tofacitinib; UPA: upadacitinib; UST: ustekinumab; w/o LD: without loading dose

ACR50. The results for the NMA on ACR50 at week 16. ( A ) b/tsDMARD-naïve patients including forest plot and SUCRA values. FE baseline–adjusted model DIC = 469.59. ( B ) TNFi-experienced patients including forest plot and SUCRA values. RE-unadjusted model DIC = 205.33. a Week 24 data were used as week 16 data was not available. * The 95% CrI does not include 1; bimekizumab 160 mg Q4W is considered either better or worse depending on the direction of the effect. ABA: abatacept; ADA: adalimumab; APR: apremilast; b/tsDMARD-naïve: biologic and targeted synthetic DMARD-naïve; BKZ: bimekizumab; CrI: credible interval; CZP: certolizumab pegol; DIC: deviance information criterion; ETA: etanercept; FE: fixed effects; GOL: golimumab; GUS: guselkumab; IFX: infliximab; IV: intravenous; IXE: ixekizumab; NMA: network meta-analysis; PBO: placebo; Q4W: every 4 weeks; Q8W: every 8 weeks; RE: random effects; RIS: risankizumab; SEC: secukinumab; SUCRA: surface under the cumulative ranking curve; TNFi-experienced: TNF inhibitor–experienced; TOF: tofacitinib; UPA: upadacitinib; UST: ustekinumab; w/o LD: without loading dose

PASI100. The results for the NMA on PASI100 at week 16: (A) b/tsDMARD-naïve patients including forest plot and SUCRA values. FE baseline–adjusted model DIC = 150.27. (B) TNFi-experienced patients including forest plot and SUCRA values. RE-unadjusted model DIC = 81.76. aWeek 24 data were used as week 16 data was not available. *The 95% CrI does not include 1; bimekizumab 160 mg 4W is considered better. ADA: adalimumab; b/tsDMARD-naïve: biologic and targeted synthetic DMARD-naïve; BKZ, bimekizumab; CrI, credible interval; CZP, certolizumab pegol; DIC, deviance information criterion; FE, fixed effects; GOL, golimumab; GUS, guselkumab; IXE, ixekizumab; NMA, network meta-analysis; PASI, Psoriasis Area and Severity Index; PBO, placebo; Q4W, every 4 weeks; Q8W, every 8 weeks; RE, random effects; SEC, secukinumab; SUCRA, surface under the cumulative ranking curve; TNFi-experienced, TNF inhibitor–experienced; UPA, upadacitinib

PASI100. The results for the NMA on PASI100 at week 16: ( A ) b/tsDMARD-naïve patients including forest plot and SUCRA values. FE baseline–adjusted model DIC = 150.27. ( B ) TNFi-experienced patients including forest plot and SUCRA values. RE-unadjusted model DIC = 81.76. a Week 24 data were used as week 16 data was not available. * The 95% CrI does not include 1; bimekizumab 160 mg 4W is considered better. ADA: adalimumab; b/tsDMARD-naïve: biologic and targeted synthetic DMARD-naïve; BKZ, bimekizumab; CrI, credible interval; CZP, certolizumab pegol; DIC, deviance information criterion; FE, fixed effects; GOL, golimumab; GUS, guselkumab; IXE, ixekizumab; NMA, network meta-analysis; PASI, Psoriasis Area and Severity Index; PBO, placebo; Q4W, every 4 weeks; Q8W, every 8 weeks; RE, random effects; SEC, secukinumab; SUCRA, surface under the cumulative ranking curve; TNFi-experienced, TNF inhibitor–experienced; UPA, upadacitinib

Joint outcomes

For ACR50 outcomes, the best-fit models for b/tsDMARD-naïve and TNFi-experienced were the FE baseline–adjusted model and RE-unadjusted model, respectively.

b/tsDMARD-naïve patients

Bimekizumab 160 mg Q4W ranked 6th for ACR20 (SUCRA = 0.75), 5th for ACR50 (SUCRA = 0.74) ( Fig. 3A ) and 3rd for ACR70 (SUCRA = 0.80) among 21 treatments. For ACR50, bimekizumab 160 mg Q4W was better than placebo, abatacept 125 mg, guselkumab 100 mg Q4W, ustekinumab 45 mg, risankizumab 150 mg, guselkumab 100 mg Q8W and ustekinumab 90 mg; worse than golimumab 2 mg i.v.; and comparable to the remaining treatments in the network ( Fig. 3A ).

TNFi-experienced patients

Bimekizumab 160 mg Q4W ranked 1st among 16 treatments for ACR20 (SUCRA = 0.96), 2nd among 15 treatments for ACR50 (SUCRA = 0.84) ( Fig. 3B ) and 1st among 16 treatments for ACR70 (SUCRA = 0.83). Bimekizumab 160 mg Q4W was better than placebo, abatacept 125 mg, secukinumab 150 mg without loading dose, tofacitinib 5 mg and secukinumab 150 mg; and comparable to the remaining treatments in the network on ACR50 ( Fig. 3B ).

Skin outcomes

For PASI100 outcomes, the best-fit models for b/tsDMARD-naïve and TNFi-experienced were the FE baseline–adjusted model and RE-unadjusted model, respectively.

Bimekizumab 160 mg Q4W ranked 2nd among 15 treatments (SUCRA = 0.89) for PASI90 and 1st among 11 treatments (SUCRA = 0.95) for PASI100 ( Fig. 4A ). Bimekizumab 160 mg Q4W was better than placebo, certolizumab pegol pooled, golimumab 2 mg i.v., secukinumab 150 mg, adalimumab 40 mg, upadacitinib 15 mg, secukinumab 300 mg and ixekizumab 80 mg Q4W; and comparable to the remaining treatments in the network on PASI100 ( Fig. 4A ).

Bimekizumab 160 mg Q4W ranked 1st among 10 treatments (SUCRA = 0.85) for PASI90 and 2nd among 7 treatments (SUCRA = 0.79) for PASI100 ( Fig. 4B ). Bimekizumab 160 mg Q4W was better than placebo, ixekizumab 80 mg Q4W and upadacitinib 15 mg; and comparable to the remaining treatments in the network on PASI100 ( Fig. 4B ).

For MDA, the best-fit models for b/tsDMARD-naïve and TNFi-experienced were the FE baseline–adjusted model and RE-unadjusted model, respectively.

Bimekizumab 160 mg Q4W ranked 1st among 13 treatments (SUCRA = 0.91) and was better than placebo [OR (95% CrI) 6.31 (4.61–8.20)], guselkumab 100 mg Q4W [2.06 (1.29–3.10)], guselkumab 100 mg Q8W [1.76 (1.09–2.69)], risankizumab 150 mg [1.99 (1.40–2.76)] and adalimumab 40 mg [1.41 (1.01–1.93)]; and comparable to the remaining treatments in the network ( Supplementary Fig. S2G , available at Rheumatology online).

Bimekizumab 160 mg Q4W ranked 1st among 11 treatments (SUCRA = 0.83) and was better than placebo [12.10 (5.31–28.19)] and tofacitinib 5 mg [6.81 (2.14–21.35)]; and comparable to the remaining treatments in the network ( Supplementary Fig. S2H , available at Rheumatology online).

The network for SAEs for a mixed population included 23 treatments and the best-fit model was an RE-unadjusted model (due to study populations and time point reporting heterogeneity). Bimekizumab 160 mg Q4W showed comparable safety to all treatments in the network ( Supplementary Fig. S2I , available at Rheumatology online).

The treatment landscape for PsA is complex, with numerous treatment options and limited direct comparative evidence. Bimekizumab 160 mg Q4W has recently been approved for the treatment of active PsA by the European Medicines Agency and recommended by NICE in the UK, and the published phase 3 results warrant comparison with existing therapies by NMA.

This NMA included 41 studies evaluating 22 b/tsDMARDs including the novel IL-17F and IL-17A inhibitor, bimekizumab. Overall, bimekizumab 160 mg Q4W ranked favourably among b/tsDMARDS for efficacy in joint, skin and disease activity outcomes in PsA across both b/tsDMARD-naïve and TNFi-experienced populations. The safety of bimekizumab 160 mg Q4W was similar to the other b/tsDMARDs.

The Group for Research and Assessment of Psoriasis and Psoriatic Arthritis (GRAPPA) and EULAR provide evidence-based recommendations for the treatment of PsA [ 1 , 2 ]. To treat peripheral arthritis symptoms in PsA, efficacy across the classes of current b/tsDMARDs are considered similar by both GRAPPA and EULAR, in part due to a lack of data comparing licensed therapies in a head-to-head trial setting [ 1 , 2 ]. EULAR recommends the use of JAK inhibitors in the case of inadequate response, intolerance or when a bDMARD is not appropriate [ 1 ]. This recommendation was made when tofacitinib was the only available JAK inhibitor, but reflects current marketing authorizations for tofacitinib and upadacitinib which indicate use in patients with an inadequate response or prior intolerance to TNFis (USA) or bDMARDs (Europe) [ 37–40 ]. This NMA suggests that bimekizumab 160 mg Q4W may have an advantage over current treatments, including IL-23 inhibitors in b/tsDMARD naïve patients, and secukinumab 150 mg and tofacitinib in TNFi-experienced patients, as evidenced by our analysis of ACR50 for which the pairwise comparisons were significantly in favour of bimekizumab 160 mg Q4W.

For the treatment of skin symptoms in PsA, IL-23, IL-12/23 and IL-17A inhibitors are currently recommended due to their greater efficacy compared with TNFis [ 1 , 4 ]. GRAPPA also suggests considering efficacy demonstrated in direct comparative studies in PSO when selecting a treatment for PsA skin symptoms [ 2 ]. In our analysis of complete skin clearance as measured by PASI100, bimekizumab 160 mg Q4W demonstrated the likelihood of significantly greater efficacy than IL-17A, JAK inhibitors and TNFis in b/tsDMARD-naïve patients and IL-17A and JAK inhibitors in TNFi-experienced patients. Furthermore, the NMA results for skin clearance in PsA are in alignment with previous studies in PSO that demonstrated superiority of bimekizumab 320 mg Q4W or Q8W vs secukinumab, ustekinumab and adalimumab ( P  < 0.001) (note that the dosing of bimekizumab in PSO differs from that in PsA) [ 12 , 18 , 41 , 42 ].

There are similarities between our results and other recently published NMAs of b/tsDMARDs in PsA, although methodological heterogeneity across all NMAs makes comparisons challenging [ 34–36 , 43–45 ]. Among recent NMAs, the largest evaluated 21 treatments [ 34 ] and only four considered subgroups of b/tsDMARD-naïve and TNFi-experienced patients or those with inadequate response [ 35 , 36 , 43 , 45 ]. Furthermore, different or pooled levels of response were evaluated for ACR and PASI outcomes.

Previous NMAs also support IL-17, IL-12/23 and IL-23 inhibitors having greater efficacy for skin symptoms than TNFis [ 35 , 36 ]. In an overall PsA population, McInnes et al. demonstrated that secukinumab 300 mg, ixekizumab 80 mg Q4W, and ustekinumab 45 mg and 90 mg were likely more efficacious than TNFis for PASI90 [ 35 ]. In another NMA by Ruyssen-Witrand et al. , results suggested that ixekizumab 80 mg Q4W had significantly greater efficacy than adalimumab, certolizumab pegol pooled, and etanercept 25 mg twice weekly/50 mg once weekly for any PASI score (50%, 75%, 90% and 100% reduction) in bDMARD-naïve patients [ 36 ].

For joint outcomes, Mease et al. compared guselkumab Q4W and Q8W with other b/tsDMARDs in a network of 21 treatments in an overall PsA population for ACR50 [ 34 ]. Both guselkumab dosing schedules were better than abatacept and apremilast, but golimumab 2 mg i.v. had a higher likelihood of ACR50 response than guselkumab Q8W [ 34 ]. Despite MDA being assessed in clinical trials for bDMARD therapies and a treatment target in PsA [ 46 ], evidence for comparative efficacy for this outcome is limited. None of the most recent NMAs before this one included an analysis of MDA [ 34–36 ]. With regard to safety outcomes, previous NMAs evaluating SAEs also resulted in either no difference between b/tsDMARDs vs placebo or other b/tsDMARDs [ 34 , 36 , 44 , 45 ].

This study has a number of strengths. To our knowledge this NMA represents the most comprehensive and in-depth comparative efficacy analysis of approved treatments in PsA to date. The evidence was derived from a recent SLR, ensuring that new RCTs and updated results from previously published RCTs were included. It is also the first NMA to include the phase 3 BE COMPLETE and BE OPTIMAL trials of bimekizumab [ 19 , 20 ]. Our NMA used robust methods and accounted for variation in placebo response through network meta-regression in accordance with NICE DSU Technical Support Documents [ 31–33 ]. As an acknowledgement of the evolution of treatment advances, separate analyses of b/tsDMARD-naïve and TNFi-experienced subgroups were conducted with the intent to assist healthcare decision-making in different clinical settings. In addition, a panel of clinical experts were consulted from project inception and are authors of this paper, ensuring inclusion of a comprehensive set of clinically meaningful outcomes, including the composite, treat-to-target outcome of MDA.

Despite the robust evidence base and methodology, this NMA has limitations. Indirect treatment comparisons such as this NMA are not a substitute for head-to-head trials. There was heterogeneity in the endpoints and reporting in the included studies. Fewer studies reporting PASI outcomes resulted in smaller networks compared with the network of studies evaluating ACR response criteria. Not all trials reported outcomes at the same timepoint, thereby reducing the comparability of trial results, which has been transparently addressed by noting where week 24 data were used vs week 12, 14 or 16 data. The analyses for the TNFi-experienced population were limited by potential heterogeneity, especially in the analyses where fewer studies were included in the networks, as this group could include patients who had an inadequate response to TNFi or discontinued TNFi treatment due to other reasons (e.g. lost access). Also, in the analyses for the TNFi-experienced population, very low patient numbers for some treatments resulted in less statistical power. Additionally, the data included in the analysis were derived exclusively from RCTs, for which the study populations may not reflect a typical patient population seen in real-world practice. For example, trial results may be different in patients with oligoarthritis who are not well-represented in clinical trials.

Over the years covering our SLR, we acknowledge that patient populations and the PsA treatment landscape have evolved. After a thorough review of baseline patient characteristics, no significant differences were observed across the studies included in the NMA. To further mitigate uncertainty, baseline regression was used to actively correct for changes in the placebo rate over time ensuring a consistent and fair comparison across all included treatments. In addition, our analyses were conducted in separate b/tsDMARD-naïve and TNFi-experienced populations that reflect the evolving PsA patient population over time. Radiographic progression was not within the purview of this NMA because the NMA focused on a shorter timeframe than the 52-week duration typically recommended by the literature for investigating radiographic progression. Furthermore, there is existing literature on this topic, as exemplified by the work of Wang et al. in 2022 [ 47 ]. Nevertheless, the comprehensive and current evidence base, examination of multiple endpoints, and consistency with previous reported NMAs lend credence to our results.

Overall, the results of this NMA demonstrated the favourable relative efficacy and safety of bimekizumab 160 mg Q4W vs all approved treatments for PsA. Bimekizumab ranked high in terms of efficacy on joint, skin and MDA outcomes in both b/tsDMARD-naïve and TNFi-experienced patient populations, and showed comparable safety to other treatments. In the evolving PsA treatment landscape, bimekizumab 160 mg Q4W is a potentially beneficial treatment option for patients with PsA.

Supplementary material is available at Rheumatology online.

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

This study was funded in full by UCB Pharma.

Disclosure statement : P.J.M.: has received research grants from AbbVie, Amgen, BMS, Eli Lilly, Gilead, Janssen, Novartis, Pfizer, Sun Pharma and UCB Pharma; consultancy fees from AbbVie, Acelyrin, Aclaris, Amgen, BMS, Boehringer Ingelheim, Eli Lilly, Galapagos, Gilead, GSK, Janssen, Moonlake Pharma, Novartis, Pfizer, Sun Pharma and UCB Pharma; and speakers’ bureau for AbbVie, Amgen, Eli Lilly, Janssen, Novartis, Pfizer and UCB Pharma. L.C.C.: received grants/research support from AbbVie, Amgen, Celgene, Eli Lilly, Janssen, Novartis, Pfizer and UCB; worked as a paid consultant for AbbVie, Amgen, Bristol Myers Squibb, Celgene, Eli Lilly, Gilead, Galapagos, Janssen, Moonlake, Novartis, Pfizer and UCB; and has been paid as a speaker for AbbVie, Amgen, Biogen, Celgene, Eli Lilly, Galapagos, Gilead, GSK, Janssen, Medac, Novartis, Pfizer and UCB. D.D.G.: consultant and/or received grant support from Abbvie, Amgen, BMS, Celgene, Eli Lilly, Galapagos, Gilead, Janssen, Novartis, Pfizer and UCB. J.F.M.: consultant and/or investigator for AbbVie, Amgen, Biogen, BMS, Dermavant, Eli Lilly, Janssen, LEO Pharma, Novartis, Pfizer, Regeneron, Sanofi, Sun Pharma and UCB Pharma. P.N.: research grants, clinical trials and honoraria for advice and lectures on behalf of AbbVie, Boehringer Ingelheim, BMS, Eli Lilly, Galapagos/Gilead, GSK, Janssen, Novartis, Pfizer, Samsung, Sanofi and UCB Pharma. S.G. and V.L.-K.: employees of Cytel, Inc. which served as a consultant on the project. A.R.P., D.W. and V.T.: employees and stockholders of UCB Pharma.

The authors acknowledge Leah Wiltshire of Cytel for medical writing and editorial assistance based on the authors’ input and direction, Heather Edens (UCB Pharma, Smyrna, GA, USA) for publication coordination and Costello Medical for review management, which were funded by UCB Pharma. This analysis was funded by UCB Pharma in accordance with Good Publication Practice (GPP 2022) guidelines ( http://www.ismpp.org/gpp-2022 ). Data were previously presented at ISPOR-US 2023 (Boston, MA, USA, 7–10 May 2023).

Gossec L , Baraliakos X , Kerschbaumer A et al.  EULAR recommendations for the management of psoriatic arthritis with pharmacological therapies: 2019 update . Ann Rheum Dis 2020 ; 79 : 700 – 12 .

Google Scholar

Coates LC , Soriano ER , Corp N et al. ; GRAPPA Treatment Recommendations domain subcommittees . Group for Research and Assessment of Psoriasis and Psoriatic Arthritis (GRAPPA): updated treatment recommendations for psoriatic arthritis 2021 . Nat Rev Rheumatol 2022 ; 18 : 465 – 79 .

Fitzgerald O , Ogdie A , Chandran V et al.  Psoriatic arthritis . Nat Rev Dis Primers 2021 ; 7 : 59 .

Coates LC , Soriano ER , Corp N et al.  Treatment recommendations domain subcommittees. Group for Research and Assessment of Psoriasis and Psoriatic Arthritis (GRAPPA): updated treatment recommendations for psoriatic arthritis 2021 . Nat Rev Rheumatol 2022 ; 27 : 1 – 15 .

Najm A , Goodyear CS , McInnes IB , Siebert S. Phenotypic heterogeneity in psoriatic arthritis: towards tissue pathology-based therapy . Nat Rev Rheumatol 2023 ; 19 : 153 – 65 .

Ogdie A , Weiss P. The epidemiology of psoriatic arthritis . Rheum Dis Clin North Am 2015 ; 41 : 545 – 68 .

Alinaghi F , Calov M , Kristensen LE et al.  Prevalence of psoriatic arthritis in patients with psoriasis: a systematic review and meta-analysis of observational and clinical studies . J Am Acad Dermatol 2019 ; 80 : 251 – 65.e19 .

Singh JA , Guyatt G , Ogdie A et al.  Special Article: 2018 American college of rheumatology/national psoriasis foundation guideline for the treatment of psoriatic arthritis . Arthritis Rheumatol 2019 ; 71 : 5 – 32 .

Coates LC , Robinson DE , Orbai AM et al.  What influences patients' opinion of remission and low disease activity in psoriatic arthritis? Principal component analysis of an international study . Rheumatology (Oxford) 2021 ; 60 : 5292 – 9 .

Gondo G , Mosca M , Hong J et al.  Demographic and clinical factors associated with patient-reported remission in psoriatic arthritis . Dermatol Ther (Heidelb) 2022 ; 12 : 1885 – 95 .

Glatt S , Baeten D , Baker T et al.  Dual IL-17A and IL-17F neutralisation by bimekizumab in psoriatic arthritis: evidence from preclinical experiments and a randomised placebo-controlled clinical trial that IL-17F contributes to human chronic tissue inflammation . Ann Rheum Dis 2018 ; 77 : 523 – 32 .

UCB Pharma S.A . Bimzelx ® (bimekizumab): Summary of Product Characteristics. 2023 . https://www.ema.europa.eu/en/medicines/human/EPAR/bimzelx (26 June 2023, date last accessed).

Adams R , Maroof A , Baker T et al.  Bimekizumab, a novel humanized IgG1 antibody that neutralizes both IL-17A and IL-17F . Front Immunol 2020 ; 11 : 1894 .

Burns LA , Maroof A , Marshall D et al.  Presence, function, and regulation of IL-17F-expressing human CD4(+) T cells . Eur J Immunol 2020 ; 50 : 568 – 80 .

Kuestner R , Taft D , Haran A et al.  Identification of the IL-17 receptor related molecule IL-17RC as the receptor for IL-17F . J Immunol 2007 ; 179 : 5462 – 73 .

Johansen C , Usher PA , Kjellerup RB et al.  Characterization of the interleukin-17 isoforms and receptors in lesional psoriatic skin . Br J Dermatol 2009 ; 160 : 319 – 24 .

Shah M , Maroof A , Gikas P et al.  Dual neutralisation of IL-17F and IL-17A with bimekizumab blocks inflammation-driven osteogenic differentiation of human periosteal cells . RMD Open 2020 ; 6 : e001306 .

Reich K , Warren RB , Lebwohl M et al.  Bimekizumab versus Secukinumab in Plaque Psoriasis . New Engl J Med 2021 ; 385 : 142 – 52 .

McInnes IB , Asahina A , Coates LC et al.  Bimekizumab in patients with psoriatic arthritis, naive to biologic treatment: a randomised, double-blind, placebo-controlled, phase 3 trial (BE OPTIMAL) . Lancet 2023 ; 401 : 25 – 37 .

Merola JF , Landewe R , McInnes IB et al.  Bimekizumab in patients with active psoriatic arthritis and previous inadequate response or intolerance to tumour necrosis factor-alpha inhibitors: a randomised, double-blind, placebo-controlled, phase 3 trial (BE COMPLETE) . Lancet 2023 ; 401 : 38 – 48 .

Page MJ , McKenzie JE , Bossuyt PM et al.  The PRISMA 2020 statement: an updated guideline for reporting systematic reviews . BMJ 2021 ; 372 : n71 .

Higgins JP. Cochrane handbook for systematic reviews of interventions. Vol. 2. 2nd edn. Chichester, UK: John Wiley & Sons, 2019 .

National Institute for Health and Care Excellence . The guidelines manual: Process and methods [PMG6]. 2012 . https://www.nice.org.uk/process/pmg6/chapter/introduction (3 March 2023, date last accessed).

Booth AM , Wright KE , Outhwaite H. Centre for Reviews and Dissemination databases: value, content, and developments . Int J Technol Assess Health Care 2010 ; 26 : 470 – 2 .

Sterne JAC , Savovic J , Page MJ et al.  RoB 2: a revised tool for assessing risk of bias in randomised trials . BMJ 2019 ; 366 : l4898 .

Daly C , Dias S , Welton N , Anwer S , Ades A. NICE Guidelines Technical Support Unit. Meta-Analysis: Guideline Methodology Document 1 (Version 1). 2021 . http://www.bristol.ac.uk/population-health-sciences/centres/cresyda/mpes/nice/guideline-methodology-documents-gmds/ (1 March 2023, date last accessed).

Dias S , Caldwell DM. Network meta-analysis explained . Archi Dis Childhood Fetal Neonatal Ed 2019 ; 104 : F8 – F12 .

Merola JF , Lockshin B , Mody EA. Switching biologics in the treatment of psoriatic arthritis . Semin Arthritis Rheum 2017 ; 47 : 29 – 37 .

Openbugs (website) 2014 . http://www.openbugs.net/w/FrontPage (6 April 2023, date last accessed).

Lunn D , Spiegelhalter D , Thomas A , Best N. The BUGS project: evolution, critique and future directions . Stat Med 2009 ; 28 : 3049 – 67 .

Dias S , Welton NJ , Sutton AJ , Ades AE. NICE DSU technical support document 2: a generalised linear modelling framework for pairwise and network meta-analysis of randomised controlled trials. London. 2016 . https://www.sheffield.ac.uk/nice-dsu/tsds/full-list (25 January 2023, date last accessed).

Dias S , Welton N , Sutton AJ , Caldwell DM , Lu G , Ades AE. NICE DSU technical support document 4: inconsistency in networks of evidence based on randomised controlled trials. 2011 . https://www.sheffield.ac.uk/nice-dsu/tsds/full-list (25 January 2023, date last accessed).

Dias S , Sutton AJ , Welton N , Ades AE. NICE DSU Technical support document 3: heterogeneity: subgroups, meta-regression, bias and bias-adjustment. 2011 . https://www.sheffield.ac.uk/nice-dsu/tsds/full-list (25 January 2023, date last accessed).

Mease PJ , McInnes IB , Tam LS et al.  Comparative effectiveness of guselkumab in psoriatic arthritis: results from systematic literature review and network meta-analysis . Rheumatology (Oxford) 2021 ; 60 : 2109 – 21 .

McInnes IB , Sawyer LM , Markus K et al.  Targeted systemic therapies for psoriatic arthritis: a systematic review and comparative synthesis of short-term articular, dermatological, enthesitis and dactylitis outcomes . RMD Open 2022 ; 8 : e002074 .

Ruyssen-Witrand A , Perry R , Watkins C et al.  Efficacy and safety of biologics in psoriatic arthritis: a systematic literature review and network meta-analysis . RMD Open 2020 ; 6 : e001117 .

Pfizer Inc . XELJANZ ® (tofacitinib): Summary of Product Characteristics. 2022 . https://www.ema.europa.eu/en/medicines/human/EPAR/xeljanz (4 May 2023, date last accessed).

Pfizer Inc . XELJANZ ® (tofacitinib): US Prescribing Information. 2022 . https://labeling.pfizer.com/ShowLabeling.aspx?id=959 (4 May 2023, date last accessed).

Abbvie Inc . RINVOQ ® (upadacitinib) extended-release tablets, for oral use: US Prescribing Information. 2023 . https://www.rxabbvie.com/pdf/rinvoq_pi.pdf (4 May 2023, date last accessed).

AbbVie Deutschland GmbH & Co . KG. RINVOQ ® (upadacitinib): Summary of Product Characteristics. 2023 . https://www.ema.europa.eu/en/medicines/human/EPAR/rinvoq (4 May 2023, date last accessed).

Warren RB , Blauvelt A , Bagel J et al.  Bimekizumab versus Adalimumab in Plaque Psoriasis . N Engl J Med 2021 ; 385 : 130 – 41 .

Reich K , Papp KA , Blauvelt A et al.  Bimekizumab versus ustekinumab for the treatment of moderate to severe plaque psoriasis (BE VIVID): efficacy and safety from a 52-week, multicentre, double-blind, active comparator and placebo controlled phase 3 trial . Lancet 2021 ; 397 : 487 – 98 .

Gladman DD , Orbai AM , Gomez-Reino J et al.  Network meta-analysis of tofacitinib, biologic disease-modifying antirheumatic drugs, and apremilast for the treatment of psoriatic arthritis . Curr Ther Res Clin Exp 2020 ; 93 : 100601 .

Qiu M , Xu Z , Gao W et al.  Fourteen small molecule and biological agents for psoriatic arthritis: a network meta-analysis of randomized controlled trials . Medicine (Baltimore) 2020 ; 99 : e21447 .

Kawalec P , Holko P , Mocko P , Pilc A. Comparative effectiveness of abatacept, apremilast, secukinumab and ustekinumab treatment of psoriatic arthritis: a systematic review and network meta-analysis . Rheumatol Int 2018 ; 38 : 189 – 201 .

Gossec L , McGonagle D , Korotaeva T et al.  Minimal disease activity as a treatment target in psoriatic arthritis: a review of the literature . J Rheumatol 2018 ; 45 : 6 – 13 .

Wang SH , Yu CL , Wang TY , Yang CH , Chi CC. Biologic disease-modifying antirheumatic drugs for preventing radiographic progression in psoriatic arthritis: a systematic review and network meta-analysis . Pharmaceutics 2022 ; 14 .

Supplementary data

Month: Total Views:
January 2024 715
February 2024 637
March 2024 596
April 2024 535
May 2024 462
June 2024 350
July 2024 179

Email alerts

Citing articles via.

  • Rheumatology X/Twitter
  • BSR Instagram
  • BSR Facebook
  • Recommend to Your Librarian

Affiliations

  • Online ISSN 1462-0332
  • Print ISSN 1462-0324
  • Copyright © 2024 British Society for Rheumatology
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

  • Open access
  • Published: 04 July 2024

Post-COVID syndrome prevalence: a systematic review and meta-analysis

  • Ruhana Sk Abd Razak 1 ,
  • Aniza Ismail 1 , 2 ,
  • Aznida Firzah Abdul Aziz 3 ,
  • Leny Suzana Suddin 4 ,
  • Amirah Azzeri 5 &
  • Nur Insyirah Sha’ari 1  

BMC Public Health volume  24 , Article number:  1785 ( 2024 ) Cite this article

84 Accesses

2 Altmetric

Metrics details

Since the Coronavirus disease 2019 (COVID-19) pandemic began, the number of individuals recovering from COVID-19 infection have increased. Post-COVID Syndrome, or PCS, which is defined as signs and symptoms that develop during or after infection in line with COVID-19, continue beyond 12 weeks, and are not explained by an alternative diagnosis, has also gained attention. We systematically reviewed and determined the pooled prevalence estimate of PCS worldwide based on published literature.

Relevant articles from the Web of Science, Scopus, PubMed, Cochrane Library, and Ovid MEDLINE databases were screened using a Preferred Reporting Items for Systematic Reviews and Meta-Analyses-guided systematic search process. The included studies were in English, published from January 2020 to April 2024, had overall PCS prevalence as one of the outcomes studied, involved a human population with confirmed COVID-19 diagnosis and undergone assessment at 12 weeks post-COVID infection or beyond. As the primary outcome measured, the pooled prevalence of PCS was estimated from a meta-analysis of the PCS prevalence data extracted from individual studies, which was conducted via the random-effects model. This study has been registered on PROSPERO (CRD42023435280).

Forty eight studies met the eligibility criteria and were included in this review. 16 were accepted for meta-analysis to estimate the pooled prevalence for PCS worldwide, which was 41.79% (95% confidence interval [CI] 39.70–43.88%, I 2  = 51%, p  = 0.03). Based on different assessment or follow-up timepoints after acute COVID-19 infection, PCS prevalence estimated at ≥ 3rd, ≥ 6th, and ≥ 12th months timepoints were each 45.06% (95% CI: 41.25–48.87%), 41.30% (95% CI: 34.37–48.24%), and 41.32% (95% CI: 39.27–43.37%), respectively. Sex-stratified PCS prevalence was estimated at 47.23% (95% CI: 44.03–50.42%) in male and 52.77% (95% CI: 49.58–55.97%) in female. Based on continental regions, pooled PCS prevalence was estimated at 46.28% (95% CI: 39.53%-53.03%) in Europe, 46.29% (95% CI: 35.82%-56.77%) in America, 49.79% (95% CI: 30.05%-69.54%) in Asia, and 42.41% (95% CI: 0.00%-90.06%) in Australia.

The prevalence estimates in this meta-analysis could be used in further comprehensive studies on PCS, which might enable the development of better PCS management plans to reduce the effect of PCS on population health and the related economic burden.

Peer Review reports

The novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) that first emerged in December 31st 2019 in Wuhan, China, causes the infectious coronavirus disease 2019 (COVID-19) [ 1 , 2 ]. The World Health Organization (WHO) declared COVID-19 a Public Health Emergency of International Concern (PHEIC) on 30 January 2020, then a pandemic on 11 March 2020 [ 3 , 4 ]. The COVID-19 pandemic has resulted in an increasing number of people recovering from SARS-CoV-2 acute infection [ 5 ]. COVID-19 patients might typically recover within a few weeks after symptom onset. However, some patients might experience health-related effects in the longer-term. Widely known as long COVID and post-COVID-19 condition, the conditions that occur post-COVID infection are also referred to with other terms, namely PCS, post-COVID-19 syndrome, long-haul COVID, post-acute COVID-19, long-term effects of COVID, or chronic COVID [ 6 , 7 , 8 , 9 , 10 , 11 , 12 ]. The WHO defined the post-COVID-19 condition as symptoms that occur at least 3 months after probable or confirmed SARS-CoV-2 infection that persist for at least 2 months and cannot be explained by an alternative diagnosis [ 13 ]. The symptoms might fluctuate, relapse, persist from the initial infection, or might also be new-onset after recovery from the acute illness [ 13 ]. In a COVID-19 rapid guideline, the National Institute for Health and Care Excellence (NICE), the Royal College of General Practitioners (RCGP), and the Scottish Intercollegiate Guidelines Network (SIGN) classified long COVID as “ongoing symptomatic COVID-19” and “post-COVID-19 syndrome”. Ongoing symptomatic COVID-19 is defined as signs and symptoms that persist 4–12 weeks after acute COVID-19, while post-COVID-19 syndrome is defined as signs and symptoms that develop during or after an infection in line with COVID-19 that continue for > 12 weeks and are not explained by an alternate diagnosis [ 14 ]. Given the increasing number of COVID-19 survivors, the above terms have gained widespread recognition in the scientific and medical communities [ 10 , 11 ].

Post-recovery symptoms have become of increasing concern to more COVID-19 survivors [ 6 ]. Several studies have determined that COVID-19 exerts a wide range of long-term effects on virtually all body systems, including the respiratory, cardiovascular, neurological, gastrointestinal, psychiatric, and dermatological systems [ 6 ]. Cough and fatigue are among the multiorgan symptoms described following COVID-19 infection, as are shortness of breath, headache, palpitations, chest discomfort, joint pain, physical limits, depression, and insomnia [ 7 ]. A published review revealed that hepatic and gastrointestinal ( n  = 6), cardiovascular ( n  = 9), musculoskeletal and rheumatologic ( n  = 22), respiratory ( n  = 27), and neurologic and psychiatric ( n  = 41) issues were the most prevalent late complications which might occur post COVID-19 infection [ 15 ]. Certain risk factors such as older age and biological sex cannot be changed, thus management of other preventable and manageable risk factors like chronic comorbidities, may benefit the high-risk people from developing the persistent COVID-19 symptoms, even after few months post-acute COVID-19 infection. Epidemiological studies and related clinical trials addressing leading hypotheses may aid in the development of good management practices, including effective prevention and early intervention strategies to control the risk factors and manage the complications [ 16 ]. Regular disease surveillance and monitoring, implementation of related health promotion strategies, plus continuous efforts in researching for the best vaccines and treatment options may help in lowering the prevalence of PCS [ 17 , 18 ].

An increasing number of published studies have focused on PCS. However, robust studies on this dynamic post-COVID condition are still required to identify the risk factors; explore the underlying aetiology; and plan strategies for preventative, rehabilitation, clinical, and public health management to enhance COVID-19 recovery and long-term outcomes [ 12 ]. Such studies should be conducted using the most recent data on PCS prevalence. Therefore, the present study systematically reviewed and determined the pooled prevalence of PCS worldwide based on current published literature.

Study design

Articles related to PCS and the prevalence data available worldwide were obtained using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) framework. The review protocol was registered with PROSPERO (CRD42023435280). All authors have a background in the related field and contributed collectively to meeting the study objective. The research question was developed, then a systematic search was conducted to identify and screen eligible studies based on the inclusion and exclusion criteria. Articles were identified from five primary databases. Relevant information and data were extracted from available full-text primary articles to answer the research question. The methodological quality of the included articles was assessed with the Joanna Briggs Institute (JBI) critical appraisal checklist. Subsequently, a meta-analysis was conducted to estimate the pooled prevalence of PCS worldwide.

Outcomes and measures

The overall prevalence estimates of any persistent health conditions and symptoms at ≥ 12 weeks after the index date were set as the primary outcome. The 12-week timeframe was adopted to conform with the clinical definition of PCS, which is symptoms and signs that develop after or during infection consistent with COVID-19, not clarified by different diagnosis, and continue beyond 12 weeks.

Inclusion and exclusion criteria

A set of inclusion and exclusion criteria was utilized as a basis for the identification and selection of relevant articles for this systematic review and meta-analysis study. The inclusion criteria were: 1) availability of full text; 2) article was written in English language; 3) article was published within 1 January 2020 to 27 April 2024; 4) study was related to prolonged post-COVID-19 conditions, and used human populations with COVID-19 diagnosis confirmed using PCR, antibody testing, or a clinical diagnosis; 5) study had an index date using the COVID-19 onset date, first test or diagnosis, hospitalisation date, or discharge date; and 6) study had adequate data on the estimates of overall PCS prevalence in a community, i.e. studies which not only focused on the prevalence of a specific PCS symptom as their only outcome. This was to ensure that the primary outcome in this meta-analysis, which is the pooled overall prevalence of PCS is derived only from studies with identical outcomes, besides limiting the probabilities of any bias resulting from including studies which only published symptom-specific PCS prevalence data estimates. Another inclusion criteria used was 7) assessment date, or follow-up or clinical check-up date at least 12 weeks after the index date. Meanwhile, the exclusion criteria were non-accessible articles and publications with content unrelated to the research question. Non-primary publications such as book chapters or letters to editor, and case reports were also excluded.

Search strategy

The search terms used in the article identification stage were derived from medical subject heading (MeSH) terms and synonyms related to the review topic. Then, two authors (RR and NIS) conducted a systematic search of the abovementioned databases using the search strings developed from combining the identified search terms and Boolean operators. The search string used was (("PCS" OR "post COVID syndrome" OR "post COVID-19 syndrome" OR "post COVID condition*" OR "post COVID-19 condition*" OR "post COVID" OR "post-COVID" OR "post COVID-19" OR "post-COVID-19" OR "post COVID sequela" OR "post-COVID sequela" OR "post COVID sequelae" OR "post-COVID sequelae" OR "long COVID" OR "long-COVID" OR "long haul*" OR "long-haul*" OR "long COVID-19" OR "long-COVID-19" OR "covid syndrome" OR "covid-19 syndrome" OR "post-acute COVID-19 syndrome" OR "post acute COVID-19" OR "post acute COVID" OR "chronic COVID" OR "chronic COVID-19" OR "persistent COVID" OR "persistent post-COVID" OR "persistent post COVID" OR "prolonged COVID" OR "prolonged post-COVID" OR "prolonged post COVID") AND ("prevalence*")). Available filters based on the inclusion and exclusion criteria were applied during the database search.

Data sources

Relevant articles searched and identified from five databases (Web of Science [WOS], Scopus, PubMed, Cochrane Library, Ovid MEDLINE) on 29 April 2024, were downloaded by author RR and collected in Mendeley Desktop version 1.19.8. Subsequently, duplicates were identified and removed by author NIS, and the shortlisted articles were transferred to Microsoft Excel for further screening.

Study selection

Relevant studies were selected via a screening process conducted by two authors, who independently screened the article titles and abstracts, then retrieved the full text of shortlisted articles. Efforts to include all available studies were made and included accessing publications via institutional accounts. Subsequently, two authors (RR and NIS) examined the full texts of potential eligible papers separately, followed by discussions and re-evaluation among them to resolve any contradictory decisions. A third author (AI) was also employed in this process, when there are uncertainties in the decision-making process.

Data extraction

Two authors (RR and NIS) then extracted and tabulated the relevant data elements (article title, authors, publication year, study design, country, study population, study setting, sample size and number of cases identified, duration from index to assessment date, PCS prevalence estimates).

Methodological quality assessment

The methodological quality of the studies was evaluated with the Joanna Briggs Institute (JBI) Critical Appraisal Checklist for Studies Reporting Prevalence Data to ascertain how well the article addressed the possibility of bias. All articles screened and selected for inclusion in this systematic review were appraised by two critical appraisers (RR and NIS). The JBI checklist contains 9 items which comprised of 1 question each; (Item 1: Was the sample frame appropriate to address the target population?), (Item 2: Were study participants sampled in an appropriate way?), (Item 3: Was the sample size adequate?), (Item 4: Were the study subjects and the setting described in detail?), (Item 5: Was the data analysis conducted with sufficient coverage of the identified sample?), (Item 6: Were valid methods used for the identification of the condition?), (Item 7: Was the condition measured in a standard, reliable way for all participants?), (Item 8: Was there appropriate statistical analysis?), (Item 9: Was the response rate adequate, and if not, was the low response rate managed appropriately?). Each item is coded as “yes/no/unclear/not applicable”. Each of these items is assessed by scoring (yes = 1), (no = 0), and (unclear or not applicable = 0). The total score of each included study was presented as percentages, which then categorized into 3 levels of risk of bias: (20–50% items scored yes = high risk of bias), (50–80% items scored yes = moderate risk of bias), and (80–100% items scored yes = low risk of bias). Based on the assessment result, both appraisers discussed and finalised the decision on the overall appraisal, i.e., whether to include the assessed study in the review.

Statistical analysis

The meta-analysis was conducted using the metaprop function in the R 4.3.1 meta package. Due to the heterogeneity of the included studies as resulted from differences in studied populations’ factors, varied geographical regions and PCS assessment timepoints, a random-effects model was considered as the better choice for assigning weights to each study in the meta-analysis. The pooled prevalence and effect sizes for each study were included in a forest plot, where the size of each study was proportional to its weight. Statistical heterogeneity was measured with I 2 statistics versus p-values, where a p-value of 0.05 and an I 2 of ≥ 50% indicated high heterogeneity. Visual inspection of the generated funnel plot’s symmetricity was conducted to determine any influence of publication bias on the findings. Egger’s test and Begg rank correlation test were also conducted for further identification of the presence of any asymmetricities.

Overall, a total of 3321 records were identified from the main literature search conducted in end of April 2024, of which 907 duplicate articles were removed. Screening of the article titles and abstracts resulted in 2325 articles unrelated to the research question being excluded. All remaining articles were retrieved to determine their accessibility, of which 89 successfully retrieved full-text articles were reviewed and assessed for eligibility. Articles with contents not relevant to this study were excluded. Studies with sample populations with mean or median prolonged signs or symptoms, or health care utilisation, or follow-up time < 12 weeks from acute COVID-19 symptom onset were excluded to ensure that the samples with persistent COVID-19 symptoms in the finalised studies met the definition of PCS. A total of 41 articles were excluded, as these studies and their contents did not align with the review topic or the other inclusion and exclusion criteria. Finally, 48 articles were included in this review. The PRISMA flow diagram in Fig.  1 depicts the literature selection process and search criteria, and the number of articles involved for each process.

figure 1

Flowchart showing the study selection process and number of results

Study characteristics and PCS prevalence

Table 1 presents the study characteristics of the 48 included studies, including the overall PCS prevalence data from each study. Among those studies, 21 were from European countries, 14 studies were from American region, 10 were from Asia, two were from Australia, and one study from African continent. Forty one included studies were cohort studies, 5 were cross-sectional and 2 were case–control studies. The studies involved sample sizes of 106–124313 individuals diagnosed with COVID-19 at least 12 weeks prior to the assessment date. The index date-to-assessment date duration ranged from 12 weeks to 25.5 months. Among the included studies, 10 studies focused mainly on the previously hospitalized COVID-19 patients and 1 study researched on PCS among the non-hospitalized COVID-19 patients. Majority of the included studies studied both previously hospitalized and non-hospitalized COVID-19 patients, as shown in 35 studies in Table  1 . Most of the examined populations in the 48 included studies were adult-aged, while the percentage of female participants varied from 26.5% to 77.5%.

Joanna Briggs Institute (JBI) Critical Appraisal Checklist for Studies Reporting Prevalence Data was used to assess the methodological quality of the included studies. The assessment results reflect the methodological quality and risk of bias levels of the individual studies, which were categorized into low (80%-100% scores), moderate (50%–80% scores), and high (20%–50% scores) risk of bias levels. The assessment result aids in finalizing the decision on the overall review of the individual studies, i.e., whether to include the assessed study in the review. Based on the checklist, majority of the 48 included articles in this review were of high methodological quality, with low risk of bias. The risk of bias levels for each study were listed under the last column titled ‘Risk of Bias’ in Table  1 (Summary of characteristics of the 48 included studies table). All 48 assessed studies were accepted to be included in this review.

Pooled prevalence estimate of post-COVID syndrome

As shown by the forest plot in Fig.  2 , the prevalence estimates of PCS reported in the 48 included individual studies ranged from 3.4% to 90.41%. Due to the significant high heterogeneity (I 2  = 100%, p  = 0) and presence of funnel plot asymmetry indicated by Egger’ test observed if meta-analysis was to be conducted using the prevalence data from all 48 included studies, only 16 studies were accepted for meta-analysis of the overall PCS prevalence, after excluding potential influential outliers based on the influence analyses done, including leave-one out analyses, risk of bias assessment for studies, and influential outliers.

figure 2

Forest plot presenting the Post-COVID Syndrome (PCS) prevalence data from all 48 studies

In the meta-analysis conducted using the 16 allowed studies, the pooled prevalence of PCS estimated by random-effects model using data from the 16 studies was 41.79% (95% CI: 39.70%-43.88%). The forest plot shown in Figure  3 depicts the results derived from the random effects model, while Fig. 4  shows the funnel plot for the publication bias assessment of the 16 studies.

figure 3

Forest plot presenting the Post-COVID Syndrome (PCS) pooled prevalence

figure 4

Funnel plot for the publication bias assessment of the 16 studies

Assessment of heterogeneity

Generally, heterogeneity is to be expected in a meta-analysis [ 67 ]. I 2 was used to measure heterogeneity, with limits of ≥ 25%, ≥ 50%, and ≥ 75% each denoting low, moderate, and high heterogeneity. The meta-analysis conducted using random-effects model to calculate the pooled-prevalence of PCS in this study revealed significant mild to moderate heterogeneity across the included studies (I 2  = 52%, p  < 0.01). The variance in the underlying distribution of true effect sizes, or the between-study heterogeneity, was estimated at τ 2  = 0.0009. In meta-analyses, heterogeneity is frequently unavoidable due to variations in study quality, methodology, sample size, and participant inclusion criteria [ 49 , 68 ].

Assessment of publication bias

Publication bias might occur when journals and authors only publish articles with the outcome of interest and can be detected by visual inspection of funnel plots. As shown in Fig.  4 , publication bias was visually implied from the asymmetrical funnel plot. However, further analysis using Egger’s test did not indicate the presence of funnel plot asymmetry, although it was not statistically significant ( p  = 0.4661). Begg rank correlation test results was also not significant with p-value of 0.7871. These formal tests findings suggested that the results were not influenced by publication bias. Nevertheless, any visual asymmetry in the funnel plot might also be caused by true heterogeneity other than publication bias [ 69 ].

PCS prevalence at different Post-COVID assessment timepoints

To assess if the pooled prevalence of PCS was increasing over time after the acute COVID-19 infection, we stratified the included studies based on different assessment or follow-up timepoints. A subgroup analysis was performed to get the PCS pooled prevalence at ≥ 3rd, ≥ 6th, and ≥ 12th months post-COVID-19 infection. As shown in Fig.  5 , the estimated Post-COVID Syndrome pooled prevalences at ≥ 3rd, ≥ 6th, and ≥ 12th months timepoints were each 45.06% (95% CI: 41.25%-48.87%, I 2  = 59%, p  = 0.02), 41.30% (95% CI: 34.37%-48.24%, I 2  = 87%, p  < 0.01), and 41.32% (95% CI: 39.27%-43.37%, I 2  = 21%, p  < 0.27), respectively.

figure 5

Forest plot showing the Post-COVID Syndrome prevalence at different assessment timepoints

Post-COVID syndrome prevalence in male and female

Further subgroup analysis was conducted to examine the PCS prevalences among male and female. For this purpose, data from 10 articles out of all 48 included articles were allowed for the subgroup analysis, after the exclusion of influential outliers to estimate the pooled prevalences with less amount of heterogeneity. As shown in Fig.  6 , the estimated Post-COVID Syndrome prevalence were 47.23% in male (95% CI: 44.03%–50.42%), and 52.77% in female (95% CI: 49.58%-55.97%). The studies had significant moderate heterogeneity with I 2  = 51%, p  = 0.03.

figure 6

Forest plot showing the Post-COVID Syndrome prevalence in both male and female sex

Post-COVID syndrome prevalence in different continental regions

Another subgroup analysis based on stratification of PCS prevalence by continental regions was also performed. For this purpose, data from all 48 articles were included in the analysis.

The estimated Post-COVID Syndrome prevalences according to the continental regions were shown in Fig.  7 . The pooled prevalence was 46.28% (95% CI: 39.53%-53.03%) in Europe, 46.29% (95% CI: 35.82%-56.77%) in America, 49.79% (95% CI: 30.05%-69.54%) in Asia, and 42.41% (95% CI: 0.00%-90.06%) in Australia. Only one study from African continent was included in this review, with PCS prevalence reported at 50.33% (95% CI: 44.55%-56.11%). Most of the subgroups showed a significant heterogeneity level with I 2  = 100%, p  < 0.01.

figure 7

Forest plot showing the Post-COVID Syndrome prevalence in different continental regions

Post-COVID syndrome (PCS)

In this systematic review and meta-analysis, the term described by NICE; which defined PCS as signs and symptoms that develop during or after an infection in line with COVID-19 that continue for > 12 weeks and are not explained by an alternate diagnosis, was used as a basis to identify the overall PCS prevalence data [ 14 ] from published studies worldwide. The cut-off point of 12 weeks was strictly used to extract and analyse the relevant data during the systematic review process.

Overall prevalence estimates of PCS worldwide

In this review, a total of 2414 published articles were screened from 3321 articles identified from 5 databases using a PRISMA-guided systematic search. The meta-analysis of 48 included studies that individually reported PCS prevalence data determined that the estimated pooled prevalence of PCS worldwide was 41.79% (95% CI: 39.70%-43.88%). Besides the articles included in this meta-analysis, other notable published studies reporting PCS prevalence data might have been missed due to some limitations in our study, including the suitability of the articles for meta-analysis and the strict inclusion criteria.

The local prevalences reported globally varied, contributing to the high level of possibility for true heterogeneity when meta-analysed. Among the factors causing the variation of the reported prevalence data was the differences of post-COVID-19 assessment timepoints used in each individual studies. Generally, most related published studies reported the prevalence of persisting COVID-19 symptoms at 3, 6, 9, 12, 18 and even 24 months after the onset of acute COVID-19 infection. In this meta-analysis, the follow-up or assessment timepoints were categorized into ≥ 3rd, ≥ 6th, and ≥ 12th months after the index date, whereby the pooled prevalence estimates were 45.06%, 41.30% and 41.32% each, respectively. A cross-sectional study in Malaysia reported that 21.1% or approximately 1 in 5 COVID-19 survivors reported persistent ill health > 3 months post-COVID infection [ 70 ]. A study in India reported that 9.4% of people had long-term symptoms after COVID-19 [ 71 ]. Two studies in Saudi Arabia by Jabali et al. and Alkwai et al. reported approximately 49% and 51.2% overall PCS prevalence, respectively, while two studies in Turkey by Baris et al. and Kayaaslan et al. reported approximately 27.1% and 47.5% prevalence, respectively [ 6 , 72 , 73 , 74 ]. In the Republic of South Korea, Kim et al. reported 52.7% prevalence for post-acute COVID-19 syndrome 12 months after COVID-19 infection [ 75 ]. A study in Japan reported 56.14% prevalence [ 76 ], while a study in Mexico reported high prevalence of 68% at approximately 90 days post-COVID infection [ 77 ]. In Canada, Estrada et al. reported 28.5% prevalence of persistent post-COVID-19 symptoms 90 days after infection [ 78 ]. A large retrospective cohort study in the UK reported an overall prevalence of 36.55% [ 8 ], while another UK study reported that 2.3% of COVID-19 survivors reported symptoms persisting for ≥ 12 weeks [ 79 ]. Three different post-COVID studies in Germany reported an overall prevalence of 6.5%, 8.3%, and 49.3%, respectively [ 80 , 81 , 82 ]. Boscolo-Rizzo et al. reported that 53% of Italians reported chronic COVID-related symptoms 12 months following the onset of mild to moderate COVID [ 83 ], while 59.5% of people in Luxembourg reported at least one symptom 12 months after COVID infection [ 84 ]. Two different post-COVID studies in Spain reported 14.34% and 48% prevalence of persistent symptoms at 6 months post-COVID, respectively [ 85 , 86 ]. In the Netherlands, 12.7% of COVID-19 patients experienced persistent somatic symptoms that could be attributed to COVID-19 after a median 101 days after infection [ 87 ]. A cohort study in Switzerland stated that 26% of people with PCR-confirmed SARS-CoV-2 infection reported not having fully recovered after 6–8 months [ 88 ]. A prospective cohort study in Russia stated that 47.1% of previously hospitalised patients with COVID-19 reported persistent symptoms at a median 218 days post-discharge [ 89 ]. A prospective cohort study in France reported a higher prevalence at 60% [ 90 ]. A meta-analysis by Lopez-Leon et al. determined that 80% (95% CI: 65%–92%) of people diagnosed with COVID-19 developed at least one long-term symptom beyond 2 weeks and up to 110 days following acute COVID-19 infection [ 91 ]. A review by Chen et al. that meta-analysed post-COVID-19 condition prevalence at 120 days after COVID-19 infection revealed that the estimated global pooled prevalence was 49% (95% CI: 40%–59%) [ 92 ]. The review also estimated that the prevalence at 30, 60, 90, and 120 days after COVID-19 infection was 37% (95% CI: 26%–49%), 25% (95% CI: 15%–38%), 32% (95% CI: 14%–57%), and 49% (95% CI: 40%–59%), respectively [ 92 ]. Rahmati M. et al. also reported that a total of 41.7% of COVID-19 survivors experienced at least 1 unresolved symptom at 2-year after SARS-CoV-2 infection, and still suffer from either neurological, physical, and psychological sequela [ 93 ]. In another meta-analysis by O'Mahoney L. L. et al., which included studies with mean follow-up 126 days post-COVID-19 infection, at least 45% of those survived, went on to experience at least one unresolved symptom, regardless of hospitalisation status [ 94 ]. The 41.79% pooled prevalence of PCS worldwide estimated in this review is quite in line with most of the reported pooled-prevalences in other meta-analyses.

Symptom-specific PCS prevalence

This review mainly focused on determining the pooled prevalence estimate of PCS in general, hence the strict inclusion criteria. In view of the higher bias expectation due to the criteria and keywords set for obtaining the primary outcome of this study, we did not conduct subgroup analyses for symptom-specific pooled prevalence estimates. Compared to the limited number of studies focusing mainly on the overall community-based PCS prevalence, numerous studies have focused on the symptom-specific prevalence estimates related to the conditions occurring post-COVID infection, although the varied terms used based on the initial infection-to-assessment date interval.

Regarding symptom-specific prevalence, the WHO study on the clinical case definition by a Delphi consensus noted that shortness of breath, tiredness, and cognitive impairment are among the typical symptoms of PCS, which might affect daily functioning [ 95 ]. A review of the sequelae of other coronavirus infections determined that fatigue, psychological symptoms, and respiratory symptoms were common among SARS and Middle East respiratory syndrome (MERS) survivors [ 96 ]. A comprehensive systematic review and meta-analysis reported that the most common symptoms at the 3- to < 6-month assessment were fatigue (32%), shortness of breath (25%), sleep disorder (24%), and difficulty focusing (22%) [ 97 ]. Moy et al. stated that the most frequently reported symptoms were fatigue, brain fog, anxiety, insomnia, and depression, with female patients presenting 58% higher probability (95% CI: 1.02, 2.45) of experiencing persistent symptoms [ 70 ].

Sociodemographic-specific PCS prevalence

For sociodemographic-specific prevalence, PCS prevalence was generally higher in the female population. Female patients were less likely to have recovered [ 88 ] and were more susceptible to prolonged symptoms compared to male patients [ 98 ]. However, some research suggested that there might be a referral bias due to the higher participation in follow-up care by female patients compared to male patients [ 99 ]. A cohort study in Moscow reported that women were associated with post-COVID conditions at the 6- and 12-month assessments (OR: 2.04, 95% CI: 1.57–2.65 and OR: 2.04, 95% CI: 1.54–2.69, respectively) [ 100 ]. Furthermore, women experienced moderate or severe dyspnoea more often than men (53.8% vs. 21.1%) [ 101 ]. Martin-Loeches et al. stated that women were 69% more likely to develop persistent post-COVID-19 symptoms than men [ 102 ]. Moreover, most patients with persistent symptoms post-COVID infection were female (63.8%) [ 22 ]. In China, women were more likely to experience fatigue and anxiety or depression at the 6-month follow-up after COVID-19 infection [ 103 ]. A prospective cohort study in Milan, Italy, reported that women had a threefold higher risk of having persistent COVID-19 symptoms [ 104 ]. A few studies suggested that hormones might be involved in perpetuating the hyperinflammatory status of the acute COVID-19 phase in female patients even after recovery [ 30 , 31 ]. While stronger immunoglobulin G (IgG) antibody production in female patients in the early phase of the illness might contribute to a more favourable outcome therein, it might also be involved in perpetuating disease manifestations [ 105 ]. In this study, sex-stratified PCS prevalence was estimated at 47.23% (95% CI: 44.03%-50.42%) in male and 52.77% (95% CI: 49.58%-55.97%) in female, which are in line with the findings from most publications with similar subject.

Populations with comorbidities such as respiratory problems, hypertension, and diabetes also had higher PCS prevalence, which indicated the role of these diseases in influencing the persistence of COVID-19 symptoms. Multiple studies also reported that high body mass index (BMI) was associated with higher hospitalisation rates and increased COVID-19 illness severity, resulting in a higher risk of developing persistent COVID-19 symptoms. Patients with known obese BMI were more likely to experience moderate or severe dyspnoea (37.5%) than those with BMI < 30 kg/m 2 (27.0%), leading to a higher risk for post-acute COVID-19 [ 101 ]. Studies conducted prior to the COVID-19 pandemic era also identified inadequate humoral and cellular immune responses to vaccination against various different viruses in individuals with higher BMI [ 106 , 107 ]. Another study reported a weak association between obesity and persisting fatigue post-COVID infection [ 108 ], even though this might have been due to the higher risk of chronic fatigue among overweight people, particularly obese individuals [ 109 ]. Apart from that, hospitalisation during the acute phase might also contribute towards higher PCS prevalence, whereby individuals hospitalised during the acute phase of the infection had higher prolonged symptom prevalence (54%) compared with non-hospitalised patients (34%). In addition to all of the reported cases, there are also a substantial number of undetected infections due to several circumstances, which include silent infections, diagnostic challenges, and underreporting [ 110 , 111 , 112 ].

Geographical region-specific PCS-prevalence

In this review, the estimated pooled prevalence based on continental regions was found highest in Asia (49.79%), followed by America (46.29%), Europe (46.28%), and Australia (42.41%). In a meta-analysis published in April 2022, which had focused on post-COVID-19 condition prevalence at > 28 days after infection, Chen et al. reported that the regional pooled prevalence estimates were highest in Asia 51% (95% CI: 37%-65%), followed by Europe 44% (95% CI: 32%-56%), and USA 31% (95% CI: 21%–43%). The regional differences described in another meta-analysis showed that the pooled prevalence among hospitalised population across continents was significantly higher in Europe 62.7% (95% CI: 56.5%–68.5%) compared to both North America 38.9% (95% CI:24.0%–56.3%) and Asia 40.9% (95% CI: 34.5%–47.7%) [ 94 ]. There were less studies on PCS prevalence in Australian and African continental regions published compared to Asian, European, and American regions. The fact that Australia is the only country in the Australian continent might be the cause of the smaller number of related publications from the region. For African region, a study included in this review reported that the prevalence of persistent symptoms 3 months following acute SARS-CoV-2 infection was 50.2% in Liberia [ 59 ]. Based on a meta-analysis conducted using long-COVID studies with 4-weeks minimum duration after the COVID-19 acute onset, Müller S. A. et. al. reported that the prevalence of long COVID in African countries varied widely, from 2% in Ghana to 86% in Egypt [ 113 ]. The scarcity of published studies on this health condition in African region might be due to varied factors influencing the reporting, including inadequate clinical data and diagnostics, accessibility to healthcare services and lack of awareness [ 113 ].

Strengths and limitations

Numerous post-COVID studies did not use similar term to refer to PCS. In this review, the inclusion criteria used in the study selection process allowed more PCS-specific prevalence data to be captured, contributing as a strength to this study. In addition, further few subgroup analyses conducted in this study allows more additional information on PCS prevalence based on the certain factors studied. Among the limitations in this study is that some of the studies potentially relevant for inclusion might not have been identified during the database search or might have been eliminated during the screening process, due to the different keywords and titles used. This review might have been subjected to language bias too, as only articles in English were included. Other limitation might include the issue of the high between-study heterogeneity in the meta-analysis, which might be a true heterogeneity due to various reasons such as differences in the assessment timepoints, the differences of sociodemographic factors worldwide, plus the smaller number of studies in certain geographical regions, such as studies in Australia as it is the only country in the continental region, and studies in resource-poor countries in Africa and certain parts of Asia.

Conclusions

This meta-analysis determined that the estimated pooled prevalence for PCS worldwide was 41.79% (95% CI: 39.70%-43.88%). The included studies had a significant moderate heterogeneity (I 2  = 51%, p  = 0.03). The estimated prevalence could be used in further related comprehensive studies, including more comprehensive analyses stratifying the prevalence based on symptom-specific risk factors too, which might enable the development of a better healthcare management plan for individuals with PCS. The provision of proper health, social, and economic protections for the higher-risk population is essential, as PCS affects population health and concurrently contributes to the higher economic burden on such patients and countries.

Availability of data and materials

Data relevant to the study were included in the article.

Abbreviations

Body mass index

Confidence Interval

Coronavirus disease 2019

Joanna Briggs Institute

Middle East Respiratory Syndrome

Medical Subject Headings

National Institute for Health and Care Excellence

Post-acute COVID-19 syndrome

Post-COVID Syndrome

Public Health Emergency of International Concern

Preferred Reporting Items for Systematic Reviews & Meta-Analyses

Royal College of General Practitioners

RNA-dependent RNA polymerase

Severe Acute Respiratory Syndrome Coronavirus 2

Scottish Intercollegiate Guidelines Network

Type 2 Diabetes Mellitus

World Health Organization

Web of Science

Tau-squared

Gorbalenya AE, Baker SC, Baric RS, de Groot RJ, Drosten C, Gulyaeva AA, et al. The species severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2. Nat Microbiol. 2020;5(4):536–44.

Article   Google Scholar  

Lee A. Wuhan novel coronavirus (COVID-19): why global control is challenging? Elsevier Public Heal Emerg Collect. 2020;179:A1.

CAS   Google Scholar  

Summers J, Cheng HY, Lin HH, Barnard LT, Kvalsvig A, Wilson N, et al. Potential lessons from the Taiwan and New Zealand health responses to the COVID-19 pandemic. Lancet Reg Heal – West Pacific. 2020;4:44.

Google Scholar  

World Health Organization. WHO Director-General’s Opening Remarks at the Media Briefing on COVID-19 - 11 March 2020. 2020.

Ministry of Health Malaysia. Post Covid-19 Management Protocol. 2021.

Kayaaslan B, Eser F, Kalem AK, Kaya G, Kaplan B, Kacar D, et al. Post-COVID syndrome: a single-center questionnaire study on 1007 participants recovered from COVID-19. J Med Virol. 2021;93(12):6566–74.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Lancet T. Facing up to long COVID. Lancet. 2020;396(10266):1861.

Taquet M, Dercon Q, Luciano S, Geddes JR, Husain M, Harrison PJ. Incidence, co-occurrence, and evolution of long-COVID features: a 6-month retrospective cohort study of 273,618 survivors of COVID-19. PLoS Med. 2021;18(9):e1003773.

Nalbandian A, Sehgal K, Gupta A, Madhavan MV, McGroder C, Stevens JS, et al. Post-acute COVID-19 syndrome. Nat Med. 2021;27(4):601–15.

Felicity C, Elisa P. How and why patients made Long Covid. Soc Sci Med. 2021;268.:113426.2021;27(4):601–15.

Yong SJ. Long COVID or post-COVID-19 syndrome: putative pathophysiology, risk factors, and treatments. Infect Dis (Lond). 2021;53(10):737–54. https://doi.org/101080/2374423520211924397 .

Michelen M, Manoharan L, Elkheir N, Cheng V, Dagens A, Hastie C, et al. Characterising long COVID: a living systematic review. BMJ Glob Heal. 2021;6(9):e005427.

World Health Organization. A clinical case definition of post COVID-19 condition by a Delphi consensus. 2021. Available from: https://www.who.int/publications/i/item/WHO-2019-nCoV-Post_COVID-19_condition-Clinical_case_definition-2021.1 .

NICE Guidelines, editor. COVID-19 rapid guideline: managing the long-term effects of COVID-19. NICE Guidelines. 2020.

Alinaghi SAS, Bagheri AB, Razi A, Mojdeganlou P, Mojdeganlou H, Afsah AM, et al. Late complications of covid-19; an umbrella review on current systematic reviews. Arch Acad Emerg Med. 2023;11(1):e28.

Davis HE, McCorkell L, Vogel JM, Topol EJ. Long COVID: major findings, mechanisms and recommendations. Nat Rev Microbiol. 2023;21:133–46.

Mehraeen E, SeyedAlinaghi SA, Karimi A. The post-Omicron situation: The end of the pandemic or a bigger challenge? J Med Virol. 2022;94(8):3501–2.

Crook H, Raza S, Nowell J, Young M, Edison P. Long covid-mechanisms, risk factors, and management. BMJ. 2021;374:n1648. http://ovidsp.ovid.com/ovidweb.cgi?T=JS&PAGE=reference&D=med20&NEWS=N&AN=34312178 .

Article   PubMed   Google Scholar  

Bellan M, Baricich A, Patrucco F, Zeppegno P, Gramaglia C, Balbo PE, et al. Long-term sequelae are highly prevalent one year after hospitalization for severe COVID-19. Sci Rep. 2021;11(1):22666. http://ovidsp.ovid.com/ovidweb.cgi?T=JS&PAGE=reference&D=med20&NEWS=N&AN=34811387 .

Bliddal S, Banasik K, Pedersen OB, Nissen J, Cantwell L, Schwinn M, et al. Acute and persistent symptoms in non-hospitalized PCR-confirmed COVID-19 patients. Sci Rep [Internet]. 2021;11(1):13153. Available from:. http://ovidsp.ovid.com/ovidweb.cgi?T=JS&PAGE=reference&D=med20&NEWS=N&AN=34162913

Peghin M, Palese A, Venturini M, De Martino M, Gerussi V, Graziano E, et al. Post-COVID-19 symptoms 6 months after acute infection among hospitalized and non-hospitalized patients. Clin Microbiol Infect. 2021;27(10):1507–13. http://ovidsp.ovid.com/ovidweb.cgi?T=JS&PAGE=reference&D=med20&NEWS=N&AN=34111579 .

Zayet S, Zahra H, Royer P-YY, Tipirdamaz C, Mercier J, Gendrin V, et al. Post-COVID-19 syndrome: Nine months after SARS-CoV-2 infection in a cohort of 354 patients: data from the first wave of COVID-19 in nord franche-comté hospital, France. Microorganisms. 2021;9(8):1719.

Fjelltveit EB, Blomberg B, Kuwelker K, Zhou F, Onyango TB, Brokstad KA, et al. Symptom burden and immune dynamics 6 to 18 months following mild severe acute respiratory syndrome coronavirus 2 infection (SARS-CoV-2): a case-control study. Clin Infect Dis. 2022;76:60–70.

Fumagalli C, Zocchi C, Tassetti L, Silverii MV, Amato C, Livi L, et al. Factors associated with persistence of symptoms 1 year after COVID-19: A longitudinal, prospective phone-based interview follow-up cohort study. Eur J Intern Med. 2022;97:36–41.

Article   CAS   PubMed   Google Scholar  

Helmsdal G, Hanusson KD, Kristiansen MF, Foldbo BM, Danielsen ME, Steig BÁ, et al. Long COVID in the long run - 23-month follow-up study of persistent symptoms. Open Forum Infect Dis. 2022;9(7). Available from:. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85136307702&doi=10.1093%2Fofid%2Fofac270&partnerID=40&md5=cbcb5cc2ee6b5652237920f2edfe20ea

Kingery JR, Safford MM, Martin P, Lau JD, Rajan M, Wehmeyer GT, et al. Health status, persistent symptoms, and effort intolerance one year after acute COVID-19 infection. J Gen Intern Med. 2022;37(5):1218–25. http://ovidsp.ovid.com/ovidweb.cgi?T=JS&PAGE=reference&D=med21&NEWS=N&AN=35075531 .

Article   PubMed   PubMed Central   Google Scholar  

Knight DRT, Munipalli B, Logvinov II, Halkar MG, Mitri G, Dabrh AMA, et al. Perception, prevalence, and prediction of severe infection and post-acute sequelae of COVID-19. Am J Med Sci. 2022;363(4):295–304. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85123740987&doi=10.1016%2Fj.amjms.2022.01.002&partnerID=40&md5=4f42b267a8346594bf423562cea9f026 .

Nehme M, Braillard O, Chappuis F, Courvoisier DS, Kaiser L, Soccal PM, et al. One-year persistent symptoms and functional impairment in SARS-CoV-2 positive and negative individuals. J Intern Med. 2022;292(1):103–15.

Petersen MS, Kristiansen MF, Hanusson KD, Foldbo BM, Danielsen ME, Steig BA, et al. Prevalence of long COVID in a national cohort: longitudinal measures from disease onset until 8 months’ follow-up. Int J Infect Dis. 2022;122:437–41.

Rivera-Izquierdo M, Lainez-Ramos-Bossini AJ, de Alba IGF, Ortiz-Gonzalez-Serna R, Serrano-Ortiz A, Fernandez-Martinez NF, et al. Long COVID 12 months after discharge: persistent symptoms in patients hospitalised due to COVID-19 and patients hospitalised due to other causes-a multicentre cohort study. BMC Med. 2022;20(1):92.

Tisler A, Stirrup O, Pisarev H, Kalda R, Meister T, Suija K, et al. Post-acute sequelae of COVID-19 among hospitalized patients in Estonia: Nationwide matched cohort study. PLoS One. 2022;17(11):e0278057. http://ovidsp.ovid.com/ovidweb.cgi?T=JS&PAGE=reference&D=med22&NEWS=N&AN=36417409 .

Titze-de-Almeida R, da Cunha TR, Dos Santos Silva LD, Ferreira CS, Silva CP, Ribeiro AP, et al. Persistent, new-onset symptoms and mental health complaints in Long COVID in a Brazilian cohort of non-hospitalized patients. BMC Infect Dis. 2022;22(1):133. http://ovidsp.ovid.com/ovidweb.cgi?T=JS&PAGE=reference&D=med21&NEWS=N&AN=35135496 .

Wu Q, Ailshire JA, Crimmins EM. Long COVID and symptom trajectory in a representative sample of Americans in the first year of the pandemic. Sci Rep. 2022;12(1):11647. http://ovidsp.ovid.com/ovidweb.cgi?T=JS&PAGE=reference&D=med22&NEWS=N&AN=35804058 .

Babicki M, Kołat D, Kapusta J, Kałuzińska-Kołat Ż, Jankowski P, Mastalerz-Migas A, et al. Prevalence and assessment of risk factors among Polish adults with post-COVID-19 syndrome: a 12-month follow-up study. Polish Arch Intern Med. 2023;133(12):16512.

Boglione L, Poletti F, Rostagno R, Moglia R, Cantone M, Esposito M, et al. Long-COVID syndrome in hospitalized patients after 2 years of follow-up. J Public Heal Emerg. 2023;7:4.

Daniel CL, Fillingim S, James J, Bassler J, Lee A. Long COVID prevalence and associated characteristics among a South Alabama population. Public Health. 2023;221:135–41.

Fatima S, Ismail M, Ejaz T, Shah Z, Fatima S, Shahzaib M, et al. Association between long COVID and vaccination: A 12-month follow-up study in a low- to middle-income country. PLoS One. 2023;18(11):e0294780.

Feter N, Caputo EL, Leite JS, Delpino FM, Silva LS da, Vieira YP, et al. Prevalence and factors associated with long COVID in adults from Southern Brazil: Findings from the PAMPA cohort. Cad Saude Publica. 2023;39(12):e00098023.

Gaspar P, Dias M, Parreira I, Gonçalves HD, Parlato F, Maione V, et al. Predictors of long-COVID-19 and its impact on quality of life: longitudinal analysis at 3, 6 and 9 months after discharge from a Portuguese centre. In: Acta Medica Portuguesa. 2023;36(10):647–60.

Hastie CE, Lowe DJ, McAuley A, Mills NL, Winter AJ, Black C, et al. True prevalence of long-COVID in a nationwide, population cohort study. Nat Commun. 2023;14(1):7892.

Hua MJ, Gonakoti S, Shariff R, Corpuz C, Acosta RAH, Chang H, et al. Prevalence and Characteristics of Long COVID 7–12 Months After Hospitalization Among Patients From an Urban Safety-Net Hospital: A Pilot Study. AJPM Focus. 2023;2(3):100091.

Jayasekera MMPT, De Silva NL, Edirisinghe EMDT, Samarawickrama T, Sirimanna SWDRC, Govindapala BGDS, et al. A prospective cohort study on post COVID syndrome from a tertiary care centre in Sri Lanka. Sci Rep. 2023;13(1):15569.

Jogdand MS, Bhondwe MR, Jogdand KS, Yerpude PN, Tathe GR, Wadiyar SS. Prevalence and Determinants of Long COVID among the COVID-19 Survivors: A Cross-sectional Study from A Rural Area of Maharashtra. Indian J Community Heal. 2023;35(2):193–8.

Khanafer N, Henaff L, Bennia S, Termoz A, Chapurlat R, Escuret V, et al. Factors Associated with Long COVID-19 in a French Multicentric Prospective Cohort Study. Int J Environ Res Public Health. 2023;20(17):6678.

Kim Y, Bae S, Chang HH, Kim SW. Long COVID prevalence and impact on quality of life 2 years after acute COVID-19. Sci Rep. 2023;13(1):11207.

Krishnadath I, Harkisoen S, Gopie F, van der Hilst K, Hollum M, Woittiez L, et al. Prevalence of persistent symptoms after having COVID-19 in a cohort in Suriname. Rev Panam Salud Publica/Pan Am J Public Heal. 2023;47:e79.

Lapa J, Rosa D, Mendes JPL, Deusdará R, Romero GAS. Prevalence and Associated Factors of Post-COVID-19 Syndrome in a Brazilian Cohort after 3 and 6 Months of Hospital Discharge. Int J Environ Res Public Health. 2023;20(1):848.

Martínez-Ayala MC, Proaños NJ, Cala-Duran J, Lora-Mantilla AJ, Cáceres-Ramírez C, Villabona-Flórez SJ, et al. Factors associated with long COVID syndrome in a Colombian cohort. Front Med. 2023;10:1325616.

Montoy JCC, Ford J, Yu H, Gottlieb M, Morse D, Santangelo M, et al. Prevalence of symptoms ≤12 months after acute illness, by COVID-19 testing status among adults — United States, December 2020–March 2023. MMWR Morb Mortal Wkly Rep. 2023;72(32):859–65. https://www.cdc.gov/mmwr/volumes/72/wr/mm7232a2.htm .

Peghin M, De Martino M, Palese A, Chiappinotto S, Fonda F, Gerussi V, et al. Post-COVID-19 syndrome 2 years after the first wave: the role of humoral response, vaccination and reinfection. Open Forum Infect Dis. 2023;10(7):ofad364.

Rodríguez Onieva A, Vallejo Basurte C, Fernández Bersabé A, Camacho Cerro L, Valverde Bascón B, Muriel Sanjuan N, et al. Clinical characterization of the persistent COVID-19 symptoms: a descriptive observational study in primary care. J Prim Care Community Heal. 2023;14:21501319231208283.

Silva KM, Freitas DCA, Medeiros SS, Miranda LVA, Carmo JBM, Silva RG, et al. Prevalence and predictors of COVID-19 long-term symptoms: a cohort study from the Amazon Basin. Am J Trop Med Hyg. 2023;109(2):466–70.

Talhari C, Criado PR, Castro CCS, Ianhez M, Ramos PM, Miot HA. Prevalence of and risk factors for post-COVID: Results from a survey of 6,958 patients from Brazil. An Acad Bras Cienc. 2023;95(1):e20220143.

Tran TK, Truong SN, Thanh LT, Gia NLH, Trung HP, Dinh BT. Post-COVID condition: a survey of patients recovered from COVID-19 in Central Vietnam. J Infect Dev Ctries. 2023;17(9):1213–20.

van der Maaden T, Mutubuki EN, de Bruijn S, Leung KY, Knoop H, Slootweg J, et al. Prevalence and Severity of Symptoms 3 Months After Infection With SARS-CoV-2 Compared to Test-Negative and Population Controls in the Netherlands. J Infect Dis. 2023;227(9):1059–67.

Wahlgren C, Forsberg G, Divanoglou A, Östholm Balkhed Å, Niward K, Berg S, et al. Two-year follow-up of patients with post-COVID-19 condition in Sweden: a prospective cohort study. Lancet Reg Heal - Eur. 2023;28:100595.

Wong MCS, Huang J, Wong YY, Wong GLH, Yip TCF, Chan RNY, et al. Epidemiology, Symptomatology, and Risk Factors for Long COVID Symptoms: Population-Based, Multicenter Study. JMIR Public Heal Surveill. 2023;9:e42315.

Bello-Chavolla OY, Fermín-Martínez CA, Ramírez-García D, Vargas-Vázquez A, Fernández-Chirino L, Basile-Alvarez MR, et al. Prevalence and determinants of post-acute sequelae after SARS-CoV-2 infection (Long COVID) among adults in Mexico during 2022: a retrospective analysis of nationally representative data. Lancet Reg Heal - Am. 2024;30:100688.

Gwaikolo C, Sackie-Wapoe Y, Badio M, Glidden D V., Lindan C, Martin J. Prevalence and determinants of post-acute sequelae of COVID-19 in Liberia. Int J Epidemiol. 2024;53(1):dyad167.

Jangnin R, Ritruangroj W, Kittisupkajorn S, Sukeiam P, Inchai J, Maneeton B, et al. Long-COVID prevalence and its association with health outcomes in the post-vaccine and antiviral-availability era. J Clin Med. 2024;13(5):1208.

Keng Tok PS, Kang KY, Ng SW, Rahman NA, Syahmi MA, Pathmanathan MD, et al. Post COVID-19 condition among adults in Malaysia following the omicron wave: a prospective cohort study. PLoS One. 2024;19(1):e0296488.

Nguyen KH, Bao Y, Mortazavi J, Allen JD, Chocano-Bedoya PO, Corlin L. Prevalence and Factors Associated with Long COVID Symptoms among U.S. Adults, 2022. Vaccines. 2024;12(1):99.

Patro M, Gothi D, Anand S, Priyadarshini DPDK, Ojha UC, Pal RS, et al. Follow-up study of COVID-19 sequelae (FOSCO study). Lung India. 2024;41(2):103–9.

Salmon D, Slama D, Linard F, Dumesges N, Le Baut V, Hakim F, et al. Patients with Long COVID continue to experience significant symptoms at 12 months and factors associated with improvement: A prospective cohort study in France (PERSICOR). Int J Infect Dis. 2024;140:9–16.

Tan S, Pryor AJG, Melville GW, Fischer O, Hewitt L, Davis KJ. The lingering symptoms of post-COVID-19 condition (long-COVID): a prospective cohort study. Intern Med J. 2024;54(2):224–33.

Woldegiorgis M, Cadby G, Ngeh S, Korda RJ, Armstrong PK, Maticevic J, et al. Long COVID in a highly vaccinated but largely unexposed Australian population following the 2022 SARS-CoV-2 Omicron wave: a cross-sectional survey. Med J Aust. 2024;220(6):323–30.

Higgins JPT. Commentary: Heterogeneity in meta-analysis should be expected and appropriately quantified. Int J Epidemiol. 2008;37(5):1158–60.

Melsen WG, Bootsma MCJ, Rovers MM, Bonten MJM. The effects of clinical and statistical heterogeneity on the predictive values of results from meta-analyses. Clin Microbiol Infect. 2014;20(2):123–9. Available from:  http://dx.doi.org/10.1111/1469-0691.12494 .

Tang JL, Liu JL. Misleading funnel plot for detection of bias in meta-analysis. J Clin Epidemiol. 2000;53(5):477–84.

Moy FM, Hairi NN, Lim ERJ, Bulgiba A. Long COVID and its associated factors among COVID survivors in the community from a middle-income country-An online cross-sectional study. PLoS One. 2022;17(8):e0273364. http://ovidsp.ovid.com/ovidweb.cgi?T=JS&PAGE=reference&D=med22&NEWS=N&AN=36040960 .

Arjun MC, Singh AK, Pal D, Das K, Venkateshan M. Characteristics and predictors of Long COVID among diagnosed cases of COVID-19. PLoS One. 2022;17(12):e0278825. http://ovidsp.ovid.com/ovidweb.cgi?T=JS&PAGE=reference&D=medl&NEWS=N&AN=36538532 .

Jabali MA, Alsabban AS, Bahakeem LM, Zwawy MA, Bagasi AT, Bagasi HT, et al. Persistent Symptoms Post-COVID-19: An Observational Study at King Abdulaziz Medical City, Jeddah, Saudi Arabia. CUREUS J Med Sci. 2022;14(4):e24343.

Alkwai HM, Khalifa AM, Ahmed AM, Alnajib AM, Alshammari KA, Alrashidi MM, et al. Persistence of COVID-19 symptoms beyond 3 months and the delayed return to the usual state of health in Saudi Arabia: A cross-sectional study. Sage Open Med. 2022;10:20503121221129918.

Baris SA, Toprak OB, Cetinkaya PD, Fakili F, Kokturk N, Kul S, et al. The predictors of long-COVID in the cohort of Turkish Thoracic Society-TURCOVID multicenter registry: One year follow-up results. Asian Pac J Trop Med. 2022;15(9):400–9. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85139758970&doi=10.4103%2F1995-7645.354422&partnerID=40&md5=2e324372c1b34f9b8d857275f3aebbe1 .

Kim Y, Bitna Ha, Kim SW, Chang HH, Kwon KT, Bae S, et al. Post-acute COVID-19 syndrome in patients after 12 months from COVID-19 infection in Korea. BMC Infect Dis. 2022;22(1):1–12. https://doi.org/10.1186/s12879-022-07062-6 .

Article   CAS   Google Scholar  

Imoto W, Yamada K, Kawai R, Imai T, Kawamoto K, Uji M, et al. A cross-sectional, multicenter survey of the prevalence and risk factors for Long COVID. Sci Rep. 2022;12(1):22413. http://ovidsp.ovid.com/ovidweb.cgi?T=JS&PAGE=reference&D=medl&NEWS=N&AN=36575200 .

Wong-Chew RM, Rodríguez Cabrera EX, Rodríguez Valdez CA, Lomelin-Gascon J, Morales-Juárez L, de la Cerda MLR, et al. Symptom cluster analysis of long COVID-19 in patients discharged from the Temporary COVID-19 Hospital in Mexico City. Ther Adv Infect Dis. 2022;9:20499361211069264.

Estrada-Codecido J, Chan AK, Andany N, Lam PW, Nguyen M, Pinto R, et al. Prevalence and predictors of persistent post-COVID-19 symptoms. JAMMI. 2022;7(3):208–19. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85139237175&doi=10.3138%2Fjammi-2022-0013&partnerID=40&md5=476625042f96602c577e2016e65b5de9 .

PubMed   PubMed Central   Google Scholar  

Sudre CH, Murray B, Varsavsky T, Graham MS, Penfold RS, Bowyer RC, et al. Attributes and predictors of long COVID. Nat Med. 2021;27(4):626–31. http://ovidsp.ovid.com/ovidweb.cgi?T=JS&PAGE=reference&D=med19&NEWS=N&AN=33692530 .

Peter RS, Nieters A, Krausslich HG, Brockmann SO, Gopel S, Kindle G, et al. Post-acute sequelae of covid-19 six to 12 months after infection: population based study. August D Blankenhorn B, Bopp-Haas U, Bunk S, Deibert P, Dietz A, Friedmann-Bette B, Giesen R, Gotz V, Grote S, Gruner B, Junginger A, Kappert O, Kirsten J, Kuhn A, Malek NP, Muller B, Niess A, Pfau S, Piechotowski I, Rieg S, Rottele S, Schellenberg J, Sc BC, Group EP 1 S, editors BMJ. 2022;379:e071050. http://ovidsp.ovid.com/ovidweb.cgi?T=JS&PAGE=reference&D=med22&NEWS=N&AN=36229057 .

Kostev K, Smith L, Koyanagi A, Konrad M, Jacob L. Post-COVID-19 conditions in children and adolescents diagnosed with COVID-19. Pediatr Res. 2022;95:182–7.

Forster C, Colombo MG, Wetzel AJ, Martus PJ, Joos S. Persisting symptoms after COVID-19. Dtsch Arztebl Int. 2022;119(10):167-+.

Boscolo-Rizzo P, Guida F, Polesel J, Marcuzzo AV, Capriotti V, D’Alessandro A, et al. Sequelae in adults at 12 months after mild-to-moderate coronavirus disease 2019 (COVID-19). Int Forum Allergy Rhinol. 2021;11(12):1685–8. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85107434855&doi=10.1002%2Falr.22832&partnerID=40&md5=906d5f4f7fc87c42348fe39371f1395b .

Fischer A, Zhang L, Elbéji A, Wilmes P, Oustric P, Staub T, et al. Long COVID symptomatology after 12 months and its impact on quality of life according to initial coronavirus disease 2019 disease severity. Open Forum Infect Dis [Internet]. 2022;9(8). Available from: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85141512873&doi=10.1093%2Fofid%2Fofac397&partnerID=40&md5=9e13b2fa20e3f1ebd3deac40523cbd4d .

Montenegro P, Moral I, Puy A, Cordero E, Chantada N, Cuixart L, et al. Prevalence of Post COVID-19 Condition in Primary Care: A Cross Sectional Study. Int J Environ Res Public Health. 2022;19(3):1836.

Moreno-Perez O, Merino E, Leon-Ramirez JM, Andres M, Ramos JM, Arenas-Jimenez J, et al. Post-acute COVID-19 syndrome. Incidence and risk factors: A Mediterranean cohort study. group C-A research, editor. J Infect. 2021;82(3):378–83. http://ovidsp.ovid.com/ovidweb.cgi?T=JS&PAGE=reference&D=med19&NEWS=N&AN=33450302 .

Ballering AV, van Zon SKR, Hartman TC, Rosmalen JGM. Initiative LCR. Persistence of somatic symptoms after COVID-19 in the Netherlands: an observational cohort study. Lancet. 2022;400(10350):452–61 WE-Science Citation Index Expanded (SCI.

Menges D, Ballouz T, Anagnostopoulos A, Aschmann HE, Domenghino A, Fehr JS, et al. Burden of post-COVID-19 syndrome and implications for healthcare service planning: A population-based cohort study. PLoS One. 2021;16(7):e0254523.

Munblit D, Bobkova P, Spiridonova E, Shikhaleva A, Gamirova A, Blyuss O, et al. Incidence and risk factors for persistent symptoms in adults previously hospitalized for COVID-19. Abdeeva E Antsiferova E, Artigas E, Bairashevskaia A, Belkina A, Bezrukov V, Bordyugov S, Bratukhina M, Chen J, Deunezhewa S, Elifkhanova K, Ezhova A, Filippova Y, Frolova A, Ganieva J, Gorina A, Kalan Y, Kirillov B, Korgunova M, Krupina A, Kuznetsova A, AN, Team SSR, editors. Clin Exp Allergy. 2021;51(9):1107–20. http://ovidsp.ovid.com/ovidweb.cgi?T=JS&PAGE=reference&D=med20&NEWS=N&AN=34351016 .

Ghosn J, Piroth L, Epaulard O, Le Turnier P, Mentré F, Bachelet D, et al. Persistent COVID-19 symptoms are highly prevalent 6 months after hospitalization: results from a large prospective cohort. Clin Microbiol Infect. 2021;27(7):1041.e1–1041.e4.

Lopez-Leon S, Wegman-Ostrosky T, Perelman C, Sepulveda R, Rebolledo PA, Cuapio A, et al. More than 50 long-term effects of COVID-19: a systematic review and meta-analysis. Sci Reports 111. 2021;11(1):1–12. https://www.nature.com/articles/s41598-021-95565-8 .  [cited 2023 Jan 23].

Chen C, Haupert SR, Zimmermann L, Shi X, Fritsche LG, Mukherjee B. Global prevalence of post-coronavirus disease 2019 (COVID-19) condition or long COVID: a meta-analysis and systematic review. J Infect Dis. 2022;226(9):1593–607. https://academic.oup.com/jid/article/226/9/1593/6569364 .

Rahmati M, Udeh R, Yon DK, Lee SW, Dolja-Gore X, McEVoy M, et al. A systematic review and meta-analysis of long-term sequelae of COVID-19 2-year after SARS-CoV-2 infection: a call to action for neurological, physical, and psychological sciences. J Med Virol. 2023;95(6):e28852.

O’Mahoney LL, Routen A, Gillies C, Ekezie W, Welford A, Zhang A, et al. The prevalence and long-term health effects of Long Covid among hospitalised and non-hospitalised populations: a systematic review and meta-analysis. eClinicalMedicine. 2023;55:101762.

Soriano JB, Murthy S, Marshall JC, Relan P, Diaz J V. A clinical case definition of post-COVID-19 condition by a Delphi consensus. The Lancet Infectious Diseases. 2022;22(4):e102–7.

O’Sullivan O. Long-term sequelae following previous coronavirus epidemics. Clin Med J R Coll Physicians London. 2021;21(1):e68–e70.

Alkodaymi MS, Omrani OA, Fawzy NA, Shaar BA, Almamlouk R, Riaz M, et al. Prevalence of post-acute COVID-19 syndrome symptoms at different follow-up periods: a systematic review and meta-analysis. Clin Microbiol Infect. 2022;28(5):657–66. http://ovidsp.ovid.com/ovidweb.cgi?T=JS&PAGE=reference&D=med21&NEWS=N&AN=35124265 .

Halpin S, O’Connor R, Sivan M. Long COVID and chronic COVID syndromes. J Med Virol. 2021;93(3):1242–3.

Bell ML, Catalfamo CJ, Farland L V., Ernst KC, Jacobs ET, Klimentidis YC, et al. Post-acute sequelae of COVID-19 in a non-hospitalized cohort: results from the Arizona CoVHORT. PLoS One. 2021;16(8):e0254347.

Pazukhina E, Andreeva M, Spiridonova E, Bobkova P, Shikhaleva A, El-Taravi Y, et al. Prevalence and risk factors of Post-COVID-19 Condition in adults and children at 6 and 12 months after hospital discharge: a prospective, cohort study in moscow (Stop COVID). SSRN Electron J. 2022;20:244.

Parkin A, Davison J, Tarrant R, Ross D, Halpin S, Simms A, et al. A Multidisciplinary NHS COVID-19 Service to Manage Post-COVID-19 Syndrome in the Community. 2021. https://doi.org/10.1177/21501327211010994 .

Martin-Loeches I, Motos A, Menéndez R, Gabarrús A, González J, Fernández-Barat L, et al. ICU-Acquired Pneumonia Is Associated with Poor Health Post-COVID-19 Syndrome. J Clin Med. 2022;11(1):224.

Huang C, Huang L, Wang Y, Li X, Ren L, Gu X, et al. -month consequences of COVID-19 in patients discharged from hospital: a cohort study. Lancet. 2021;397(10270):220–32.

Bai F, Tomasoni D, Falcinella C, Barbanotti D, Castoldi R, Mulè G, et al. Female gender is associated with long COVID syndrome: a prospective cohort study. Clin Microbiol Infect. 2022;28(4):611.e9-611.e16.

Zeng F, Dai C, Cai P, Wang J, Xu L, Li J, et al. A comparison study of SARS-CoV-2 IgG antibody between male and female COVID-19 patients: A possible reason underlying different outcome between sex. J Med Virol. 2020;92(10):2050–4.

Sheridan PA, Paich HA, Handy J, Karlsson EA, Hudgens MG, Sammon AB, et al. Obesity is associated with impaired immune response to influenza vaccination in humans. Int J Obes. 2012;36(8):1072–7.

Painter SD, Ovsyannikova IG, Poland GA. The weight of obesity on the human immune response to vaccination. Vaccine. 2015;33(36):4422–9.

Townsend L, Dyer AH, Jones K, Dunne J, Mooney A, Gaffney F, et al. Persistent fatigue following SARS-CoV-2 infection is common and independent of severity of initial infection. PLoS One. 2020;15(11):e0240784.

Flores S, Brown A, Adeoye S, Jason LA, Evans M. Examining the impact of obesity on individuals with chronic fatigue syndrome. Work Heal Saf. 2013;61(7):299–307.

McElfish PA, Purvis R, James LP, Willis DE, Andersen JA. Perceived barriers to covid-19 testing. Int J Environ Res Public Health. 2021;18(5):2278.

Zimmermann L, Bhattacharya S, Purkayastha S, Kundu R, Bhaduri R, Ghosh P, et al. SARS-CoV-2 infection fatality rates in india: systematic review, meta-analysis and model-based estimation. Stud Microeconomics. 2021;9(2):137–79.

Rahmandad H, Lim TY, Sterman J. Behavioral dynamics of COVID-19: estimating underreporting, multiple waves, and adherence fatigue across 92 nations. Syst Dyn Rev. 2021;37(1):5–31.

Müller SA, Isaaka L, Mumm R, Scheidt-Nave C, Heldt K, Schuster A, et al. Prevalence and risk factors for long COVID and post-COVID-19 condition in Africa: a systematic review. Lancet Glob Heal. 2023;11(11):e1713–24.

Download references

Acknowledgements

The authors express their sincere gratitude to Ministry of Higher Education (MoHE) Malaysia for the funding of this research via the Fundamental Research Grant Scheme under grant number (FRGS/1/2022/SKK04/UKM/02/2), and to those who had contributed to the production of the article.

This research was funded by the Ministry of Higher Education (MoHE) Malaysia through Fundamental Research Grant Scheme under the grant number (FRGS/1/2022/SKK04/UKM/02/2).

Author information

Authors and affiliations.

Department of Public Health Medicine, Faculty of Medicine, Universiti Kebangsaan Malaysia, Bandar Tun Razak, Cheras, Kuala Lumpur, 56000, Malaysia

Ruhana Sk Abd Razak, Aniza Ismail & Nur Insyirah Sha’ari

Faculty of Public Health, Universitas Sumatera Utara, Jalan Universitas No. 21 Kampus USU, Medan, North Sumatra, 20155, Indonesia

Aniza Ismail

Department of Family Medicine, Faculty of Medicine, Universiti Kebangsaan Malaysia, Bandar Tun Razak, Cheras, Kuala Lumpur, 56000, Malaysia

Aznida Firzah Abdul Aziz

Department of Public Health Medicine, Faculty of Medicine, Universiti Teknologi (UiTM) MARA, Sungai Buloh, Selangor, Malaysia

Leny Suzana Suddin

Department of Primary Care, Faculty of Medicine and Health Sciences, Universiti Sains Islam Malaysia (USIM), Persiaran Ilmu, Putra Nilai, Nilai, Negeri Sembilan, 71800, Malaysia

Amirah Azzeri

You can also search for this author in PubMed   Google Scholar

Contributions

Conception of the work: RR and AI. Initial search, data extraction, screening process, quality assessment, and data analysis: RR, NIS, and AI. Results interpretation: AI, AFAA, LSS, AA, and RR. Drafting the article: RR and NIS. Critical revision of the manuscript: AI, AFAA, LSS, AA, and RR. Final approval of the manuscript: all authors.

Corresponding author

Correspondence to Aniza Ismail .

Ethics declarations

Ethics approval and consent to participate.

Not applicable. This study is a systematic review based on published studies. Patients and/or the public were not involved in the design, conduct, reporting, or dissemination plans of this review.

Consent for publication

Not applicable (patients and/or the public were not involved in the design, conduct, reporting, or dissemination plans of this review). However, all authors had approved and consented for the publication of this review.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Sk Abd Razak, R., Ismail, A., Abdul Aziz, A.F. et al. Post-COVID syndrome prevalence: a systematic review and meta-analysis. BMC Public Health 24 , 1785 (2024). https://doi.org/10.1186/s12889-024-19264-5

Download citation

Received : 25 September 2023

Accepted : 25 June 2024

Published : 04 July 2024

DOI : https://doi.org/10.1186/s12889-024-19264-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Post-COVID syndrome

BMC Public Health

ISSN: 1471-2458

systematic literature review figure

  • Systematic review
  • Open access
  • Published: 24 June 2024

A systematic review of experimentally tested implementation strategies across health and human service settings: evidence from 2010-2022

  • Laura Ellen Ashcraft   ORCID: orcid.org/0000-0001-9957-0617 1 , 2 ,
  • David E. Goodrich 3 , 4 , 5 ,
  • Joachim Hero 6 ,
  • Angela Phares 3 ,
  • Rachel L. Bachrach 7 , 8 ,
  • Deirdre A. Quinn 3 , 4 ,
  • Nabeel Qureshi 6 ,
  • Natalie C. Ernecoff 6 ,
  • Lisa G. Lederer 5 ,
  • Leslie Page Scheunemann 9 , 10 ,
  • Shari S. Rogal 3 , 11   na1 &
  • Matthew J. Chinman 3 , 4 , 6   na1  

Implementation Science volume  19 , Article number:  43 ( 2024 ) Cite this article

2246 Accesses

19 Altmetric

Metrics details

Studies of implementation strategies range in rigor, design, and evaluated outcomes, presenting interpretation challenges for practitioners and researchers. This systematic review aimed to describe the body of research evidence testing implementation strategies across diverse settings and domains, using the Expert Recommendations for Implementing Change (ERIC) taxonomy to classify strategies and the Reach Effectiveness Adoption Implementation and Maintenance (RE-AIM) framework to classify outcomes.

We conducted a systematic review of studies examining implementation strategies from 2010-2022 and registered with PROSPERO (CRD42021235592). We searched databases using terms “implementation strategy”, “intervention”, “bundle”, “support”, and their variants. We also solicited study recommendations from implementation science experts and mined existing systematic reviews. We included studies that quantitatively assessed the impact of at least one implementation strategy to improve health or health care using an outcome that could be mapped to the five evaluation dimensions of RE-AIM. Only studies meeting prespecified methodologic standards were included. We described the characteristics of studies and frequency of implementation strategy use across study arms. We also examined common strategy pairings and cooccurrence with significant outcomes.

Our search resulted in 16,605 studies; 129 met inclusion criteria. Studies tested an average of 6.73 strategies (0-20 range). The most assessed outcomes were Effectiveness ( n =82; 64%) and Implementation ( n =73; 56%). The implementation strategies most frequently occurring in the experimental arm were Distribute Educational Materials ( n =99), Conduct Educational Meetings ( n =96), Audit and Provide Feedback ( n =76), and External Facilitation ( n =59). These strategies were often used in combination. Nineteen implementation strategies were frequently tested and associated with significantly improved outcomes. However, many strategies were not tested sufficiently to draw conclusions.

This review of 129 methodologically rigorous studies built upon prior implementation science data syntheses to identify implementation strategies that had been experimentally tested and summarized their impact on outcomes across diverse outcomes and clinical settings. We present recommendations for improving future similar efforts.

Peer Review reports

Contributions to the literature

While many implementation strategies exist, it has been challenging to compare their effectiveness across a wide range of trial designs and practice settings

This systematic review provides a transdisciplinary evaluation of implementation strategies across population, practice setting, and evidence-based interventions using a standardized taxonomy of strategies and outcomes.

Educational strategies were employed ubiquitously; nineteen other commonly used implementation strategies, including External Facilitation and Audit and Provide Feedback, were associated with positive outcomes in these experimental trials.

This review offers guidance for scholars and practitioners alike in selecting implementation strategies and suggests a roadmap for future evidence generation.

Implementation strategies are “methods or techniques used to enhance the adoption, implementation, and sustainment of evidence-based practices or programs” (EBPs) [ 1 ]. In 2015, the Expert Recommendations for Implementing Change (ERIC) study organized a panel of implementation scientists to compile a standardized set of implementation strategy terms and definitions [ 2 , 3 , 4 ]. These 73 strategies were then organized into nine “clusters” [ 5 ]. The ERIC taxonomy has been widely adopted and further refined [ 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 ]. However, much of the evidence for individual or groups of ERIC strategies remains narrowly focused. Prior systematic reviews and meta-analyses have assessed strategy effectiveness, but have generally focused on a specific strategy, (e.g., Audit and Provide Feedback) [ 14 , 15 , 16 ], subpopulation, disease (e.g., individuals living with dementia) [ 16 ], outcome [ 15 ], service setting (e.g., primary care clinics) [ 17 , 18 , 19 ] or geography [ 20 ]. Given that these strategies are intended to have broad applicability, there remains a need to understand how well implementation strategies work across EBPs and settings and the extent to which implementation knowledge is generalizable.

There are challenges in assessing the evidence of implementation strategies across many EBPs, populations, and settings. Heterogeneity in population characteristics, study designs, methods, and outcomes have made it difficult to quantitatively compare which strategies work and under which conditions [ 21 ]. Moreover, there remains significant variability in how researchers operationalize, apply, and report strategies (individually or in combination) and outcomes [ 21 , 22 ]. Still, synthesizing data related to using individual strategies would help researchers replicate findings and better understand possible mediating factors including the cost, timing, and delivery by specific types of health providers or key partners [ 23 , 24 , 25 ]. Such an evidence base would also aid practitioners with implementation planning such as when and how to deploy a strategy for optimal impact.

Building upon previous efforts, we therefore conducted a systematic review to evaluate the level of evidence supporting the ERIC implementation strategies across a broad array of health and human service settings and outcomes, as organized by the evaluation framework, RE-AIM (Reach, Effectiveness, Adoption, Implementation, Maintenance) [ 26 , 27 , 28 ]. A secondary aim of this work was to identify patterns in scientific reporting of strategy use that could not only inform reporting standards for strategies but also the methods employed in future. The current study was guided by the following research questions Footnote 1 :

What implementation strategies have been most commonly and rigorously tested in health and human service settings?

Which implementation strategies were commonly paired?

What is the evidence supporting commonly tested implementation strategies?

We used the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA-P) model [ 29 , 30 , 31 ] to develop and report on the methods for this systematic review (Additional File 1). This study was considered to be non-human subjects research by the RAND institutional review board.

Registration

The protocol was registered with PROSPERO (PROSPERO 2021 CRD42021235592).

Eligibility criteria

This review sought to synthesize evidence for implementation strategies from research studies conducted across a wide range of health-related settings and populations. Inclusion criteria required studies to: 1) available in English; 2) published between January 1, 2010 and September 20, 2022; 3) based on experimental research (excluded protocols, commentaries, conference abstracts, or proposed frameworks); 4) set in a health or human service context (described below); 5) tested at least one quantitative outcome that could be mapped to the RE-AIM evaluation framework [ 26 , 27 , 28 ]; and 6) evaluated the impact of an implementation strategy that could be classified using the ERIC taxonomy [ 2 , 32 ]. We defined health and human service setting broadly, including inpatient and outpatient healthcare settings, specialty clinics, mental health treatment centers, long-term care facilities, group homes, correctional facilities, child welfare or youth services, aging services, and schools, and required that the focus be on a health outcome. We excluded hybrid type I trials that primarily focused on establishing EBP effectiveness, qualitative studies, studies that described implementation barriers and facilitators without assessing implementation strategy impact on an outcome, and studies not meeting standardized rigor criteria defined below.

Information sources

Our three-pronged search strategy included searching academic databases (i.e., CINAHL, PubMed, and Web of Science for replicability and transparency), seeking recommendations from expert implementation scientists, and assessing existing, relevant systematic reviews and meta-analyses.

Search strategy

Search terms included “implementation strateg*” OR “implementation intervention*” OR “implementation bundl*” OR “implementation support*.” The search, conducted on September 20, 2022, was limited to English language and publication between 2010 and 2022, similar to other recent implementation science reviews [ 22 ]. This timeframe was selected to coincide with the advent of Implementation Science and when the term “implementation strategy” became conventionally used [ 2 , 4 , 33 ]. A full search strategy can be found in Additional File 2.

Title and abstract screening process

Each study’s title and abstract were read by two reviewers, who dichotomously scored studies on each of the six eligibility criteria described above as yes=1 or no=0, resulting in a score ranging from 1 to 6. Abstracts receiving a six from both reviewers were included in the full text review. Those with only one score of six were adjudicated by a senior member of the team (MJC, SSR, DEG). The study team held weekly meetings to troubleshoot and resolve any ongoing issues noted through the abstract screening process.

Full text screening

During the full text screening process, we reviewed, in pairs, each article that had progressed through abstract screening. Conflicts between reviewers were adjudicated by a senior member of the team for a final inclusion decision (MJC, SSR, DEG).

Review of study rigor

After reviewing published rigor screening tools [ 34 , 35 , 36 ], we developed an assessment of study rigor that was appropriate for the broad range of reviewed implementation studies. Reviewers evaluated studies on the following: 1) presence of a concurrent comparison or control group (=2 for traditional randomized controlled trial or stepped wedge cluster randomized trial and =1 for pseudo-randomized and other studies with concurrent control); 2) EBP standardization by protocol or manual (=1 if present); 3) EBP fidelity tracking (=1 if present); 4) implementation strategy standardization by operational description, standard training, or manual (=1 if present); 5) length of follow-up from full implementation of intervention (=2 for twelve months or longer, =1 for six to eleven months, or =0 for less than six months); and 6) number of sites (=1 for more than one site). Rigor scores ranged from 0 to 8, with 8 indicating the most rigorous. Articles were included if they 1) included a concurrent control group, 2) had an experimental design, and 3) received a score of 7 or 8 from two independent reviewers.

Outside expert consultation

We contacted 37 global implementation science experts who were recognized by our study team as leaders in the field or who were commonly represented among first or senior authors in the included abstracts. We asked each expert for recommendations of publications meeting study inclusion criteria (i.e., quantitatively evaluating the effectiveness of an implementation strategy). Recommendations were recorded and compared to the full abstract list.

Systematic reviews

Eighty-four systematic reviews were identified through the initial search strategy (See Additional File 3). Systematic reviews that examined the effectiveness of implementation strategies were reviewed in pairs for studies that were not found through our initial literature search.

Data abstraction and coding

Data from the full text review were abstracted in pairs, with conflicts resolved by senior team members (DEG, MJC) using a standard Qualtrics abstraction form. The form captured the setting, number of sites and participants studied, evidence-based practice/program of focus, outcomes assessed (based on RE-AIM), strategies used in each study arm, whether the study took place in the U.S. or outside of the U.S., and the findings (i.e., was there significant improvement in the outcome(s)?). We coded implementation strategies used in the Control and Experimental Arms. We defined the Control Arm as receiving the lowest number of strategies (which could mean zero strategies or care as usual) and the Experimental Arm as the most intensive arm (i.e., receiving the highest number of strategies). When studies included multiple Experimental Arms, the Experimental Arm with the least intensive implementation strategy(ies) was classified as “Control” and the Experimental Arm with the most intensive implementation strategy(ies) was classified as the “Experimental” Arm.

Implementation strategies were classified using standard definitions (MJC, SSR, DEG), based on minor modifications to the ERIC taxonomy [ 2 , 3 , 4 ]. Modifications resulted in 70 named strategies and were made to decrease redundancy and improve clarity. These modifications were based on input from experts, cognitive interview data, and team consensus [ 37 ] (See Additional File 4). Outcomes were then coded into RE-AIM outcome domains following best practices as recommended by framework experts [ 26 , 27 , 28 ]. We coded the RE-AIM domain of Effectiveness as either an assessment of the effectiveness of the EBP or the implementation strategy. We did not assess implementation strategy fidelity or effects on health disparities as these are recently adopted reporting standards [ 27 , 28 ] and not yet widely implemented in current publications. Further, we did not include implementation costs as an outcome because reporting guidelines have not been standardized [ 38 , 39 ].

Assessment and minimization of bias

Assessment and minimization of bias is an important component of high-quality systematic reviews. The Cochrane Collaboration guidance for conducting high-quality systematic reviews recommends including a specific assessment of bias for individual studies by assessing the domains of randomization, deviations of intended intervention, missing data, measurement of the outcome, and selection of the reported results (e.g., following a pre-specified analysis plan) [ 40 , 41 ]. One way we addressed bias was by consolidating multiple publications from the same study into a single finding (i.e., N =1), so-as to avoid inflating estimates due to multiple publications on different aspects of a single trial. We also included high-quality studies only, as described above. However, it was not feasible to consistently apply an assessment of bias tool due to implementation science’s broad scope and the heterogeneity of study design, context, outcomes, and variable measurement, etc. For example, most implementation studies reviewed had many outcomes across the RE-AIM framework, with no one outcome designated as primary, precluding assignment of a single score across studies.

We used descriptive statistics to present the distribution of health or healthcare area, settings, outcomes, and the median number of included patients and sites per study, overall and by country (classified as U.S. vs. non-U.S.). Implementation strategies were described individually, using descriptive statistics to summarize the frequency of strategy use “overall” (in any study arm), and the mean number of strategies reported in the Control and Experimental Arms. We additionally described the strategies that were only in the experimental (and not control) arm, defining these as strategies that were “tested” and may have accounted for differences in outcomes between arms.

We described frequencies of pair-wise combinations of implementation strategies in the Experimental Arm. To assess the strength of the evidence supporting implementation strategies that were used in the Experimental Arm, study outcomes were categorized by RE-AIM and coded based on whether the association between use of the strategies resulted in a significantly positive effect (yes=1; no=0). We then created an indicator variable if at least one RE-AIM outcome in the study was significantly positive (yes=1; no=0). We plotted strategies on a graph with quadrants based on the combination of median number of studies in which a strategy appears and the median percent of studies in which a strategy was associated with at least one positive RE-AIM outcome. The upper right quadrant—higher number of studies overall and higher percent of studies with a significant RE-AIM outcome—represents a superior level of evidence. For implementation strategies in the upper right quadrant, we describe each RE-AIM outcome and the proportion of studies which have a significant outcome.

Search results

We identified 14,646 articles through the initial literature search, 17 articles through expert recommendation (three of which were not included in the initial search), and 1,942 articles through reviewing prior systematic reviews (Fig. 1 ). After removing duplicates, 9,399 articles were included in the initial abstract screening. Of those, 48% ( n =4,075) abstracts were reviewed in pairs for inclusion. Articles with a score of five or six were reviewed a second time ( n =2,859). One quarter of abstracts that scored lower than five were reviewed for a second time at random. We screened the full text of 1,426 articles in pairs. Common reasons for exclusion were 1) study rigor, including no clear delineation between the EBP and implementation strategy, 2) not testing an implementation strategy, and 3) article type that did not meet inclusion criteria (e.g., commentary, protocol, etc.). Six hundred seventeen articles were reviewed for study rigor with 385 excluded for reasons related to study design and rigor, and 86 removed for other reasons (e.g., not a research article). Among the three additional expert-recommended articles, one met inclusion criteria and was added to the analysis. The final number of studies abstracted was 129 representing 143 publications.

figure 1

Expanded PRISMA Flow Diagram

The expanded PRISMA flow diagram provides a description of each step in the review and abstraction process for the systematic review

Descriptive results

Of 129 included studies (Table 1 ; see also Additional File 5 for Summary of Included Studies), 103 (79%) were conducted in a healthcare setting. EBP health care setting varied and included primary care ( n =46; 36%), specialty care ( n =27; 21%), mental health ( n =11; 9%), and public health ( n =30; 23%), with 64 studies (50%) occurring in an outpatient health care setting. Studies included a median of 29 sites and 1,419 target population (e.g., patients or students). The number of strategies varied widely across studies, with Control Arms averaging approximately two strategies (Range = 0-20, including studies with no strategy in the comparison group) and Experimental Arms averaging eight strategies (Range = 1-21). Non-US studies ( n =73) included more sites and target population on average, with an overall median of 32 sites and 1,531 patients assessed in each study.

Organized by RE-AIM, the most evaluated outcomes were Effectiveness ( n = 82, 64%) and Implementation ( n = 73, 56%); followed by Maintenance ( n =40; 31%), Adoption ( n =33; 26%), and Reach ( n =31; 24%). Most studies ( n = 98, 76%) reported at least one significantly positive outcome. Adoption and Implementation outcomes showed positive change in three-quarters of studies ( n =78), while Reach ( n =18; 58%), Effectiveness ( n =44; 54%), and Maintenance ( n =23; 58%) outcomes evidenced positive change in approximately half of studies.

The following describes the results for each research question.

Table 2 shows the frequency of studies within which an implementation strategy was used in the Control Arm, Experimental Arm(s), and tested strategies (those used exclusively in the Experimental Arm) grouped by strategy type, as specified by previous ERIC reports [ 2 , 6 ].

Control arm

In about half the studies (53%; n =69), the Control Arms were “active controls” that included at least one strategy, with an average of 1.64 (and up to 20) strategies reported in control arms. The two most common strategies used in Control Arms were: Distribute Educational Materials ( n =52) and Conduct Educational Meetings ( n =30).

Experimental arm

Experimental conditions included an average of 8.33 implementation strategies per study (Range = 1-21). Figure 2 shows a heat map of the strategies that were used in the Experimental Arms in each study. The most common strategies in the Experimental Arm were Distribute Educational Materials ( n =99), Conduct Educational Meetings ( n =96), Audit and Provide Feedback ( n =76), and External Facilitation ( n =59).

figure 2

Implementation strategies used in the Experimental Arm of included studies. Explore more here: https://public.tableau.com/views/Figure2_16947070561090/Figure2?:language=en-US&:display_count=n&:origin=viz_share_link

Tested strategies

The average number of implementation strategies that were included in the Experimental Arm only (and not in the Control Arm) was 6.73 (Range = 0-20). Footnote 2 Overall, the top 10% of tested strategies included Conduct Educational Meetings ( n =68), Audit and Provide Feedback ( n =63), External Facilitation ( n =54), Distribute Educational Materials ( n =49), Tailor Strategies ( n =41), Assess for Readiness and Identify Barriers and Facilitators ( n =38) and Organize Clinician Implementation Team Meetings ( n =37). Few studies tested a single strategy ( n =9). These strategies included, Audit and Provide Feedback, Conduct Educational Meetings, Conduct Ongoing Training, Create a Learning Collaborative, External Facilitation ( n =2), Facilitate Relay of Clinical Data To Providers, Prepare Patients/Consumers to be Active Participants, and Use Other Payment Schemes. Three implementation strategies were included in the Control or Experimental Arms but were not Tested including, Use Mass Media, Stage Implementation Scale Up, and Fund and Contract for the Clinical Innovation.

Table 3  shows the five most used strategies in Experimental Arms with their top ten most frequent pairings, excluding Distribute Educational Materials and Conduct Educational Meetings, as these strategies were included in almost all Experimental and half of Control Arms. The five most used strategies in the Experimental Arm included Audit and Provide Feedback ( n =76), External Facilitation ( n =59), Tailor Strategies ( n =43), Assess for Readiness and Identify Barriers and Facilitators ( n =43), and Organize Implementation Teams ( n =42).

Strategies frequently paired with these five strategies included two educational strategies: Distribute Educational Materials and Conduct Educational Meetings. Other commonly paired strategies included Develop a Formal Implementation Blueprint, Promote Adaptability, Conduct Ongoing Training, Purposefully Reexamine the Implementation, and Develop and Implement Tools for Quality Monitoring.

We classified the strength of evidence for each strategy by evaluating both the number of studies in which each strategy appeared in the Experimental Arm and the percentage of times there was at least one significantly positive RE-AIM outcome. Using these factors, Fig. 3 shows the number of studies in which individual strategies were evaluated (on the y axis) compared to the percentage of times that studies including those strategies had at least one positive outcome (on the x axis). Due to the non-normal distribution of both factors, we used the median (rather than the mean) to create four quadrants. Strategies in the lower left quadrant were tested in fewer than the median number of studies (8.5) and were less frequently associated with a significant RE-AIM outcome (75%). The upper right quadrant included strategies that occurred in more than the median number of studies (8.5) and had more than the median percent of studies with a significant RE-AIM outcome (75%); thus those 19 strategies were viewed as having stronger evidence. Of those 19 implementation strategies, Conduct Educational Meetings, Distribute Educational Materials, External Facilitation, and Audit and Provide Feedback continued to occur frequently, appearing in 59-99 studies.

figure 3

Experimental Arm Implementation Strategies with significant RE-AIM outcome. Explore more here: https://public.tableau.com/views/Figure3_16947017936500/Figure3?:language=en-US&publish=yes&:display_count=n&:origin=viz_share_link

Figure 4 graphically illustrates the proportion of significant outcomes for each RE-AIM outcome for the 19 commonly used and evidence-based implementation strategies in the upper right quadrant. These findings again show the widespread use of Conduct Educational Meetings and Distribute Educational Materials. Implementation and Effectiveness outcomes were assessed most frequently, with Implementation being the mostly commonly reported significantly positive outcome.

figure 4

RE-AIM outcomes for the 19 Top-Right Quadrant Implementation Strategies . The y-axis is the number of studies and the x-axis is a stacked bar chart for each RE-AIM outcome with R=Reach, E=Effectiveness, A=Adoption, I=Implementation, M=Maintenance. Blue denotes at least one significant RE-AIM outcome; Light blue denotes studies which used the given implementation strategy and did not have a significant RE-AIM . Explore more here: https://public.tableau.com/views/Figure4_16947017112150/Figure4?:language=en-US&publish=yes&:display_count=n&:origin=viz_share_link

This systematic review identified 129 experimental studies examining the effectiveness of implementation strategies across a broad range of health and human service studies. Overall, we found that evidence is lacking for most ERIC implementation strategies, that most studies employed combinations of strategies, and that implementation outcomes, categorized by RE-AIM dimensions, have not been universally defined or applied. Accordingly, other researchers have described the need for universal outcomes definitions and descriptions across implementation research studies [ 28 , 42 ]. Our findings have important implications not only for the current state of the field but also for creating guidance to help investigators determine which strategies and in what context to examine.

The four most evaluated strategies were Distribute Educational Materials, Conduct Educational Meetings, External Facilitation, and Audit and Provide Feedback. Conducting Educational Meetings and Distributing Educational Materials were surprisingly the most common. This may reflect the fact that education strategies are generally considered to be “necessary but not sufficient” for successful implementation [ 43 , 44 ]. Because education is often embedded in interventions, it is critical to define the boundary between the innovation and the implementation strategies used to support the innovation. Further specification as to when these strategies are EBP core components or implementation strategies (e.g., booster trainings or remediation) is needed [ 45 , 46 ].

We identified 19 implementation strategies that were tested in at least 8 studies (more than the median) and were associated with positive results at least 75% of the time. These strategies can be further categorized as being used in early or pre-implementation versus later in implementation. Preparatory activities or pre-implementation, strategies that had strong evidence included educational activities (Meetings, Materials, Outreach visits, Train for Leadership, Use Train the Trainer Strategies) and site diagnostic activities (Assess for Readiness, Identify Barriers and Facilitators, Conduct Local Needs Assessment, Identify and Prepare Champions, and Assess and Redesign Workflows). Strategies that target the implementation phase include those that provide coaching and support (External and Internal Facilitation), involve additional key partners (Intervene with Patients to Enhance Uptake and Adherence), and engage in quality improvement activities (Audit and Provide Feedback, Facilitate the Relay of Clinical Data to Providers, Purposefully Reexamine the Implementation, Conduct Cyclical Small Tests of Change, Develop and Implement Tools for Quality Monitoring).

There were many ERIC strategies that were not represented in the reviewed studies, specifically the financial and policy strategies. Ten strategies were not used in any studies, including: Alter Patient/Consumer Fees, Change Liability Laws, Change Service Sites, Develop Disincentives, Develop Resource Sharing Agreements, Identify Early Adopters, Make Billing Easier, Start a Dissemination Organization, Use Capitated Payments, and Use Data Experts. One of the limitations of this investigation was that not all individual strategies or combinations were investigated. Reasons for the absence of these strategies in our review may include challenges with testing certain strategies experimentally (e.g., changing liability laws), limitations in our search terms, and the relative paucity of implementation strategy trials compared to clinical trials. Many “untested” strategies require large-scale structural changes with leadership support (see [ 47 ] for policy experiment example). Recent preliminary work has assessed the feasibility of applying policy strategies and described the challenges with doing so [ 48 , 49 , 50 ]. While not impossible in large systems like VA (for example: the randomized evaluation of the VA Stratification Tool for Opioid Risk Management) the large size, structure, and organizational imperative makes these initiatives challenging to experimentally evaluate. Likewise, the absence of these ten strategies may have been the result of our inclusion criteria, which required an experimental design. Thus, creative study designs may be needed to test high-level policy or financial strategies experimentally.

Some strategies that were likely under-represented in our search strategy included electronic medical record reminders and clinical decision support tools and systems. These are often considered “interventions” when used by clinical trialists and may not be indexed as studies involving ‘implementation strategies’ (these tools have been reviewed elsewhere [ 51 , 52 , 53 ]). Thus, strategies that are also considered interventions in the literature (e.g., education interventions) were not sought or captured. Our findings do not imply that these strategies are ineffective, rather that more study is needed. Consistent with prior investigations [ 54 ], few studies meeting inclusion criteria tested financial strategies. Accordingly, there are increasing calls to track and monitor the effects of financial strategies within implementation science to understand their effectiveness in practice [ 55 , 56 ]. However, experts have noted that the study of financial strategies can be a challenge given that they are typically implemented at the system-level and necessitate research designs for studying policy-effects (e.g., quasi-experimental methods, systems-science modeling methods) [ 57 ]. Yet, there have been some recent efforts to use financial strategies to support EBPs that appear promising [ 58 ] and could be a model for the field moving forward.

The relationship between the number of strategies used and improved outcomes has been described inconsistently in the literature. While some studies have found improved outcomes with a bundle of strategies that were uniquely combined or a standardized package of strategies (e.g., Replicating Effective Programs [ 59 , 60 ] and Getting To Outcomes [ 61 , 62 ]), others have found that “more is not always better” [ 63 , 64 , 65 ]. For example, Rogal and colleagues documented that VA hospitals implementing a new evidence-based hepatitis C treatment chose >20 strategies, when multiple years of data linking strategies to outcomes showed that 1-3 specific strategies would have yielded the same outcome [ 39 ]. Considering that most studies employed multiple or multifaceted strategies, it seems that there is a benefit of using a targeted bundle of strategies that are purposefully aligns with site/clinic/population norms, rather than simply adding more strategies [ 66 ].

It is difficult to assess the effectiveness of any one implementation strategy in bundles where multiple strategies are used simultaneously. Even a ‘single’ strategy like External Facilitation is, in actuality, a bundle of narrowly constructed strategies (e.g., Conduct Educational Meetings, Identify and Prepare Champions, and Develop a Formal Implementation Blueprint). Thus, studying External Facilitation does not allow for a test of the individual strategies that comprise it, potentially masking the effectiveness of any individual strategy. While we cannot easily disaggregate the effects of multifaceted strategies, doing so may not yield meaningful results. Because strategies often synergize, disaggregated results could either underestimate the true impact of individual strategies or conversely, actually undermine their effectiveness (i.e., when their effectiveness comes from their combination with other strategies). The complexity of health and human service settings, imperative to improve public health outcomes, and engagement with community partners often requires the use of multiple strategies simultaneously. Therefore, the need to improve real-world implementation may outweigh the theoretical need to identify individual strategy effectiveness. In situations where it would be useful to isolate the impact of single strategies, we suggest that the same methods for documenting and analyzing the critical components (or core functions) of complex interventions [ 67 , 68 , 69 , 70 ] may help to identify core components of multifaceted implementation strategies [ 71 , 72 , 73 , 74 ].

In addition, to truly assess the impacts of strategies on outcomes, it may be necessary to track fidelity to implementation strategies (not just the EBPs they support). While this can be challenging, without some degree of tracking and fidelity checks, one cannot determine whether a strategy’s apparent failure to work was because it 1) was ineffective or 2) was not applied well. To facilitate this tracking there are pragmatic tools to support researchers. For example, the Longitudinal Implementation Strategy Tracking System (LISTS) offers a pragmatic and feasible means to assess fidelity to and adaptations of strategies [ 75 ].

Implications for implementation science: four recommendations

Based on our findings, we offer four recommended “best practices” for implementation studies.

Prespecify strategies using standard nomenclature. This study reaffirmed the need to apply not only a standard naming convention (e.g., ERIC) but also a standard reporting of for implementation strategies. While reporting systems like those by Proctor [ 1 ] or Pinnock [ 75 ] would optimize learning across studies, few manuscripts specify strategies as recommended [ 76 , 77 ]. Pre-specification allows planners and evaluators to assess the feasibility and acceptability of strategies with partners and community members [ 24 , 78 , 79 ] and allows evaluators and implementers to monitor and measure the fidelity, dose, and adaptations to strategies delivered over the course of implementation [ 27 ]. In turn, these data can be used to assess the costs, analyze their effectiveness [ 38 , 80 , 81 ], and ensure more accurate reporting [ 82 , 83 , 84 , 85 ]. This specification should include, among other data, the intensity, stage of implementation, and justification for the selection. Information regarding why strategies were selected for specific settings would further the field and be of great use to practitioners. [ 63 , 65 , 69 , 79 , 86 ].

Ensure that standards for measuring and reporting implementation outcomes are consistently applied and account for the complexity of implementation studies. Part of improving standardized reporting must include clearly defining outcomes and linking each outcome to particular implementation strategies. It was challenging in the present review to disentangle the impact of the intervention(s) (i.e., the EBP) versus the impact of the implementation strategy(ies) for each RE-AIM dimension. For example, often fidelity to the EBP was reported but not for the implementation strategies. Similarly, Reach and Adoption of the intervention would be reported for the Experimental Arm but not for the Control Arm, prohibiting statistical comparisons of strategies on the relative impact of the EBP between study arms. Moreover, there were many studies evaluating numerous outcomes, risking data dredging. Further, the significant heterogeneity in the ways in which implementation outcomes are operationalized and reported is a substantial barrier to conducting large-scale meta-analytic approaches to synthesizing evidence for implementation strategies [ 67 ]. The field could look to others in the social and health sciences for examples in how to test, validate, and promote a common set of outcome measures to aid in bringing consistency across studies and real-world practice (e.g., the NIH-funded Patient-Reported Outcomes Measurement Information System [PROMIS], https://www.healthmeasures.net/explore-measurement-systems/promis ).

Develop infrastructure to learn cross-study lessons in implementation science. Data repositories, like those developed by NCI for rare diseases, U.S. HIV Implementation Science Coordination Initiative [ 87 ], and the Behavior Change Technique Ontology [ 88 ], could allow implementation scientists to report their findings in a more standardized manner, which would promote ease of communication and contextualization of findings across studies. For example, the HIV Implementation Science Coordination Initiative requested all implementation projects use common frameworks, developed user friendly databases to enable practitioners to match strategies to determinants, and developed a dashboard of studies that assessed implementation determinants [ 89 , 90 , 91 , 92 , 93 , 94 ].

Develop and apply methods to rigorously study common strategies and bundles. These findings support prior recommendations for improved empirical rigor in implementation studies [ 46 , 95 ]. Many studies were excluded from our review based on not meeting methodological rigor standards. Understanding the effectiveness of discrete strategies deployed alone or in combination requires reliable and low burden tracking methods to collect information about strategy use and outcomes. For example, frameworks like the Implementation Replication Framework [ 96 ] could help interpret findings across studies using the same strategy bundle. Other tracking approaches may leverage technology (e.g., cell phones, tablets, EMR templates) [ 78 , 97 ] or find novel, pragmatic approaches to collect recommended strategy specifications over time (e.g.., dose, deliverer, and mechanism) [ 1 , 9 , 27 , 98 , 99 ]. Rigorous reporting standards could inform more robust analyses and conclusions (e.g., moving toward the goal of understanding causality, microcosting efforts) [ 24 , 38 , 100 , 101 ]. Such detailed tracking is also required to understand how site-level factors moderate implementation strategy effects [ 102 ]. In some cases, adaptive trial designs like sequential multiple assignment randomized trials (SMARTs) and just-in-time adaptive interventions (JITAIs) can be helpful for planning strategy escalation.

Limitations

Despite the strengths of this review, there were certain notable limitations. For one, we only included experimental studies, omitting many informative observational investigations that cover the range of implementation strategies. Second, our study period was centered on the creation of the journal Implementation Science and not on the standardization and operationalization of implementation strategies in the publication of the ERIC taxonomy (which came later). This, in conjunction with latency in reporting study results and funding cycles, means that the employed taxonomy was not applied in earlier studies. To address this limitation, we retroactively mapped strategies to ERIC, but it is possible that some studies were missed. Additionally, indexing approaches used by academic databases may have missed relevant studies. We addressed this particular concern by reviewing other systematic reviews of implementation strategies and soliciting recommendations from global implementation science experts.

Another potential limitation comes from the ERIC taxonomy itself—i.e., strategy listings like ERIC are only useful when they are widely adopted and used in conjunction with guidelines for specifying and reporting strategies [ 1 ] in protocol and outcome papers. Although the ERIC paper has been widely cited (over three thousand times, accessed about 186 thousand times), it is still not universally applied, making tracking the impact of specific strategies more difficult. However, our experience with this review seemed to suggest that ERIC’s use was increasing over time. Also, some have commented that ERIC strategies can be unclear and are missing key domains. Thus, researchers are making definitions clearer for lay users [ 37 , 103 ], increasing the number of discrete strategies for specific domains like HIV treatment, acknowledging strategies for new functions (e.g., de-implementation [ 104 ], local capacity building), accounting for phases of implementation (dissemination, sustainment [ 13 ], scale-up), addressing settings [ 12 , 20 ], actors roles in the process, and making mechanisms of change to select strategies more user-friendly through searchable databases [ 9 , 10 , 54 , 73 , 104 , 105 , 106 ]. In sum, we found the utility of the ERIC taxonomy to outweigh any of the taxonomy’s current limitations.

As with all reviews, the search terms influenced our findings. As such, the broad terms for implementation strategies (e.g., “evidence-based interventions”[ 7 ] or “behavior change techniques” [ 107 ]) may have led to inadvertent omissions of studies of specific strategies. For example, the search terms may not have captured tests of policies, financial strategies, community health promotion initiatives, or electronic medical record reminders, due to differences in terminology used in corresponding subfields of research (e.g., health economics, business, health information technology, and health policy). To manage this, we asked experts to inform us about any studies that they would include and cross-checked their lists with what was identified through our search terms, which yielded very few additional studies. We included standard coding using the ERIC taxonomy, which was a strength, but future work should consider including the additional strategies that have been recommended to augment ERIC, around sustainment [ 13 , 79 , 106 , 108 ], community and public health research [ 12 , 109 , 110 , 111 ], consumer or service user engagement [ 112 ], de-implementation [ 104 , 113 , 114 , 115 , 116 , 117 ] and related terms [ 118 ].

We were unable to assess the bias of studies due to non-standard reporting across the papers and the heterogeneity of study designs, measurement of implementation strategies and outcomes, and analytic approaches. This could have resulted in over- or underestimating the results of our synthesis. We addressed this limitation by being cautious in our reporting of findings, specifically in identifying “effective” implementation strategies. Further, we were not able to gather primary data to evaluate effect sizes across studies in order to systematically evaluate bias, which would be fruitful for future study.

Conclusions

This novel review of 129 studies summarized the body of evidence supporting the use of ERIC-defined implementation strategies to improve health or healthcare. We identified commonly occurring implementation strategies, frequently used bundles, and the strategies with the highest degree of supportive evidence, while simultaneously identifying gaps in the literature. Additionally, we identified several key areas for future growth and operationalization across the field of implementation science with the goal of improved reporting and assessment of implementation strategies and related outcomes.

Availability and materials

All data for this study are included in this published article and its supplementary information files.

We modestly revised the following research questions from our PROSPERO registration after reading the articles and better understanding the nature of the literature: 1) What is the available evidence regarding the effectiveness of implementation strategies in supporting the uptake and sustainment of evidence intended to improve health and healthcare outcomes? 2) What are the current gaps in the literature (i.e., implementation strategies that do not have sufficient evidence of effectiveness) that require further exploration?

Tested strategies are those which exist in the Experimental Arm but not in the Control Arm. Comparative effectiveness or time staggered trials may not have any unique strategies in the Experimental Arm and therefore in our analysis would have no Tested Strategies.

Abbreviations

Centers for Disease Control

Cumulated Index to Nursing and Allied Health Literature

Dissemination and Implementation

Evidence-based practices or programs

Expert Recommendations for Implementing Change

Multiphase Optimization Strategy

National Cancer Institute

National Institutes of Health

The Pittsburgh Dissemination and Implementation Science Collaborative

Sequential Multiple Assignment Randomized Trial

United States

Department of Veterans Affairs

Proctor EK, Powell BJ, McMillen JC. Implementation strategies: recommendations for specifying and reporting. Implement Sci. 2013;8:139.

Article   PubMed   PubMed Central   Google Scholar  

Powell BJ, Waltz TJ, Chinman MJ, Damschroder LJ, Smith JL, Matthieu MM, et al. A refined compilation of implementation strategies: results from the Expert Recommendations for Implementing Change (ERIC) project. Implement Sci. 2015;10:21.

Waltz TJ, Powell BJ, Chinman MJ, Smith JL, Matthieu MM, Proctor EK, et al. Expert recommendations for implementing change (ERIC): protocol for a mixed methods study. Implement Sci IS. 2014;9:39.

Article   PubMed   Google Scholar  

Powell BJ, McMillen JC, Proctor EK, Carpenter CR, Griffey RT, Bunger AC, et al. A Compilation of Strategies for Implementing Clinical Innovations in Health and Mental Health. Med Care Res Rev. 2012;69:123–57.

Waltz TJ, Powell BJ, Matthieu MM, Damschroder LJ, Chinman MJ, Smith JL, et al. Use of concept mapping to characterize relationships among implementation strategies and assess their feasibility and importance: results from the Expert Recommendations for Implementing Change (ERIC) study. Implement Sci. 2015;10:109.

Perry CK, Damschroder LJ, Hemler JR, Woodson TT, Ono SS, Cohen DJ. Specifying and comparing implementation strategies across seven large implementation interventions: a practical application of theory. Implement Sci. 2019;14(1):32.

Community Preventive Services Task Force. Community Preventive Services Task Force: All Active Findings June 2023 [Internet]. 2023 [cited 2023 Aug 7]. Available from: https://www.thecommunityguide.org/media/pdf/CPSTF-All-Findings-508.pdf

Solberg LI, Kuzel A, Parchman ML, Shelley DR, Dickinson WP, Walunas TL, et al. A Taxonomy for External Support for Practice Transformation. J Am Board Fam Med JABFM. 2021;34:32–9.

Leeman J, Birken SA, Powell BJ, Rohweder C, Shea CM. Beyond “implementation strategies”: classifying the full range of strategies used in implementation science and practice. Implement Sci. 2017;12:1–9.

Article   Google Scholar  

Leeman J, Calancie L, Hartman MA, Escoffery CT, Herrmann AK, Tague LE, et al. What strategies are used to build practitioners’ capacity to implement community-based interventions and are they effective?: a systematic review. Implement Sci. 2015;10:1–15.

Nathan N, Shelton RC, Laur CV, Hailemariam M, Hall A. Editorial: Sustaining the implementation of evidence-based interventions in clinical and community settings. Front Health Serv. 2023;3:1176023.

Balis LE, Houghtaling B, Harden SM. Using implementation strategies in community settings: an introduction to the Expert Recommendations for Implementing Change (ERIC) compilation and future directions. Transl Behav Med. 2022;12:965–78.

Nathan N, Powell BJ, Shelton RC, Laur CV, Wolfenden L, Hailemariam M, et al. Do the Expert Recommendations for Implementing Change (ERIC) strategies adequately address sustainment? Front Health Serv. 2022;2:905909.

Ivers N, Jamtvedt G, Flottorp S, Young JM, Odgaard-Jensen J, French SD, et al. Audit and feedback effects on professional practice and healthcare outcomes. Cochrane Database Syst Rev. 2012;6:CD000259.

Google Scholar  

Moore L, Guertin JR, Tardif P-A, Ivers NM, Hoch J, Conombo B, et al. Economic evaluations of audit and feedback interventions: a systematic review. BMJ Qual Saf. 2022;31:754–67.

Sykes MJ, McAnuff J, Kolehmainen N. When is audit and feedback effective in dementia care? A systematic review. Int J Nurs Stud. 2018;79:27–35.

Barnes C, McCrabb S, Stacey F, Nathan N, Yoong SL, Grady A, et al. Improving implementation of school-based healthy eating and physical activity policies, practices, and programs: a systematic review. Transl Behav Med. 2021;11:1365–410.

Tomasone JR, Kauffeldt KD, Chaudhary R, Brouwers MC. Effectiveness of guideline dissemination and implementation strategies on health care professionals’ behaviour and patient outcomes in the cancer care context: a systematic review. Implement Sci. 2020;15:1–18.

Seda V, Moles RJ, Carter SR, Schneider CR. Assessing the comparative effectiveness of implementation strategies for professional services to community pharmacy: A systematic review. Res Soc Adm Pharm. 2022;18:3469–83.

Lovero KL, Kemp CG, Wagenaar BH, Giusto A, Greene MC, Powell BJ, et al. Application of the Expert Recommendations for Implementing Change (ERIC) compilation of strategies to health intervention implementation in low- and middle-income countries: a systematic review. Implement Sci. 2023;18:56.

Chapman A, Rankin NM, Jongebloed H, Yoong SL, White V, Livingston PM, et al. Overcoming challenges in conducting systematic reviews in implementation science: a methods commentary. Syst Rev. 2023;12:1–6.

Article   CAS   Google Scholar  

Proctor EK, Bunger AC, Lengnick-Hall R, Gerke DR, Martin JK, Phillips RJ, et al. Ten years of implementation outcomes research: a scoping review. Implement Sci. 2023;18:1–19.

Michaud TL, Pereira E, Porter G, Golden C, Hill J, Kim J, et al. Scoping review of costs of implementation strategies in community, public health and healthcare settings. BMJ Open. 2022;12:e060785.

Sohn H, Tucker A, Ferguson O, Gomes I, Dowdy D. Costing the implementation of public health interventions in resource-limited settings: a conceptual framework. Implement Sci. 2020;15:1–8.

Peek C, Glasgow RE, Stange KC, Klesges LM, Purcell EP, Kessler RS. The 5 R’s: an emerging bold standard for conducting relevant research in a changing world. Ann Fam Med. 2014;12:447–55.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Glasgow RE, Vogt TM, Boles SM. Evaluating the public health impact of health promotion interventions: the RE-AIM framework. Am J Public Health. 1999;89:1322–7.

Shelton RC, Chambers DA, Glasgow RE. An Extension of RE-AIM to Enhance Sustainability: Addressing Dynamic Context and Promoting Health Equity Over Time. Front Public Health. 2020;8:134.

Holtrop JS, Estabrooks PA, Gaglio B, Harden SM, Kessler RS, King DK, et al. Understanding and applying the RE-AIM framework: Clarifications and resources. J Clin Transl Sci. 2021;5:e126.

Moher D, Shamseer L, Clarke M, Ghersi D, Liberati A, Petticrew M, et al. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement. Syst Rev. 2015;4:1.

Shamseer L, Moher D, Clarke M, Ghersi D, Liberati A, Petticrew M, et al. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015: elaboration and explanation. BMJ. 2015;349:g7647.

Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ [Internet]. 2021;372. Available from: https://www.bmj.com/content/372/bmj.n71

Rabin BA, Brownson RC, Haire-Joshu D, Kreuter MW, Weaver NL. A Glossary for Dissemination and Implementation Research in Health. J Public Health Manag Pract. 2008;14:117–23.

Eccles MP, Mittman BS. Welcome to Implementation Science. Implement Sci. 2006;1:1.

Article   PubMed Central   Google Scholar  

Miller WR, Wilbourne PL. Mesa Grande: a methodological analysis of clinical trials of treatments for alcohol use disorders. Addict Abingdon Engl. 2002;97:265–77.

Miller WR, Brown JM, Simpson TL, Handmaker NS, Bien TH, Luckie LF, et al. What works? A methodological analysis of the alcohol treatment outcome literature. Handb Alcohol Treat Approaches Eff Altern 2nd Ed. Needham Heights, MA, US: Allyn & Bacon; 1995:12–44.

Wells S, Tamir O, Gray J, Naidoo D, Bekhit M, Goldmann D. Are quality improvement collaboratives effective? A systematic review BMJ Qual Saf. 2018;27:226–40.

Yakovchenko V, Chinman MJ, Lamorte C, Powell BJ, Waltz TJ, Merante M, et al. Refining Expert Recommendations for Implementing Change (ERIC) strategy surveys using cognitive interviews with frontline providers. Implement Sci Commun. 2023;4:1–14.

Wagner TH, Yoon J, Jacobs JC, So A, Kilbourne AM, Yu W, et al. Estimating costs of an implementation intervention. Med Decis Making. 2020;40:959–67.

Gold HT, McDermott C, Hoomans T, Wagner TH. Cost data in implementation science: categories and approaches to costing. Implement Sci. 2022;17:11.

Boutron I, Page MJ, Higgins JP, Altman DG, Lundh A, Hróbjartsson A. Considering bias and conflicts of interest among the included studies. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA, editors. Cochrane Handbook for Systematic Reviews of Interventions. 2019. https://doi.org/10.1002/9781119536604.ch7 . 

Higgins JP, Savović J, Page MJ, Elbers RG, Sterne J. Assessing risk of bias in a randomized trial. Cochrane Handb Syst Rev Interv. 2019;6:205–28.

Reilly KL, Kennedy S, Porter G, Estabrooks P. Comparing, Contrasting, and Integrating Dissemination and Implementation Outcomes Included in the RE-AIM and Implementation Outcomes Frameworks. Front Public Health [Internet]. 2020 [cited 2024 Apr 24];8. Available from: https://www.frontiersin.org/journals/public-health/articles/ https://doi.org/10.3389/fpubh.2020.00430/full

Grimshaw JM, Thomas RE, MacLennan G, Fraser C, Ramsay CR, Vale L, et al. Effectiveness and efficiency of guideline dissemination and implementation strategies. Health Technol Assess Winch Engl. 2004;8:iii–iv 1-72.

CAS   Google Scholar  

Beidas RS, Kendall PC. Training Therapists in Evidence-Based Practice: A Critical Review of Studies From a Systems-Contextual Perspective. Clin Psychol Publ Div Clin Psychol Am Psychol Assoc. 2010;17:1–30.

Powell BJ, Beidas RS, Lewis CC, Aarons GA, McMillen JC, Proctor EK, et al. Methods to Improve the Selection and Tailoring of Implementation Strategies. J Behav Health Serv Res. 2017;44:177–94.

Powell BJ, Fernandez ME, Williams NJ, Aarons GA, Beidas RS, Lewis CC, et al. Enhancing the Impact of Implementation Strategies in Healthcare: A Research Agenda. Front Public Health [Internet]. 2019 [cited 2021 Mar 31];7. Available from: https://www.frontiersin.org/articles/ https://doi.org/10.3389/fpubh.2019.00003/full

Frakt AB, Prentice JC, Pizer SD, Elwy AR, Garrido MM, Kilbourne AM, et al. Overcoming Challenges to Evidence-Based Policy Development in a Large. Integrated Delivery System Health Serv Res. 2018;53:4789–807.

PubMed   Google Scholar  

Crable EL, Lengnick-Hall R, Stadnick NA, Moullin JC, Aarons GA. Where is “policy” in dissemination and implementation science? Recommendations to advance theories, models, and frameworks: EPIS as a case example. Implement Sci. 2022;17:80.

Crable EL, Grogan CM, Purtle J, Roesch SC, Aarons GA. Tailoring dissemination strategies to increase evidence-informed policymaking for opioid use disorder treatment: study protocol. Implement Sci Commun. 2023;4:16.

Bond GR. Evidence-based policy strategies: A typology. Clin Psychol Sci Pract. 2018;25:e12267.

Loo TS, Davis RB, Lipsitz LA, Irish J, Bates CK, Agarwal K, et al. Electronic Medical Record Reminders and Panel Management to Improve Primary Care of Elderly Patients. Arch Intern Med. 2011;171:1552–8.

Shojania KG, Jennings A, Mayhew A, Ramsay C, Eccles M, Grimshaw J. Effect of point-of-care computer reminders on physician behaviour: a systematic review. CMAJ Can Med Assoc J. 2010;182:E216-25.

Sequist TD, Gandhi TK, Karson AS, Fiskio JM, Bugbee D, Sperling M, et al. A Randomized Trial of Electronic Clinical Reminders to Improve Quality of Care for Diabetes and Coronary Artery Disease. J Am Med Inform Assoc JAMIA. 2005;12:431–7.

Dopp AR, Kerns SEU, Panattoni L, Ringel JS, Eisenberg D, Powell BJ, et al. Translating economic evaluations into financing strategies for implementing evidence-based practices. Implement Sci. 2021;16:1–12.

Dopp AR, Hunter SB, Godley MD, Pham C, Han B, Smart R, et al. Comparing two federal financing strategies on penetration and sustainment of the adolescent community reinforcement approach for substance use disorders: protocol for a mixed-method study. Implement Sci Commun. 2022;3:51.

Proctor EK, Toker E, Tabak R, McKay VR, Hooley C, Evanoff B. Market viability: a neglected concept in implementation science. Implement Sci. 2021;16:98.

Dopp AR, Narcisse M-R, Mundey P, Silovsky JF, Smith AB, Mandell D, et al. A scoping review of strategies for financing the implementation of evidence-based practices in behavioral health systems: State of the literature and future directions. Implement Res Pract. 2020;1:2633489520939980.

PubMed   PubMed Central   Google Scholar  

Dopp AR, Kerns SEU, Panattoni L, Ringel JS, Eisenberg D, Powell BJ, et al. Translating economic evaluations into financing strategies for implementing evidence-based practices. Implement Sci IS. 2021;16:66.

Kilbourne AM, Neumann MS, Pincus HA, Bauer MS, Stall R. Implementing evidence-based interventions in health care:application of the replicating effective programs framework. Implement Sci. 2007;2:42–51.

Kegeles SM, Rebchook GM, Hays RB, Terry MA, O’Donnell L, Leonard NR, et al. From science to application: the development of an intervention package. AIDS Educ Prev Off Publ Int Soc AIDS Educ. 2000;12:62–74.

Wandersman A, Imm P, Chinman M, Kaftarian S. Getting to outcomes: a results-based approach to accountability. Eval Program Plann. 2000;23:389–95.

Wandersman A, Chien VH, Katz J. Toward an evidence-based system for innovation support for implementing innovations with quality: Tools, training, technical assistance, and quality assurance/quality improvement. Am J Community Psychol. 2012;50:445–59.

Rogal SS, Yakovchenko V, Waltz TJ, Powell BJ, Kirchner JE, Proctor EK, et al. The association between implementation strategy use and the uptake of hepatitis C treatment in a national sample. Implement Sci. 2017;12:1–13.

Smith SN, Almirall D, Prenovost K, Liebrecht C, Kyle J, Eisenberg D, et al. Change in patient outcomes after augmenting a low-level implementation strategy in community practices that are slow to adopt a collaborative chronic care model: a cluster randomized implementation trial. Med Care. 2019;57:503.

Rogal SS, Yakovchenko V, Waltz TJ, Powell BJ, Gonzalez R, Park A, et al. Longitudinal assessment of the association between implementation strategy use and the uptake of hepatitis C treatment: Year 2. Implement Sci. 2019;14:1–12.

Harvey G, Kitson A. Translating evidence into healthcare policy and practice: Single versus multi-faceted implementation strategies – is there a simple answer to a complex question? Int J Health Policy Manag. 2015;4:123–6.

Engell T, Stadnick NA, Aarons GA, Barnett ML. Common Elements Approaches to Implementation Research and Practice: Methods and Integration with Intervention Science. Glob Implement Res Appl. 2023;3:1–15.

Michie S, Fixsen D, Grimshaw JM, Eccles MP. Specifying and reporting complex behaviour change interventions: the need for a scientific method. Implement Sci IS. 2009;4:40.

Smith JD, Li DH, Rafferty MR. The Implementation Research Logic Model: a method for planning, executing, reporting, and synthesizing implementation projects. Implement Sci IS. 2020;15:84.

Perez Jolles M, Lengnick-Hall R, Mittman BS. Core Functions and Forms of Complex Health Interventions: a Patient-Centered Medical Home Illustration. JGIM J Gen Intern Med. 2019;34:1032–8.

Schroeck FR, Ould Ismail AA, Haggstrom DA, Sanchez SL, Walker DR, Zubkoff L. Data-driven approach to implementation mapping for the selection of implementation strategies: a case example for risk-aligned bladder cancer surveillance. Implement Sci IS. 2022;17:58.

Frank HE, Kemp J, Benito KG, Freeman JB. Precision Implementation: An Approach to Mechanism Testing in Implementation Research. Adm Policy Ment Health. 2022;49:1084–94.

Lewis CC, Klasnja P, Lyon AR, Powell BJ, Lengnick-Hall R, Buchanan G, et al. The mechanics of implementation strategies and measures: advancing the study of implementation mechanisms. Implement Sci Commun. 2022;3:114.

Geng EH, Baumann AA, Powell BJ. Mechanism mapping to advance research on implementation strategies. PLoS Med. 2022;19:e1003918.

Pinnock H, Barwick M, Carpenter CR, Eldridge S, Grandes G, Griffiths CJ, et al. Standards for Reporting Implementation Studies (StaRI) Statement. BMJ. 2017;356:i6795.

Proctor E, Silmere H, Raghavan R, Hovmand P, Aarons G, Bunger A, et al. Outcomes for Implementation Research: Conceptual Distinctions, Measurement Challenges, and Research Agenda. Adm Policy Ment Health Ment Health Serv Res. 2011;38:65–76.

Hooley C, Amano T, Markovitz L, Yaeger L, Proctor E. Assessing implementation strategy reporting in the mental health literature: a narrative review. Adm Policy Ment Health Ment Health Serv Res. 2020;47:19–35.

Proctor E, Ramsey AT, Saldana L, Maddox TM, Chambers DA, Brownson RC. FAST: a framework to assess speed of translation of health innovations to practice and policy. Glob Implement Res Appl. 2022;2:107–19.

Cullen L, Hanrahan K, Edmonds SW, Reisinger HS, Wagner M. Iowa Implementation for Sustainability Framework. Implement Sci IS. 2022;17:1.

Saldana L, Ritzwoller DP, Campbell M, Block EP. Using economic evaluations in implementation science to increase transparency in costs and outcomes for organizational decision-makers. Implement Sci Commun. 2022;3:40.

Eisman AB, Kilbourne AM, Dopp AR, Saldana L, Eisenberg D. Economic evaluation in implementation science: making the business case for implementation strategies. Psychiatry Res. 2020;283:112433.

Akiba CF, Powell BJ, Pence BW, Nguyen MX, Golin C, Go V. The case for prioritizing implementation strategy fidelity measurement: benefits and challenges. Transl Behav Med. 2022;12:335–42.

Akiba CF, Powell BJ, Pence BW, Muessig K, Golin CE, Go V. “We start where we are”: a qualitative study of barriers and pragmatic solutions to the assessment and reporting of implementation strategy fidelity. Implement Sci Commun. 2022;3:117.

Rudd BN, Davis M, Doupnik S, Ordorica C, Marcus SC, Beidas RS. Implementation strategies used and reported in brief suicide prevention intervention studies. JAMA Psychiatry. 2022;79:829–31.

Painter JT, Raciborski RA, Matthieu MM, Oliver CM, Adkins DA, Garner KK. Engaging stakeholders to retrospectively discern implementation strategies to support program evaluation: Proposed method and case study. Eval Program Plann. 2024;103:102398.

Bunger AC, Powell BJ, Robertson HA, MacDowell H, Birken SA, Shea C. Tracking implementation strategies: a description of a practical approach and early findings. Health Res Policy Syst. 2017;15:1–12.

Mustanski B, Smith JD, Keiser B, Li DH, Benbow N. Supporting the growth of domestic HIV implementation research in the united states through coordination, consultation, and collaboration: how we got here and where we are headed. JAIDS J Acquir Immune Defic Syndr. 2022;90:S1-8.

Marques MM, Wright AJ, Corker E, Johnston M, West R, Hastings J, et al. The Behaviour Change Technique Ontology: Transforming the Behaviour Change Technique Taxonomy v1. Wellcome Open Res. 2023;8:308.

Merle JL, Li D, Keiser B, Zamantakis A, Queiroz A, Gallo CG, et al. Categorising implementation determinants and strategies within the US HIV implementation literature: a systematic review protocol. BMJ Open. 2023;13:e070216.

Glenshaw MT, Gaist P, Wilson A, Cregg RC, Holtz TH, Goodenow MM. Role of NIH in the Ending the HIV Epidemic in the US Initiative: Research Improving Practice. J Acquir Immune Defic Syndr. 1999;2022(90):S9-16.

Purcell DW, Namkung Lee A, Dempsey A, Gordon C. Enhanced Federal Collaborations in Implementation Science and Research of HIV Prevention and Treatment. J Acquir Immune Defic Syndr. 1999;2022(90):S17-22.

Queiroz A, Mongrella M, Keiser B, Li DH, Benbow N, Mustanski B. Profile of the Portfolio of NIH-Funded HIV Implementation Research Projects to Inform Ending the HIV Epidemic Strategies. J Acquir Immune Defic Syndr. 1999;2022(90):S23-31.

Zamantakis A, Li DH, Benbow N, Smith JD, Mustanski B. Determinants of Pre-exposure Prophylaxis (PrEP) Implementation in Transgender Populations: A Qualitative Scoping Review. AIDS Behav. 2023;27:1600–18.

Li DH, Benbow N, Keiser B, Mongrella M, Ortiz K, Villamar J, et al. Determinants of Implementation for HIV Pre-exposure Prophylaxis Based on an Updated Consolidated Framework for Implementation Research: A Systematic Review. J Acquir Immune Defic Syndr. 1999;2022(90):S235-46.

Chambers DA, Emmons KM. Navigating the field of implementation science towards maturity: challenges and opportunities. Implement Sci. 2024;19:26, s13012-024-01352–0.

Chinman M, Acosta J, Ebener P, Shearer A. “What we have here, is a failure to [replicate]”: Ways to solve a replication crisis in implementation science. Prev Sci. 2022;23:739–50.

Chambers DA, Glasgow RE, Stange KC. The dynamic sustainability framework: addressing the paradox of sustainment amid ongoing change. Implement Sci. 2013;8:117.

Lengnick-Hall R, Gerke DR, Proctor EK, Bunger AC, Phillips RJ, Martin JK, et al. Six practical recommendations for improved implementation outcomes reporting. Implement Sci. 2022;17:16.

Miller CJ, Barnett ML, Baumann AA, Gutner CA, Wiltsey-Stirman S. The FRAME-IS: a framework for documenting modifications to implementation strategies in healthcare. Implement Sci IS. 2021;16:36.

Xu X, Lazar CM, Ruger JP. Micro-costing in health and medicine: a critical appraisal. Health Econ Rev. 2021;11:1.

Barnett ML, Dopp AR, Klein C, Ettner SL, Powell BJ, Saldana L. Collaborating with health economists to advance implementation science: a qualitative study. Implement Sci Commun. 2020;1:82.

Lengnick-Hall R, Williams NJ, Ehrhart MG, Willging CE, Bunger AC, Beidas RS, et al. Eight characteristics of rigorous multilevel implementation research: a step-by-step guide. Implement Sci. 2023;18:52.

Riley-Gibson E, Hall A, Shoesmith A, Wolfenden L, Shelton RC, Doherty E, et al. A systematic review to determine the effect of strategies to sustain chronic disease prevention interventions in clinical and community settings: study protocol. Res Sq [Internet]. 2023 [cited 2024 Apr 19]; Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10312971/

Ingvarsson S, Hasson H, von Thiele Schwarz U, Nilsen P, Powell BJ, Lindberg C, et al. Strategies for de-implementation of low-value care—a scoping review. Implement Sci IS. 2022;17:73.

Lewis CC, Powell BJ, Brewer SK, Nguyen AM, Schriger SH, Vejnoska SF, et al. Advancing mechanisms of implementation to accelerate sustainable evidence-based practice integration: protocol for generating a research agenda. BMJ Open. 2021;11:e053474.

Hailemariam M, Bustos T, Montgomery B, Barajas R, Evans LB, Drahota A. Evidence-based intervention sustainability strategies: a systematic review. Implement Sci. 2019;14:N.PAG-N.PAG.

Michie S, Atkins L, West R. The behaviour change wheel. Guide Des Interv 1st Ed G B Silverback Publ. 2014;1003:1010.

Birken SA, Haines ER, Hwang S, Chambers DA, Bunger AC, Nilsen P. Advancing understanding and identifying strategies for sustaining evidence-based practices: a review of reviews. Implement Sci IS. 2020;15:88.

Metz A, Jensen T, Farley A, Boaz A, Bartley L, Villodas M. Building trusting relationships to support implementation: A proposed theoretical model. Front Health Serv. 2022;2:894599.

Rabin BA, Cain KL, Watson P, Oswald W, Laurent LC, Meadows AR, et al. Scaling and sustaining COVID-19 vaccination through meaningful community engagement and care coordination for underserved communities: hybrid type 3 effectiveness-implementation sequential multiple assignment randomized trial. Implement Sci IS. 2023;18:28.

Gyamfi J, Iwelunmor J, Patel S, Irazola V, Aifah A, Rakhra A, et al. Implementation outcomes and strategies for delivering evidence-based hypertension interventions in lower-middle-income countries: Evidence from a multi-country consortium for hypertension control. PLOS ONE. 2023;18:e0286204.

Woodward EN, Ball IA, Willging C, Singh RS, Scanlon C, Cluck D, et al. Increasing consumer engagement: tools to engage service users in quality improvement or implementation efforts. Front Health Serv. 2023;3:1124290.

Norton WE, Chambers DA. Unpacking the complexities of de-implementing inappropriate health interventions. Implement Sci IS. 2020;15:2.

Norton WE, McCaskill-Stevens W, Chambers DA, Stella PJ, Brawley OW, Kramer BS. DeImplementing Ineffective and Low-Value Clinical Practices: Research and Practice Opportunities in Community Oncology Settings. JNCI Cancer Spectr. 2021;5:pkab020.

McKay VR, Proctor EK, Morshed AB, Brownson RC, Prusaczyk B. Letting Go: Conceptualizing Intervention De-implementation in Public Health and Social Service Settings. Am J Community Psychol. 2018;62:189–202.

Patey AM, Grimshaw JM, Francis JJ. Changing behaviour, ‘more or less’: do implementation and de-implementation interventions include different behaviour change techniques? Implement Sci IS. 2021;16:20.

Rodriguez Weno E, Allen P, Mazzucca S, Farah Saliba L, Padek M, Moreland-Russell S, et al. Approaches for Ending Ineffective Programs: Strategies From State Public Health Practitioners. Front Public Health. 2021;9:727005.

Gnjidic D, Elshaug AG. De-adoption and its 43 related terms: harmonizing low-value care terminology. BMC Med. 2015;13:273.

Download references

Acknowledgements

The authors would like to acknowledge the early contributions of the Pittsburgh Dissemination and Implementation Science Collaborative (Pitt DISC). LEA would like to thank Dr. Billie Davis for analytical support. The authors would like to acknowledge the implementation science experts who recommended articles for our review, including Greg Aarons, Mark Bauer, Rinad Beidas, Geoffrey Curran, Laura Damschroder, Rani Elwy, Amy Kilbourne, JoAnn Kirchner, Jennifer Leeman, Cara Lewis, Dennis Li, Aaron Lyon, Gila Neta, and Borsika Rabin.

Dr. Rogal’s time was funded in part by a University of Pittsburgh K award (K23-DA048182) and by a VA Health Services Research and Development grant (PEC 19-207). Drs. Bachrach and Quinn were supported by VA HSR Career Development Awards (CDA 20-057, PI: Bachrach; CDA 20-224, PI: Quinn). Dr. Scheunemann’s time was funded by the US Agency for Healthcare Research and Quality (K08HS027210). Drs. Hero, Chinman, Goodrich, Ernecoff, and Mr. Qureshi were funded by the Patient-Centered Outcomes Research Institute (PCORI) AOSEPP2 Task Order 12 to conduct a landscape review of US studies on the effectiveness of implementation strategies with results reported here ( https://www.pcori.org/sites/default/files/PCORI-Implementation-Strategies-for-Evidence-Based-Practice-in-Health-and-Health-Care-A-Review-of-the-Evidence-Full-Report.pdf and https://www.pcori.org/sites/default/files/PCORI-Implementation-Strategies-for-Evidence-Based-Practice-in-Health-and-Health-Care-Brief-Report-Summary.pdf ). Dr. Ashcraft and Ms. Phares were funded by the Center for Health Equity Research and Promotion, (CIN 13-405). The funders had no involvement in this study.

Author information

Shari S. Rogal and Matthew J. Chinman are co-senior authors.

Authors and Affiliations

Center for Health Equity Research and Promotion, Corporal Michael Crescenz VA Medical Center, Philadelphia, PA, USA

Laura Ellen Ashcraft

Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA

Center for Health Equity Research and Promotion, VA Pittsburgh Healthcare System, Pittsburgh, PA, USA

David E. Goodrich, Angela Phares, Deirdre A. Quinn, Shari S. Rogal & Matthew J. Chinman

Division of General Internal Medicine, Department of Medicine, University of Pittsburgh, Pittsburgh, PA, USA

David E. Goodrich, Deirdre A. Quinn & Matthew J. Chinman

Clinical & Translational Science Institute, University of Pittsburgh, Pittsburgh, PA, USA

David E. Goodrich & Lisa G. Lederer

RAND Corporation, Pittsburgh, PA, USA

Joachim Hero, Nabeel Qureshi, Natalie C. Ernecoff & Matthew J. Chinman

Center for Clinical Management Research, VA Ann Arbor Healthcare System, Ann Arbor, Michigan, USA

Rachel L. Bachrach

Department of Psychiatry, University of Michigan Medical School, Ann Arbor, MI, USA

Division of Geriatric Medicine, University of Pittsburgh, Department of Medicine, Pittsburgh, PA, USA

Leslie Page Scheunemann

Division of Pulmonary, Allergy, Critical Care, and Sleep Medicine, University of Pittsburgh, Department of Medicine, Pittsburgh, PA, USA

Departments of Medicine and Surgery, University of Pittsburgh, Pittsburgh, Pennsylvania, USA

Shari S. Rogal

You can also search for this author in PubMed   Google Scholar

Contributions

LEA, SSR, and MJC conceptualized the study. LEA, SSR, MJC, and JOH developed the study design. LEA and JOH acquired the data. LEA, DEG, AP, RLB, DAQ, LGL, LPS, SSR, NQ, and MJC conducted the abstract, full text review, and rigor assessment. LEA, DEG, JOH, AP, RLB, DAQ, NQ, NCE, SSR, and MJC conducted the data abstraction. DEG, SSR, and MJC adjudicated conflicts. LEA and SSR analyzed the data. LEA, SSR, JOH, and MJC interpreted the data. LEA, SSR, and MJC drafted the work. All authors substantially revised the work. All authors approved the submitted version and agreed to be personally accountable for their contributions and the integrity of the work.

Corresponding author

Correspondence to Laura Ellen Ashcraft .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

The manuscript does not contain any individual person’s data.

Competing interests

Additional information, publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1., supplementary material 2., supplementary material 3., supplementary material 4., supplementary material 5., supplementary material 6., supplementary material 7., supplementary material 8., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Ashcraft, L.E., Goodrich, D.E., Hero, J. et al. A systematic review of experimentally tested implementation strategies across health and human service settings: evidence from 2010-2022. Implementation Sci 19 , 43 (2024). https://doi.org/10.1186/s13012-024-01369-5

Download citation

Received : 09 November 2023

Accepted : 27 May 2024

Published : 24 June 2024

DOI : https://doi.org/10.1186/s13012-024-01369-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Implementation strategy
  • Health-related outcomes

Implementation Science

ISSN: 1748-5908

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

systematic literature review figure

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Wiley Open Access Collection

Logo of blackwellopen

An overview of methodological approaches in systematic reviews

Prabhakar veginadu.

1 Department of Rural Clinical Sciences, La Trobe Rural Health School, La Trobe University, Bendigo Victoria, Australia

Hanny Calache

2 Lincoln International Institute for Rural Health, University of Lincoln, Brayford Pool, Lincoln UK

Akshaya Pandian

3 Department of Orthodontics, Saveetha Dental College, Chennai Tamil Nadu, India

Mohd Masood

Associated data.

APPENDIX B: List of excluded studies with detailed reasons for exclusion

APPENDIX C: Quality assessment of included reviews using AMSTAR 2

The aim of this overview is to identify and collate evidence from existing published systematic review (SR) articles evaluating various methodological approaches used at each stage of an SR.

The search was conducted in five electronic databases from inception to November 2020 and updated in February 2022: MEDLINE, Embase, Web of Science Core Collection, Cochrane Database of Systematic Reviews, and APA PsycINFO. Title and abstract screening were performed in two stages by one reviewer, supported by a second reviewer. Full‐text screening, data extraction, and quality appraisal were performed by two reviewers independently. The quality of the included SRs was assessed using the AMSTAR 2 checklist.

The search retrieved 41,556 unique citations, of which 9 SRs were deemed eligible for inclusion in final synthesis. Included SRs evaluated 24 unique methodological approaches used for defining the review scope and eligibility, literature search, screening, data extraction, and quality appraisal in the SR process. Limited evidence supports the following (a) searching multiple resources (electronic databases, handsearching, and reference lists) to identify relevant literature; (b) excluding non‐English, gray, and unpublished literature, and (c) use of text‐mining approaches during title and abstract screening.

The overview identified limited SR‐level evidence on various methodological approaches currently employed during five of the seven fundamental steps in the SR process, as well as some methodological modifications currently used in expedited SRs. Overall, findings of this overview highlight the dearth of published SRs focused on SR methodologies and this warrants future work in this area.

1. INTRODUCTION

Evidence synthesis is a prerequisite for knowledge translation. 1 A well conducted systematic review (SR), often in conjunction with meta‐analyses (MA) when appropriate, is considered the “gold standard” of methods for synthesizing evidence related to a topic of interest. 2 The central strength of an SR is the transparency of the methods used to systematically search, appraise, and synthesize the available evidence. 3 Several guidelines, developed by various organizations, are available for the conduct of an SR; 4 , 5 , 6 , 7 among these, Cochrane is considered a pioneer in developing rigorous and highly structured methodology for the conduct of SRs. 8 The guidelines developed by these organizations outline seven fundamental steps required in SR process: defining the scope of the review and eligibility criteria, literature searching and retrieval, selecting eligible studies, extracting relevant data, assessing risk of bias (RoB) in included studies, synthesizing results, and assessing certainty of evidence (CoE) and presenting findings. 4 , 5 , 6 , 7

The methodological rigor involved in an SR can require a significant amount of time and resource, which may not always be available. 9 As a result, there has been a proliferation of modifications made to the traditional SR process, such as refining, shortening, bypassing, or omitting one or more steps, 10 , 11 for example, limits on the number and type of databases searched, limits on publication date, language, and types of studies included, and limiting to one reviewer for screening and selection of studies, as opposed to two or more reviewers. 10 , 11 These methodological modifications are made to accommodate the needs of and resource constraints of the reviewers and stakeholders (e.g., organizations, policymakers, health care professionals, and other knowledge users). While such modifications are considered time and resource efficient, they may introduce bias in the review process reducing their usefulness. 5

Substantial research has been conducted examining various approaches used in the standardized SR methodology and their impact on the validity of SR results. There are a number of published reviews examining the approaches or modifications corresponding to single 12 , 13 or multiple steps 14 involved in an SR. However, there is yet to be a comprehensive summary of the SR‐level evidence for all the seven fundamental steps in an SR. Such a holistic evidence synthesis will provide an empirical basis to confirm the validity of current accepted practices in the conduct of SRs. Furthermore, sometimes there is a balance that needs to be achieved between the resource availability and the need to synthesize the evidence in the best way possible, given the constraints. This evidence base will also inform the choice of modifications to be made to the SR methods, as well as the potential impact of these modifications on the SR results. An overview is considered the choice of approach for summarizing existing evidence on a broad topic, directing the reader to evidence, or highlighting the gaps in evidence, where the evidence is derived exclusively from SRs. 15 Therefore, for this review, an overview approach was used to (a) identify and collate evidence from existing published SR articles evaluating various methodological approaches employed in each of the seven fundamental steps of an SR and (b) highlight both the gaps in the current research and the potential areas for future research on the methods employed in SRs.

An a priori protocol was developed for this overview but was not registered with the International Prospective Register of Systematic Reviews (PROSPERO), as the review was primarily methodological in nature and did not meet PROSPERO eligibility criteria for registration. The protocol is available from the corresponding author upon reasonable request. This overview was conducted based on the guidelines for the conduct of overviews as outlined in The Cochrane Handbook. 15 Reporting followed the Preferred Reporting Items for Systematic reviews and Meta‐analyses (PRISMA) statement. 3

2.1. Eligibility criteria

Only published SRs, with or without associated MA, were included in this overview. We adopted the defining characteristics of SRs from The Cochrane Handbook. 5 According to The Cochrane Handbook, a review was considered systematic if it satisfied the following criteria: (a) clearly states the objectives and eligibility criteria for study inclusion; (b) provides reproducible methodology; (c) includes a systematic search to identify all eligible studies; (d) reports assessment of validity of findings of included studies (e.g., RoB assessment of the included studies); (e) systematically presents all the characteristics or findings of the included studies. 5 Reviews that did not meet all of the above criteria were not considered a SR for this study and were excluded. MA‐only articles were included if it was mentioned that the MA was based on an SR.

SRs and/or MA of primary studies evaluating methodological approaches used in defining review scope and study eligibility, literature search, study selection, data extraction, RoB assessment, data synthesis, and CoE assessment and reporting were included. The methodological approaches examined in these SRs and/or MA can also be related to the substeps or elements of these steps; for example, applying limits on date or type of publication are the elements of literature search. Included SRs examined or compared various aspects of a method or methods, and the associated factors, including but not limited to: precision or effectiveness; accuracy or reliability; impact on the SR and/or MA results; reproducibility of an SR steps or bias occurred; time and/or resource efficiency. SRs assessing the methodological quality of SRs (e.g., adherence to reporting guidelines), evaluating techniques for building search strategies or the use of specific database filters (e.g., use of Boolean operators or search filters for randomized controlled trials), examining various tools used for RoB or CoE assessment (e.g., ROBINS vs. Cochrane RoB tool), or evaluating statistical techniques used in meta‐analyses were excluded. 14

2.2. Search

The search for published SRs was performed on the following scientific databases initially from inception to third week of November 2020 and updated in the last week of February 2022: MEDLINE (via Ovid), Embase (via Ovid), Web of Science Core Collection, Cochrane Database of Systematic Reviews, and American Psychological Association (APA) PsycINFO. Search was restricted to English language publications. Following the objectives of this study, study design filters within databases were used to restrict the search to SRs and MA, where available. The reference lists of included SRs were also searched for potentially relevant publications.

The search terms included keywords, truncations, and subject headings for the key concepts in the review question: SRs and/or MA, methods, and evaluation. Some of the terms were adopted from the search strategy used in a previous review by Robson et al., which reviewed primary studies on methodological approaches used in study selection, data extraction, and quality appraisal steps of SR process. 14 Individual search strategies were developed for respective databases by combining the search terms using appropriate proximity and Boolean operators, along with the related subject headings in order to identify SRs and/or MA. 16 , 17 A senior librarian was consulted in the design of the search terms and strategy. Appendix A presents the detailed search strategies for all five databases.

2.3. Study selection and data extraction

Title and abstract screening of references were performed in three steps. First, one reviewer (PV) screened all the titles and excluded obviously irrelevant citations, for example, articles on topics not related to SRs, non‐SR publications (such as randomized controlled trials, observational studies, scoping reviews, etc.). Next, from the remaining citations, a random sample of 200 titles and abstracts were screened against the predefined eligibility criteria by two reviewers (PV and MM), independently, in duplicate. Discrepancies were discussed and resolved by consensus. This step ensured that the responses of the two reviewers were calibrated for consistency in the application of the eligibility criteria in the screening process. Finally, all the remaining titles and abstracts were reviewed by a single “calibrated” reviewer (PV) to identify potential full‐text records. Full‐text screening was performed by at least two authors independently (PV screened all the records, and duplicate assessment was conducted by MM, HC, or MG), with discrepancies resolved via discussions or by consulting a third reviewer.

Data related to review characteristics, results, key findings, and conclusions were extracted by at least two reviewers independently (PV performed data extraction for all the reviews and duplicate extraction was performed by AP, HC, or MG).

2.4. Quality assessment of included reviews

The quality assessment of the included SRs was performed using the AMSTAR 2 (A MeaSurement Tool to Assess systematic Reviews). The tool consists of a 16‐item checklist addressing critical and noncritical domains. 18 For the purpose of this study, the domain related to MA was reclassified from critical to noncritical, as SRs with and without MA were included. The other six critical domains were used according to the tool guidelines. 18 Two reviewers (PV and AP) independently responded to each of the 16 items in the checklist with either “yes,” “partial yes,” or “no.” Based on the interpretations of the critical and noncritical domains, the overall quality of the review was rated as high, moderate, low, or critically low. 18 Disagreements were resolved through discussion or by consulting a third reviewer.

2.5. Data synthesis

To provide an understandable summary of existing evidence syntheses, characteristics of the methods evaluated in the included SRs were examined and key findings were categorized and presented based on the corresponding step in the SR process. The categories of key elements within each step were discussed and agreed by the authors. Results of the included reviews were tabulated and summarized descriptively, along with a discussion on any overlap in the primary studies. 15 No quantitative analyses of the data were performed.

From 41,556 unique citations identified through literature search, 50 full‐text records were reviewed, and nine systematic reviews 14 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 were deemed eligible for inclusion. The flow of studies through the screening process is presented in Figure  1 . A list of excluded studies with reasons can be found in Appendix B .

An external file that holds a picture, illustration, etc.
Object name is JEBM-15-39-g001.jpg

Study selection flowchart

3.1. Characteristics of included reviews

Table  1 summarizes the characteristics of included SRs. The majority of the included reviews (six of nine) were published after 2010. 14 , 22 , 23 , 24 , 25 , 26 Four of the nine included SRs were Cochrane reviews. 20 , 21 , 22 , 23 The number of databases searched in the reviews ranged from 2 to 14, 2 reviews searched gray literature sources, 24 , 25 and 7 reviews included a supplementary search strategy to identify relevant literature. 14 , 19 , 20 , 21 , 22 , 23 , 26 Three of the included SRs (all Cochrane reviews) included an integrated MA. 20 , 21 , 23

Characteristics of included studies

Author, yearSearch strategy (year last searched; no. databases; supplementary searches)SR design (type of review; no. of studies included)Topic; subject areaSR objectivesSR authors’ comments on study quality
Crumley, 2005 2004; Seven databases; four journals handsearched, reference lists and contacting authorsSR;  = 64RCTs and CCTs; not specifiedTo identify and quantitatively review studies comparing two or more different resources (e.g., databases, Internet, handsearching) used to identify RCTs and CCTs for systematic reviews.Most of the studies adequately described reproducible search methods, expected search yield. Poor quality in studies was mainly due to lack of rigor in reporting selection methodology. Majority of the studies did not indicate the number of people involved in independently screening the searches or applying eligibility criteria to identify potentially relevant studies.
Hopewell, 2007 2002; eight databases; selected journals and published abstracts handsearched, and contacting authorsSR and MA;  = 34 (34 in quantitative analysis)RCTs; health careTo review systematically empirical studies, which have compared the results of handsearching with the results of searching one or more electronic databases to identify reports of randomized trials.The electronic search was designed and carried out appropriately in majority of the studies, while the appropriateness of handsearching was unclear in half the studies because of limited information. The screening studies methods used in both groups were comparable in most of the studies.
Hopewell, 2007 2005; two databases; selected journals and published abstracts handsearched, reference lists, citations and contacting authorsSR and MA;  = 5 (5 in quantitative analysis)RCTs; health careTo review systematically research studies, which have investigated the impact of gray literature in meta‐analyses of randomized trials of health care interventions.In majority of the studies, electronic searches were designed and conducted appropriately, and the selection of studies for eligibility was similar for handsearching and database searching. Insufficient data for most studies to assess the appropriateness of handsearching and investigator agreeability on the eligibility of the trial reports.
Horsley, 2011 2008; three databases; reference lists, citations and contacting authorsSR;  = 12Any topic or study areaTo investigate the effectiveness of checking reference lists for the identification of additional, relevant studies for systematic reviews. Effectiveness is defined as the proportion of relevant studies identified by review authors solely by checking reference lists.Interpretability and generalizability of included studies was difficult. Extensive heterogeneity among the studies in the number and type of databases used. Lack of control in majority of the studies related to the quality and comprehensiveness of searching.
Morrison, 2012 2011; six databases and gray literatureSR;  = 5RCTs; conventional medicineTo examine the impact of English language restriction on systematic review‐based meta‐analysesThe included studies were assessed to have good reporting quality and validity of results. Methodological issues were mainly noted in the areas of sample power calculation and distribution of confounders.
Robson, 2019 2016; three databases; reference lists and contacting authorsSR;  = 37N/RTo identify and summarize studies assessing methodologies for study selection, data abstraction, or quality appraisal in systematic reviews.The quality of the included studies was generally low. Only one study was assessed as having low RoB across all four domains. Majority of the studies were assessed to having unclear RoB across one or more domains.
Schmucker, 2017 2016; four databases; reference listsSR;  = 10Study data; medicineTo assess whether the inclusion of data that were not published at all and/or published only in the gray literature influences pooled effect estimates in meta‐analyses and leads to different interpretation.Majority of the included studies could not be judged on the adequacy of matching or adjusting for confounders of the gray/unpublished data in comparison to published data.
Also, generalizability of results was low or unclear in four research projects
Morissette, 2011 2009; five databases; reference lists and contacting authorsSR and MA;  = 6 (5 included in quantitative analysis)N/RTo determine whether blinded versus unblinded assessments of risk of bias result in similar or systematically different assessments in studies included in a systematic review.Four studies had unclear risk of bias, while two studies had high risk of bias.
O'Mara‐Eves, 2015 2013; 14 databases and gray literatureSR;  = 44N/RTo gather and present the available research evidence on existing methods for text mining related to the title and abstract screening stage in a systematic review, including the performance metrics used to evaluate these technologies.Quality appraised based on two criteria‐sampling of test cases and adequacy of methods description for replication. No study was excluded based on the quality (author contact).

SR = systematic review; MA = meta‐analysis; RCT = randomized controlled trial; CCT = controlled clinical trial; N/R = not reported.

The included SRs evaluated 24 unique methodological approaches (26 in total) used across five steps in the SR process; 8 SRs evaluated 6 approaches, 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 while 1 review evaluated 18 approaches. 14 Exclusion of gray or unpublished literature 21 , 26 and blinding of reviewers for RoB assessment 14 , 23 were evaluated in two reviews each. Included SRs evaluated methods used in five different steps in the SR process, including methods used in defining the scope of review ( n  = 3), literature search ( n  = 3), study selection ( n  = 2), data extraction ( n  = 1), and RoB assessment ( n  = 2) (Table  2 ).

Summary of findings from review evaluating systematic review methods

Key elementsAuthor, yearMethod assessedEvaluations/outcomes (P—primary; S—secondary)Summary of SR authors’ conclusionsQuality of review
Excluding study data based on publication statusHopewell, 2007 Gray vs. published literaturePooled effect estimatePublished trials are usually larger and show an overall greater treatment effect than gray trials. Excluding trials reported in gray literature from SRs and MAs may exaggerate the results.Moderate
Schmucker, 2017 Gray and/or unpublished vs. published literatureP: Pooled effect estimateExcluding unpublished trials had no or only a small effect on the pooled estimates of treatment effects. Insufficient evidence to conclude the impact of including unpublished or gray study data on MA conclusions.Moderate
S: Impact on interpretation of MA
Excluding study data based on language of publicationMorrison, 2012 English language vs. non‐English language publicationsP: Bias in summary treatment effectsNo evidence of a systematic bias from the use of English language restrictions in systematic review‐based meta‐analyses in conventional medicine. Conflicting results on the methodological and reporting quality of English and non‐English language RCTs. Further research required.Low
S: number of included studies and patients, methodological quality and statistical heterogeneity
Resources searchingCrumley, 2005 Two or more resources searching vs. resource‐specific searchingRecall and precisionMultiple‐source comprehensive searches are necessary to identify all RCTs for a systematic review. For electronic databases, using the Cochrane HSS or complex search strategy in consultation with a librarian is recommended.Critically low
Supplementary searchingHopewell, 2007 Handsearching only vs. one or more electronic database(s) searchingNumber of identified randomized trialsHandsearching is important for identifying trial reports for inclusion in systematic reviews of health care interventions published in nonindexed journals. Where time and resources are limited, majority of the full English‐language trial reports can be identified using a complex search or the Cochrane HSS.Moderate
Horsley, 2011 Checking reference list (no comparison)P: additional yield of checking reference listsThere is some evidence to support the use of checking reference lists to complement literature search in systematic reviews.Low
S: additional yield by publication type, study design or both and data pertaining to costs
Reviewer characteristicsRobson, 2019 Single vs. double reviewer screeningP: Accuracy, reliability, or efficiency of a methodUsing two reviewers for screening is recommended. If resources are limited, one reviewer can screen, and other reviewer can verify the list of excluded studies.Low
S: factors affecting accuracy or reliability of a method
Experienced vs. inexperienced reviewers for screeningScreening must be performed by experienced reviewers
Screening by blinded vs. unblinded reviewersAuthors do not recommend blinding of reviewers during screening as the blinding process was time‐consuming and had little impact on the results of MA
Use of technology for study selectionRobson, 2019 Use of dual computer monitors vs. nonuse of dual monitors for screeningP: Accuracy, reliability, or efficiency of a methodThere are no significant differences in the time spent on abstract or full‐text screening with the use and nonuse of dual monitorsLow
S: factors affecting accuracy or reliability of a method
Use of Google translate to translate non‐English citations to facilitate screeningUse of Google translate to screen German language citations
O'Mara‐Eves, 2015 Use of text mining for title and abstract screeningAny evaluation concerning workload reductionText mining approaches can be used to reduce the number of studies to be screened, increase the rate of screening, improve the workflow with screening prioritization, and replace the second reviewer. The evaluated approaches reported saving a workload of between 30% and 70%Critically low
Order of screeningRobson, 2019 Title‐first screening vs. title‐and‐abstract simultaneous screeningP: Accuracy, reliability, or efficiency of a methodTitle‐first screening showed no substantial gain in time when compared to simultaneous title and abstract screening.Low
S: factors affecting accuracy or reliability of a method
Reviewer characteristicsRobson, 2019 Single vs. double reviewer data extractionP: Accuracy, reliability, or efficiency of a methodUse two reviewers for data extraction. Single reviewer data extraction followed by the verification of outcome data by a second reviewer (where statistical analysis is planned), if resources precludeLow
S: factors affecting accuracy or reliability of a method
Experienced vs. inexperienced reviewers for data extractionExperienced reviewers must be used for extracting continuous outcomes data
Data extraction by blinded vs. unblinded reviewersAuthors do not recommend blinding of reviewers during data extraction as it had no impact on the results of MA
Use of technology for data extractionUse of dual computer monitors vs. nonuse of dual monitors for data extractionUsing two computer monitors may improve the efficiency of data extraction
Data extraction by two English reviewers using Google translate vs. data extraction by two reviewers fluent in respective languagesGoogle translate provides limited accuracy for data extraction
Computer‐assisted vs. double reviewer extraction of graphical dataUse of computer‐assisted programs to extract graphical data
Obtaining additional dataContacting study authors for additional dataRecommend contacting authors for obtaining additional relevant data
Reviewer characteristicsRobson, 2019 Quality appraisal by blinded vs. unblinded reviewersP: Accuracy, reliability, or efficiency of a methodInconsistent results on RoB assessments performed by blinded and unblinded reviewers. Blinding reviewers for quality appraisal not recommendedLow
S: factors affecting accuracy or reliability of a method
Morissette, 2011 Risk of bias (RoB) assessment by blinded vs. unblinded reviewersP: Mean difference and 95% confidence interval between RoB assessment scoresFindings related to the difference between blinded and unblinded RoB assessments are inconsistent from the studies. Pooled effects show no differences in RoB assessments for assessments completed in a blinded or unblinded manner.Moderate
S: qualitative level of agreement, mean RoB scores and measures of variance for the results of the RoB assessments, and inter‐rater reliability between blinded and unblinded reviewers
Robson, 2019 Experienced vs. inexperienced reviewers for quality appraisalP: Accuracy, reliability, or efficiency of a methodReviewers performing quality appraisal must be trained. Quality assessment tool must be pilot tested.Low
S: factors affecting accuracy or reliability of a method
Use of additional guidance vs. nonuse of additional guidance for quality appraisalProviding guidance and decision rules for quality appraisal improved the inter‐rater reliability in RoB assessments.
Obtaining additional dataContacting study authors for obtaining additional information/use of supplementary information available in the published trials vs. no additional information for quality appraisalAdditional data related to study quality obtained by contacting study authors improved the quality assessment.
RoB assessment of qualitative studiesStructured vs. unstructured appraisal of qualitative research studiesUse of structured tool if qualitative and quantitative studies designs are included in the review. For qualitative reviews, either structured or unstructured quality appraisal tool can be used.

There was some overlap in the primary studies evaluated in the included SRs on the same topics: Schmucker et al. 26 and Hopewell et al. 21 ( n  = 4), Hopewell et al. 20 and Crumley et al. 19 ( n  = 30), and Robson et al. 14 and Morissette et al. 23 ( n  = 4). There were no conflicting results between any of the identified SRs on the same topic.

3.2. Methodological quality of included reviews

Overall, the quality of the included reviews was assessed as moderate at best (Table  2 ). The most common critical weakness in the reviews was failure to provide justification for excluding individual studies (four reviews). Detailed quality assessment is provided in Appendix C .

3.3. Evidence on systematic review methods

3.3.1. methods for defining review scope and eligibility.

Two SRs investigated the effect of excluding data obtained from gray or unpublished sources on the pooled effect estimates of MA. 21 , 26 Hopewell et al. 21 reviewed five studies that compared the impact of gray literature on the results of a cohort of MA of RCTs in health care interventions. Gray literature was defined as information published in “print or electronic sources not controlled by commercial or academic publishers.” Findings showed an overall greater treatment effect for published trials than trials reported in gray literature. In a more recent review, Schmucker et al. 26 addressed similar objectives, by investigating gray and unpublished data in medicine. In addition to gray literature, defined similar to the previous review by Hopewell et al., the authors also evaluated unpublished data—defined as “supplemental unpublished data related to published trials, data obtained from the Food and Drug Administration  or other regulatory websites or postmarketing analyses hidden from the public.” The review found that in majority of the MA, excluding gray literature had little or no effect on the pooled effect estimates. The evidence was limited to conclude if the data from gray and unpublished literature had an impact on the conclusions of MA. 26

Morrison et al. 24 examined five studies measuring the effect of excluding non‐English language RCTs on the summary treatment effects of SR‐based MA in various fields of conventional medicine. Although none of the included studies reported major difference in the treatment effect estimates between English only and non‐English inclusive MA, the review found inconsistent evidence regarding the methodological and reporting quality of English and non‐English trials. 24 As such, there might be a risk of introducing “language bias” when excluding non‐English language RCTs. The authors also noted that the numbers of non‐English trials vary across medical specialties, as does the impact of these trials on MA results. Based on these findings, Morrison et al. 24 conclude that literature searches must include non‐English studies when resources and time are available to minimize the risk of introducing “language bias.”

3.3.2. Methods for searching studies

Crumley et al. 19 analyzed recall (also referred to as “sensitivity” by some researchers; defined as “percentage of relevant studies identified by the search”) and precision (defined as “percentage of studies identified by the search that were relevant”) when searching a single resource to identify randomized controlled trials and controlled clinical trials, as opposed to searching multiple resources. The studies included in their review frequently compared a MEDLINE only search with the search involving a combination of other resources. The review found low median recall estimates (median values between 24% and 92%) and very low median precisions (median values between 0% and 49%) for most of the electronic databases when searched singularly. 19 A between‐database comparison, based on the type of search strategy used, showed better recall and precision for complex and Cochrane Highly Sensitive search strategies (CHSSS). In conclusion, the authors emphasize that literature searches for trials in SRs must include multiple sources. 19

In an SR comparing handsearching and electronic database searching, Hopewell et al. 20 found that handsearching retrieved more relevant RCTs (retrieval rate of 92%−100%) than searching in a single electronic database (retrieval rates of 67% for PsycINFO/PsycLIT, 55% for MEDLINE, and 49% for Embase). The retrieval rates varied depending on the quality of handsearching, type of electronic search strategy used (e.g., simple, complex or CHSSS), and type of trial reports searched (e.g., full reports, conference abstracts, etc.). The authors concluded that handsearching was particularly important in identifying full trials published in nonindexed journals and in languages other than English, as well as those published as abstracts and letters. 20

The effectiveness of checking reference lists to retrieve additional relevant studies for an SR was investigated by Horsley et al. 22 The review reported that checking reference lists yielded 2.5%–40% more studies depending on the quality and comprehensiveness of the electronic search used. The authors conclude that there is some evidence, although from poor quality studies, to support use of checking reference lists to supplement database searching. 22

3.3.3. Methods for selecting studies

Three approaches relevant to reviewer characteristics, including number, experience, and blinding of reviewers involved in the screening process were highlighted in an SR by Robson et al. 14 Based on the retrieved evidence, the authors recommended that two independent, experienced, and unblinded reviewers be involved in study selection. 14 A modified approach has also been suggested by the review authors, where one reviewer screens and the other reviewer verifies the list of excluded studies, when the resources are limited. It should be noted however this suggestion is likely based on the authors’ opinion, as there was no evidence related to this from the studies included in the review.

Robson et al. 14 also reported two methods describing the use of technology for screening studies: use of Google Translate for translating languages (for example, German language articles to English) to facilitate screening was considered a viable method, while using two computer monitors for screening did not increase the screening efficiency in SR. Title‐first screening was found to be more efficient than simultaneous screening of titles and abstracts, although the gain in time with the former method was lesser than the latter. Therefore, considering that the search results are routinely exported as titles and abstracts, Robson et al. 14 recommend screening titles and abstracts simultaneously. However, the authors note that these conclusions were based on very limited number (in most instances one study per method) of low‐quality studies. 14

3.3.4. Methods for data extraction

Robson et al. 14 examined three approaches for data extraction relevant to reviewer characteristics, including number, experience, and blinding of reviewers (similar to the study selection step). Although based on limited evidence from a small number of studies, the authors recommended use of two experienced and unblinded reviewers for data extraction. The experience of the reviewers was suggested to be especially important when extracting continuous outcomes (or quantitative) data. However, when the resources are limited, data extraction by one reviewer and a verification of the outcomes data by a second reviewer was recommended.

As for the methods involving use of technology, Robson et al. 14 identified limited evidence on the use of two monitors to improve the data extraction efficiency and computer‐assisted programs for graphical data extraction. However, use of Google Translate for data extraction in non‐English articles was not considered to be viable. 14 In the same review, Robson et al. 14 identified evidence supporting contacting authors for obtaining additional relevant data.

3.3.5. Methods for RoB assessment

Two SRs examined the impact of blinding of reviewers for RoB assessments. 14 , 23 Morissette et al. 23 investigated the mean differences between the blinded and unblinded RoB assessment scores and found inconsistent differences among the included studies providing no definitive conclusions. Similar conclusions were drawn in a more recent review by Robson et al., 14 which included four studies on reviewer blinding for RoB assessment that completely overlapped with Morissette et al. 23

Use of experienced reviewers and provision of additional guidance for RoB assessment were examined by Robson et al. 14 The review concluded that providing intensive training and guidance on assessing studies reporting insufficient data to the reviewers improves RoB assessments. 14 Obtaining additional data related to quality assessment by contacting study authors was also found to help the RoB assessments, although based on limited evidence. When assessing the qualitative or mixed method reviews, Robson et al. 14 recommends the use of a structured RoB tool as opposed to an unstructured tool. No SRs were identified on data synthesis and CoE assessment and reporting steps.

4. DISCUSSION

4.1. summary of findings.

Nine SRs examining 24 unique methods used across five steps in the SR process were identified in this overview. The collective evidence supports some current traditional and modified SR practices, while challenging other approaches. However, the quality of the included reviews was assessed to be moderate at best and in the majority of the included SRs, evidence related to the evaluated methods was obtained from very limited numbers of primary studies. As such, the interpretations from these SRs should be made cautiously.

The evidence gathered from the included SRs corroborate a few current SR approaches. 5 For example, it is important to search multiple resources for identifying relevant trials (RCTs and/or CCTs). The resources must include a combination of electronic database searching, handsearching, and reference lists of retrieved articles. 5 However, no SRs have been identified that evaluated the impact of the number of electronic databases searched. A recent study by Halladay et al. 27 found that articles on therapeutic intervention, retrieved by searching databases other than PubMed (including Embase), contributed only a small amount of information to the MA and also had a minimal impact on the MA results. The authors concluded that when the resources are limited and when large number of studies are expected to be retrieved for the SR or MA, PubMed‐only search can yield reliable results. 27

Findings from the included SRs also reiterate some methodological modifications currently employed to “expedite” the SR process. 10 , 11 For example, excluding non‐English language trials and gray/unpublished trials from MA have been shown to have minimal or no impact on the results of MA. 24 , 26 However, the efficiency of these SR methods, in terms of time and the resources used, have not been evaluated in the included SRs. 24 , 26 Of the SRs included, only two have focused on the aspect of efficiency 14 , 25 ; O'Mara‐Eves et al. 25 report some evidence to support the use of text‐mining approaches for title and abstract screening in order to increase the rate of screening. Moreover, only one included SR 14 considered primary studies that evaluated reliability (inter‐ or intra‐reviewer consistency) and accuracy (validity when compared against a “gold standard” method) of the SR methods. This can be attributed to the limited number of primary studies that evaluated these outcomes when evaluating the SR methods. 14 Lack of outcome measures related to reliability, accuracy, and efficiency precludes making definitive recommendations on the use of these methods/modifications. Future research studies must focus on these outcomes.

Some evaluated methods may be relevant to multiple steps; for example, exclusions based on publication status (gray/unpublished literature) and language of publication (non‐English language studies) can be outlined in the a priori eligibility criteria or can be incorporated as search limits in the search strategy. SRs included in this overview focused on the effect of study exclusions on pooled treatment effect estimates or MA conclusions. Excluding studies from the search results, after conducting a comprehensive search, based on different eligibility criteria may yield different results when compared to the results obtained when limiting the search itself. 28 Further studies are required to examine this aspect.

Although we acknowledge the lack of standardized quality assessment tools for methodological study designs, we adhered to the Cochrane criteria for identifying SRs in this overview. This was done to ensure consistency in the quality of the included evidence. As a result, we excluded three reviews that did not provide any form of discussion on the quality of the included studies. The methods investigated in these reviews concern supplementary search, 29 data extraction, 12 and screening. 13 However, methods reported in two of these three reviews, by Mathes et al. 12 and Waffenschmidt et al., 13 have also been examined in the SR by Robson et al., 14 which was included in this overview; in most instances (with the exception of one study included in Mathes et al. 12 and Waffenschmidt et al. 13 each), the studies examined in these excluded reviews overlapped with those in the SR by Robson et al. 14

One of the key gaps in the knowledge observed in this overview was the dearth of SRs on the methods used in the data synthesis component of SR. Narrative and quantitative syntheses are the two most commonly used approaches for synthesizing data in evidence synthesis. 5 There are some published studies on the proposed indications and implications of these two approaches. 30 , 31 These studies found that both data synthesis methods produced comparable results and have their own advantages, suggesting that the choice of the method must be based on the purpose of the review. 31 With increasing number of “expedited” SR approaches (so called “rapid reviews”) avoiding MA, 10 , 11 further research studies are warranted in this area to determine the impact of the type of data synthesis on the results of the SR.

4.2. Implications for future research

The findings of this overview highlight several areas of paucity in primary research and evidence synthesis on SR methods. First, no SRs were identified on methods used in two important components of the SR process, including data synthesis and CoE and reporting. As for the included SRs, a limited number of evaluation studies have been identified for several methods. This indicates that further research is required to corroborate many of the methods recommended in current SR guidelines. 4 , 5 , 6 , 7 Second, some SRs evaluated the impact of methods on the results of quantitative synthesis and MA conclusions. Future research studies must also focus on the interpretations of SR results. 28 , 32 Finally, most of the included SRs were conducted on specific topics related to the field of health care, limiting the generalizability of the findings to other areas. It is important that future research studies evaluating evidence syntheses broaden the objectives and include studies on different topics within the field of health care.

4.3. Strengths and limitations

To our knowledge, this is the first overview summarizing current evidence from SRs and MA on different methodological approaches used in several fundamental steps in SR conduct. The overview methodology followed well established guidelines and strict criteria defined for the inclusion of SRs.

There are several limitations related to the nature of the included reviews. Evidence for most of the methods investigated in the included reviews was derived from a limited number of primary studies. Also, the majority of the included SRs may be considered outdated as they were published (or last updated) more than 5 years ago 33 ; only three of the nine SRs have been published in the last 5 years. 14 , 25 , 26 Therefore, important and recent evidence related to these topics may not have been included. Substantial numbers of included SRs were conducted in the field of health, which may limit the generalizability of the findings. Some method evaluations in the included SRs focused on quantitative analyses components and MA conclusions only. As such, the applicability of these findings to SR more broadly is still unclear. 28 Considering the methodological nature of our overview, limiting the inclusion of SRs according to the Cochrane criteria might have resulted in missing some relevant evidence from those reviews without a quality assessment component. 12 , 13 , 29 Although the included SRs performed some form of quality appraisal of the included studies, most of them did not use a standardized RoB tool, which may impact the confidence in their conclusions. Due to the type of outcome measures used for the method evaluations in the primary studies and the included SRs, some of the identified methods have not been validated against a reference standard.

Some limitations in the overview process must be noted. While our literature search was exhaustive covering five bibliographic databases and supplementary search of reference lists, no gray sources or other evidence resources were searched. Also, the search was primarily conducted in health databases, which might have resulted in missing SRs published in other fields. Moreover, only English language SRs were included for feasibility. As the literature search retrieved large number of citations (i.e., 41,556), the title and abstract screening was performed by a single reviewer, calibrated for consistency in the screening process by another reviewer, owing to time and resource limitations. These might have potentially resulted in some errors when retrieving and selecting relevant SRs. The SR methods were grouped based on key elements of each recommended SR step, as agreed by the authors. This categorization pertains to the identified set of methods and should be considered subjective.

5. CONCLUSIONS

This overview identified limited SR‐level evidence on various methodological approaches currently employed during five of the seven fundamental steps in the SR process. Limited evidence was also identified on some methodological modifications currently used to expedite the SR process. Overall, findings highlight the dearth of SRs on SR methodologies, warranting further work to confirm several current recommendations on conventional and expedited SR processes.

CONFLICT OF INTEREST

The authors declare no conflicts of interest.

Supporting information

APPENDIX A: Detailed search strategies

ACKNOWLEDGMENTS

The first author is supported by a La Trobe University Full Fee Research Scholarship and a Graduate Research Scholarship.

Open Access Funding provided by La Trobe University.

Veginadu P, Calache H, Gussy M, Pandian A, Masood M. An overview of methodological approaches in systematic reviews . J Evid Based Med . 2022; 15 :39–54. 10.1111/jebm.12468 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]

  • Open access
  • Published: 03 July 2024

The impact of evidence-based nursing leadership in healthcare settings: a mixed methods systematic review

  • Maritta Välimäki 1 , 2 ,
  • Shuang Hu 3 ,
  • Tella Lantta 1 ,
  • Kirsi Hipp 1 , 4 ,
  • Jaakko Varpula 1 ,
  • Jiarui Chen 3 ,
  • Gaoming Liu 5 ,
  • Yao Tang 3 ,
  • Wenjun Chen 3 &
  • Xianhong Li 3  

BMC Nursing volume  23 , Article number:  452 ( 2024 ) Cite this article

135 Accesses

Metrics details

The central component in impactful healthcare decisions is evidence. Understanding how nurse leaders use evidence in their own managerial decision making is still limited. This mixed methods systematic review aimed to examine how evidence is used to solve leadership problems and to describe the measured and perceived effects of evidence-based leadership on nurse leaders and their performance, organizational, and clinical outcomes.

We included articles using any type of research design. We referred nurses, nurse managers or other nursing staff working in a healthcare context when they attempt to influence the behavior of individuals or a group in an organization using an evidence-based approach. Seven databases were searched until 11 November 2021. JBI Critical Appraisal Checklist for Quasi-experimental studies, JBI Critical Appraisal Checklist for Case Series, Mixed Methods Appraisal Tool were used to evaluate the Risk of bias in quasi-experimental studies, case series, mixed methods studies, respectively. The JBI approach to mixed methods systematic reviews was followed, and a parallel-results convergent approach to synthesis and integration was adopted.

Thirty-one publications were eligible for the analysis: case series ( n  = 27), mixed methods studies ( n  = 3) and quasi-experimental studies ( n  = 1). All studies were included regardless of methodological quality. Leadership problems were related to the implementation of knowledge into practice, the quality of nursing care and the resource availability. Organizational data was used in 27 studies to understand leadership problems, scientific evidence from literature was sought in 26 studies, and stakeholders’ views were explored in 24 studies. Perceived and measured effects of evidence-based leadership focused on nurses’ performance, organizational outcomes, and clinical outcomes. Economic data were not available.

Conclusions

This is the first systematic review to examine how evidence is used to solve leadership problems and to describe its measured and perceived effects from different sites. Although a variety of perceptions and effects were identified on nurses’ performance as well as on organizational and clinical outcomes, available knowledge concerning evidence-based leadership is currently insufficient. Therefore, more high-quality research and clinical trial designs are still needed.

Trail registration

The study was registered (PROSPERO CRD42021259624).

Peer Review reports

Global health demands have set new roles for nurse leaders [ 1 ].Nurse leaders are referred to as nurses, nurse managers, or other nursing staff working in a healthcare context who attempt to influence the behavior of individuals or a group based on goals that are congruent with organizational goals [ 2 ]. They are seen as professionals “armed with data and evidence, and a commitment to mentorship and education”, and as a group in which “leaders innovate, transform, and achieve quality outcomes for patients, health care professionals, organizations, and communities” [ 3 ]. Effective leadership occurs when team members critically follow leaders and are motivated by a leader’s decisions based on the organization’s requests and targets [ 4 ]. On the other hand, problems caused by poor leadership may also occur, regarding staff relations, stress, sickness, or retention [ 5 ]. Therefore, leadership requires an understanding of different problems to be solved using synthesizing evidence from research, clinical expertise, and stakeholders’ preferences [ 6 , 7 ]. If based on evidence, leadership decisions, also referred as leadership decision making [ 8 ], could ensure adequate staffing [ 7 , 9 ] and to produce sufficient and cost-effective care [ 10 ]. However, nurse leaders still rely on their decision making on their personal [ 11 ] and professional experience [ 10 ] over research evidence, which can lead to deficiencies in the quality and safety of care delivery [ 12 , 13 , 14 ]. As all nurses should demonstrate leadership in their profession, their leadership competencies should be strengthened [ 15 ].

Evidence-informed decision-making, referred to as evidence appraisal and application, and evaluation of decisions [ 16 ], has been recognized as one of the core competencies for leaders [ 17 , 18 ]. The role of evidence in nurse leaders’ managerial decision making has been promoted by public authorities [ 19 , 20 , 21 ]. Evidence-based management, another concept related to evidence-based leadership, has been used as the potential to improve healthcare services [ 22 ]. It can guide nursing leaders, in developing working conditions, staff retention, implementation practices, strategic planning, patient care, and success of leadership [ 13 ]. Collins and Holton [ 23 ] in their systematic review and meta-analysis examined 83 studies regarding leadership development interventions. They found that leadership training can result in significant improvement in participants’ skills, especially in knowledge level, although the training effects varied across studies. Cummings et al. [ 24 ] reviewed 100 papers (93 studies) and concluded that participation in leadership interventions had a positive impact on the development of a variety of leadership styles. Clavijo-Chamorro et al. [ 25 ] in their review of 11 studies focused on leadership-related factors that facilitate evidence implementation: teamwork, organizational structures, and transformational leadership. The role of nurse managers was to facilitate evidence-based practices by transforming contexts to motivate the staff and move toward a shared vision of change.

As far as we are aware, however, only a few systematic reviews have focused on evidence-based leadership or related concepts in the healthcare context aiming to analyse how nurse leaders themselves uses evidence in the decision-making process. Young [ 26 ] targeted definitions and acceptance of evidence-based management (EBMgt) in healthcare while Hasanpoor et al. [ 22 ] identified facilitators and barriers, sources of evidence used, and the role of evidence in the process of decision making. Both these reviews concluded that EBMgt was of great importance but used limitedly in healthcare settings due to a lack of time, a lack of research management activities, and policy constraints. A review by Williams [ 27 ] showed that the usage of evidence to support management in decision making is marginal due to a shortage of relevant evidence. Fraser [ 28 ] in their review further indicated that the potential evidence-based knowledge is not used in decision making by leaders as effectively as it could be. Non-use of evidence occurs and leaders base their decisions mainly on single studies, real-world evidence, and experts’ opinions [ 29 ]. Systematic reviews and meta-analyses rarely provide evidence of management-related interventions [ 30 ]. Tate et al. [ 31 ] concluded based on their systematic review and meta-analysis that the ability of nurse leaders to use and critically appraise research evidence may influence the way policy is enacted and how resources and staff are used to meet certain objectives set by policy. This can further influence staff and workforce outcomes. It is therefore important that nurse leaders have the capacity and motivation to use the strongest evidence available to effect change and guide their decision making [ 27 ].

Despite of a growing body of evidence, we found only one review focusing on the impact of evidence-based knowledge. Geert et al. [ 32 ] reviewed literature from 2007 to 2016 to understand the elements of design, delivery, and evaluation of leadership development interventions that are the most reliably linked to outcomes at the level of the individual and the organization, and that are of most benefit to patients. The authors concluded that it is possible to improve individual-level outcomes among leaders, such as knowledge, motivation, skills, and behavior change using evidence-based approaches. Some of the most effective interventions included, for example, interactive workshops, coaching, action learning, and mentoring. However, these authors found limited research evidence describing how nurse leaders themselves use evidence to support their managerial decisions in nursing and what the outcomes are.

To fill the knowledge gap and compliment to existing knowledgebase, in this mixed methods review we aimed to (1) examine what leadership problems nurse leaders solve using an evidence-based approach and (2) how they use evidence to solve these problems. We also explored (3) the measured and (4) perceived effects of the evidence-based leadership approach in healthcare settings. Both qualitative and quantitative components of the effects of evidence-based leadership were examined to provide greater insights into the available literature [ 33 ]. Together with the evidence-based leadership approach, and its impact on nursing [ 34 , 35 ], this knowledge gained in this review can be used to inform clinical policy or organizational decisions [ 33 ]. The study is registered (PROSPERO CRD42021259624). The methods used in this review were specified in advance and documented in a priori in a published protocol [ 36 ]. Key terms of the review and the search terms are defined in Table  1 (population, intervention, comparison, outcomes, context, other).

In this review, we used a mixed methods approach [ 37 ]. A mixed methods systematic review was selected as this approach has the potential to produce direct relevance to policy makers and practitioners [ 38 ]. Johnson and Onwuegbuzie [ 39 ] have defined mixed methods research as “the class of research in which the researcher mixes or combines quantitative and qualitative research techniques, methods, approaches, concepts or language into a single study.” Therefore, we combined quantitative and narrative analysis to appraise and synthesize empirical evidence, and we held them as equally important in informing clinical policy or organizational decisions [ 34 ]. In this review, a comprehensive synthesis of quantitative and qualitative data was performed first and then discussed in discussion part (parallel-results convergent design) [ 40 ]. We hoped that different type of analysis approaches could complement each other and deeper picture of the topic in line with our research questions could be gained [ 34 ].

Inclusion and exclusion criteria

Inclusion and exclusion criteria of the study are described in Table  1 .

Search strategy

A three-step search strategy was utilized. First, an initial limited search with #MEDLINE was undertaken, followed by analysis of the words used in the title, abstract, and the article’s key index terms. Second, the search strategy, including identified keywords and index terms, was adapted for each included data base and a second search was undertaken on 11 November 2021. The full search strategy for each database is described in Additional file 1 . Third, the reference list of all studies included in the review were screened for additional studies. No year limits or language restrictions were used.

Information sources

The database search included the following: CINAHL (EBSCO), Cochrane Library (academic database for medicine and health science and nursing), Embase (Elsevier), PsycINFO (EBSCO), PubMed (MEDLINE), Scopus (Elsevier) and Web of Science (academic database across all scientific and technical disciplines, ranging from medicine and social sciences to arts and humanities). These databases were selected as they represent typical databases in health care context. Subject headings from each of the databases were included in the search strategies. Boolean operators ‘AND’ and ‘OR’ were used to combine the search terms. An information specialist from the University of Turku Library was consulted in the formation of the search strategies.

Study selection

All identified citations were collated and uploaded into Covidence software (Covidence systematic review software, Veritas Health Innovation, Melbourne, Australia www.covidence.org ), and duplicates were removed by the software. Titles and abstracts were screened and assessed against the inclusion criteria independently by two reviewers out of four, and any discrepancies were resolved by the third reviewer (MV, KH, TL, WC). Studies meeting the inclusion criteria were retrieved in full and archived in Covidence. Access to one full-text article was lacking: the authors for one study were contacted about the missing full text, but no full text was received. All remaining hits of the included studies were retrieved and assessed independently against the inclusion criteria by two independent reviewers of four (MV, KH, TL, WC). Studies that did not meet the inclusion criteria were excluded, and the reasons for exclusion were recorded in Covidence. Any disagreements that arose between the reviewers were resolved through discussions with XL.

Assessment of methodological quality

Eligible studies were critically appraised by two independent reviewers (YT, SH). Standardized critical appraisal instruments based on the study design were used. First, quasi-experimental studies were assessed using the JBI Critical Appraisal Checklist for Quasi-experimental studies [ 44 ]. Second, case series were assessed using the JBI Critical Appraisal Checklist for Case Series [ 45 ]. Third, mixed methods studies were appraised using the Mixed Methods Appraisal Tool [ 46 ].

To increase inter-reviewer reliability, the review agreement was calculated (SH) [ 47 ]. A kappa greater than 0.8 was considered to represent a high level of agreement (0–0.1). In our data, the agreement was 0.75. Discrepancies raised between two reviewers were resolved through discussion and modifications and confirmed by XL. As an outcome, studies that met the inclusion criteria were proceeded to critical appraisal and assessed as suitable for inclusion in the review. The scores for each item and overall critical appraisal scores were presented.

Data extraction

For data extraction, specific tables were created. First, study characteristics (author(s), year, country, design, number of participants, setting) were extracted by two authors independently (JC, MV) and reviewed by TL. Second, descriptions of the interventions were extracted by two reviewers (JV, JC) using the structure of the TIDIeR (Template for Intervention Description and Replication) checklist (brief name, the goal of the intervention, material and procedure, models of delivery and location, dose, modification, adherence and fidelity) [ 48 ]. The extractions were confirmed (MV).

Third, due to a lack of effectiveness data and a wide heterogeneity between study designs and presentation of outcomes, no attempt was made to pool the quantitative data statistically; the findings of the quantitative data were presented in narrative form only [ 44 ]. The separate data extraction tables for each research question were designed specifically for this study. For both qualitative (and a qualitative component of mixed-method studies) and quantitative studies, the data were extracted and tabulated into text format according to preplanned research questions [ 36 ]. To test the quality of the tables and the data extraction process, three authors independently extracted the data from the first five studies (in alphabetical order). After that, the authors came together to share and determine whether their approaches of the data extraction were consistent with each other’s output and whether the content of each table was in line with research question. No reason was found to modify the data extraction tables or planned process. After a consensus of the data extraction process was reached, the data were extracted in pairs by independent reviewers (WC, TY, SH, GL). Any disagreements that arose between the reviewers were resolved through discussion and with a third reviewer (MV).

Data analysis

We were not able to conduct a meta-analysis due to a lack of effectiveness data based on clinical trials. Instead, we used inductive thematic analysis with constant comparison to answer the research question [ 46 , 49 ] using tabulated primary data from qualitative and quantitative studies as reported by the original authors in narrative form only [ 47 ]. In addition, the qualitizing process was used to transform quantitative data to qualitative data; this helped us to convert the whole data into themes and categories. After that we used the thematic analysis for the narrative data as follows. First, the text was carefully read, line by line, to reveal topics answering each specific review question (MV). Second, the data coding was conducted, and the themes in the data were formed by data categorization. The process of deriving the themes was inductive based on constant comparison [ 49 ]. The results of thematic analysis and data categorization was first described in narrative format and then the total number of studies was calculated where the specific category was identified (%).

Stakeholder involvement

The method of reporting stakeholders’ involvement follows the key components by [ 50 ]: (1) people involved, (2) geographical location, (3) how people were recruited, (4) format of involvement, (5) amount of involvement, (6) ethical approval, (7) financial compensation, and (8) methods for reporting involvement.

In our review, stakeholder involvement targeted nurses and nurse leader in China. Nurse Directors of two hospitals recommended potential participants who received a personal invitation letter from researchers to participate in a discussion meeting. Stakeholders’ participation was based on their own free will. Due to COVID-19, one online meeting (1 h) was organized (25 May 2022). Eleven participants joined the meeting. Ethical approval was not applied and no financial compensation was offered. At the end of the meeting, experiences of stakeholders’ involvement were explored.

The meeting started with an introductory presentation with power points. The rationale, methods, and preliminary review results were shared with the participants [ 51 ].The meeting continued with general questions for the participants: (1) Are you aware of the concepts of evidence-based practice or evidence-based leadership?; (2) How important is it to use evidence to support decisions among nurse leaders?; (3) How is the evidence-based approach used in hospital settings?; and (4) What type of evidence is currently used to support nurse leaders’ decision making (e.g. scientific literature, organizational data, stakeholder views)?

Two people took notes on the course and content of the conversation. The notes were later transcripted in verbatim, and the key points of the discussions were summarised. Although answers offered by the stakeholders were very short, the information was useful to validate the preliminary content of the results, add the rigorousness of the review, and obtain additional perspectives. A recommendation of the stakeholders was combined in the Discussion part of this review increasing the applicability of the review in the real world [ 50 ]. At the end of the discussion, the value of stakeholders’ involvement was asked. Participants shared that the experience of participating was unique and the topic of discussion was challenging. Two authors of the review group further represented stakeholders by working together with the research team throughout the review study.

Search results

From seven different electronic databases, 6053 citations were identified as being potentially relevant to the review. Then, 3133 duplicates were removed by an automation tool (Covidence: www.covidence.org ), and one was removed manually. The titles and abstracts of 3040 of citations were reviewed, and a total of 110 full texts were included (one extra citation was found on the reference list but later excluded). Based on the eligibility criteria, 31 studies (32 hits) were critically appraised and deemed suitable for inclusion in the review. The search results and selection process are presented in the PRISMA [ 52 ] flow diagram Fig.  1 . The full list of references for included studies can be find in Additional file 2 . To avoid confusion between articles of the reference list and studies included in the analysis, the studies included in the review are referred inside the article using the reference number of each study (e.g. ref 1, ref 2).

figure 1

Search results and study selection and inclusion process [ 52 ]

Characteristics of included studies

The studies had multiple purposes, aiming to develop practice, implement a new approach, improve quality, or to develop a model. The 31 studies (across 32 hits) were case series studies ( n  = 27), mixed methods studies ( n  = 3) and a quasi-experimental study ( n  = 1). All studies were published between the years 2004 and 2021. The highest number of papers was published in year 2020.

Table  2 describes the characteristics of included studies and Additional file 3 offers a narrative description of the studies.

Methodological quality assessment

Quasi-experimental studies.

We had one quasi-experimental study (ref 31). All questions in the critical appraisal tool were applicable. The total score of the study was 8 (out of a possible 9). Only one response of the tool was ‘no’ because no control group was used in the study (see Additional file 4 for the critical appraisal of included studies).

Case series studies . A case series study is typically defined as a collection of subjects with common characteristics. The studies do not include a comparison group and are often based on prevalent cases and on a sample of convenience [ 53 ]. Munn et al. [ 45 ] further claim that case series are best described as observational studies, lacking experimental and randomized characteristics, being descriptive studies, without a control or comparator group. Out of 27 case series studies included in our review, the critical appraisal scores varied from 1 to 9. Five references were conference abstracts with empirical study results, which were scored from 1 to 3. Full reports of these studies were searched in electronic databases but not found. Critical appraisal scores for the remaining 22 studies ranged from 1 to 9 out of a possible score of 10. One question (Q3) was not applicable to 13 studies: “Were valid methods used for identification of the condition for all participants included in the case series?” Only two studies had clearly reported the demographic of the participants in the study (Q6). Twenty studies met Criteria 8 (“Were the outcomes or follow-up results of cases clearly reported?”) and 18 studies met Criteria 7 (“Q7: Was there clear reporting of clinical information of the participants?”) (see Additional file 4 for the critical appraisal of included studies).

Mixed-methods studies

Mixed-methods studies involve a combination of qualitative and quantitative methods. This is a common design and includes convergent design, sequential explanatory design, and sequential exploratory design [ 46 ]. There were three mixed-methods studies. The critical appraisal scores for the three studies ranged from 60 to 100% out of a possible 100%. Two studies met all the criteria, while one study fulfilled 60% of the scored criteria due to a lack of information to understand the relevance of the sampling strategy well enough to address the research question (Q4.1) or to determine whether the risk of nonresponse bias was low (Q4.4) (see Additional file 4 for the critical appraisal of included studies).

Intervention or program components

The intervention of program components were categorized and described using the TiDier checklist: name and goal, theory or background, material, procedure, provider, models of delivery, location, dose, modification, and adherence and fidelity [ 48 ]. A description of intervention in each study is described in Additional file 5 and a narrative description in Additional file 6 .

Leadership problems

In line with the inclusion criteria, data for the leadership problems were categorized in all 31 included studies (see Additional file 7 for leadership problems). Three types of leadership problems were identified: implementation of knowledge into practice, the quality of clinical care, and resources in nursing care. A narrative summary of the results is reported below.

Implementing knowledge into practice

Eleven studies (35%) aimed to solve leadership problems related to implementation of knowledge into practice. Studies showed how to support nurses in evidence-based implementation (EBP) (ref 3, ref 5), how to engage nurses in using evidence in practice (ref 4), how to convey the importance of EBP (ref 22) or how to change practice (ref 4). Other problems were how to facilitate nurses to use guideline recommendations (ref 7) and how nurses can make evidence-informed decisions (ref 8). General concerns also included the linkage between theory and practice (ref 1) as well as how to implement the EBP model in practice (ref 6). In addition, studies were motivated by the need for revisions or updates of protocols to improve clinical practice (ref 10) as well as the need to standardize nursing activities (ref 11, ref 14).

The quality of the care

Thirteen (42%) focused on solving problems related to the quality of clinical care. In these studies, a high number of catheter infections led a lack of achievement of organizational goals (ref 2, ref 9). A need to reduce patient symptoms in stem cell transplant patients undergoing high-dose chemotherapy (ref 24) was also one of the problems to be solved. In addition, the projects focused on how to prevent pressure ulcers (ref 26, ref 29), how to enhance the quality of cancer treatment (ref 25) and how to reduce the need for invasive constipation treatment (ref 30). Concerns about patient safety (ref 15), high fall rates (ref 16, ref 19), dissatisfaction of patients (ref 16, ref 18) and nurses (ref 16, ref 30) were also problems that had initiated the projects. Studies addressed concerns about how to promote good contingency care in residential aged care homes (ref 20) and about how to increase recognition of human trafficking problems in healthcare (ref 21).

Resources in nursing care

Nurse leaders identified problems in their resources, especially in staffing problems. These problems were identified in seven studies (23%), which involved concerns about how to prevent nurses from leaving the job (ref 31), how to ensure appropriate recruitment, staffing and retaining of nurses (ref 13) and how to decrease nurses’ burden and time spent on nursing activities (ref 12). Leadership turnover was also reported as a source of dissatisfaction (ref 17); studies addressed a lack of structured transition and training programs, which led to turnover (ref 23), as well as how to improve intershift handoff among nurses (ref 28). Optimal design for new hospitals was also examined (ref 27).

Main features of evidence-based leadership

Out of 31 studies, 17 (55%) included all four domains of an evidence-based leadership approach, and four studies (13%) included evidence of critical appraisal of the results (see Additional file 8 for the main features of evidence-based Leadership) (ref 11, ref 14, ref 23, ref 27).

Organizational evidence

Twenty-seven studies (87%) reported how organizational evidence was collected and used to solve leadership problems (ref 2). Retrospective chart reviews (ref 5), a review of the extent of specific incidents (ref 19), and chart auditing (ref 7, ref 25) were conducted. A gap between guideline recommendations and actual care was identified using organizational data (ref 7) while the percentage of nurses’ working time spent on patient care was analyzed using an electronic charting system (ref 12). Internal data (ref 22), institutional data, and programming metrics were also analyzed to understand the development of the nurse workforce (ref 13).

Surveys (ref 3, ref 25), interviews (ref 3, ref 25) and group reviews (ref 18) were used to better understand the leadership problem to be solved. Employee opinion surveys on leadership (ref 17), a nurse satisfaction survey (ref 30) and a variety of reporting templates were used for the data collection (ref 28) reported. Sometimes, leadership problems were identified by evidence facilitators or a PI’s team who worked with staff members (ref 15, ref 17). Problems in clinical practice were also identified by the Nursing Professional Council (ref 14), managers (ref 26) or nurses themselves (ref 24). Current practices were reviewed (ref 29) and a gap analysis was conducted (ref 4, ref 16, ref 23) together with SWOT analysis (ref 16). In addition, hospital mission and vision statements, research culture established and the proportion of nursing alumni with formal EBP training were analyzed (ref 5). On the other hand, it was stated that no systematic hospital-specific sources of data regarding job satisfaction or organizational commitment were used (ref 31). In addition, statements of organizational analysis were used on a general level only (ref 1).

Scientific evidence identified

Twenty-six studies (84%) reported the use of scientific evidence in their evidence-based leadership processes. A literature search was conducted (ref 21) and questions, PICO, and keywords were identified (ref 4) in collaboration with a librarian. Electronic databases, including PubMed (ref 14, ref 31), Cochrane, and EMBASE (ref 31) were searched. Galiano (ref 6) used Wiley Online Library, Elsevier, CINAHL, Health Source: Nursing/Academic Edition, PubMed, and the Cochrane Library while Hoke (ref 11) conducted an electronic search using CINAHL and PubMed to retrieve articles.

Identified journals were reviewed manually (ref 31). The findings were summarized using ‘elevator speech’ (ref 4). In a study by Gifford et al. (ref 9) evidence facilitators worked with participants to access, appraise, and adapt the research evidence to the organizational context. Ostaszkiewicz (ref 20) conducted a scoping review of literature and identified and reviewed frameworks and policy documents about the topic and the quality standards. Further, a team of nursing administrators, directors, staff nurses, and a patient representative reviewed the literature and made recommendations for practice changes.

Clinical practice guidelines were also used to offer scientific evidence (ref 7, ref 19). Evidence was further retrieved from a combination of nursing policies, guidelines, journal articles, and textbooks (ref 12) as well as from published guidelines and literature (ref 13). Internal evidence, professional practice knowledge, relevant theories and models were synthesized (ref 24) while other study (ref 25) reviewed individual studies, synthesized with systematic reviews or clinical practice guidelines. The team reviewed the research evidence (ref 3, ref 15) or conducted a literature review (ref 22, ref 28, ref 29), a literature search (ref 27), a systematic review (ref 23), a review of the literature (ref 30) or ‘the scholarly literature was reviewed’ (ref 18). In addition, ‘an extensive literature review of evidence-based best practices was carried out’ (ref 10). However, detailed description how the review was conducted was lacking.

Views of stakeholders

A total of 24 studies (77%) reported methods for how the views of stakeholders, i.e., professionals or experts, were considered. Support to run this study was received from nursing leadership and multidisciplinary teams (ref 29). Experts and stakeholders joined the study team in some cases (ref 25, ref 30), and in other studies, their opinions were sought to facilitate project success (ref 3). Sometimes a steering committee was formed by a Chief Nursing Officer and Clinical Practice Specialists (ref 2). More specifically, stakeholders’ views were considered using interviews, workshops and follow-up teleconferences (ref 7). The literature review was discussed with colleagues (ref 11), and feedback and support from physicians as well as the consensus of staff were sought (ref 16).

A summary of the project findings and suggestions for the studies were discussed at 90-minute weekly meetings by 11 charge nurses. Nurse executive directors were consulted over a 10-week period (ref 31). An implementation team (nurse, dietician, physiotherapist, occupational therapist) was formed to support the implementation of evidence-based prevention measures (ref 26). Stakeholders volunteered to join in the pilot implementation (ref 28) or a stakeholder team met to determine the best strategy for change management, shortcomings in evidence-based criteria were discussed, and strategies to address those areas were planned (ref 5). Nursing leaders, staff members (ref 22), ‘process owners (ref 18) and program team members (ref 18, ref 19, ref 24) met regularly to discuss the problems. Critical input was sought from clinical educators, physicians, nutritionists, pharmacists, and nurse managers (ref 24). The unit director and senior nursing staff reviewed the contents of the product, and the final version of clinical pathways were reviewed and approved by the Quality Control Commission of the Nursing Department (ref 12). In addition, two co-design workshops with 18 residential aged care stakeholders were organized to explore their perspectives about factors to include in a model prototype (ref 20). Further, an agreement of stakeholders in implementing continuous quality services within an open relationship was conducted (ref 1).

Critical appraisal

In five studies (16%), a critical appraisal targeting the literature search was carried out. The appraisals were conducted by interns and teams who critiqued the evidence (ref 4). In Hoke’s study, four areas that had emerged in the literature were critically reviewed (ref 11). Other methods were to ‘critically appraise the search results’ (ref 14). Journal club team meetings (ref 23) were organized to grade the level and quality of evidence and the team ‘critically appraised relevant evidence’ (ref 27). On the other hand, the studies lacked details of how the appraisals were done in each study.

The perceived effects of evidence-based leadership

Perceived effects of evidence-based leadership on nurses’ performance.

Eleven studies (35%) described perceived effects of evidence-based leadership on nurses’ performance (see Additional file 9 for perceived effects of evidence-based leadership), which were categorized in four groups: awareness and knowledge, competence, ability to understand patients’ needs, and engagement. First, regarding ‘awareness and knowledge’, different projects provided nurses with new learning opportunities (ref 3). Staff’s knowledge (ref 20, ref 28), skills, and education levels improved (ref 20), as did nurses’ knowledge comprehension (ref 21). Second, interventions and approaches focusing on management and leadership positively influenced participants’ competence level to improve the quality of services. Their confidence level (ref 1) and motivation to change practice increased, self-esteem improved, and they were more positive and enthusiastic in their work (ref 22). Third, some nurses were relieved that they had learned to better handle patients’ needs (ref 25). For example, a systematic work approach increased nurses’ awareness of the patients who were at risk of developing health problems (ref 26). And last, nurse leaders were more engaged with staff, encouraging them to adopt the new practices and recognizing their efforts to change (ref 8).

Perceived effects on organizational outcomes

Nine studies (29%) described the perceived effects of evidence-based leadership on organizational outcomes (see Additional file 9 for perceived effects of evidence-based leadership). These were categorized into three groups: use of resources, staff commitment, and team effort. First, more appropriate use of resources was reported (ref 15, ref 20), and working time was more efficiently used (ref 16). In generally, a structured approach made implementing change more manageable (ref 1). On the other hand, in the beginning of the change process, the feedback from nurses was unfavorable, and they experienced discomfort in the new work style (ref 29). New approaches were also perceived as time consuming (ref 3). Second, nurse leaders believed that fewer nursing staff than expected left the organization over the course of the study (ref 31). Third, the project helped staff in their efforts to make changes, and it validated the importance of working as a team (ref 7). Collaboration and support between the nurses increased (ref 26). On the other hand, new work style caused challenges in teamwork (ref 3).

Perceived effects on clinical outcomes

Five studies (16%) reported the perceived effects of evidence-based leadership on clinical outcomes (see Additional file 9 for perceived effects of evidence-based leadership), which were categorized in two groups: general patient outcomes and specific clinical outcomes. First, in general, the project assisted in connecting the guideline recommendations and patient outcomes (ref 7). The project was good for the patients in general, and especially to improve patient safety (ref 16). On the other hand, some nurses thought that the new working style did not work at all for patients (ref 28). Second, the new approach used assisted in optimizing patients’ clinical problems and person-centered care (ref 20). Bowel management, for example, received very good feedback (ref 30).

The measured effects of evidence-based leadership

The measured effects on nurses’ performance.

Data were obtained from 20 studies (65%) (see Additional file 10 for measured effects of evidence-based leadership) and categorized nurse performance outcomes for three groups: awareness and knowledge, engagement, and satisfaction. First, six studies (19%) measured the awareness and knowledge levels of participants. Internship for staff nurses was beneficial to help participants to understand the process for using evidence-based practice and to grow professionally, to stimulate for innovative thinking, to give knowledge needed to use evidence-based practice to answer clinical questions, and to make possible to complete an evidence-based practice project (ref 3). Regarding implementation program of evidence-based practice, those with formal EBP training showed an improvement in knowledge, attitude, confidence, awareness and application after intervention (ref 3, ref 11, ref 20, ref 23, ref 25). On the contrary, in other study, attitude towards EBP remained stable ( p  = 0.543). and those who applied EBP decreased although no significant differences over the years ( p  = 0.879) (ref 6).

Second, 10 studies (35%) described nurses’ engagement to new practices (ref 5, ref 6, ref 7, ref 10, ref 16, ref 17, ref 18, ref 21, ref 25, ref 27). 9 studies (29%) studies reported that there was an improvement of compliance level of participants (ref 6, ref 7, ref 10, ref 16, ref 17, ref 18, ref 21, ref 25, ref 27). On the contrary, in DeLeskey’s (ref 5) study, although improvement was found in post-operative nausea and vomiting’s (PONV) risk factors documented’ (2.5–63%), and ’risk factors communicated among anaesthesia and surgical staff’ (0–62%), the improvement did not achieve the goal. The reason was a limited improvement was analysed. It was noted that only those patients who had been seen by the pre-admission testing nurse had risk assessments completed. Appropriate treatment/prophylaxis increased from 69 to 77%, and from 30 to 49%; routine assessment for PONV/rescue treatment 97% and 100% was both at 100% following the project. The results were discussed with staff but further reasons for a lack of engagement in nursing care was not reported.

And third, six studies (19%) reported nurses’ satisfaction with project outcomes. The study results showed that using evidence in managerial decisions improved nurses’ satisfaction and attitudes toward their organization ( P  < 0.05) (ref 31). Nurses’ overall job satisfaction improved as well (ref 17). Nurses’ satisfaction with usability of the electronic charting system significantly improved after introduction of the intervention (ref 12). In handoff project in seven hospitals, improvement was reported in all satisfaction indicators used in the study although improvement level varied in different units (ref 28). In addition, positive changes were reported in nurses’ ability to autonomously perform their job (“How satisfied are you with the tools and resources available for you treat and prevent patient constipation?” (54%, n  = 17 vs. 92%, n  = 35, p  < 0.001) (ref 30).

The measured effects on organizational outcomes

Thirteen studies (42%) described the effects of a project on organizational outcomes (see Additional file 10 for measured effects of evidence-based leadership), which were categorized in two groups: staff compliance, and changes in practices. First, studies reported improved organizational outcomes due to staff better compliance in care (ref 4, ref 13, ref 17, ref 23, ref 27, ref 31). Second, changes in organization practices were also described (ref 11) like changes in patient documentation (ref 12, ref 21). Van Orne (ref 30) found a statistically significant reduction in the average rate of invasive medication administration between pre-intervention and post-intervention ( p  = 0.01). Salvador (ref 24) also reported an improvement in a proactive approach to mucositis prevention with an evidence-based oral care guide. On the contrary, concerns were also raised such as not enough time for new bedside report (ref 16) or a lack of improvement of assessment of diabetic ulcer (ref 8).

The measured effects on clinical outcomes

A variety of improvements in clinical outcomes were reported (see Additional file 10 for measured effects of evidence-based leadership): improvement in patient clinical status and satisfaction level. First, a variety of improvement in patient clinical status was reported. improvement in Incidence of CAUTI decreased 27.8% between 2015 and 2019 (ref 2) while a patient-centered quality improvement project reduced CAUTI rates to 0 (ref 10). A significant decrease in transmission rate of MRSA transmission was also reported (ref 27) and in other study incidences of CLABSIs dropped following of CHG bathing (ref 14). Further, it was possible to decrease patient nausea from 18 to 5% and vomiting to 0% (ref 5) while the percentage of patients who left the hospital without being seen was below 2% after the project (ref 17). In addition, a significant reduction in the prevalence of pressure ulcers was found (ref 26, ref 29) and a significant reduction of mucositis severity/distress was achieved (ref 24). Patient falls rate decreased (ref 15, ref 16, ref 19, ref 27).

Second, patient satisfaction level after project implementation improved (ref 28). The scale assessing healthcare providers by consumers showed improvement, but the changes were not statistically significant. Improvement in an emergency department leadership model and in methods of communication with patients improved patient satisfaction scores by 600% (ref 17). In addition, new evidence-based unit improved patient experiences about the unit although not all items improved significantly (ref 18).

Stakeholder involvement in the mixed-method review

To ensure stakeholders’ involvement in the review, the real-world relevance of our research [ 53 ], achieve a higher level of meaning in our review results, and gain new perspectives on our preliminary findings [ 50 ], a meeting with 11 stakeholders was organized. First, we asked if participants were aware of the concepts of evidence-based practice or evidence-based leadership. Responses revealed that participants were familiar with the concept of evidence-based practice, but the topic of evidence-based leadership was totally new. Examples of nurses and nurse leaders’ responses are as follows: “I have heard a concept of evidence-based practice but never a concept of evidence-based leadership.” Another participant described: “I have heard it [evidence-based leadership] but I do not understand what it means.”

Second, as stakeholder involvement is beneficial to the relevance and impact of health research [ 54 ], we asked how important evidence is to them in supporting decisions in health care services. One participant described as follows: “Using evidence in decisions is crucial to the wards and also to the entire hospital.” Third, we asked how the evidence-based approach is used in hospital settings. Participants expressed that literature is commonly used to solve clinical problems in patient care but not to solve leadership problems. “In [patient] medication and care, clinical guidelines are regularly used. However, I am aware only a few cases where evidence has been sought to solve leadership problems.”

And last, we asked what type of evidence is currently used to support nurse leaders’ decision making (e.g. scientific literature, organizational data, stakeholder views)? The participants were aware that different types of information were collected in their organization on a daily basis (e.g. patient satisfaction surveys). However, the information was seldom used to support decision making because nurse leaders did not know how to access this information. Even so, the participants agreed that the use of evidence from different sources was important in approaching any leadership or managerial problems in the organization. Participants also suggested that all nurse leaders should receive systematic training related to the topic; this could support the daily use of the evidence-based approach.

To our knowledge, this article represents the first mixed-methods systematic review to examine leadership problems, how evidence is used to solve these problems and what the perceived and measured effects of evidence-based leadership are on nurse leaders and their performance, organizational, and clinical outcomes. This review has two key findings. First, the available research data suggests that evidence-based leadership has potential in the healthcare context, not only to improve knowledge and skills among nurses, but also to improve organizational outcomes and the quality of patient care. Second, remarkably little published research was found to explore the effects of evidence-based leadership with an efficient trial design. We validated the preliminary results with nurse stakeholders, and confirmed that nursing staff, especially nurse leaders, were not familiar with the concept of evidence-based leadership, nor were they used to implementing evidence into their leadership decisions. Our data was based on many databases, and we screened a large number of studies. We also checked existing registers and databases and found no registered or ongoing similar reviews being conducted. Therefore, our results may not change in the near future.

We found that after identifying the leadership problems, 26 (84%) studies out of 31 used organizational data, 25 (81%) studies used scientific evidence from the literature, and 21 (68%) studies considered the views of stakeholders in attempting to understand specific leadership problems more deeply. However, only four studies critically appraised any of these findings. Considering previous critical statements of nurse leaders’ use of evidence in their decision making [ 14 , 30 , 31 , 34 , 55 ], our results are still quite promising.

Our results support a previous systematic review by Geert et al. [ 32 ], which concluded that it is possible to improve leaders’ individual-level outcomes, such as knowledge, motivation, skills, and behavior change using evidence-based approaches. Collins and Holton [ 23 ] particularly found that leadership training resulted in significant knowledge and skill improvements, although the effects varied widely across studies. In our study, evidence-based leadership was seen to enable changes in clinical practice, especially in patient care. On the other hand, we understand that not all efforts to changes were successful [ 56 , 57 , 58 ]. An evidence-based approach causes negative attitudes and feelings. Negative emotions in participants have also been reported due to changes, such as discomfort with a new working style [ 59 ]. Another study reported inconvenience in using a new intervention and its potential risks for patient confidentiality. Sometimes making changes is more time consuming than continuing with current practice [ 60 ]. These findings may partially explain why new interventions or program do not always fully achieve their goals. On the other hand, Dubose et al. [ 61 ] state that, if prepared with knowledge of resistance, nurse leaders could minimize the potential negative consequences and capitalize on a powerful impact of change adaptation.

We found that only six studies used a specific model or theory to understand the mechanism of change that could guide leadership practices. Participants’ reactions to new approaches may be an important factor in predicting how a new intervention will be implemented into clinical practice. Therefore, stronger effort should be put to better understanding the use of evidence, how participants’ reactions and emotions or practice changes could be predicted or supported using appropriate models or theories, and how using these models are linked with leadership outcomes. In this task, nurse leaders have an important role. At the same time, more responsibilities in developing health services have been put on the shoulders of nurse leaders who may already be suffering under pressure and increased burden at work. Working in a leadership position may also lead to role conflict. A study by Lalleman et al. [ 62 ] found that nurses were used to helping other people, often in ad hoc situations. The helping attitude of nurses combined with structured managerial role may cause dilemmas, which may lead to stress. Many nurse leaders opt to leave their positions less than 5 years [ 63 ].To better fulfill the requirements of health services in the future, the role of nurse leaders in evidence-based leadership needs to be developed further to avoid ethical and practical dilemmas in their leadership practices.

It is worth noting that the perceived and measured effects did not offer strong support to each other but rather opened a new venue to understand the evidence-based leadership. Specifically, the perceived effects did not support to measured effects (competence, ability to understand patients’ needs, use of resources, team effort, and specific clinical outcomes) while the measured effects could not support to perceived effects (nurse’s performance satisfaction, changes in practices, and clinical outcomes satisfaction). These findings may indicate that different outcomes appear if the effects of evidence-based leadership are looked at using different methodological approach. Future study is encouraged using well-designed study method including mixed-method study to examine the consistency between perceived and measured effects of evidence-based leadership in health care.

There is a potential in nursing to support change by demonstrating conceptual and operational commitment to research-based practices [ 64 ]. Nurse leaders are well positioned to influence and lead professional governance, quality improvement, service transformation, change and shared governance [ 65 ]. In this task, evidence-based leadership could be a key in solving deficiencies in the quality, safety of care [ 14 ] and inefficiencies in healthcare delivery [ 12 , 13 ]. As WHO has revealed, there are about 28 million nurses worldwide, and the demand of nurses will put nurse resources into the specific spotlight [ 1 ]. Indeed, evidence could be used to find solutions for how to solve economic deficits or other problems using leadership skills. This is important as, when nurses are able to show leadership and control in their own work, they are less likely to leave their jobs [ 66 ]. On the other hand, based on our discussions with stakeholders, nurse leaders are not used to using evidence in their own work. Further, evidence-based leadership is not possible if nurse leaders do not have access to a relevant, robust body of evidence, adequate funding, resources, and organizational support, and evidence-informed decision making may only offer short-term solutions [ 55 ]. We still believe that implementing evidence-based strategies into the work of nurse leaders may create opportunities to protect this critical workforce from burnout or leaving the field [ 67 ]. However, the role of the evidence-based approach for nurse leaders in solving these problems is still a key question.

Limitations

This study aimed to use a broad search strategy to ensure a comprehensive review but, nevertheless, limitations exist: we may have missed studies not included in the major international databases. To keep search results manageable, we did not use specific databases to systematically search grey literature although it is a rich source of evidence used in systematic reviews and meta-analysis [ 68 ]. We still included published conference abstract/proceedings, which appeared in our scientific databases. It has been stated that conference abstracts and proceedings with empirical study results make up a great part of studies cited in systematic reviews [ 69 ]. At the same time, a limited space reserved for published conference publications can lead to methodological issues reducing the validity of the review results [ 68 ]. We also found that the great number of studies were carried out in western countries, restricting the generalizability of the results outside of English language countries. The study interventions and outcomes were too different across studies to be meaningfully pooled using statistical methods. Thus, our narrative synthesis could hypothetically be biased. To increase transparency of the data and all decisions made, the data, its categorization and conclusions are based on original studies and presented in separate tables and can be found in Additional files. Regarding a methodological approach [ 34 ], we used a mixed methods systematic review, with the core intention of combining quantitative and qualitative data from primary studies. The aim was to create a breadth and depth of understanding that could confirm to or dispute evidence and ultimately answer the review question posed [ 34 , 70 ]. Although the method is gaining traction due to its usefulness and practicality, guidance in combining quantitative and qualitative data in mixed methods systematic reviews is still limited at the theoretical stage [ 40 ]. As an outcome, it could be argued that other methodologies, for example, an integrative review, could have been used in our review to combine diverse methodologies [ 71 ]. We still believe that the results of this mixed method review may have an added value when compared with previous systematic reviews concerning leadership and an evidence-based approach.

Our mixed methods review fills the gap regarding how nurse leaders themselves use evidence to guide their leadership role and what the measured and perceived impact of evidence-based leadership is in nursing. Although the scarcity of controlled studies on this topic is concerning, the available research data suggest that evidence-based leadership intervention can improve nurse performance, organizational outcomes, and patient outcomes. Leadership problems are also well recognized in healthcare settings. More knowledge and a deeper understanding of the role of nurse leaders, and how they can use evidence in their own managerial leadership decisions, is still needed. Despite the limited number of studies, we assume that this narrative synthesis can provide a good foundation for how to develop evidence-based leadership in the future.

Implications

Based on our review results, several implications can be recommended. First, the future of nursing success depends on knowledgeable, capable, and strong leaders. Therefore, nurse leaders worldwide need to be educated about the best ways to manage challenging situations in healthcare contexts using an evidence-based approach in their decisions. This recommendation was also proposed by nurses and nurse leaders during our discussion meeting with stakeholders.

Second, curriculums in educational organizations and on-the-job training for nurse leaders should be updated to support general understanding how to use evidence in leadership decisions. And third, patients and family members should be more involved in the evidence-based approach. It is therefore important that nurse leaders learn how patients’ and family members’ views as stakeholders are better considered as part of the evidence-based leadership approach.

Future studies should be prioritized as follows: establishment of clear parameters for what constitutes and measures evidence-based leadership; use of theories or models in research to inform mechanisms how to effectively change the practice; conducting robust effectiveness studies using trial designs to evaluate the impact of evidence-based leadership; studying the role of patient and family members in improving the quality of clinical care; and investigating the financial impact of the use of evidence-based leadership approach within respective healthcare systems.

Data availability

The authors obtained all data for this review from published manuscripts.

World Health Organization. State of the world’s nursing 2020: investing in education, jobs and leadership. 2020. https://www.who.int/publications/i/item/9789240003279 . Accessed 29 June 2024.

Hersey P, Campbell R. Leadership: a behavioral science approach. The Center for; 2004.

Cline D, Crenshaw JT, Woods S. Nurse leader: a definition for the 21st century. Nurse Lead. 2022;20(4):381–4. https://doi.org/10.1016/j.mnl.2021.12.017 .

Article   Google Scholar  

Chen SS. Leadership styles and organization structural configurations. J Hum Resource Adult Learn. 2006;2(2):39–46.

Google Scholar  

McKibben L. Conflict management: importance and implications. Br J Nurs. 2017;26(2):100–3.

Article   PubMed   Google Scholar  

Haghgoshayie E, Hasanpoor E. Evidence-based nursing management: basing Organizational practices on the best available evidence. Creat Nurs. 2021;27(2):94–7. https://doi.org/10.1891/CRNR-D-19-00080 .

Majers JS, Warshawsky N. Evidence-based decision-making for nurse leaders. Nurse Lead. 2020;18(5):471–5.

Tichy NM, Bennis WG. Making judgment calls. Harvard Business Rev. 2007;85(10):94.

Sousa MJ, Pesqueira AM, Lemos C, Sousa M, Rocha Á. Decision-making based on big data analytics for people management in healthcare organizations. J Med Syst. 2019;43(9):1–10.

Guo R, Berkshire SD, Fulton LV, Hermanson PM. %J L in HS. Use of evidence-based management in healthcare administration decision-making. 2017;30(3): 330–42.

Liang Z, Howard P, Rasa J. Evidence-informed managerial decision-making: what evidence counts?(part one). Asia Pac J Health Manage. 2011;6(1):23–9.

Hasanpoor E, Janati A, Arab-Zozani M, Haghgoshayie E. Using the evidence-based medicine and evidence-based management to minimise overuse and maximise quality in healthcare: a hybrid perspective. BMJ evidence-based Med. 2020;25(1):3–5.

Shingler NA, Gonzalez JZ. Ebm: a pathway to evidence-based nursing management. Nurs 2022. 2017;47(2):43–6.

Farokhzadian J, Nayeri ND, Borhani F, Zare MR. Nurse leaders’ attitudes, self-efficacy and training needs for implementing evidence-based practice: is it time for a change toward safe care? Br J Med Med Res. 2015;7(8):662.

Article   PubMed   PubMed Central   Google Scholar  

American Nurses Association. ANA leadership competency model. Silver Spring, MD; 2018.

Royal College of Nursing. Leadership skills. 2022. https://www.rcn.org.uk/professional-development/your-career/nurse/leadership-skills . Accessed 29 June 2024.

Kakemam E, Liang Z, Janati A, Arab-Zozani M, Mohaghegh B, Gholizadeh M. Leadership and management competencies for hospital managers: a systematic review and best-fit framework synthesis. J Healthc Leadersh. 2020;12:59.

Liang Z, Howard PF, Leggat S, Bartram T. Development and validation of health service management competencies. J Health Organ Manag. 2018;32(2):157–75.

World Health Organization. Global Strategic Directions for Nursing and Midwifery. 2021. https://apps.who.int/iris/bitstream/handle/10665/344562/9789240033863-eng.pdf . Accessed 29 June 2024.

NHS Leadership Academy. The nine leadership dimensions. 2022. https://www.leadershipacademy.nhs.uk/resources/healthcare-leadership-model/nine-leadership-dimensions/ . Accessed 29 June 2024.

Canadian Nurses Association. Evidence-informed decision-making and nursing practice: Position statement. 2018. https://hl-prod-ca-oc-download.s3-ca-central-1.amazonaws.com/CNA/2f975e7e-4a40-45ca-863c-5ebf0a138d5e/UploadedImages/documents/Evidence_informed_Decision_making_and_Nursing_Practice_position_statement_Dec_2018.pdf . Accessed 29 June 2024.

Hasanpoor E, Hajebrahimi S, Janati A, Abedini Z, Haghgoshayie E. Barriers, facilitators, process and sources of evidence for evidence-based management among health care managers: a qualitative systematic review. Ethiop J Health Sci. 2018;28(5):665–80.

PubMed   PubMed Central   Google Scholar  

Collins DB, Holton EF III. The effectiveness of managerial leadership development programs: a meta-analysis of studies from 1982 to 2001. Hum Res Dev Q. 2004;15(2):217–48.

Cummings GG, Lee S, Tate K, Penconek T, Micaroni SP, Paananen T, et al. The essentials of nursing leadership: a systematic review of factors and educational interventions influencing nursing leadership. Int J Nurs Stud. 2021;115:103842.

Clavijo-Chamorro MZ, Romero-Zarallo G, Gómez-Luque A, López-Espuela F, Sanz-Martos S, López-Medina IM. Leadership as a facilitator of evidence implementation by nurse managers: a metasynthesis. West J Nurs Res. 2022;44(6):567–81.

Young SK. Evidence-based management: a literature review. J Nurs Adm Manag. 2002;10(3):145–51.

Williams LL. What goes around comes around: evidence-based management. Nurs Adm Q. 2006;30(3):243–51.

Fraser I. Organizational research with impact: working backwards. Worldviews Evidence-Based Nurs. 2004;1:S52–9.

Roshanghalb A, Lettieri E, Aloini D, Cannavacciuolo L, Gitto S, Visintin F. What evidence on evidence-based management in healthcare? Manag Decis. 2018;56(10):2069–84.

Jaana M, Vartak S, Ward MM. Evidence-based health care management: what is the research evidence available for health care managers? Eval Health Prof. 2014;37(3):314–34.

Tate K, Hewko S, McLane P, Baxter P, Perry K, Armijo-Olivo S, et al. Learning to lead: a review and synthesis of literature examining health care managers’ use of knowledge. J Health Serv Res Policy. 2019;24(1):57–70.

Geerts JM, Goodall AH, Agius S, %J SS. Medicine. Evidence-based leadership development for physicians: a systematic literature review. 2020;246: 112709.

Barends E, Rousseau DM, Briner RB. Evidence-based management: The basic principles. Amsterdam; 2014. https://research.vu.nl/ws/portalfiles/portal/42141986/complete+dissertation.pdf#page=203 . Accessed 29 June 2024.

Stern C, Lizarondo L, Carrier J, Godfrey C, Rieger K, Salmond S, et al. Methodological guidance for the conduct of mixed methods systematic reviews. JBI Evid Synthesis. 2020;18(10):2108–18. https://doi.org/10.11124/JBISRIR-D-19-00169 .

Lancet T. 2020: unleashing the full potential of nursing. Lancet (London, England). 2019. p. 1879.

Välimäki MA, Lantta T, Hipp K, Varpula J, Liu G, Tang Y, et al. Measured and perceived impacts of evidence-based leadership in nursing: a mixed-methods systematic review protocol. BMJ Open. 2021;11(10):e055356. https://doi.org/10.1136/bmjopen-2021-055356 .

The Joanna Briggs Institute. Joanna Briggs Institute reviewers’ manual: 2014 edition. Joanna Briggs Inst. 2014; 88–91.

Pearson A, White H, Bath-Hextall F, Salmond S, Apostolo J, Kirkpatrick P. A mixed-methods approach to systematic reviews. JBI Evid Implement. 2015;13(3):121–31.

Johnson RB, Onwuegbuzie AJ. Mixed methods research: a research paradigm whose time has come. Educational Researcher. 2004;33(7):14–26.

Hong, Pluye P, Bujold M, Wassef M. Convergent and sequential synthesis designs: implications for conducting and reporting systematic reviews of qualitative and quantitative evidence. Syst Reviews. 2017;6(1):61. https://doi.org/10.1186/s13643-017-0454-2 .

Ramis MA, Chang A, Conway A, Lim D, Munday J, Nissen L. Theory-based strategies for teaching evidence-based practice to undergraduate health students: a systematic review. BMC Med Educ. 2019;19(1):1–13.

Sackett DL, Rosenberg WM, Gray JM, Haynes RB, Richardson WS. Evidence based medicine: what it is and what it isn’t. Bmj. British Medical Journal Publishing Group; 1996. pp. 71–2.

Goodman JS, Gary MS, Wood RE. Bibliographic search training for evidence-based management education: a review of relevant literatures. Acad Manage Learn Educ. 2014;13(3):322–53.

Aromataris E, Munn Z. Chapter 3: Systematic reviews of effectiveness. JBI Manual for Evidence Synthesis. 2020; https://synthesismanual.jbi.global .

Munn Z, Barker TH, Moola S, Tufanaru C, Stern C, McArthur A et al. Methodological quality of case series studies: an introduction to the JBI critical appraisal tool. 2020;18(10): 2127–33.

Hong Q, Pluye P, Fàbregues S, Bartlett G, Boardman F, Cargo M, et al. Mixed methods Appraisal Tool (MMAT) Version 2018: user guide. Montreal: McGill University; 2018.

McKenna J, Jeske D. Ethical leadership and decision authority effects on nurses’ engagement, exhaustion, and turnover intention. J Adv Nurs. 2021;77(1):198–206.

Maxwell M, Hibberd C, Aitchison P, Calveley E, Pratt R, Dougall N, et al. The TIDieR (template for intervention description and replication) checklist. The patient Centred Assessment Method for improving nurse-led biopsychosocial assessment of patients with long-term conditions: a feasibility RCT. NIHR Journals Library; 2018.

Braun V, Clarke V. Using thematic analysis in psychology. Qualitative Res Psychol. 2006;3(2):77–101.

Pollock A, Campbell P, Struthers C, Synnot A, Nunn J, Hill S, et al. Stakeholder involvement in systematic reviews: a scoping review. Syst Reviews. 2018;7:1–26.

Braye S, Preston-Shoot M. Emerging from out of the shadows? Service user and carer involvement in systematic reviews. Evid Policy. 2005;1(2):173–93.

Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Syst Reviews. 2021;10(1):1–11.

Porta M. Pilot investigation, study. A dictionary of epidemiology. Oxford University Press Oxford; 2014. p. 215.

Kreis J, Puhan MA, Schünemann HJ, Dickersin K. Consumer involvement in systematic reviews of comparative effectiveness research. Health Expect. 2013;16(4):323–37.

Joseph ML, Nelson-Brantley HV, Caramanica L, Lyman B, Frank B, Hand MW, et al. Building the science to guide nursing administration and leadership decision making. JONA: J Nurs Adm. 2022;52(1):19–26.

Gifford W, Davies BL, Graham ID, Tourangeau A, Woodend AK, Lefebre N. Developing Leadership Capacity for Guideline Use: a pilot cluster Randomized Control Trial: Leadership Pilot Study. Worldviews Evidence-Based Nurs. 2013;10(1):51–65. https://doi.org/10.1111/j.1741-6787.2012.00254.x .

Hsieh HY, Henker R, Ren D, Chien WY, Chang JP, Chen L, et al. Improving effectiveness and satisfaction of an electronic charting system in Taiwan. Clin Nurse Specialist. 2016;30(6):E1–6. https://doi.org/10.1097/NUR.0000000000000250 .

McAllen E, Stephens K, Swanson-Biearman B, Kerr K, Whiteman K. Moving Shift Report to the Bedside: an evidence-based Quality Improvement Project. OJIN: Online J Issues Nurs. 2018;23(2). https://doi.org/10.3912/OJIN.Vol23No02PPT22 .

Thomas M, Autencio K, Cesario K. Positive outcomes of an evidence-based pressure injury prevention program. J Wound Ostomy Cont Nurs. 2020;47:S24.

Cullen L, Titler MG. Promoting evidence-based practice: an internship for Staff nurses. Worldviews Evidence-Based Nurs. 2004;1(4):215–23. https://doi.org/10.1111/j.1524-475X.2004.04027.x .

DuBose BM, Mayo AM. Resistance to change: a concept analysis. Nursing forum. Wiley Online Library; 2020. pp. 631–6.

Lalleman PCB, Smid GAC, Lagerwey MD, Shortridge-Baggett LM, Schuurmans MJ. Curbing the urge to care: a bourdieusian analysis of the effect of the caring disposition on nurse middle managers’ clinical leadership in patient safety practices. Int J Nurs Stud. 2016;63:179–88.

Article   CAS   PubMed   Google Scholar  

Martin E, Warshawsky N. Guiding principles for creating value and meaning for the next generation of nurse leaders. JONA: J Nurs Adm. 2017;47(9):418–20.

Griffiths P, Recio-Saucedo A, Dall’Ora C, Briggs J, Maruotti A, Meredith P, et al. The association between nurse staffing and omissions in nursing care: a systematic review. J Adv Nurs. 2018;74(7):1474–87. https://doi.org/10.1111/jan.13564 .

Lúanaigh PÓ, Hughes F. The nurse executive role in quality and high performing health services. J Nurs Adm Manag. 2016;24(1):132–6.

de Kok E, Weggelaar-Jansen AM, Schoonhoven L, Lalleman P. A scoping review of rebel nurse leadership: descriptions, competences and stimulating/hindering factors. J Clin Nurs. 2021;30(17–18):2563–83.

Warshawsky NE. Building nurse manager well-being by reducing healthcare system demands. JONA: J Nurs Adm. 2022;52(4):189–91.

Paez A. Gray literature: an important resource in systematic reviews. J Evidence-Based Med. 2017;10(3):233–40.

McAuley L, Tugwell P, Moher D. Does the inclusion of grey literature influence estimates of intervention effectiveness reported in meta-analyses? Lancet. 2000;356(9237):1228–31.

Sarah S. Introduction to mixed methods systematic reviews. https://jbi-global-wiki.refined.site/space/MANUAL/4689215/8.1+Introduction+to+mixed+methods+systematic+reviews . Accessed 29 June 2024.

Whittemore R, Knafl K. The integrative review: updated methodology. J Adv Nurs. 2005;52(5):546–53.

Download references

Acknowledgements

We want to thank the funding bodies, the Finnish National Agency of Education, Asia Programme, the Department of Nursing Science at the University of Turku, and Xiangya School of Nursing at the Central South University. We also would like to thank the nurses and nurse leaders for their valuable opinions on the topic.

The work was supported by the Finnish National Agency of Education, Asia Programme (grant number 26/270/2020) and the University of Turku (internal fund 26003424). The funders had no role in the study design and will not have any role during its execution, analysis, interpretation of the data, decision to publish, or preparation of the manuscript.

Author information

Authors and affiliations.

Department of Nursing Science, University of Turku, Turku, FI-20014, Finland

Maritta Välimäki, Tella Lantta, Kirsi Hipp & Jaakko Varpula

School of Public Health, University of Helsinki, Helsinki, FI-00014, Finland

Maritta Välimäki

Xiangya Nursing, School of Central South University, Changsha, 410013, China

Shuang Hu, Jiarui Chen, Yao Tang, Wenjun Chen & Xianhong Li

School of Health and Social Services, Häme University of Applied Sciences, Hämeenlinna, Finland

Hunan Cancer Hospital, Changsha, 410008, China

Gaoming Liu

You can also search for this author in PubMed   Google Scholar

Contributions

Study design: MV, XL. Literature search and study selection: MV, KH, TL, WC, XL. Quality assessment: YT, SH, XL. Data extraction: JC, MV, JV, WC, YT, SH, GL. Analysis and interpretation: MV, SH. Manuscript writing: MV. Critical revisions for important intellectual content: MV, XL. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Xianhong Li .

Ethics declarations

Ethics approval and consent to participate.

No ethical approval was required for this study.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Differences between the original protocol

We modified criteria for the included studies: we included published conference abstracts/proceedings, which form a relatively broad knowledge base in scientific knowledge. We originally planned to conduct a survey with open-ended questions followed by a face-to-face meeting to discuss the preliminary results of the review. However, to avoid extra burden in nurses due to COVID-19, we decided to limit the validation process to the online discussion only.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary material 2, supplementary material 3, supplementary material 4, supplementary material 5, supplementary material 6, supplementary material 7, supplementary material 8, supplementary material 9, supplementary material 10, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Välimäki, M., Hu, S., Lantta, T. et al. The impact of evidence-based nursing leadership in healthcare settings: a mixed methods systematic review. BMC Nurs 23 , 452 (2024). https://doi.org/10.1186/s12912-024-02096-4

Download citation

Received : 28 April 2023

Accepted : 13 June 2024

Published : 03 July 2024

DOI : https://doi.org/10.1186/s12912-024-02096-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Evidence-based leadership
  • Health services administration
  • Organizational development
  • Quality in healthcare

BMC Nursing

ISSN: 1472-6955

systematic literature review figure

  • Open access
  • Published: 05 July 2024

Integrating virtual patients into undergraduate health professions curricula: a framework synthesis of stakeholders’ opinions based on a systematic literature review

  • Joanna Fąferek 1 ,
  • Pierre-Louis Cariou 2 ,
  • Inga Hege 3 ,
  • Anja Mayer 4 ,
  • Luc Morin 2 ,
  • Daloha Rodriguez-Molina 5 ,
  • Bernardo Sousa-Pinto 6 &
  • Andrzej A. Kononowicz 7  

BMC Medical Education volume  24 , Article number:  727 ( 2024 ) Cite this article

Metrics details

Virtual patients (VPs) are widely used in health professions education. When they are well integrated into curricula, they are considered to be more effective than loosely coupled add-ons. However, it is unclear what constitutes their successful integration. The aim of this study was to identify and synthesise the themes found in the literature that stakeholders perceive as important for successful implementation of VPs in curricula.

We searched five databases from 2000 to September 25, 2023. We included qualitative, quantitative, mixed-methods and descriptive case studies that defined, identified, explored, or evaluated a set of factors that, in the perception of students, teachers, course directors and researchers, were crucial for VP implementation. We excluded effectiveness studies that did not consider implementation characteristics, and studies that focused on VP design factors. We included English-language full-text reports and excluded conference abstracts, short opinion papers and editorials. Synthesis of results was performed using the framework synthesis method with Kern’s six-step model as the initial framework. We appraised the quality of the studies using the QuADS tool.

Our search yielded a total of 4808 items, from which 21 studies met the inclusion criteria. We identified 14 themes that formed an integration framework. The themes were: goal in the curriculum; phase of the curriculum when to implement VPs; effective use of resources; VP alignment with curricular learning objectives; prioritisation of use; relation to other learning modalities; learning activities around VPs; time allocation; group setting; presence mode; VPs orientation for students and faculty; technical infrastructure; quality assurance, maintenance, and sustainability; assessment of VP learning outcomes and learning analytics. We investigated the occurrence of themes across studies to demonstrate the relevance of the framework. The quality of the studies did not influence the coverage of the themes.

Conclusions

The resulting framework can be used to structure plans and discussions around implementation of VPs in curricula. It has already been used to organise the curriculum implementation guidelines of a European project. We expect it will direct further research to deepen our knowledge on individual integration themes.

Peer Review reports

Introduction

Virtual patients (VPs) are defined as interactive computer simulations of real-life clinical scenarios for the purpose of health professions training, education, or assessment [ 1 ]. Several systematic reviews have demonstrated that learning using VPs is associated with educational gains when compared to no intervention and is non-inferior to traditional, non-computer-aided, educational methods [ 2 , 3 , 4 ]. This conclusion holds true across several health professions, including medicine [ 3 , 5 ], nursing [ 6 ] and pharmacy [ 7 ]. The strength of VPs in health professions education lies in fostering clinical reasoning [ 4 , 6 , 8 ] and related communication skills [ 5 , 7 , 9 ]. At the same time, the research syntheses report high heterogeneity of obtained results [ 2 , 4 ]. Despite suggestions in the literature that VPs that are well integrated into curricula are more effective than loosely coupled add-ons [ 5 , 10 , 11 ], there is no clarity on what constitutes successful integration. Consequently, the next important step in the research agenda around VPs is to investigate strategies for effectively implementing VPs into curricula [ 9 , 12 , 13 ].

In the context of healthcare innovation, implementation is the process of uptaking a new finding, policy or technology in the routine practice of health services [ 14 , 15 , 16 ]. In many organisations, innovations are rolled out intuitively, which at times ends in failure even though the new tool has previously shown good results in laboratory settings [ 17 ]. A large review of over 500 implementation studies showed that better-implemented health promotion programs yield 2–3 times larger mean effect sizes than poorly implemented ones [ 18 ]. Underestimation of the importance and difficulty of implementation processes is costly and may lead to unjustified attribution of failure to the new product, while the actual problem is inadequate methods for integration of the innovation into practice [ 15 ].

The need for research into different ways of integrating computer technology into medical schools was recognised by Friedman as early as 1994 [ 19 ]. However, studies of the factors and processes of technology implementation in medical curricula have long been scarce [ 12 ]. While the terminology varies across studies, we will use the terms introduction, integration, incorporation , and implementation of VPs into curricula interchangeably. Technology adoption is the decision to use a new technology in a curriculum, and we view it as the first phase of implementation. In an early guide to the integration of VPs into curricula, Huwendiek et al. recommended, based on their experience, the consideration of four aspects relevant to successful implementation: blending face-to-face learning with on-line VP sessions; designing collaborative learning around VPs; allowing students flexibility in deciding when/where/how to learn with VPs; and constructively aligning learning objectives with suitable VPs and matched assessment [ 20 ]. In a narrative review of VPs in medical curricula, Cendan and Lok identified a few practices which are recommended for the use of VPs in curricula: filling gaps in clinical experience with standardised and safe practice, replacing paper cases with interactive models showing variations in clinical presentations, and providing individualised feedback based on objective observation of student activities. These authors also highlighted cost as a significant barrier to the implementation process [ 21 ]. Ellaway and Davies proposed a theoretical construct based on Activity Theory to relate VPs to their use and to link to other educational interventions in curricula [ 22 ]. However, a systematic synthesis of the literature on the identified integration factors and steps relevant to VP implementation is lacking.

The context of this study was a European project called iCoViP (International Collection of Virtual Patients; https://icovip.eu ) , which involved project partners from France, Germany, Poland, Portugal, and Spain and succeeded in creating a collection of 200 open-access VPs available in 6 languages to support clinical reasoning education [ 23 ]. Such a collection would benefit from being accompanied by integration guidelines to inform potential users on how to implement the collection into their curricula. However, guidelines require frameworks to structure the recommendations. Existing integration frameworks are limited in scope for a specific group of health professions, were created mostly for evaluation rather than guidance, or are theoretical or opinion-based, without an empirical foundation [ 24 , 25 , 26 ].

Inspired by the methodological development of qualitative literature synthesis [ 27 ], we decided to build a mosaic of the available studies in order to identify and describe what stakeholders believe is important when planning the integration of VPs into health professions curricula. The curriculum stakeholders in our review included students, teachers, curriculum planners, and researchers in health professions education. We aimed to develop a framework that would configure existing research on curriculum implementations, structure future practice guidelines, and inform research agendas in order to strengthen the evidence behind the recommendations.

Therefore, the research aim of this study was to identify and synthesise themes across the literature that, in stakeholders’ opinions, are important for the successful implementation of VPs in health professions curricula.

This systematic review is reported in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) framework [ 28 ].

Eligibility criteria

We selected studies whose main objective was to define, identify, explore, or evaluate a set of factors that, in the view of the authors or study participants, contribute to the successful implementation of VPs in curricula. Table  1 summarises the inclusion and exclusion criteria.

The curricula in which VPs were included targeted undergraduate health professions students, such as human medicine, dentistry, nursing, or pharmacy programs. We were interested in the perspectives of all possible stakeholders engaged in planning or directly affected by undergraduate health professions curricula, such as students, teachers, curriculum planners, course directors, and health professions education researchers. We excluded postgraduate and continuing medical education curricula, faculty development courses not specifically designed to prepare a faculty to teach an undergraduate curriculum with VPs, courses for patients, as well as education at secondary school level and below. Also excluded were alternative and complementary medicine programs and programs in which students do not interact with human patients, such as veterinary medicine.

Similar to the previous systematic review [ 4 ], we excluded from the review VP simulations that required non-standard computer equipment (like virtual reality headsets) and those in which the VP was merely a static case vignette without interaction or the VP was simulated by a human (e.g., a teacher answering emails from students as a virtual patient). We included studies in which VPs were presented in the context of health professions curricula; we excluded studies in which VPs were used as extracurricular activities (e.g., one-time learning opportunities, such as conference workshops) or merely as part of laboratory experimentation.

We included all studies that presented original research, and we excluded editorials and opinion papers. Systematic reviews were included in the first stage so we could manually search for references in order to detect relevant studies that had potentially been omitted. We included studies that aimed to comprehensively identify or evaluate external contextual factors relevant for the integration of VPs into curricula or that examined activities around VPs and the organisational, curricular and accreditation context (the constructed and framed layers of activities in Ellaway & Davies’ model [ 22 ]). As the goal was to investigate integration strategies, we excluded VP design studies that looked into techniques for authoring VPs or researched technical or pedagogical mechanisms encoded in VPs that could not be easily altered (i.e., encoded layer of VP activities [ 22 ]). As we looked into studies that comprehensively investigated a set of integration factors that are important in the implementation process, we excluded studies that focus on program effectiveness (i.e., whether or not a VP integration worked) but do not describe in detail how the VPs were integrated into curricula or investigate what integration factors contributed to the implementation process. We also excluded studies that focused on a single integration factor as our goal was to explore the broad perspective of stakeholders’ opinions on what factors matter in integration of VPs into curricula.

We only included studies published in English as we aimed to qualitatively analyse the stakeholders’ opinions in depth and did not want to rely on translations. We chose the year 2000 as the starting point for inclusion. We recognise that VPs were used before this date but also acknowledge the significant shift in infrastructure from offline technologies to the current web-based platforms, user-friendly graphical web browsers, and broadband internet, all of which appeared around the turn of the millennium. Additionally, VP literature before 2000 was mainly focused on demonstrating technology rather than integrating these tools into curricula [ 12 , 19 ].

Information sources and search

We systematically searched the following five bibliographic databases: MEDLINE (via PubMed), EMBASE (via Elsevier), Educational Resource Information Center (ERIC) (via EBSCO), CINAHL Complete (via EBSCO), Web of Science (via Clarivate). The search strategies are presented in Supplementary Material S1 . We launched the first query on March 8, 2022, and the last update was carried out on September 25, 2023. The search results were imported into the Rayyan on-line software [ 29 ]. Duplicate items were removed. Each abstract was screened by at least two reviewers working independently. In the case of disagreement between reviewers, we included the abstract for full text analysis. Next, we downloaded the full text of the included abstracts, and pairs of reviewers analysed the content in order to determine whether they met the inclusion criteria. In the case of disagreement, a third reviewer was consulted to arbitrate the decision.

Data extraction and analysis

Reviewers working independently extracted relevant characteristics of the included studies to an online spreadsheet. We extracted such features as the country in which the study was conducted, the study approach, the data collection method, the year of implementation in the curriculum, the medical topic of the VPs, the type and number of participants, the number of included VPs, the type of VP software, and the provenance of the cases (e.g., self-developed, part of a commercial database or open access repository).

The qualitative synthesis followed the five steps of the framework synthesis method [ 27 , pp. 188–190]. In the familiarisation phase (step 1), the authors who were involved previously in the screening and data extraction process read the full text versions of the included studies to identify text segments containing opinions on how VPs should be implemented into curricula.

Next, after a working group discussion, we selected David Kern’s six-step curriculum development [ 30 ] for the pragmatic initial frame (step 2). Even though it is not a VP integration framework in itself, we regarded it as a “best fit” to configure a broad range of integration factors spanning the whole process of curriculum development. David Kern’s model is often used for curriculum design and reform and has also been applied in the design of e-learning curricula [ 31 ]. Through a series of asynchronous rounds of comments, on-line meetings and one face-to-face workshop that involved a group of stakeholders from the iCoViP project, we iteratively clustered the recommendations into the themes that emerged. Each theme was subsumed to one of Kern’s six-steps in the initial framework. Next, we formulated definitions of the themes.

In the indexing phase (step 3), two authors (JF and AK) systematically coded the results and discussion sections of all the included empirical studies, line-by-line, using the developed themes as a coding frame. Text segments grouped into individual themes were comparatively analysed for consistency and to identify individual topics within themes. Coding was performed using MaxQDA software for qualitative analysis (MaxQDA, version 22.5 [ 32 ]). Disagreements were discussed and resolved by consensus, leading to iterative refinements of the coding frame, clarifications of definitions, and re-coding until a final framework was established.

Subsequently, the studies were charted (step 4) into tables in order to compare their characteristics. Similar papers were clustered based on study design to facilitate closer comparisons. A quality appraisal of the included studies was then performed using a standardised tool. Finally, a visual representation of the framework was designed and discussed among the research team, allowing for critical reflection on the consistency of the themes.

In the concluding step (step 5), in order to ensure the completeness and representativeness of the framework for the analysed body of literature, we mapped the themes from the developed framework to the studies in which they were found, and we analysed how individual themes corresponded to the conceptual and implementation evaluation models identified during the review. We looked for patterns and attempted to interpret them. We also looked for inconsistencies and tensions in the studies to identify potential areas for future research.

Quality appraisal of the included studies

To appraise the quality of the included studies, we selected the QuADS (Quality Assessment with Diverse Studies) tool [ 33 ], which is suitable for assessing the quality of studies with diverse designs, including mixed- or multi-method studies. This tool consists of 13 items on a four-point scale (0: not reported; 1: reported but inadequate; 2: reported and partially adequate; 3: sufficiently reported). QuADS has previously been successfully used in synthesis of studies in the field of health professions education [ 34 ] and technology-enhanced learning environments [ 35 ]. The included qualitative studies, quantitative surveys, and mixed-methods interview studies were independently assessed by two reviewers (JF, AK). The results were then compared; if differences arose, the justifications were discussed and a final judgement was reached by consensus. Following the approach taken by Goagoses et al. [ 35 ], we divided the studies into three groups, depending on the summary quality score: weak (≤ 49% of QuADS points); medium (50–69%) and high (≥ 70%) study quality.

Characteristics of the included studies

The selection process for the included studies is presented in Fig.  1 .

figure 1

PRISMA flowchart of the study selection process

Our search returned a total of 4808 items. We excluded duplicate records ( n  = 2201), abstracts not meeting the inclusion criteria ( n  = 2526), and complete reports ( n  = 59) after full text analysis. In the end, 21 studies met our inclusion criteria.

Types of included studies

In the analysis of the 21 included studies, 18 were classified as empirical studies, while three studies were identified as theoretical or evaluation models.

The purpose of the 18 empirical studies was to survey or directly observe the reaction of stakeholders to curriculum integration strategies in order to identify or describe the relevant factors (Table  2 ). Study types included qualitative ( n  = 4) [ 11 , 36 , 37 , 38 ], mixed-methods ( n  = 4) [ 39 , 40 , 41 , 42 ], quantitative survey ( n  = 4) [ 10 , 43 , 44 , 45 ], and descriptive case studies ( n  = 6) [ 46 , 47 , 48 , 49 , 50 , 51 ]. Data collection methods included questionnaires ( n  = 9) [ 10 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 48 ], focus groups and small group interviews ( n  = 8) [ 11 , 36 , 37 , 38 , 39 , 41 , 42 , 48 ], system log analyses ( n  = 3) [ 44 , 47 , 48 ], direct observations ( n  = 1) [ 44 ], or narrative descriptions of experiences with integration ( n  = 5) [ 46 , 47 , 49 , 50 , 51 ]. The vast majority of studies reported experiences from integration of VPs into medical curricula ( n  = 15). Two studies reported integration of VPs into nursing programs [ 40 , 51 ], one in a dentistry [ 40 ] and one in a pharmacy program [ 41 ]. One study was unspecific about the health professions program [ 46 ].

The remaining three of the included studies represented a more theoretical approach: one aimed to create a conceptual model [ 25 ]; the other two [ 24 , 26 ] presented evaluation models of the integration process (Table  3 ). We analysed them separately, considering their different structures, and we mapped the components of these models to our framework in the last stage of the framework synthesis.

Themes in the developed framework

The developed framework (Table  4 ), which we named the iCoViP Virtual Patient Curriculum Integration Framework (iCoViP Framework), contains 14 themes and 51 topic codes. The final version of the codebook used in the study can be found in Supplementary Material  S2 . Below, we describe the individual themes.

General needs assessment

In the Goal theme, we coded perceptions regarding appropriate general uses of VPs in curricula. This covers the competencies to be trained using VPs, but also unique strengths and limitations of VPs as a learning method that should influence decisions regarding their adoption in curricula.

A common opinion was that VPs should target clinical reasoning skills and subskills such as acquisition/organisation of clinical information, development of illness scripts (sign, symptoms, risk factors, knowledge of disease progress over time), patient-centred care (including personal preferences and cultural competencies in patient interaction) [ 11 , 36 , 37 , 38 , 39 , 40 , 42 , 43 , 44 , 45 , 46 , 49 , 50 , 51 ]. According to these opinions, a strength of VPs is their potential for self-directed learning in an authentic, practice-relevant, safe environment that gives opportunities for reflection and “productive struggle” [ 37 , 39 , 49 ]. VPs also make it possible for students to practise decision-making in undifferentiated patient cases and observe the development of disease longitudinally [ 45 ]. For instance, some students valued the potential of VPs as a tool that integrates basic knowledge with clinical application in a memorable experience:

We associate a disease more to a patient than to the textbook. If I saw the patient, saw the photo and questioned the patient in the program, I will remember more easily, I’ll have my flashback of that pathology more than if I only studied my class notes or a book. {Medical student, 4th year, Columbia} [ 36 ].

Another perceived function of VPs is to help fill gaps in curricula and clinical experiences [ 36 , 37 , 38 , 42 , 45 , 50 ]. This supporting factor for the implementation of VPs in curricula is particularly strong when combined with the need to meet regulatory requirements [ 42 ].

Varying opinions were expressed regarding the aim of VPs to represent rare diseases (or common conditions but with unusual symptoms) [ 43 , 48 ] versus common clinical pictures [ 37 , 40 ]. Another tension arose when considering whether VPs should be used to introduce new factual/conceptual knowledge versus serving as a knowledge application and revision tool:

The students, however, differed from leaders and teachers in assuming that VPS should offer a reasonable load of factual knowledge with each patient. More as a surprise came the participants’ preference for usual presentations of common diseases. [ 40 ].

Limitations of VPs were voiced when the educational goal was related to physical contact and hands-on training because, in some aspects of communication skills, physical examination, or application of medical equipment, VPs clearly have inferior properties to real patients, human actors or physical mannequins [ 36 , 51 ].

Targeted needs assessment

The Phase theme described the moment in curricula when the introduction of VPs was regarded as adequate. According to some opinions, VPs should be introduced early in curricula to provide otherwise limited exposure to real patients [ 39 , 43 ]:

Students of the pre-clinical years show a high preference in the adoption of VPs as learning activities. That could be explained from the lack of any clinical contact with real patients in their two first years of study and their willingness to have early, even virtual, clinical encounters. [ 43 ].

The tendency to introduce VPs early in curricula was confronted with the problem of students’ limited core knowledge as they were required to use VPs before they had learnt about the features of the medical conditions they were supposed to recognise [ 41 , 48 ]. At the other end of the time axis, we did not encounter opinions that specified when it would be too late to use VPs in curricula. Even final-year students stated that they preferred to augment their clinical experience with VPs [ 43 ].

In the Resources theme, we gathered opinions regarding the cost and assets required for the integration of VPs into curricula. Cost can be a barrier that, if not addressed properly, can slow down or even stop an implementation, therefore it should be addressed early in the implementation process. This includes monetary funds [ 42 ] and availability of adequately qualified personnel [ 38 ] and their time [ 47 ].

For instance, it was found that if a faculty member is primarily focused on clinical work, their commitment to introducing innovation in VPs will be limited and will tend to revert to previous practices unless additional resources are provided to support the change [ 38 ].

The Resources theme also included strategies to follow when there is only a limited number of resources to implement VPs in a curriculum. Some suggested solutions included the sharing of VPs with other institutions [ 50 ], the exchange of know-how on the implementation of VPs with more experienced institutions and networks of excellence [ 38 , 42 ], and increasing faculties’ awareness of the benefits of using VPs, also in terms of reduced workload after the introduction of VPs in curricula [ 38 ]. Finally, another aspect of this theme was the (lack of) awareness of the cost of implementing VPs in curricula across stakeholder groups [ 40 ].

Goals and objectives

The Alignment theme grouped utterances highlighting the importance of selecting the correct VP content for curricula and matching VPs with several elements of curricula, such as learning objectives, the content of VPs across different learning forms, as well as the need to adapt VPs to local circumstances. The selection criteria included discussion regarding the number of VPs [ 36 ], fine-grained learning objectives that could be achieved using VPs [ 42 , 50 ], and selection of an appropriate difficulty level, which preferably should gradually increase [ 11 , 49 ].

It was noticed that VPs can be used to systematically cover a topic. For example, they can align with implementation of clinical reasoning themes in curricula [ 38 ] or map a range of diseases that are characteristic of a particular region of interest, thereby filling gaps in important clinical exposure and realistically representing the patient population [ 36 ].

Several approaches were mentioned regarding the alignment of VPs with curricula that include the selection of learning methods adjusted to the type of learning objectives [ 45 ], introduction of VPs in small portions in relevant places in curricula to avoid large-scale changes [ 38 ], alignment of VP content with assessment [ 39 ], and the visibility of this alignment by explicitly presenting the specific learning objectives addressed by VPs [ 49 ]. It is crucial to retain cohesion of educational content across a range of learning modalities:

I worked through a VP, and then I went to the oncology ward where I saw a patient with a similar disease. After that we discussed the disease. It was great that it was all so well coordinated and it added depth and some [sic!] much needed repetition to the case. {Medical student, 5th year, Germany} [ 11 ].

We also noted unresolved dilemmas, such as whether to present VPs in English as the modern lingua franca to support the internationalisation of studies, versus the need to adapt VPs to the local native language of learners in order to improve accessibility and perceived relevance [ 50 ].

Prioritisation

Several studies presented ideas for achieving higher Prioritisation of VPs in student agendas. The common but “heavy-handed” approach to increase motivation was to make completion of VPs a mandatory requirement to obtain course credits [ 36 , 48 , 51 ]. However, this approach was then often criticised for promoting superficial learning and lack of endorsement for self-directed learning [ 47 ]. Motivation was reported to increase when content was exam-relevant [ 11 ].

According to yet another mentioned strategy, motivation comes with greater engagement of teachers who intensively reference VPs in their classes and often give meaningful feedback regarding their use [ 40 ] or construct group activities around them [ 46 ]. It was suggested that VPs ought to have dedicated time for their use which should not compete with activities with obviously higher priorities, such as meeting real patients [ 37 ].

Another idea for motivation was adjustment of VPs to local needs, language and culture. It was indicated that it would be helpful to promote VPs’ authenticity by stressing the similarity of presented scenarios to problems clinicians encounter in clinical practice (e.g., using teacher testimonials [ 48 ]). Some students saw VPs as being more relevant when they are comprehensively described in course guides and syllabi [ 39 ]. The opinions about VPs that circulate among more-experienced students are also important:

Definitely if the year above kind of approves of something you definitely think you need it. {Medical student, 3rd year, UK} [ 39 ].

Peer opinion was also important for teachers, who were reported to be more likely to adopt VPs in their teaching if they have heard positive opinions from colleagues using them, know the authors of VP cases, or respect organisations that endorse the use of VP software [ 38 , 42 ]:

I was amazed because it was a project that seemed to have incredible scope, it was huge. I was impressed that there was the organization to really roll out and develop all these cases and have this national organization involved. {Clerkship director, USA} [ 42 ].

Educational strategies

The Relation theme contained opinions about the connections between VPs and other types of learning activities. This theme was divided into preferences regarding which types of activities should be replaced or extended by VPs, and the relative order in which they should appear in curricula. We noticed general warnings that VPs should not be added on top of existing activities as this is likely to cause work overload for students [ 10 , 45 ]. The related forms of education that came up in the discussions were expository methods like lectures and reading assignments (e.g., textbooks, websites), small group discussions in seminars (e.g., problem-based learning [PBL] sessions, follow-up seminars), alternative forms of simulations (e.g., simulated patients, human patient simulators), clinical teaching (i.e., meeting with real patients and bedside learning opportunities), and preparation for assessments.

Lectures were seen as a form of providing core knowledge that could later be applied in VPs:

Working through the VP before attending the lecture was not as useful to me as attending the lecture before doing the VP. I feel I was able to get more out of the VP when I first attended the lecture in which the substance and procedures were explained. {Medical student, 5th year, Germany} [ 11 ].

Textbooks were helpful as a source of reference knowledge while solving VPs that enabled students to reflect while applying this knowledge in clinical context. Such a learning scenario was regarded impossible in front of real patients:

But here it’s very positive right now when we really don’t know everything about rheumatic diseases, that we can sit with our books at the same time as we have a patient in front of us. {Medical student, 3rd year, Sweden} [ 37 ].

Seminars (small group discussions) were perceived as learning events that motivate students to work intensively with VPs and as an opportunity to ask questions about them [ 11 , 46 , 47 ], with the warning that teachers should not simply repeat the content of VPs as this would be boring [ 44 ]. The reported combination of VPs with simulated patients made it possible to increase the fidelity of the latter by means of realistic representation of clinical signs (e.g., cranial nerve palsies) [ 48 ]. It was noticed that VPs can connect different forms of simulation, “turn[ing] part-task training into whole-task training” [ 46 ], or allow more thorough and nuanced preparation for other forms of simulation (e.g., mannequin-based simulation) [ 46 ]. A common thread in the discussion was the relation between VPs and clinical teaching [ 10 , 11 , 37 , 39 , 45 , 46 ]. The opinions included warnings against spending too much time with VPs at the expense of bedside teaching [ 37 , 51 ]. The positive role of VPs was highlighted in preparing for clinical experience or as a follow-up to meeting real patients because working with VPs is not limited by time and is not influenced by emotions [ 37 ].

Huwendiek et al. [ 11 ] suggested a complete sequence of activities which has found confirmation in some other studies [ 48 ]: lectures, VP, seminars and, finally, real patients. However, we also identified alternative solutions, such as VPs that are discussed between lectures as springboards to introduce new concepts [ 49 ]. In addition, some studies concluded that students should have the right to decide which form of learning they prefer in order to achieve their learning objectives [ 38 , 48 ], but this conflicts with limited resources, a problem the students seem not to consider when expressing their preferences.

In the Activities theme, we grouped statements about tasks constructed by teachers around VPs. This includes teachers asking questions to probe whether students have understood the content of VPs, and guiding students in their work with VPs [ 11 , 49 ]. Students were also expected to ask their teachers questions to clarify content [ 43 ]. Some educators felt that students trained using VPs ask too many questions instead of relying more on their clinical reasoning skills and asking fewer, but more pertinent questions [ 38 ].

Students were asked to compare two or more VPs with similar symptoms to recognise key diagnostic features [ 11 ] and to reflect on cases, discuss their decisions, and summarise VPs to their peers or document them in a standardised form [ 11 , 46 , 49 , 51 ]. Another type of activity was working with textbooks while solving VP cases [ 37 ] or following a standard/institutional checklist [ 51 ]. Finally, some students expected more activities around VPs and felt left alone to struggle with learning with VPs [ 37 ].

Implementation

Another theme grouped stakeholders’ opinions regarding Time. A prominent topic was the time required for VP activities. Some statements provided the exact amount of time allocated to VP activities (e.g., one hour a week [ 51 ]), sometimes suggesting that it should be increased. There were several comments from students complaining about insufficient time allocated for VP activities:

There was also SO much information last week and with studying for discretionary IRATs constantly, I felt that I barely had enough time to synthesize the information and felt burdened by having a deadline for using the simulation. {Medical student, 2nd year, USA} [ 48 ].

Interestingly, the perceived lack of time was sometimes interpreted by researchers as a matter of students not assigning high enough priority to VP tasks because they do not consider them relevant [ 39 ].

Some students expected their teachers to help them with time management. Mechanisms for this included explicitly allocated time slots for work with VPs, declaration of the required time spent on working with VPs, and setting deadlines for task completion:

Without a time limit we can say: I’ll check the cases later, and then nothing happens; but if there’s a time limit, well, this week I see cardiac failure patients etc. It’s more practical for us and also for the teachers, I think. {Medical student, 4th year, Columbia} [ 36 ].

This expectation conflicts with the views that students should learn to self-regulate their activities, that setting a minimum amount of time that students should spend working with VPs will discourage them from doing more, and that deadlines cause an acute burst of activity shortly before them, but no activity otherwise [ 47 , 48 ].

Finally, it was interesting to notice that some educators and students perceived VPs as a more time-efficient way of collecting clinical experience than meeting real patients [ 37 , 38 ].

The Group theme included preferences for working alone or in a group. The identified comments revealed tensions between the benefits of working in groups, such as gaining new perspectives, higher motivation thanks to teamwork, peer support:

You get so much more from the situation when you discuss things with someone else, than if you would be working alone. {Medical student, 3rd year, Sweden} [ 37 ].

and the flexibility of working alone [ 43 , 44 , 46 , 49 ]. Some studies reported on their authors’ experiences in selection of group size [ 11 , 48 ]. It was also noted that smaller groups motivated more intensive work [ 41 , 44 ].

In the Presence theme, we coded preferences regarding whether students should work on VPs in a computer lab, a shared space, seminar rooms, or at home. Some opinions valued flexibility in selecting the place of work (provided a good internet connection is available) [ 11 , 36 ]. Students reported working from home in order to prepare well for work in a clinical setting:

... if you can work through a VP at home, you can check your knowledge about a certain topic by working through the relevant VP to see how you would do in a more realistic situation. {Medical student, 5th year, Germany} [ 11 ].

Some elements of courses related to simulated patient encounters had to be done during obligatory face-to-face training in a simulation lab (e.g., physical examination) that accompanied work with VPs [ 51 ]. Finally, it was observed that VPs offer sufficient flexibility to support different forms of blended learning scenarios [ 46 ]. Synchronous collaborative learning can be combined with asynchronous individual learning, which is particularly effective when there is a need for collaboration between geographically dispersed groups [ 46 ], for instance if a school has more than one campus.

Orientation

In the Orientation theme, we included all comments that relate to the need for teacher training, the content of teacher training courses, and the form of preparation of faculty members and students for using VPs. Knowledge and skills mentioned as useful for the faculty were awareness about how VPs fit into curricula [ 42 ], small-group facilitation skills, clinical experience [ 11 ], and experience with online learning [ 38 ]. Teachers expected to be informed about the advantages/disadvantages and evidence of effectiveness of VPs [ 38 ]. For students, the following prerequisites were identified: the ability to operate VP tools and experience with online learning in general, high proficiency of the language in which the VPs are presented and, for some scenarios (e.g., learning by design), also familiarity with VP methodology [ 38 , 47 , 48 , 50 , 51 ]. It was observed that introduction of VPs is more successful when both teachers and students are familiar with the basics of clinical reasoning theory and explicit teaching methods [ 38 ].

Forms of student orientation that were also identified regarding the use of VPs included demonstrations and introductions at the start of learning units [ 42 ], handouts and email reminders, publication of online schedules for assigned VPs, and expected time to complete them [ 11 , 48 ].

Infrastructure

The Infrastructure theme grouped stakeholders’ requirements regarding the technical environment in which VPs work. This included the following aspects: stable internet connection, secure login, usability of the user interface, robust software (well tested for errors and able to handle many simultaneous users), interoperability (e.g., support for the standardised exchange of VPs between universities) and access to an IT helpdesk [ 11 , 40 , 42 , 47 , 50 ]. It was noticed that technical glitches can have a profound influence on the perceived success of VP integration:

Our entire team had some technical difficulties, whether during the log-in process or during the patient interviews themselves and felt that our learning was somewhat compromised by this. {Medical student, 2nd year, USA} [ 48 ].

Evaluating the effectiveness

Sustainability & quality.

In the Sustainability & Quality theme, we indexed statements regarding the need to validate and update VP content, and its alignment with curricular goals and actual assessment to respond to changes in local conditions and regulatory requirements [ 45 ].

The need to add new cases to VP collections that are currently in use was mentioned [ 40 ]. This theme also included the requirement to evaluate students’ opinions on VPs using questionnaires, feedback sessions and observations [ 47 , 48 , 49 ]. Some of the stakeholders required evidence regarding the quality of VPs before they decided to adopt them [ 38 , 42 , 50 ]. Interestingly, it was suggested that awareness of the need for quality control of VPs varied between stakeholder groups, with low estimation of the importance of this factor among educational leaders:

Leaders also gave very low scores to both case validation and case exchange with other higher education institutions (the latter finding puts into perspective the current development of VPS interoperability standards). The leaders’ lack of interest in case validation may reflect a de facto conviction, that it is the ‘shell’ that validates the content. [ 40 ].

The Assessment theme encompasses a broad selection of topics related to various forms of using VPs in the assessment of educational outcomes related to VPs. This includes general comments on VPs as an assessment form, use of VPs in formative and summative assessment, as well as the use of learning analytics methods around VPs.

General topics identified in this theme included which learning objectives should be assessed with VPs, such as the ability to conduct medical diagnostic processes effectively [ 36 ], the authenticity of VPs as a form of examination [ 36 ], the use of VPs for self-directed assessment [ 11 , 39 , 43 , 46 ], and the emotions associated with assessment using VPs, e.g., reduced stress and a feeling of competitiveness [ 36 , 48 ].

Other topics discussed in the context of assessment included the pedagogical value of using VPs for assessments [ 36 ], such as the improved retention of information through reflection on diagnostic errors made with VPs [ 48 ], and VPs’ ability to illustrate the consequences of students’ errors [ 46 ]. Methods of providing feedback during learning with VPs were also described [ 11 ]. It was highlighted that data from assessments using VPs can aid teachers in planning future training [ 49 , 51 ]. Furthermore, it was observed that feedback from formative assessments with VPs motivates students to engage more deeply in their future learning [ 10 , 41 , 47 ]:

It definitely helped what we did wrong and what we should have caught, because there was a lot that I missed and I didn’t realize it until I got the feedback and in the feedback it also said where you would find it most of the time and why you would have looked there in the first place. {Pharmacy student, 4th year, Canada} [ 41 ].

In several papers [ 42 , 47 , 48 , 51 ], suggestions were made regarding the types of metrics that can be used to gauge students’ performance (e.g., time to complete tasks related to VPs, the accuracy of answers given in the context of VPs, recall and precision in selecting key features in the diagnostic process, the order of selecting diagnostic methods, and the quality of medical documentation prepared by students from VPs). The use of specific metrics and the risks associated with them were discussed. For instance, time spent on a task was sometimes seen as a metric of decision efficiency (a speed-based decision score) that should be minimised [ 48 ], or as an indicator of diligence in VP analysis that should be maximised [ 47 ]. Time measurements in on-line environments can be influenced by external factors like parallel learning using different methods (e.g. consulting a textbook) or interruptions unrelated to learning [ 47 ].

Finally, the analysed studies discussed summative aspects of assessment, including arguments regarding the validity of using VPs in assessments [ 51 ], the need to ensure alignment between VPs and examination content [ 49 ], and the importance of VP assessment in relation to other forms of assessment (e.g., whether it should be part of high-stakes examinations) [ 40 , 51 ]. The studies also explored forms of assessment that should be used to test students’ assimilation of content delivered through VPs [ 47 ], the challenges related to assessing clinical reasoning [ 38 ], and the risk of academic dishonesty in grading based on VP performance [ 48 ].

Mapping of the literature using the developed framework

We mapped the occurrence of the iCoViP Framework themes across the included empirical studies, as presented in Fig.  2 .

figure 2

Code matrix of the occurrence of themes in the included empirical studies

Table  5 displays a pooled number of studies in which each theme occurred. The three most frequently covered themes were Prioritisation , Goal , and Alignment . These themes were present in approx. 90% of the analysed papers. Each theme from the framework appeared in at least four studies. The least-common themes, present in fewer than one-third of studies, were Phase , Presence , and Resources .

We mapped the iCoViP Framework to the three identified existing theoretical and evaluation models (Fig.  3 ).

figure 3

Mapping of the existing integration models to the iCoViP Framework

None of the compared models contained a category that could not be mapped to the themes from the iCoViP Framework. The model by Georg & Zary [ 25 ] covered the fewest themes from our framework, including only the common categories of Goal, Alignment, Activities and Assessment . The remaining two models by Huwendiek et al. [ 24 ] and Kleinheksel & Ritzhaupt [ 26 ] underpinned integration quality evaluation tools and covered the majority of themes (9 out of 14 each). There were three themes not covered by any of the models: Phase, Resources, and Presence .

Quality assessment of studies

The details of the quality appraisal of the empirical studies using the QuADS tool are presented in Supplementary Material S3 . The rated papers had medium (50–69%; [ 39 , 40 , 43 ]) to high quality (≥ 70%; [ 10 , 11 , 36 , 37 , 38 , 41 , 42 , 44 , 45 ]). Owing to the difficulty in identifying the study design elements in the included descriptive case studies [ 46 , 47 , 48 , 49 , 50 , 51 ], we decided against assessing their methodological quality with the QuADS tool. This difficulty can also be interpreted as indicative of the low quality of the studies in this group.

The QuADS quality criterion that was most problematic in the reported studies was the inadequate involvement of stakeholders in study design. Most studies reported the involvement of students or teachers only in questionnaire pilots, but not in the conceptualisation of the research. Another issue was the lack of explicit referral to the theoretical frameworks upon which the studies were based. Finally, in many of the studies, participants were selected using convenience sampling, or the authors did not report purposeful selection of the study group.

We found high-quality studies in qualitative, quantitative, and mixed-methods research. There was no statistical correlation between study quality and the number of topics covered. For sensitivity analysis, we excluded all medium-quality and descriptive studies from the analysis; this did not reduce the number of iCoViP Framework topics covered by the remaining high-quality studies.

In our study, we synthesised the literature that describes stakeholders’ perceptions of the implementation of VPs in health professions curricula. We systematically analysed research reports from a mix of study designs that provided a broad perspective on the relevant factors. The main outcome of this study is the iCoViP Framework, which represents a mosaic of 14 themes encompassing many specific topics encountered by stakeholders when reflecting on VPs in health professions curricula. We examined the prevalence of the identified themes in the included studies to justify the relevance of the framework. Finally, we assessed the quality of the analysed studies.

Significance of the results

The significance of the developed framework lies in its ability to provide the health professions education community with a structure that can guide VP implementation efforts and serve as a scaffold for training and research in the field of integration of VPs in curricula. The developed framework was immediately applied in the structuring of the iCoViP Curriculum Implementation Guideline. This dynamic document, available on the website of the iCoViP project [ https://icovip.eu/knowledge-base ], presents the recommendations taken from the literature review and the project partners’ experiences with how to implement VPs, particularly the collection of 200 VPs developed during the iCoViP project [ 23 ]. To improve the accessibility of this guideline, we have added a glossary with definitions of important terms. We have already been using the framework to structure faculty development courses on the topic of teaching with VPs.

It is clear from our study that the success of integrating VPs into curricula depends on the substantial effort that is required of stakeholders to make changes in the learning environment to enable VPs to work well in the context of local health professions education programs. The wealth of themes discussed in the literature around VPs confirms what is known from implementation science: the quality of the implementation is as important as the quality of the product [ 15 ]. This might be disappointing for those who hope VPs are a turnkey solution that can be easily purchased to save time, under the misconception that implementation will occur effortlessly.

Our review also makes it evident that implementation of VPs is a team endeavour. Without understanding, acceptance and mutual support at all levels of the institutional hierarchy and a broad professional background, different aspects of the integration of VPs into curricula will not match. Students should not be left to their own devices when using VPs. They need to understand the relevance of the learning method used in a given curriculum by observing teachers’ engagement in the routine use of VPs, and they should properly understand the relationship between VPs and student assessment. Despite the IT-savviness of many students, they should be shown how and when to use VPs, while also allowing room for creative, self-directed learning. Finally, students should not get the impression that their use of VPs comes at the expense of something they give higher priority, such as direct patient contact or teacher feedback. Teachers facilitating learning with VPs should be convinced of their utility and effectiveness, and they need to know how to use VPs by themselves before recommending them to students. It is important that teachers are aware that VPs, like any other teaching resources, require quality control linked with perpetual updates. They should feel supported by more-experienced colleagues and an IT helpdesk if methodological or technical issues arise. Last but not least, curriculum managers should recognise the benefits and limitations of VPs, how they align with institutional goals, and that their adoption requires both time and financial resources for sustainment. All of this entails communication, coordinated efforts, and shared decision-making during the implementation of VPs in curricula.

Implications for the field

Per Nilsen has divided implementation theories, models and frameworks into three broad categories: process models, determinant frameworks and evaluation models [ 16 ]. We view the iCoViP Framework primarily as a process model. This perspective originates from the initial framework we adopted in our systematic review, namely Kern’s 6-steps curriculum development process [ 30 ], which facilitates the grouping of curricula integration factors into discrete steps and suggests a specific order in which to address implementation tasks. Our intention in using this framework was also to structure how-to guidelines, which are another hallmark of process models. As already noted by Nilsen and as is evident in Kern’s model, implementation process models are rarely applied linearly in practice and require a pragmatic transition between steps, depending on the situation.

The boundary between the classes of implementation models is blurred [ 16 ] and there is significant overlap. It is therefore not surprising that the iCoViP framework can be interpreted through the lens of a determinant framework which configures many factors (facilitators and barriers) that influence VP implementation in curricula. Nilsen’s category of determinant frameworks includes the CFIR framework [ 52 ], which was also chosen by Kassianos et al. to structure their study included in this review [ 38 ]. A comparison of the themes emerging from their study and our framework indicates a high degree of agreement (as depicted in Fig.  2 ). We interpret this as a positive indication of research convergence. Our framework extends this research by introducing numerous fine-grained topic codes that are characteristic of VP integration into curricula.

The aim of our research was not to develop an evaluation framework. For this purpose, the two evaluation tools available in the literature by Huwendiek et al. [ 24 ] and Kleinheksel & Ritzhaupt [ 26 ] are suitable. However, the factors proposed in our framework can further inform and potentially extend existing or new tools for assessing VP integration.

Despite the plethora of available implementation science theories and models [ 16 ], their application in health professions curricula is limited [ 15 ]. The studies included in the systematic review only occasionally reference implementation sciences theories directly (exceptions are CFIR and UTAUT [ 38 ], Rogers’ Diffusion of Innovation Theory [ 26 , 42 ] and Surry’s RIPPLES model [ 42 ]). However, it is important to acknowledge that implementation science is itself an emerging field that is gradually gaining recognition. Furthermore, as noticed by Dubrowski & Dubrowski [ 17 ], the direct application of general implementation science models does not guarantee success and requires verification and adaptation.

Limitations and strengths

This study is based on stakeholders’ perceptions of the integration of VPs into curricula. The strength of the evidence behind the recommendations expressed in the analysed studies is low from a positivist perspective as it is based on subjective opinions. However, by adopting a more interpretivist stance in this review, our goal is not to offer absolute, ready-to-copy recommendations. Instead, we aim to provide a framework that organises the implementation themes identified in the literature into accessible steps. It is beyond the scope of this review to supply an inventory of experimental evidence for the validity of the recommendations in each topic, as was intended in previous systematic reviews [ 4 ]. We recognise that, for some themes, it will always be challenging to achieve a higher level of evidence due to practical constraints in organising studies that experiment with different types of curricula. The complexity, peculiarities, and context-dependency of implementation likely preclude one-size-fits-all recommendations for VP integration. Nevertheless, even in such a situation, a framework for sorting through past experiences with integration of VPs proves valuable for constructing individual solutions that fit a particular context.

The aim of our study was to cover experiences from different health professions programs in the literature synthesis. However, with a few exceptions, the results show a dominance of medical programs in research on VP implementation in curricula. This, although beyond the authors’ control, limits the applicability of our review findings. The data clearly indicates a need for more research into the integration of VPs into health professions curricula other than medicine.

The decision to exclude single-factor studies from the framework synthesis is justified by our aim to provide a comprehensive overview of the integration process. Nevertheless, recommendations from identified single-factor studies [ 53 , 54 , 55 ] were subsequently incorporated into the individual themes in the iCoViP project implementation guideline. We did not encounter any studies on single factors that failed to align with any of the identified themes within the framework. Due to practical reasons concerning the review’s feasibility, we did not analyse studies in languages other than English and did not explore non-peer-reviewed grey literature databases. However, we recognise the potential of undertaking such activities in preparing future editions of the iCoViP guideline as we envisage this resource as an evolving document.

We acknowledge that our systematic review was shaped by the European iCoViP project [ 23 ]. However, we did not confine our study to just a single VP model, thereby encompassing a broad range of technical implementations. The strength of this framework synthesis lies in the diversity of its contributors affiliated with several European universities in different countries, who were at different stages of their careers, and had experience with various VP systems.

Further research

The iCoViP framework, by charting a map of themes around VP integration in health professions curricula, provides a foundation for further, more focused research on individual themes. The less-common themes or conflicts and inconsistencies in recommendations found in the literature synthesis may be a promising starting point.

An example of this is the phase of the curriculum into which a given VP fits. We see that proponents of early and late introduction of VPs use different arguments. The recommendation that VPs should be of increasing difficulty seems to be valid, but what is missing is the detail of what this means in practice. We envisage that this will be researched by exploring models of integration that cater for different levels of student expertise.

There are also varying opinions between those who see VPs as tools for presenting rare, intriguing cases, and those who see the commonality and practice relevance of the clinical problems presented in VPs as the most important factor. However, these opposing stances can be harmonised by developing a methodology to establish a well-balanced case-mix of VPs with different properties depending upon the needs of the learners and curricular context. Another point of division is the recognition of VPs as a tool for internationalising studies and supporting student mobility, versus the expectation that VPs should be adapted to local circumstances. These disparate beliefs can be reconciled by research into the design of activities around VPs that explicitly addresses the different expectations and confirm or refute their usefulness.

A significant barrier to the adoption of VPs is cost. While universities are occasionally willing to make a one-off investment in VPs for prestige or research purposes, the field needs more sustainable models. These should be suitable for different regions of the world and demonstrate how VPs can be maintained at a high level of quality in the face of limited time and resources. This is particularly important in low-resource countries and those affected by crises (e.g., war, natural disasters, pandemics), where the need for VPs is even greater than in developed countries due to the shortage of health professionals involved in teaching [ 56 ]. However, most of the studies included in our systematic review are from high-income countries. This shows a clear need for more research into the implementation of VPs in health professions curricula in developing countries.

Finally, an interesting area for future research is the interplay of different types of simulation modalities in curricula. The studies we reviewed do not recommend one type of simulation over another as each method has its unique advantages. In line with previous suggestions [ 46 ], we see a need for further research into practical implementation methods of such integrated simulation scenarios in curricula.

Stakeholders’ perceptions were structured into 14 themes by this framework synthesis of mixed methods studies on the curricular integration of VPs. We envision that teachers, course directors and curriculum designers will benefit from this framework when they decide to introduce VPs in their teaching. We anticipate that our summary will inspire health professions education researchers to conduct new studies that will deepen our understanding of how to effectively and efficiently implement VPs in curricula. Last but not least, we hope that our research will empower students to express their expectations regarding how they would like to learn with VPs in curricula, thus helping them to become better health professionals in the future.

Data availability

All datasets produced and analysed during the current study are available from the corresponding author upon reasonable request.

Abbreviations

  • Virtual patients

International Collection of Virtual Patients

Quality Assessment with Diverse Studies

Liaison Committee on Medical Education (LCME) accreditation standard

Computer-assisted Learning in Paediatrics Program

Problem-Based Learning

Ellaway R, Poulton T, Fors U, McGee JB, Albright S. Building a virtual patient commons. Med Teach. 2008;30:170–4.

Article   Google Scholar  

Cook DA, Erwin PJ, Triola MM. Computerized virtual patients in Health professions Education: a systematic review and Meta-analysis. Acad Med. 2010;85:1589–602.

Consorti F, Mancuso R, Nocioni M, Piccolo A. Efficacy of virtual patients in medical education: a meta-analysis of randomized studies. Comput Educ. 2012;59:1001–8.

Kononowicz AA, Woodham LA, Edelbring S, Stathakarou N, Davies D, Saxena N, et al. Virtual Patient Simulations in Health Professions Education: systematic review and Meta-analysis by the Digital Health Education Collaboration. J Med Internet Res. 2019;21:e14676.

Lee J, Kim H, Kim KH, Jung D, Jowsey T, Webster CS. Effective virtual patient simulators for medical communication training: a systematic review. Med Educ. 2020;54:786–95.

Foronda CL, Fernandez-Burgos M, Nadeau C, Kelley CN, Henry MN. Virtual Simulation in nursing education: a systematic review spanning 1996 to 2018. Simul Healthc J Soc Simul Healthc. 2020;15:46–54.

Richardson CL, White S, Chapman S. Virtual patient technology to educate pharmacists and pharmacy students on patient communication: a systematic review. BMJ Simul Technol Enhanc Learn. 2020;6:332–8.

Plackett R, Kassianos AP, Mylan S, Kambouri M, Raine R, Sheringham J. The effectiveness of using virtual patient educational tools to improve medical students’ clinical reasoning skills: a systematic review. BMC Med Educ. 2022;22:365.

Kelly S, Smyth E, Murphy P, Pawlikowska T. A scoping review: virtual patients for communication skills in medical undergraduates. BMC Med Educ. 2022;22:429.

Berman N, Fall LH, Smith S, Levine DA, Maloney CG, Potts M, et al. Integration strategies for using virtual patients in clinical clerkships. Acad Med. 2009;84:942–9.

Huwendiek S, Duncker C, Reichert F, De Leng BA, Dolmans D, Van Der Vleuten CPM, et al. Learner preferences regarding integrating, sequencing and aligning virtual patients with other activities in the undergraduate medical curriculum: a focus group study. Med Teach. 2013;35:920–9.

Cook DA. The Research we still are not doing: an agenda for the study of computer-based learning. Acad Med. 2005;80:541–8.

Berman NB, Fall LH, Maloney CG, Levine DA. Computer-assisted instruction in Clinical Education: a Roadmap to increasing CAI implementation. Adv Health Sci Educ. 2008;13:373–83.

Eccles MP, Mittman BS. Welcome to implementation science. Implement Sci. 2006;1:1, 1748-5908-1–1.

Dubrowski R, Barwick M, Dubrowski A. I wish I knew this Before… an implementation science primer and model to Guide Implementation of Simulation Programs in Medical Education. In: Safir O, Sonnadara R, Mironova P, Rambani R, editors. Boot Camp Approach to Surgical Training. Cham: Springer International Publishing; 2018. pp. 103–21.

Chapter   Google Scholar  

Nilsen P. Making sense of implementation theories, models and frameworks. Implement Sci. 2015;10:53.

Dubrowski R, Dubrowski A. Why should implementation science matter in simulation-based health professions education? Cureus. 2018. https://doi.org/10.7759/cureus.3754 .

Google Scholar  

Durlak JA, DuPre EP. Implementation matters: a review of research on the influence of implementation on program outcomes and the factors affecting implementation. Am J Community Psychol. 2008;41:327–50.

Friedman C. The research we should be doing. Acad Med. 1994;69:455–7.

Huwendiek S, Muntau AC, Maier EM, Tönshoff B, Sostmann K. E-Learning in Der Medizinischen Ausbildung: Leitfaden Zum Erfolgreichen Einsatz in Der Pädiatrie. Monatsschr Kinderheilkd. 2008;156:458–63.

Cendan J, Lok B. The use of virtual patients in medical school curricula. Adv Physiol Educ. 2012;36:48–53.

Ellaway RH, Davies D. Design for learning: deconstructing virtual patient activities. Med Teach. 2011;33:303–10.

Mayer A, Da Silva Domingues V, Hege I, Kononowicz AA, Larrosa M, Martínez-Jarreta B, et al. Planning a Collection of virtual patients to train clinical reasoning: a blueprint representative of the European Population. Int J Environ Res Public Health. 2022;19:6175.

Huwendiek S, Haider HR, Tönshoff B, Leng BD. Evaluation of curricular integration of virtual patients: development of a student questionnaire and a reviewer checklist within the electronic virtual patient (eVIP) project. Bio-Algorithms Med-Syst. 2009;5:35–44.

Georg C, Zary N. Web-based virtual patients in nursing education: development and validation of theory-anchored design and activity models. J Med Internet Res. 2014;16:e105.

Kleinheksel AJ, Ritzhaupt AD. Measuring the adoption and integration of virtual patient simulations in nursing education: an exploratory factor analysis. Comput Educ. 2017;108:11–29.

Gough D, Oliver S, Thomas J. An introduction to systematic reviews. SAGE; 2017.

Moher D, Liberati A, Tetzlaff J, Altman DG, for the PRISMA Group. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. BMJ. 2009;339(jul21 1):b2535–2535.

Ouzzani M, Hammady H, Fedorowicz Z, Elmagarmid A. Rayyan—a web and mobile app for systematic reviews. Syst Rev. 2016;5:210.

Thomas PA, Kern DE, Hughes MT, Chen BY, editors. Curriculum development for medical education: a six-step approach. Third edition. Baltimore: Johns Hopkins University Press; 2016.

Chen BY, Kern DE, Kearns RM, Thomas PA, Hughes MT, Tackett S. From modules to MOOCs: application of the Six-Step Approach to Online Curriculum Development for Medical Education. Acad Med. 2019;94:678–85.

VERBI Software. MAXQDA 2022.5. Software. 2023. maxqda.com.

Harrison R, Jones B, Gardner P, Lawton R. Quality assessment with diverse studies (QuADS): an appraisal tool for methodological and reporting quality in systematic reviews of mixed- or multi-method studies. BMC Health Serv Res. 2021;21:144.

Opie JE, McLean SA, Vuong AT, Pickard H, McIntosh JE. Training of lived experience workforces: a Rapid Review of Content and outcomes. Adm Policy Ment Health Ment Health Serv Res. 2023;50:177–211.

Goagoses N, Suovuo T, Bgt, Winschiers-Theophilus H, Suero Montero C, Pope N, Rötkönen E, et al. A systematic review of social classroom climate in online and technology-enhanced learning environments in primary and secondary school. Educ Inf Technol. 2024;29:2009–42.

Botezatu M, Hult H, Fors UG. Virtual patient simulation: what do students make of it? A focus group study. BMC Med Educ. 2010;10:91.

Edelbring S, Dastmalchi M, Hult H, Lundberg IE, Dahlgren LO. Experiencing virtual patients in clinical learning: a phenomenological study. Adv Health Sci Educ. 2011;16:331–45.

Kassianos AP, Plackett R, Kambouri MA, Sheringham J. Educators’ perspectives of adopting virtual patient online learning tools to teach clinical reasoning in medical schools: a qualitative study. BMC Med Educ. 2023;23:424.

McCarthy D, O’Gorman C, Gormley G. Intersecting virtual patients and microbiology: fostering a culture of learning. Ulster Med J. 2015;84(3):173-8.

Botezatu M, Hult Hå, Kassaye Tessma M, Fors UGH. As time goes by: stakeholder opinions on the implementation and use of a virtual patient simulation system. Med Teach. 2010;32:e509–16.

Dahri K, MacNeil K, Chan F, Lamoureux E, Bakker M, Seto K, et al. Curriculum integration of virtual patients. Curr Pharm Teach Learn. 2019;11:1309–15.

Schifferdecker KE, Berman NB, Fall LH, Fischer MR. Adoption of computer-assisted learning in medical education: the educators’ perspective: adoption of computer-assisted learning in medical education. Med Educ. 2012;46:1063–73.

Dafli E, Fountoukidis I, Hatzisevastou-Loukidou C, D Bamidis P. Curricular integration of virtual patients: a unifying perspective of medical teachers and students. BMC Med Educ. 2019;19:416.

Edelbring S, Broström O, Henriksson P, Vassiliou D, Spaak J, Dahlgren LO, et al. Integrating virtual patients into courses: follow-up seminars and perceived benefit. Med Educ. 2012;46:417–25.

Lang VJ, Kogan J, Berman N, Torre D. The evolving role of online virtual patients in Internal Medicine Clerkship Education nationally. Acad Med. 2013;88:1713–8.

Ellaway R, Topps D, Lee S, Armson H. Virtual patient activity patterns for clinical learning. Clin Teach. 2015;12:267–71.

Hege I, Ropp V, Adler M, Radon K, Mäsch G, Lyon H, et al. Experiences with different integration strategies of case-based e-learning. Med Teach. 2007;29:791–7.

Hirumi A, Johnson T, Reyes RJ, Lok B, Johnsen K, Rivera-Gutierrez DJ, et al. Advancing virtual patient simulations through design research and interPLAY: part II—integration and field test. Educ Technol Res Dev. 2016;64:1301–35.

Kulasegaram K, Mylopoulos M, Tonin P, Bernstein S, Bryden P, Law M, et al. The alignment imperative in curriculum renewal. Med Teach. 2018;40:443–8.

Fors UGH, Muntean V, Botezatu M, Zary N. Cross-cultural use and development of virtual patients. Med Teach. 2009;31:732–8.

Kelley CG. Using a virtual patient in an Advanced Assessment Course. J Nurs Educ. 2015;54:228–31.

Damschroder LJ, Aron DC, Keith RE, Kirsh SR, Alexander JA, Lowery JC. Fostering implementation of health services research findings into practice: a consolidated framework for advancing implementation science. Implement Sci. 2009;4:50.

Zary N, Johnson G, Fors U. Web-based virtual patients in dentistry: factors influencing the use of cases in the Web‐SP system. Eur J Dent Educ. 2009;13:2–9.

Maier EM, Hege I, Muntau AC, Huber J, Fischer MR. What are effects of a spaced activation of virtual patients in a pediatric course? BMC Med Educ. 2013;13:45.

Johnson TR, Lyons R, Kopper R, Johnsen KJ, Lok BC, Cendan JC. Virtual patient simulations and optimal social learning context: a replication of an aptitude–treatment interaction effect. Med Teach. 2014;36:486–94.

Mayer A, Yaremko O, Shchudrova T, Korotun O, Dospil K, Hege I. Medical education in times of war: a mixed-methods needs analysis at Ukrainian medical schools. BMC Med Educ. 2023;23:804.

Download references

Acknowledgements

The authors would like to thank Zuzanna Oleniacz and Joanna Ożga for their contributions in abstract screening and data extraction, as well as all the participants who took part in the iCoViP project and the workshops.

The study has been partially funded by the ERASMUS + program, iCoViP project (International Collection of Virtual Patients) from European Union grant no. 2020-1-DE01-KA226-005754 and internal funds from Jagiellonian University Medical College (N41/DBS/001125).

Author information

Authors and affiliations.

Center for Innovative Medical Education, Jagiellonian University Medical College, Medyczna 7, Krakow, 30-688, Poland

Joanna Fąferek

Faculty of Medicine, Paris Saclay University, Le Kremlin-Bicetre, 94270, France

Pierre-Louis Cariou & Luc Morin

Paracelsus Medical University, Prof.-Ernst-Nathan-Str. 1, 90419, Nürnberg, Germany

Medical Education Sciences, University of Augsburg, 86159, Augsburg, Germany

Institute and Clinic for Occupational, Social and Environmental Medicine, LMU University Hospital, 80336, Munich, Germany

Daloha Rodriguez-Molina

Department of Community Medicine, Information and Health Decision Sciences, Faculty of Medicine, University of Porto, Porto, Portugal

Bernardo Sousa-Pinto

Department of Bioinformatics and Telemedicine, Jagiellonian University Medical College, Medyczna 7, Krakow, 30-688, Poland

Andrzej A. Kononowicz

You can also search for this author in PubMed   Google Scholar

Contributions

JF and AK conceived the idea for the study. JF coordinated the research team activities. All authors contributed to the writing of the review protocol. AK designed the literature search strategies. All authors participated in screening and data extraction. JF retrieved and managed the abstracts and full-text articles. JF and AK performed qualitative analysis of the data and quality appraisal. AK, JF and IH designed the illustrations for this study. All authors interpreted the analysis and contributed to the discussion. JF and AK drafted the manuscript. PLC, IH, AM, LM, DRM, BSP read and critically commented on the manuscript. All authors gave final approval of the version submitted.

Corresponding authors

Correspondence to Joanna Fąferek or Andrzej A. Kononowicz .

Ethics declarations

Ethics approval and consent to participate.

Systematic review of literature - not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary material 2, supplementary material 3, supplementary material 4, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Fąferek, J., Cariou, PL., Hege, I. et al. Integrating virtual patients into undergraduate health professions curricula: a framework synthesis of stakeholders’ opinions based on a systematic literature review. BMC Med Educ 24 , 727 (2024). https://doi.org/10.1186/s12909-024-05719-1

Download citation

Received : 20 March 2024

Accepted : 27 June 2024

Published : 05 July 2024

DOI : https://doi.org/10.1186/s12909-024-05719-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Curriculum development
  • Systematic review
  • Framework synthesis

BMC Medical Education

ISSN: 1472-6920

systematic literature review figure

IMAGES

  1. Review overview. A. Flow chart of systematic literature review...

    systematic literature review figure

  2. Systematic literature review phases.

    systematic literature review figure

  3. | Systematic literature review process the flow sheet diagram...

    systematic literature review figure

  4. Process of the systematic literature review

    systematic literature review figure

  5. Systematic reviews

    systematic literature review figure

  6. [PDF] Guidance on Conducting a Systematic Literature Review

    systematic literature review figure

VIDEO

  1. Workshop Systematic Literature Review (SLR) & Bibliometric Analysis

  2. Literature Review, Systematic Literature Review, Meta

  3. Systematic Literature Review Part2 March 20, 2023 Joseph Ntayi

  4. Introduction Systematic Literature Review-Various frameworks Bibliometric Analysis

  5. Introduction to Literature Review, Systematic Review, and Meta-analysis

  6. Systematic Literature Review

COMMENTS

  1. Guidance on Conducting a Systematic Literature Review

    Literature reviews establish the foundation of academic inquires. However, in the planning field, we lack rigorous systematic reviews. In this article, through a systematic search on the methodology of literature review, we categorize a typology of literature reviews, discuss steps in conducting a systematic literature review, and provide suggestions on how to enhance rigor in literature ...

  2. PDF Systematic Literature Reviews: an Introduction

    Systematic literature reviews (SRs) are a way of synthesising scientific evidence to answer a particular research question in a way that is transparent and reproducible, while seeking to include all published ... Figure 1. PRISMA chart for reporting systematic reviews (Moher et al., 2009)

  3. Ten Steps to Conduct a Systematic Review

    It is usually the initial figure presented in the results section of your systematic review . ... 2016. How to do a systematic literature review in nursing: a step-by-step guide. [Google Scholar] 9. Utilization of the PICO framework to improve searching PubMed for clinical questions. Schardt C, Adams MB, Owens T, Keitz S, Fontelo P. BMC Med ...

  4. How to Do a Systematic Review: A Best Practice Guide for Conducting and

    The best reviews synthesize studies to draw broad theoretical conclusions about what a literature means, linking theory to evidence and evidence to theory. This guide describes how to plan, conduct, organize, and present a systematic review of quantitative (meta-analysis) or qualitative (narrative review, meta-synthesis) information.

  5. (PDF) Systematic Literature Reviews: An Introduction

    systematic reviews is increasing exponentially (Figure 1, see also (Bastian et al., 2010)). However, there are also methodological and practical challenges to systematic reviews. First, the

  6. The PRISMA 2020 statement: an updated guideline ...

    The Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) statement published in 2009 (hereafter referred to as PRISMA 2009) [4,5,6,7,8,9,10] is a reporting guideline designed to address poor reporting of systematic reviews [].The PRISMA 2009 statement comprised a checklist of 27 items recommended for reporting in systematic reviews and an "explanation and elaboration ...

  7. How-to conduct a systematic literature review: A quick guide for

    Method details Overview. A Systematic Literature Review (SLR) is a research methodology to collect, identify, and critically analyze the available research studies (e.g., articles, conference proceedings, books, dissertations) through a systematic procedure [12].An SLR updates the reader with current literature about a subject [6].The goal is to review critical points of current knowledge on a ...

  8. Conducting systematic literature reviews and bibliometric analyses

    The rationale for systematic literature reviews has been well established in some fields such as medicine for decades (e.g. Mulrow, 1994); however, there are still few methodological guidelines available in the management sciences on how to assemble and structure such reviews (for exceptions, see Denyer and Tranfield, 2009; Tranfield et al., 2003 and related publications).

  9. How to Do a Systematic Review: A Best Practice Guide ...

    Systematic reviews are characterized by a methodical and replicable methodology and presentation. They involve a comprehensive search to locate all relevant published and unpublished work on a subject; a systematic integration of search results; and a critique of the extent, nature, and quality of evidence in relation to a particular research question. The best reviews synthesize studies to ...

  10. Systematic Literature Review

    Although there is no single method for conducting a systematic literature review, certain steps are included in the methods described by several authors (Cooper et al. 2009; Gough et al. 2012; Khan et al. 2003; Smith et al. 2011), as depicted in Fig. 7.1.There is a common core that includes the search, selection and quality evaluation of the studies to be considered, although the method ...

  11. Introduction to systematic review and meta-analysis

    The figure depicts individual trials as filled squares with the relative sample size and the solid line as the 95% confidence interval of the difference. The diamond shape indicates the pooled estimate and uncertainty for the combined effect. ... When performing a systematic literature review or meta-analysis, if the quality of studies is not ...

  12. Preparing your manuscript

    Language and manuscript preparation services. Let one of our experts assist you with getting your manuscript and language into shape - our services cover: English language improvement. scientific in-depth editing and strategic advice. figure and tables formatting. manuscript formatting to match your target journal.

  13. (PDF) The systematic literature review process: a simple guide for

    The systematic literature review process: a simple guide for public health and allied health students. ... Figure 1: Process of systematic review. 1. Figure 2: Preliminary search steps. 9,10.

  14. Five tips for developing useful literature summary tables for writing

    Literature reviews offer a critical synthesis of empirical and theoretical literature to assess the strength of evidence, develop guidelines for practice and policymaking, and identify areas for future research.1 It is often essential and usually the first task in any research endeavour, particularly in masters or doctoral level education. For effective data extraction and rigorous synthesis ...

  15. (PDF) Systematic Literature Review: Some Examples

    Report. 4. ii. Example for a Systematic Literature Review: In references 5 example for paper that use Systematic Literature Review (SlR) example: ( Event-Driven Process Chain for Modeling and ...

  16. Full article: Systematic literature reviews over the years

    Purpose: Nowadays, systematic literature reviews (SLRs) and meta-analyses are often placed at the top of the study hierarchy of evidence. The main objective of this paper is to evaluate the trends in SLRs of randomized controlled trials (RCTs) throughout the years. ... Figure 5 depicts the distribution of studies according to disease area in ...

  17. A methodological quest for systematic literature mapping

    As opposed to systematic reviews, systematic literature mapping - such as scoping, mapping or evidence reviews (Arksey & O'Malley, ... Figure 3 shows 48 references strong on both dimensions of engagement, on which we focused further in the related evidence review (in progress). But there are merits in exploring the other 41 and 90 references ...

  18. Systematic literature reviews over the years

    Purpose: Nowadays, systematic literature reviews (SLRs) and meta-analyses are often placed at the top of the study hierarchy of evidence. The main objective of this paper is to evaluate the trends in SLRs of randomized controlled trials (RCTs) throughout the years. Methods: Medline database was searched, using a highly focused search strategy. Each paper was coded according to a specific ICD ...

  19. Methodological Investigation: Traditional and Systematic Reviews as

    Traditional Literature Review (TLR) has been stated to be a retrospective account of previous research on certain topic (Li & Wang, 2018). Meanwhile, Systematic Literature Review (SLR) has been stated as a means of evaluating and interpreting all available research significant to a singular research question, topic area, or phenomenon of ...

  20. Systematically Reviewing the Literature: Building the Evidence for

    Systematic reviews that summarize the available information on a topic are an important part of evidence-based health care. There are both research and non-research reasons for undertaking a literature review. It is important to systematically review the literature when one would like to justify the need for a study, to update personal ...

  21. Full article: A systematic literature review on green human resource

    The study applied the PRISMA model for a systematic literature review, as illustrated in Figure 1, and finally selected 89 articles for synthesis. Using the keywords defined above as search strings, the preliminary study included Scopus (640) and Web of Science (503).

  22. Comparative efficacy and safety of bimekizumab in psoriatic arthritis

    A systematic literature review (most recent update conducted on 1 January 2023) identified randomized controlled trials (RCTs) of b/tsDMARDs in PsA. Bayesian NMAs were conducted for efficacy outcomes at Weeks 12-24 for b/tsDMARD-naïve and TNF inhibitor (TNFi)-experienced patients. Safety at Weeks 12-24 was analysed in a mixed population.

  23. PDF A Systematic Literature Review of The Benefits of Utilizing Pervasive

    Figure 1 introduces the systematic literature . review process guidelines proposed by Kitchenham . and Charters (2007), Hanafizadeh et al. (2014), Mota ... systematic literature reviews in software engineering [technical report]. Keele University and Durham University . Joint Report, 2007-001. Staffordshire, Durham: EBSE.

  24. Post-COVID syndrome prevalence: a systematic review and meta-analysis

    We systematically reviewed and determined the pooled prevalence estimate of PCS worldwide based on published literature. Relevant articles from the Web of Science, Scopus, PubMed, Cochrane Library, and Ovid MEDLINE databases were screened using a Preferred Reporting Items for Systematic Reviews and Meta-Analyses-guided systematic search process.

  25. How-to conduct a systematic literature review: A quick guide for

    Overview. A Systematic Literature Review (SLR) is a research methodology to collect, identify, and critically analyze the available research studies (e.g., articles, conference proceedings, books, dissertations) through a systematic procedure .An SLR updates the reader with current literature about a subject .The goal is to review critical points of current knowledge on a topic about research ...

  26. A systematic review of experimentally tested implementation strategies

    Search results. We identified 14,646 articles through the initial literature search, 17 articles through expert recommendation (three of which were not included in the initial search), and 1,942 articles through reviewing prior systematic reviews (Fig. 1).After removing duplicates, 9,399 articles were included in the initial abstract screening.

  27. A Systematic Literature Review of Defect Detection in Railways Using

    Furthermore, a transparent and methodical literature search strengthens the review's credibility by offering a defined process for locating and choosing pertinent publications. The detailed systematic literature review process is shown in Fig. 2. Step 1 involved collecting relevant literature from the database using specified keywords, covering ...

  28. An overview of methodological approaches in systematic reviews

    Included SRs evaluated 24 unique methodological approaches used for defining the review scope and eligibility, literature search, screening, data extraction, and quality appraisal in the SR process. Limited evidence supports the following (a) searching multiple resources (electronic databases, handsearching, and reference lists) to identify ...

  29. The impact of evidence-based nursing leadership in healthcare settings

    The central component in impactful healthcare decisions is evidence. Understanding how nurse leaders use evidence in their own managerial decision making is still limited. This mixed methods systematic review aimed to examine how evidence is used to solve leadership problems and to describe the measured and perceived effects of evidence-based leadership on nurse leaders and their performance ...

  30. Integrating virtual patients into undergraduate health professions

    Virtual patients (VPs) are defined as interactive computer simulations of real-life clinical scenarios for the purpose of health professions training, education, or assessment [].Several systematic reviews have demonstrated that learning using VPs is associated with educational gains when compared to no intervention and is non-inferior to traditional, non-computer-aided, educational methods [2 ...