To reduce the likelihood that variability, in appraisal scor- ing between researchers, did not exclude valid material for synthesis, 42% (43/103) of all reports categorized as ‘Mid’ were re-appraised for inter-rater reliability. A purposive approach was used to capture potential sources of variabil- ity. Therefore, selection of excluded reports for re-appraisal ensured that different types of interventions (. popula- tion-based, experimental, school-based, etc.) were covered and care was taken to have these reliability appraisals carried out by researchers with different backgrounds (. epidemiology vs. health promotion). The investigative team settled any discrepancies in categorization of reports arising from this reliability appraisal