Finding the standard setting method for your assessment

Number of Citations: 0

https://doi.org/10.29060/TAPS.2025-10-3/TT003

Gominda Ponnamperuma

MBBS, MMEd, PhD
Professor in Medical Education
Faculty of Medicine, University of Colombo, Sri Lanka

Standard setting is the process of deciding the boundary or standard that separates the candidates into two (e.g. pass and fail) or more groups, based on their ability shown at an assessment. Standard setting methods can be broadly grouped into four clusters (see table below).

When to use which method, though a crucial decision for any Board of Examiners, is inadequately explored in the literature. The following brief guide attempts to bridge this literature gap.

Cluster of methods	Key features	Issues	When to use
Arbitrary standards and norm-referenced standards	Arbitrary standards produce a fixed pass mark, e.g., candidates scoring 50% or more pass. Norm-referenced standards produce a fixed pass rate, e.g., 40% of top-scoring candidates pass.	The pass mark is unrelated to the difficulty of assessment items.	Arbitrary standards: not indicated for high-stakes assessment. Norm-referencing: used for selection purposes.
Test-centred methods	A group of experts (judges) estimate the probability of a hypothetical borderline (a candidate who has a 50% probability of passing or failing) or a just-passing candidate passing the test items, e.g. Angoff (1971), Ebel (1972), Nedelsky (1954), Bookmark (Karantonis & Sireci, 2006), Jaeger (1982). The judges’ estimates are collated through an averaging process. An expert (judge) is a subject-matter specialist, with considerable experience as a teacher and an assessor, well versed with the educational basis behind standard setting.	Although the pass mark is directly related to the difficulty of test items, human judgement is not infallible: The pass mark can vary from one panel of judges to another, even for the same test. finding a sizeable group of experts (at least 8) satisfying all requirements is difficult. it is difficult for judges to visualise a hypothetical borderline candidate. the process is time-consuming. Due to the above difficulties, the pass mark can be unrealistic.	When an adequate number of properly trained and experienced expert judges who can devote quality time to the standard setting process is available. When modifications such as the Modified Angoff method can be used to overcome unrealistic standards by allowing judges to be informed by actual results of previous similar exams.
Partially results-based methods-I: Examinee-centred methods	Based on actual candidate performance, judges group candidates into two or more groups, e.g. Borderline group (Smee & Blackmore, 2001), Borderline regression (Kramer et al., 2003), Contrasting groups (p.35) (Livingston & Zieky, 1982) and Up-down (p.43) (Livingston & Zieky, 1982) methods. The pass mark is calculated using the actual candidate scores.	Although judgements are realistic, the introduction of actual test results tends to make the standard cohort-dependent, i.e., norm-referencing features influence the standard.	When there is a sufficiently large number of candidates. When a global score or a global pass/fail decision is available, in addition to the usual itemized score. When the judges are well-trained in making a global decision independent of the itemised scores.
Partially results-based methods-II: Compromise methods	Judges make judgements by looking at test items, and those judgements are superimposed on actual candidate scores to derive the pass mark, e.g., Hofstee method (Hofstee, 1973).	Expert judgements and actual results may not match each other. The standard can be cohort-dependent due to the norm-referencing features of actual candidate scores.	When trained judges, actual results of a sizable cohort of candidates and expertise in handling both judges’ judgements and results are available. Mostly used as a backup method to verify standards generated by other methods.
Results-based methods	Judges are not needed for standard setting. The pass mark is generated by statistically manipulating the actual marks, e.g. Cohen (Cohen- Schotanus & van der Vleuten, 2010) and Wijnen (1971) methods.	Due to the norm-referencing influence, the pass mark could be high and defensibility would be an issue.	These methods should be used in high-stakes assessment only when an adequate evidence base is built by conducting them parallelly with another more established method.

References

Angoff, W. H. (1971). Scales, norms, and equivalent scores. In R. L. Thorndike (Ed.), Educational measurement (2^nd ed., pp. 508-600). American Council on Education.

Ebel, R. L. (1972). Essentials of educational measurement. Prentice Hall.

Nedelsky, L. (1954). Absolute grading standards for objective tests. Educational and Psychological Measurement, 14(1), 3-19. https://doi.org/10.1177/001316445401400101

Karantonis, A., & Sireci, S. G. (2006). The bookmark standard-setting method: A literature review. Educational Measurement Issues and Practice, 25(1), 4-12. https://doi.org/10.1111/j.1745-3992.2006.00047.x

Jaeger, R. M. (1982). An iterative structured judgment process for establishing standards on competency test: Theory and application. Educational Evaluation and Policy Analysis, 4(4), 461-476. https://doi.org/10.3102/01623737004004461

Smee, S. M., & Blackmore, D. E. (2001). Setting standards for an Objective Structured Clinical Examination: The borderline group method gains ground on Angoff. Medical Education, 35(11), 1009-1010. https://doi.org/10.1111/j.1365-2923.2001.01047.x

Kramer, A., Muijtjens, A., Jansen, K., Dusman, H., Tan, L., & van der Vleuten, C. (2003) Comparison of a rational and an empirical standard setting procedure for an OSCE. Medical Education, 37(2), 132-139. https://doi.org/10.1046/j.1365-2923.2003.01429.x

Livingston, S. A., & Zieky, M. J. (1982). Passing scores: A manual for setting standards of performance on educational and occupational tests. Educational Testing Service.

Hofstee, W. K. B. (1973). Een alternatief voor normhandhaving bij toetsen. Nederlands Tijdschrift voor de Psychologie, 28, 215-227.

Cohen-Schotanus, J., & van der Vleuten, C. P. M. (2010). A standard setting method with the best performing students as point of reference: Practical and affordable. Medical Teacher, 32(2), 154-160. https://doi.org/10.3109/01421590903196979

Wijnen, W. H. F. W. (1971). Onder of boven de maat. Amsterdam: Swets & Zeitlinger.

Announcements

Best Reviewer Awards 2025
TAPS would like to express gratitude and thanks to an extraordinary group of reviewers who are awarded the Best Reviewer Awards for 2025.
Refer here for the list of recipients.
Most Accessed Article 2025
The Most Accessed Article of 2025 goes to Analyses of self-care agency and mindset: A pilot study on Malaysian undergraduate medical students.
Congratulations, Dr Reshma Mohamed Ansari and co-authors!
Best Article Award 2025
The Best Article Award of 2025 goes to From disparity to inclusivity: Narrative review of strategies in medical education to bridge gender inequality.
Congratulations, Dr Han Ting Jillian Yeo and co-authors!
Best Reviewer Awards 2024
TAPS would like to express gratitude and thanks to an extraordinary group of reviewers who are awarded the Best Reviewer Awards for 2024.
Refer here for the list of recipients.
Most Accessed Article 2024
The Most Accessed Article of 2024 goes to Persons with Disabilities (PWD) as patient educators: Effects on medical student attitudes.
Congratulations, Dr Vivien Lee and co-authors!
Best Article Award 2024
The Best Article Award of 2024 goes to Achieving Competency for Year 1 Doctors in Singapore: Comparing Night Float or Traditional Call.
Congratulations, Dr Tan Mae Yue and co-authors!
Best Reviewer Awards 2023
TAPS would like to express gratitude and thanks to an extraordinary group of reviewers who are awarded the Best Reviewer Awards for 2023.
Refer here for the list of recipients.
Most Accessed Article 2023
The Most Accessed Article of 2023 goes to Small, sustainable, steps to success as a scholar in Health Professions Education – Micro (macro and meta) matters.
Congratulations, A/Prof Goh Poh-Sun & Dr Elisabeth Schlegel!
Best Article Award 2023
The Best Article Award of 2023 goes to Increasing the value of Community-Based Education through Interprofessional Education.
Congratulations, Dr Tri Nur Kristina and co-authors!
Best Reviewer Awards 2022
TAPS would like to express gratitude and thanks to an extraordinary group of reviewers who are awarded the Best Reviewer Awards for 2022.
Refer here for the list of recipients.
Most Accessed Article 2022
The Most Accessed Article of 2022 goes to An urgent need to teach complexity science to health science students.
Congratulations, Dr Bhuvan KC and Dr Ravi Shankar.
Best Article Award 2022
The Best Article Award of 2022 goes to From clinician to educator: A scoping review of professional identity and the influence of impostor phenomenon.
Congratulations, Ms Freeman and co-authors.

Finding the standard setting method for your assessment

Citing Literature

Announcements

Disclaimer