Автор неизвестен - Mededworld and amee 2013 conference connect - страница 3
ABSTRACT BOOK: SESSION 2 MONDAY 26 AUGUST: 0830-1015
students get the opportunity to create and develop mutual relationships with the patients. The complex learning processes, involving complex constructions of learning thresholds, of final year nursing students need to be further explored and handled by supervisors. References: Clouder L. 2006. Caring as a "threshold concept". Transforming students in higher education into health(care) professionals. Teachning in Higher
Education, 10:4, 505-517.
Hsieh H-F. & Shannon S. E. 2005. Three approaches to qualitative content analysis. Qualitative Health
Research, 15, 1277-1288.
McCune V. 2009. Final year biosciences students'
willingness to engage: teaching-learning environments,
authentic learning experiences and identities. Studies in
Higher Education, 34:3, 347-61.
Manninen K., Welin Henriksson E., Scheja M. & Silen C.
2013. Authenticity in learning - nursing students'
experiences at a clinical education ward. Health
Education, 113: 2 (in press)
Mezirow J. 2009. An overview on transformative
learning. (In K. Illeris (Ed), Contemporary theories of
learning (pp. 90-105). London: Routledge.
The effectiveness of service learning: A critical review of the literature
Ruth Mc Menamin (National University of Ireland Galway, College of Medicine, Nursing and Health Sciences, Galway, Ireland)
Margaret Mc Grath (National University of Ireland Galway, College of Medicine, Nursing and Health Sciences, Galway, Ireland)
Introduction: Service learning (SL) is increasingly used as a pedagogical tool by healthcare educators worldwide (1). There is a paucity of information on the impacts of SL on students' learning outcomes and the difficulties of transferring SL from one location to another. This review aims to: i)identify the reported impacts of SL for healthcare students; ii) highlight gaps in our understanding of this pedagogical approach and iii) provide guidance on priority areas for future research. This work is timely as it evaluates the emerging international evidence base on the impacts of SL and the challenges of localisation (2). Methods: A critical review of the literature. Seven databases were searched for all available literature on the impacts of SL for undergraduate healthcare students. Data was extracted relating to research aims, population and sample size, design and data collection method(s), key findings, type of measure(s) used and the nature of the impact(s) reported. Six categories of learning outcomes guided analysis including (i) personal and interpersonal development, (ii) understanding and applying knowledge, (iii) engagement curiosity and reflective practice, (iv) critical thinking, (v) perspective transformation and (vi) citizenship (3). Results: The initial search identified 1485 papers. 1423 were excluded for failing to meet the inclusion criteria. Screening the bibliographies of the remaining 62 papers
identified a further 15 relevant studies. Following quality appraisal the set of 77 was reduced to 53 eligible papers for detailed analysis. This review highlights the lack of clarity in definition and understanding of SL and confirms the paucity of literature on the impacts of SL. Civic awareness is an explicit aim of SL yet only a minority of studies reported changes in this domain. Positive learning outcomes are primarily reported in students' personal and interpersonal development which includes interesting gains in cultural competence and comfort in collaborating with the 'different other'
Discussion and Conclusion: SL is a complex educational approach involving communities, students and institutions with the aspiration that the shared relationship is equally beneficial and reciprocal. The idiosyncratic nature of SL experiences makes it difficult to identify definite learning outcomes that can be generalised. Future studies based on the interpretative paradigm focused on the process rather than the outcomes of SL may expand our understanding of this pedagogy. Currently, the evidence base to support the use of SL in undergraduate healthcare curricula is not established which creates opportunities and challenges for those considering introducing this teaching approach. We encourage educators to continue to share evidence about the impact(s) of SL from rigorous study designs by transforming tacit knowledge into tangible research questions. Exploring questions around: 1. Defining SL; 2. How SL experiences lead to particular academic and partner outcomes; 3. The nature of the evidence and measurement tools being used; 4. Whether the unique features of SL are being evaluated as outcomes; 5. Whether participation in SL has long term impacts and 6. What will make this pedagogical approach most educationally effective will create new evidence enabling us to make informed decisions about the implementation of SL in healthcare education. References: (1) Dharamsi S, Richards M, Louie D, Murray D, Berland A, Whitfield M, et al. Enhancing medical students' conceptions of the CanMEDS Health Advocate Role through international service-learning and critical reflection: a phenomenological study. Medical Teacher.
(2) Boland J, A., McIlrath L. The Process of Localising Pedagogies for Civic Engagement in Ireland: The Significance of Conceptions, Culture and Context. In: McIlrath L, Mac Labhrainn I, editors. Higher Education and Civic Engagement: International Perspectives. Hampshire: Ashgate Publishing: Corporate Social Responsibility Series; 2007. p. 83-103.
(3) Eyler J, Giles DE, Jr. Where's the Learning in Service-Learning? Jossey-Bass Higher and Adult Education Series: Jossey-Bass, Inc.; 1999.
(4) En Wee L, Xin YW, Koh GCH. Doctors-to-be at the doorstep Comparing service-learning programs in an Asian medical school. Medical Teacher.
(5) Meili R, Fuller D, Lydiate J. Teaching social accountability by making the links:Qualitative evaluation of student experiences in a service-learning project.
Medical Teacher. 2011;33:659-66.
ABSTRACT BOOK: SESSION 2 MONDAY 26 AUGUST: 0830-1015
Reliability estimations of the mini-CEX using traditional and construct-aligned scales
Alberto Alves de Lima (Instituto Cardiovascular de Buenos Aires, Education and research, Blanco Encalada 1525, Libertador 6302, Buenos Aires 1428, Argentina) Augusto Lavalle Cobo (Instituto Cardiovascular de Buenos Aires, Education and research, Buenos Aires, Argentina)
Ana Iribarren (Instituto Cardiovascular de Buenos Aires, Education and research, Buenos Aires, Argentina) Lujan Forti (Instituto Cardiovascular de Buenos Aires, Education and research, Buenos Aires, Argentina) Mariano Albertal (Instituto Cardiovascular de Buenos Aires, Education and research, Buenos Aires, Argentina) Cees Van der Vleuten (Maastricht University, Educational development and Research, Maastricht, Netherlands)
Introduction: Recently, Crossley et al. have demonstrated that in real life settings Mini-CEX scales constructed to reflect the development of clinical sophistication and independence (CS) have higher utility than the traditional ones (TS), since they are more reliable and therefore raises the evidence of greater validity (1, 2). The aim of this study is to reproduce these findings in a controlled setup and to evaluate the different variance components in both scales. Methods: Three encounters were videotaped from 21 residents (R). The patients were the same for all R. Each encounter was assessed by 3 assessors (A) who assessed all encounters for all R. The A assessed the encounters twice. The first time they assessed the encounters using the TS and 30 days later with the CS. Each A was an internal medicine specialist from outside the institute and was blinded to the level of expertise of the R. All of them had previous experience with the mini-CEX and were involved in medical education. This delivered a fully crossed (all random) two-facet generalisability design each time (3).
Results: For both scales, a third of the total variance was associated with universe score variance, TS: 36% vs CS 29%. The largest source of variance in the TS was of general error (49%), followed by the main effect of assessors (7%). In the CS the largest source of variance was of general error (34%) followed by the assessors' variability for some residents (23%). Generalisability coefficients indicated that for both types of scales an approximate sample of 7 encounters was needed, assuming both the presence of one different assessor per encounter and the presentation of different cases per encounter (the usual situation in real practice): 4 encounters when 2 raters were used and 3 encounters in case 3 raters were used.
Discussion and Conclusion: According to the results obtained and contrary to our expectations the TS and the CS showed similar performance in terms of sources of variance and in the resulting reliability. Unexplained general error appears to be the major cause of unreliability of both scales followed by the assessor leniency/stringency in the TS and the assessors'
variability for some residents in the CS. The explanation for these results may be that assessors were blinded to the level of expertise of the residents. Although CS were carefully built and aligned with the priorities and the 'reality map' of those who they would further assess, the knowledge of the level of expertise of the residents could be central as a frame of reference to enhance reliability of CS (4,5). In conclusion traditional scales and construct-aligned scales showed similar performance in terms of sources of variance and in the resulting reliability.
References: (1) Crossley J, Johnson G, Booth J, Wade W. Good questions, good answers: construct alignment improves the performance of workplace-based assessment scales. Medical Education. 2011
Jun;45(6):560-9. PubMed PMID: 21501218.
(2) Crossley J, Jolly B. Making sense of work-based assessment: ask the right questions, in the right way, about the right things, of the right people. Medical
Education. 2012 Jan;46(1):28-37. PubMed PMID: 22150194.
(3) Alves de Lima A, Conde D, Costabel J, Corso J, Van der Vleuten C. A laboratory study on the reliability estimations of the mini-CEX. Advances in health sciences education: theory and practice. 2011 Dec 23. PubMed
(4) Kogan JR, Conforti L, Bernabeo E, Iobst W, Holmboe E. Opening the black box of clinical skills assessment via observation: a conceptual model. Medical Education.
2011 Oct;45(10):1048-60. PubMed PMID: 21916943.
(5) Regehr G, Eva K, Ginsburg S, Halwani Y, Sidhu RS. Assessment in Postgraduate Medical Education: Trends and Issues in Assessment in the workplace. Members of
the FMEC PG Consortium. 2011.
2F Short Communications: Assessment: OSCE 1 - Standard setting and scoring
Location: Chamber Hall, PCC
Comparison of Absolute and Borderline Regression Standard Setting Method in Evaluating OSCE Performance
Asty Amalia (Faculty of Medicine Hasanuddin University, Medical Education, Komp. Hartaco Indah Blok 1 H No. 11, Id, Makassar 90224, Indonesia)
Background: Harden et al. explained that OSCE offered the advantages of controlled grading criteria and easy repeatability of the examination. Studies have shown that OSCEs help students develop procedural, communication and physical examination skills. Standard setting is one of the essential issues of OSCE, Our faculty has been using the absolute method but has never analyzed it. The Indonesian National Association for Competence Examination has been using borderline regression method for National OSCE try out but the analysis has never been published. This research was performed to compare absolute and borderline regression standard setting method to investigate which procedures would be most effective in determining proper cutoff score in OSCE.
Summary of work: The research was performed in our eight stations third year final semester OSCE with a total of 232 students. The results were then analyzed with absolute and borderline regression method using Microsoft Excel 2003.
Summary of results: Using borderline regression method the average remedy students was 49.25 with average cut off score 60.5% while using absolute method with the cut off score 80%, the average remedy students were 126.25.
Conclusions: The results for the third year showed that the borderline regression method is reasonable and is justifiable and credible in determining pass standard. It can be done right after the examination and is efficient. The weakness of this method is that the pass score cannot be determined before the assessment. Take-home messages: More research has to be performed in using more standard setting methods which include the students' point of view on this method.
The Objective Borderline Method: A probabilistic approach for standard setting
Boaz Shulruf (University of New South Wales, Medical Education, UNSW, NSW 2052, Sydney 2052, Australia) Philip Jones (University of New South Wales, Medical Education, Sydney, Australia)
Background: Despite the availability of standard setting methods, the determination of Pass/Fail decisions in clinical examinations remains problematic. The
ABSTRACT BOOK: SESSION 2 MONDAY 26 AUGUST: 0830-1015
objective borderline method (OBM), employing a probability approach to reclassify borderline grades, has been recently introduced. This study describes a modification of the OBM (OBM2) that uses two parameters (examinee ability and item difficulty) to determine the probability that a Borderline grade is a Fail or Pass.
Summary of work: Examinees' borderline grades from clinical examinations were reclassified as Pass or Fail based on the probability, derived by the OBM2 method, that a Borderline grade was likely to be a Pass or Fail in two different ways: OBM2-Pass (probability of Borderline to pass) and OBM2-not-fail (probability of Borderline not to fail). The overall examination outcomes were compared using the original grades and the reclassified grades.
Summary of results: The overall examination outcomes of both OBM2 models were more stringent than the original method and as expected, the OBM2-Pass was stringent than the OBM2-not-Fail. Nonetheless, the OBM2-Pass had the largest positive predictive value (.95) for predicting success in clinical examination of the subsequent year.
Conclusions: The OBM2-Pass model is a simple, statistically robust and valid method for making Pass/Fail decisions over Borderline grades. Take-home messages: Using a probabilistic approach for making Pass/Fail decisions over Borderline grades is more practical and more defensible than other methods. Further research into this developing topic is needs.
How low can you go? Measuring the error in OSCE standard setting for a range of cohort sizes
Matt Homer (University of Leeds, Leeds Institute of Medical Education, School of Medicine, Worsley Building 7.09, Leeds LS2 9JT, United Kingdom) John Patterson (Barts and the London School of Medicine and Dentistry, Centre for Medical Education, London, United Kingdom)
Godfrey Pell (University of Leeds, Leeds Institute of Medical Education, School of Medicine, Leeds, United Kingdom)
Richard Fuller (University of Leeds, Leeds Institute of Medical Education, School of Medicine, Leeds, United Kingdom)
Background: The use of the borderline regression method (BRM) is a widely accepted standard setting method for OSCEs. However, it is unclear whether this method is appropriate for use with small cohorts (e.g. specialist post-graduate examinations). Summary of work: This work investigates how the robustness of the BRM changes as the cohort size varies. Using re-sampling methods and pre-existing OSCE data from two institutions, the 'quality' of an OSCE is evaluated for cohorts of approximately n=300 down to n=15. The error in pass marks, r-squared coefficient, and Cronbach's alpha are all used as metrics of assessment quality.
Summary of results: The re-sampling approach proved robust, producing replicable results. For larger cohorts (n>200), the standard error in the overall pass mark is small (less than 0.5%), and for individual stations is of the order of 1-2%. These errors grow as the sample size reduces, with cohorts of <50 candidates showing unacceptably large error. Alpha and r-squared also become unstable for small cohorts. Conclusions: Institutions working with small cohorts need to carefully consider whether their standard setting methods are sufficiently robust. If possible, the errors in the standard setting should be estimated and steps taken to ensure defensible pass/fail decisions are made. Using an innovative methodology, this work shows that the BRM is highly robust at large cohort sizes, but that for n<50 become subject to large errors. Take-home messages: With cohort sizes below 50, institutions should be aware of the potentially large errors in standard setting, particularly under the BRM.
Estimating the Reproducibility of OSCE Scores When Exams Involve Multiple Circuits
David Swanson (National Board of Medical Examiners, International Programs, 3750 Market Street,
Philadelphia 19104, United States)
Kate Johnson (St Georges, University of London, London, United Kingdom)
David Oliveira (St Georges, University of London, London, United Kingdom)
Kevin Haynes (St Georges, University of London, London, United Kingdom)
Katharine Boursicot (St Georges, University of London, London, United Kingdom)
Background: Schools commonly administer full-class OSCEs using multiple circuits at different sites and times. Estimation of score reproducibility is difficult because circuits can differ in difficulty when the "same" stations are used because different markers and standardized patients are involved.
Summary of work: A new generalizability-theory-based method was developed to examine score reproducibility on 15-station end-of-year OSCEs taken by 276 students in the SGUL MBBS course in 2010 and 302 students in 2011. Stations used checklist-based scoring in 2010 and ratings-based scoring in 2011. Ratings-based scores were found to be less reproducible because of greater variation in the stringency of examiners marking the same station in different circuits. Summary of results: Rather than running a persons-by-stations ANOVA ignoring circuits, the new method involves running a persons-by-examiners-nested-in-circuits ANOVA, then adding stations to the design to control for overall differences in station difficulty, and working with the two sets of resulting variance components to separate variation due to overall station difficulty from circuit-specific variation in examiner stringency. From a practical standpoint, a persons-by-stations ANOVA ignoring circuits produced an estimated generalizability for ratings-based scores for short
ABSTRACT BOOK: SESSION 2 MONDAY 26 AUGUST: 0830-1015
stations that was several hundredths larger than for checklist-based scores; the reverse was true when the new method was applied.
Conclusions: In designing and evaluating checklist-based and ratings-based scoring methods for OSCE stations, variation in examiner stringency across circuits should be taken into account in analyses of score reproducibility.
Take-home messages: Confounding of examinee ability and circuit difficulty in multi-circuit OSCEs should not be ignored in analyzing reproducibility of scores.
A clarification study of internal scales clinicians use to assess undergraduate medical students
Catherine Hyde (Keele University, School of Medicine, Keele, United Kingdom)
Janet Lefroy (Keele University, School of Medicine, Keele, United Kingdom)
Simon Gay (Keele University, School of Medicine, Keele, United Kingdom)
Sarah Yardley (Keele University, School of Medicine, Keele, United Kingdom)
Robert McKinley (Keele University School of Medicine, Academic General Practice, David Weatherall Building, Keele ST5 5BG, United Kingdom)
Background: Clinicians hold internal constructs which they use to make often intuitive judgements about learners and colleagues. Grading scales which align with these internal scales may be more reliable. This study aims to understand the constructs clinicians use to make judgements of undergraduate medical students' consultation skills and whether we can develop construct aligned scales for their assessment. Summary of work: JL and CH conducted semi-structured face-to-face interviews with 15 clinicians with a minimum of 2 years' experience in OSCE examinations of undergraduates at one English Medical School and were also actively practicing and teaching. Interviews were audio-recorded. During interviews clinicians were asked to draw assessment scales for assessment domains and populate them with words and phrases which described the range of student performance. The audio-recording and scales from each interview were analysed by the interviewer and a researcher using framework analysis informed by realist theory. Emerging scales for each construct were reviewed in round-table meetings and fed back to subsequent participants. The finalised scales will be reviewed in a focus group with clinicians who participated. Summary of results: Details of the results will be presented: preliminary results suggest that clinician assessors hold internal scales which they can use to describe meaningful scales though individual assessors weigh the importance of particular scales differently. Take-home messages: This work suggests that designing assessment scales more aligned to the internal scales clinicians use to assess undergraduate medical students is feasible. Further work is needed to investigate reliability and generalisability of the scales.
Simplified Scoring for the Medical Council of Canada's Part II (MCCQEII) Examination: Does expert weighting make a difference?
Andrea Gotzmann (Medical Council of Canada, Research and Development, 2283 St. Laurent Blvd., Ottawa K1G 5A2, Canada)
Debra (Dallie) Sandilands (University of British Columbia, Vancouver, Canada)
Bruno Zumbo (University of British Columbia, Vancouver, Canada)
Andre De Champlain (Medical Council of Canada, Research and Development, Ottawa, Canada) Marguerite Roy (Medical Council of Canada, Research and Development, Ottawa, Canada)
Background: Current scoring schemes for the MCCQEII OSCE often include expert weighting of items, rating scales and even stations. The assumption is that these weights yield a more valid measure of clinical competency. However, there is relatively little empirical evidence that supports the assumption that complex weighting schemes impact score and decision reliability. The purpose of this research was to assess whether such weighting improved classification decisions on the MCCQEII, required as part of medical licensure for Canadian medical graduates.
Summary of work: Four scoring models were applied to three past administrations of the MCCQEII: (1) Complex/Component (item and component weights); (2) Complex/Station (item weights no component weights); (3) Simple/Component (no item weights, component weights) and; (4) Simple/Station (no weights). Reliability estimates, pass/fail rates, and classification decisions were compared across the four scoring models. Summary of results: Score reliability values (Cronbach's alpha) ranged from 0.74 to 0.78, with a slight increase noted for the simplest scoring model. Pass/fail rates varied slightly across the scoring models, but these differences were quite small in magnitude (less than 3%). Classification decisions were very consistent across scoring models (accuracy 0.87 to 0.99; consistency 0.82 to 0.98), which suggests that 82% to 99% of the decisions were accurately or consistently applied in the pass/fail categories, regardless of whether or not complex weighting was implemented. Conclusions: Results indicate that using a simplified scoring model (no weights) yielded reliability estimates (both for scores and more importantly decisions) that were virtually identical to those obtained with more complex weighting schemes.