Regular readers here know that when it comes to clinical exams for mental health licensure, I’m not a fan. A recent article of mine, published in the peer-reviewed Journal of Mental Health and Clinical Psychology, tackles a key component of the legal underpinning for these exams. As I explain, despite the claims of exam developers, clinical exams in mental health care do not appear to meet basic testing industry standards.
Those testing industry standards come from the Standards for Educational and Psychological Testing. They are jointly published by the American Educational Research Association (AERA), the American Psychological Association (APA), and the National Council on Measurement in Education (NCME). The most recent edition (2014) is open access, available for download here. Test developers routinely assert that their exams are consistent with those standards.
My review finds otherwise. The exams fall short in key areas related to construct clarity, validity, fairness, and statistical analysis. Some of the cited testing industry standards where the exams fall short are labeled as “foundational” and “overarching.” This shows just how important these issues are.
What knowledge is “correct?”
Two examples are worth noting here. One, industry standards demand that those about to take a test of knowledge be given “adequate information to help them properly prepare for a test[,] so that the test results accurately reflect their standing on the construct being assessed and lead to fair and adequate score interpretations” (p. 133). While exam developers offer broad outlines of topic areas that may be covered on their tests, none offers a comprehensive list of the references used in exam development. As a result, examinees have no way of knowing whose version of a treatment model is treated as the “correct” one.
For California’s MFT Clinical Exam, examinees aren’t even informed which code of ethics they should use. Two organizations, AAMFT and CAMFT, offer ethics codes for MFTs. There are many substantive differences between those two codes. Examinees may get questions wrong, and indeed may fail their exams, not because of a lack of knowledge — but because the examinee’s knowledge doesn’t align with the anchoring of the item writers.
Bias is found, but not remedied
As another example, developers must engage in statistical analysis to determine whether their exams are fair across demographic groups. Industry standards for this process suggest item-level analysis as well as exam-level analysis. Those standards go on to demand that when errors in scoring are identified, developers take steps to address those errors. But developers simply choose not to do so. ASWB has acknowledged that scored items are sometimes identified as showing bias, and removed from exams going forward. But examinees whose tests had previously included biased items are never notified. Their scores are never corrected. Further, ASWB refuses to engage in exam-level analysis. They have offered changing rationales for this refusal. Recently, researchers demonstrated how non-significant amounts of item-level bias can add up to a statistically significant amount of overall test bias. This further demonstrates the importance of assessing for bias on both levels.
Misplaced trust
My article is open access, so please read the whole thing. But the TLDR version is this: Clinical exams in mental health care don’t appear to meet minimum expectations of the testing industry. Unfortunately, licensing boards don’t tend to ask the tough questions here that they should. They instead simply take developers at their word that the exams are built and evaluated properly. That trust, it appears, is not well placed.