Authors: Anastasios Bastounis, Anthea Sutton, Joanna Leaviss, Scott Weich, Sally Ohlsen and Andrew Booth
Key take home messages
The McLean Screening Instrument for BPD (MSI-BPD) demonstrates good diagnostic accuracy with sensitivity of 0.81-0.82 and specificity of 0.72-0.85, making it an effective initial screening tool.
The Borderline Symptom List-23 (BSL-23) shows excellent internal consistency (α=0.78-0.97) and is particularly well-suited for tracking symptom severity and monitoring therapy progress over time.
Three instruments were identified as especially effective for monitoring symptom changes: BSL-23, Zanarini Rating Scale (ZAN-BPD), and Borderline Evaluation of Severity Over Time (BEST).
Self-report instruments provide advantages in BPD assessment including ease of administration and cost-effectiveness, though they should be used alongside clinician-administered assessments.
Accuracy in psychometric measurement of BPD includes not only diagnostic precision but also responsiveness to change over time, which is crucial for evaluating treatment effectiveness.
Introduction
Borderline Personality Disorder (BPD) is a complex mental health condition characterised by pervasive patterns of instability in interpersonal relationships, self-image, and emotions, often accompanied by marked impulsivity. According to the DSM-V, individuals with BPD may exhibit frantic efforts to avoid real or imagined abandonment; unstable and intense interpersonal relationships; identity disturbance; impulsivity in areas that are potentially self-damaging; recurrent suicidal behaviour; affective instability; chronic feelings of emptiness; inappropriate anger; and transient, stress-related paranoid ideation or severe dissociative symptoms (APA, 2013). The prevalence of BPD varies across studies and populations, with cumulative prevalence estimated at around 3% for adolescents and young adults (Johnson et al., 2008), rising considerably among adult populations (Ha et al., 2014) and populations with comorbid mental health conditions (Grant et al., 2008; Tomko et al., 2014).
Given the complexity and variability of BPD symptoms, accurate assessment and monitoring are crucial for effective treatment planning and outcome evaluation. This becomes even more apparent considering the ongoing debate as to whether BPD should be regarded as a separate entity among personality disorders (Tyrer et al., 2019), since BPD criteria can be viewed as reflections of general impairments in personality functioning rather than as components of a distinct personality disorder (Sharp & Wall, 2021; Sharp et al., 2015). Self-report instruments have become indispensable tools for the screening and tracking of symptom changes in individuals with BPD. While they should be used in conjunction with clinician-administered assessments and interviews for a definitive diagnosis, self-report instruments offer several advantages, including ease of administration and cost-effectiveness in research and clinical settings.
Aim & objectives
The aim of this rapid, scoping, mini-report is to provide a description of the self-report instruments used for screening and tracking symptom severity in individuals with Borderline Personality Disorder (BPD). The objectives of this mini-report are: (i) to provide a description of the most frequently used self-report screening tools and their psychometric properties, including accuracy, and (ii) to focus on tools specifically developed for tracking symptom changes and/or assessing therapy progress in individuals with BPD of varying severity.
This brief report initially focused on a cross-comparison between the McLean Screening Instrument for Borderline Personality Disorder (MSI-BPD) and the Borderline Personality Questionnaire (BPQ), with particular emphasis on their responsiveness. Subsequently, it was expanded to cover other instruments as applied to individuals with BPD or individuals with BPD traits. This mini-report offers a generic presentation of the most widely used tools for the screening of BPD across populations. The report also focuses on measures that are designed to serve both as initial screening instruments and as tools for tracking symptom changes and monitoring therapy progress. This report does not provide an exhaustive account of the instruments used for the diagnosis of BPD.
Methods
A high precision search of MEDLINE (via Ovid) was conducted, along with hand-searching of reference lists from relevant papers in the field of BPD. No language or date restrictions were applied. Particular focus was given to screening instruments, as well as instruments used alongside clinical interviews for the diagnosis of BPD. Emphasis was also placed on studies published in peer-reviewed journals that reported the psychometric or diagnostic properties of these instruments. During the initial phase of the search, we prioritised the responsiveness of the instruments (i.e., their ability to track symptom changes over time). However, after running the search and reviewing the results, we adopted a more holistic approach, considering a broader range of psychometric and diagnostic properties. A rudimentary search strategy was developed in the MEDLINE database, combining the Medical Subject Headings (MeSH) 'Borderline Personality Disorder' with relevant free-text keywords (e.g., 'impulsivity', 'irritability, ‘accuracy'), as well as highly specific keyword searches using the names of established screening and diagnostic instruments for BPD.
The full search strategy is provided in Appendix I. A single reviewer screened titles/abstracts and full texts and extracted data from the selected records. Data consisted of study ID, name of the instrument, number of items, type of administration, number and type of constructs targeted by each instrument, type of population, psychometric properties (e.g., internal consistency, sensitivity, specificity). Psychometric properties, such as sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), area under curve (AUC) of the ROC curve, and likelihood odds ratios are generally considered critical psychometric properties for evaluating the screening accuracy of a psychometric tool.
Measuring responsiveness is challenging. At a minimum, it requires longitudinal studies and the formulation of pre-defined hypotheses regarding the relationship between the measures being tested (e.g., Tool X and Tool Y) to detect changes over time and/or the magnitude of anticipated change (e.g., we anticipate change of small effect size, d=-.20 to be detected within 10 days after therapy).
There are generally two approaches for measuring responsiveness:
Criterion-based approach, when a gold standard is available. Clinical interviews are often considered the gold standard for diagnosing BPD, correlations between the test scores of the tool being evaluated (e.g., BSL-23 or BEST) and a gold standard (e.g., a structured clinical interview) are commonly used to assess responsiveness. For dichotomous outcomes (e.g., BPD vs. BPD-free), area under the curve (AUC of the ROC curve) is generally considered sufficient to assess responsiveness.
Construct-based approach, when no gold standard is available. This relies on predefined hypotheses about how a measure should behave over time in the absence of a gold standard.
Given the overall descriptive purpose of the mini-report, quality assessment of included studies was not undertaken. The heatmap and the spider plots were created in R Studio (version 4.3.1), using the “ggplot2” and “fmsb” packages.
Table 1 provides a brief glossary and explanation of the reported psychometric properties (de Vet et al., 2011).
Results
Overall, data were extracted from 22 studies reporting psychometric properties and/or diagnostic accuracy metrics of five screening tools (Appendix II). Based on our assessment, these are the most widely implemented and extensively researched tools in BPD populations. Most included studies were diagnostic validation studies, and in most studies, clinical or subclinical populations were included (Appendix II & III). Based on the included studies, the most frequently reported psychometric properties across instruments were internal consistency, test-retest reliability, and convergent validity (Figure 1). The MSI-BPD was the most consistently evaluated instrument for screening, assessed in twelve studies, followed by the BSL-23 screening instrument, which was assessed in four studies (Appendix II & III).
Figure 1 shows a heatmap titled "Psychometric properties by instrument" that visualises the psychometric qualities of different BPD assessment tools. A heatmap is a data visualisation technique that uses colour intensity to represent numerical values in a matrix format. In this case, the colours range from dark purple (lowest values around 0.5) through blue, green, and yellow (highest values around 0.9), as indicated by the colour scale on the right side. Higher sensitivity values indicate that the psychometric tool is better at correctly identifying true positives (e.g., individuals with a disorder). Higher specificity values indicate that the psychometric tool is better at correctly identifying true negatives (e.g., individuals without a disorder). A higher positive predictive value (PPV) means that a positive test result is more likely to indicate a true positive, reflecting the tool's accuracy in a given population, while a higher negative predictive value (NPV) indicates that a negative test result is more likely to be a true negative, enhancing confidence in ruling out a condition. A higher AUC (Area Under the Curve) in a Receiver Operating Characteristic (ROC) curve reflects better overall performance of the psychometric tool, balancing sensitivity and specificity and indicating a higher discriminative ability. Higher internal consistency values reflect greater consistency among the items in a psychometric tool, suggesting that they measure the same underlying construct, while higher test-retest reliability values indicate that the tool produces stable and consistent results over repeated administrations, reflecting temporal stability. Higher convergent validity values suggest that the tool is strongly correlated with other measures that assess the same or closely related constructs, confirming construct validity. Higher discriminant validity values indicate that the tool is not strongly correlated with measures of different, unrelated constructs, supporting the uniqueness of the construct being measured.
This particular heatmap compares five most commonly used BPD assessment instruments (ZAN-BPD, MSI-BPD, BSL-23, BPQ, and BEST) across various psychometric metrics such as sensitivity, specificity, positive/negative predictive values, AUC (Area Under Curve), responsiveness, test-retest reliability, convergent validity, and discriminant validity.
The visualisation allows the reader to quickly identify that:
● Some instruments have more complete psychometric evaluation than others (more coloured cells)
● BSL-23 shows strong performance across multiple metrics (predominantly green cells)
● MSI-BPD demonstrates good overall psychometric properties with a mix of medium to high values
● Each instrument has different strengths in terms of specific psychometric properties
● Some metrics haven't been evaluated for certain instruments (grey cells)
This heatmap provides a comprehensive visual comparison of the measurement quality across these different assessment tools.
Figure 1. Heatmap of the psychometric properties by screening instrument.
Among those instruments that have been more thoroughly validated, MSI-BPD demonstrates good internal consistency, with Cronbach’s alpha values ranging from 0.74 to 0.77, indicating acceptable reliability across samples (Figure 2). The test-retest reliability is reported at 0.72, while sensitivity ranged from 0.81 to 0.82 and specificity from 0.72 to 0.85 (Appendix I & II). This profile suggests that the tool is useful both in identifying true cases and excluding non-cases.
The BSL-23 shows excellent internal consistency, with Cronbach’s alpha values ranging from 0.78 to 0.97, indicating strong reliability across diverse linguistic and cultural adaptations (Figure 2). Test-retest reliability is similarly strong, ranging from 0.61 to 0.93 (Appendix I & II). Convergent validity ranged from 0.78 to 0.55, suggesting moderate to strong correlation with related constructs. In terms of diagnostic accuracy, BSL-23 shows a sensitivity of 0.76, specificity of 0.83, and an area under curve (AUC) of 0.87. These values suggest that while the BSL-23 is not primarily a diagnostic tool, it demonstrates strong potential for discriminating between clinical and non-clinical populations.
Figure 2. Spider plots with the psychometric profiles of the MSI-BPD and BSL-23. The spider plots present the psychometric profiles of the MSI-BPD and BSL-23, illustrating their performance across key psychometric properties, including sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), area under the curve (AUC), internal consistency, test-retest reliability, and convergent validity. Each axis represents a different psychometric property, with higher values indicating stronger performance.
Based on the retrieved records, three instruments were identified as particularly well-suited for tracking symptom severity and monitoring therapy progress over time. These instruments are:
Borderline Symptom List-23 (BSL-23) (i.e., the condensed version of the BSL-95),
Zanarini Rating Scale for Borderline Personality Disorder (ZAN-BPD),
Borderline Evaluation of Severity Over Time (BEST).
Both the BSL-23 and BEST are self-report instruments, whereas the ZAN-BPD is a clinician-administered tool. The number of items in these instruments ranges from nine (ZAN-BPD) to 15 (BEST) and 23 (BSL-23) (Appendix I & II). All three instruments utilise Likert scales and are designed to assess and detect symptom changes in individuals with BPD.
According to the authors, the BSL-23 was found to satisfactorily measure sensitivity to change and track symptom changes one to three months post-therapy, using paired comparisons (paired t-tests) of before-and-after treatment scores (Bohus et al., 2007; Nicastro et al., 2016; Soler et al., 2013). In these studies, sensitivity to change over time was assessed using effect sizes and p-values to compare pre- and post-treatment scores (Bohus et al., 2007; Nicastro et al., 2016), as well as assessing correlation coefficients between the BSL-23 scores and clinician-administered scales (Soler et al., 2013).
According to the authors, the BEST was found to satisfactorily measure sensitivity to clinical change and track symptom changes at post-treatment (five months) (Pfohl et al., 2009). The tool's ability to detect change was assessed by modeling mean assessment scores and evaluating effect sizes, measures of score variability, and p-values (Pfohl et al., 2009).
According to the authors, ZAN-BPD was found to satisfactorily track symptom changes and measure sensitivity to change ten days post-assessment (Zanarini, 2003). The tool's ability to detect change was assessed by re-interviewing a sample of patients in a week time and assess the correlation coefficients of difference in scores. A cross-comparison table of the qualitative properties of these instruments is presented in Appendix III.
It is important to note a potential issue with the selected studies evaluating the BEST and BSL-23. The authors of these studies primarily relied on correlations, effect sizes, and p-values of change scores without directly comparing these changes to a clinical interview, which is the gold standard. As noted in de Vet et al. (2011), the use of effect sizes and paired t-tests without a direct comparison to a gold standard is generally considered an inappropriate method for assessing responsiveness.
Regarding the ZAN-BPD, this is a clinician-administered tool that we also highlighted as appropriate for tracking responsiveness. However, its responsiveness has only been evaluated within a short window (7-10 days), which might limit its utility for capturing longer-term changes in symptoms.
Discussion
The mini-report provides a comprehensive overview of the most commonly used self-report instruments for screening and tracking symptom severity in individuals with borderline personality disorder (BPD). The main strength of this report is that it provides a comprehensive overview of self-report screening tools for BPD, using a highly precise search strategy.
However, this mini report inevitably has some limitations. First, it does not provide a comprehensive assessment of all potential BPD instruments, focussing mainly on a subset of well-studied tools. Second, it lacks a formal quality assessment of the included studies. Third, the focus on self-reports only, and the exclusion of clinician-administered tools, may limit the generalisability of the findings. Fourth, this mini-report does not address cultural or linguistic variations that may affect the cross-cultural applicability of these instruments. Fifth, this mini report, in its current state, deviates from its published protocol in terms of eligibility criteria and methods underpinning the assessment and synthesis of data.
Conclusion
The BSL-23 is well-suited for monitoring BPD symptom severity over time, making it ideal for longitudinal studies and treatment evaluation (e.g., tracking therapy progress). In contrast, the MSI-BPD serves as an efficient initial screening tool to identify individuals who may require a more comprehensive evaluation for BPD. The choice between these instruments should be guided by the specific objectives of the assessment, the duration of follow-up, and the characteristics of the sample (e.g., ratio of healthy controls to individuals with subclinical or clinical symptoms), as well as whether in-depth symptom tracking, or preliminary screening is warranted.
Andre, J. A., Verschuere, B., & Lobbestael, J. (2015). Diagnostic value of the Dutch version of the McLean Screening Instrument for BPD (MSI-BPD). J Pers Disord, 29(1), 71-78.
American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). https://doi.org/10.1176/appi.books.9780890425596.
Azizi, M. R., Mohammadsadeghi, H., Alavi, K., Rasoulian, M., Karimzad, N., & Eftekhar Ardebili, M. (2019). Validity and reliability of Persian translation of the Borderline Evaluation of Severity over Time (BEST) questionnaire. Med J Islam Repub Iran, 33, 133. https://doi.org/10.34171/mjiri.33.133
Bohus, M., Kleindienst, N., Limberger, M. F., Stieglitz, R. D., Domsalla, M., Chapman, A. L., Steil, R., Philipsen, A., & Wolf, M. (2009). The short version of the Borderline Symptom List (BSL-23): development and initial data on psychometric properties. Psychopathology, 42(1), 32-39. https://doi.org/10.1159/000173701
Chanen, A. M., Jovev, M., Djaja, D., McDougall, E., Yuen, H. P., Rawlings, D., & Jackson, H. J. (2008). Screening for borderline personality disorder in outpatient youth. J Pers Disord, 22(4), 353-364.
de Vet, H. C. W., Terwee, C. B., Mokkink, L. B., & Knol, D. L. (2011). Measurement in medicine: A practical guide. Cambridge University Press.
Grant, B. F., Chou, P. S., Goldstein, R. B., Huang, B., Stinson, F. S., Saha, T. D., Smith, S. M., Dawson, D. A., Pulay, A. J., Pickering, R. P., Ruan, W. J., & Kaplan, K. (2008). Prevalence, correlates, disability, and comorbidity of DSM-IV borderline personality disorder: Results from the Wave 2 National Epidemiologic Survey on Alcohol and Related Conditions. Journal of Clinical Psychiatry, 69(4), 533–545.
Ha, C., Balderas, J. C., Zanarini, M. C., Oldham, J. M., & Sharp, C. (2014). Psychiatric comorbidity in hospitalized adolescents with borderline personality disorder. Journal of Clinical Psychiatry, 75(5), e457–e464.
Johnson, J. G., Cohen, P., Kasen, S., Skodol, A. E., Hamagami, F., & Brook, J. S. (2008). Cumulative prevalence of personality disorders between adolescence and adulthood. Acta Psychiatrica Scandinavica, 118(5), 410–413.
Keng, S.-L., Lee, Y., Drabu, S., Hong, R. Y., Chee, C. Y. I., Ho, C. S. H., & Ho, R. C. M. (2019). Construct validity of the McLean Screening Instrument for Borderline Personality Disorder in two Singaporean samples. Journal of Personality Disorders, 33(4), 450–469.
Kröger, C., Huget, F., & Roepke, S. (2011). Diagnostische Effizienz des McLean Screening Instrument für Borderline-Persönlichkeitsstörung in einer Stichprobe, die eine stationäre, störungsspezifische Behandlung in Anspruch nehmen möchte. Psychotherapie, Psychosomatik, Medizinische Psychologie, 61(11), 481–486.
Kroger, C., Vonau, M., Kliem, S., & Kosfelder, J. (2011). Emotion dysregulation as a core feature of borderline personality disorder: comparison of the discriminatory ability of two self-rating measures. Psychopathology, 44(4), 253-260. https://doi.org/10.1159/000322806
Melartin, T., Hakkinen, M., Koivisto, M., Suominen, K., & Isometsa, E. T. (2009). Screening of psychiatric outpatients for borderline personality disorder with the McLean Screening Instrument for Borderline Personality Disorder (MSI-BPD). Nord J Psychiatry, 63(6), 475-479. https://doi.org/10.3109/08039480903062968
Munawar, K., Aqeel, M., Rehna, T., Shuja, K. H., Bakrin, F. S., & Choudhry, F. R. (2021). Validity and Reliability of the Urdu Version of the McLean Screening Instrument for Borderline Personality Disorder. Front Psychol, 12, 533526. https://doi.org/10.3389/fpsyg.2021.533526
Nicastro, R., Prada, P., Kung, A. L., Salamin, V., Dayer, A., Aubry, J. M., Guenot, F., & Perroud, N. (2016). Psychometric properties of the French borderline symptom list, short form (BSL-23). Borderline Personal Disord Emot Dysregul, 3, 4. https://doi.org/10.1186/s40479-016-0038-0
Noblin, J. L., Venta, A., & Sharp, C. (2014). The validity of the MSI-BPD among inpatient adolescents. Assessment, 21(2), 210-217. https://doi.org/10.1177/1073191112473177
Patel, A. B., Sharp, C., & Fonagy, P. (2011). Criterion Validity of the MSI-BPD in a Community Sample of Women. Journal of Psychopathology and Behavioral Assessment, 33(3), 403-408. https://doi.org/10.1007/s10862-011-9238-5
Pfohl, B., Blum, N., St John, D., McCormick, B., Allen, J., & Black, D. W. (2009). Reliability and validity of the Borderline Evaluation of Severity Over Time (BEST): a self-rated scale to measure severity and change in persons with borderline personality disorder. J Pers Disord, 23(3), 281-293.
Poreh, A. M., Rawlings, D., Claridge, G., Freeman, J. L., Faulkner, C., & Shelton, C. (2006). The BPQ: A Scale For The Assessment Of Borderline Personality Based On Dsm-Iv Criteria. Journal of Personality Disorders, 20(3), 247-260.
Sharp, C., & Wall, K. (2021). DSM-5 level of personality functioning: Refocusing personality disorder on what it means to be human. Annual Review of Clinical Psychology, 17, 313–337.
Sharp, C., Wright, A. G. C., Fowler, J. C., Frueh, B. C., Allen, J. G., Oldham, J., & Clark, L. A. (2015). The structure of personality pathology: Both general (“g”) and specific (“s”) factors? Journal of Abnormal Psychology, 124(2), 387–398.
Shen, J. E., Huang, Y. H., Huang, H. C., Liu, H. C., Lee, T. H., Sun, F. J., Huang, C. R., & Liu, S. I. (2023). Psychometric properties of the Chinese Mandarin version of the Borderline Symptom List, short form (BSL-23) in suicidal adolescents. Borderline Personal Disord Emot Dysregul, 10(1), 23.
Soler, J., Domínguez-Clavé, E., García-Rizo, C., Vega, D., Elices, M., Martín-Blanco, A., & Pascual, J. C. (2016). Validation of the Spanish version of the McLean Screening Instrument for Borderline Personality Disorder. Revista de Psiquiatría y Salud Mental, 9(4), 195–202
Soler, J., Vega, D., Feliu-Soler, A., Trujols, J., Soto, A., Elices, M., Ortiz, C., Pérez, V., Bohus, M., & Pascual, J. C. (2013). Validation of the Spanish version of the borderline symptom list, short form (BSL-23). BMC Psychiatry 13.
Tomko, R. L., Trull, T. J., Wood, P. K., & Sher, K. J. (2014). Characteristics of borderline personality disorder in a community sample: Comorbidity, treatment utilization, and general functioning. Journal of Personality Disorders, 28(5), 734–750.
Tyrer, P., Mulder, R., Kim, Y. R., & Crawford, M. J. (2019). The development of the ICD-11 classification of personality disorders: An amalgam of science, pragmatism, and politics. Annual Review of Clinical Psychology, 15, 481–502.
van Alebeek, A., van der Heijden, P. T., Hessels, C., Thong, M., & van Aken, M. A. G. (2017). Comparison of three questionnaires to screen for borderline personality disorder in adolescents and young adults. European Journal of Psychological Assessment, 33(2), 123–128.
Zanarini, M. C. (2003). Zanarini Rating Scale For Borderline Personality Disorder (Zan-Bpd): A Continuous Measure Of Dsm-Iv Borderline Psychopathology. Journal of Personality Disorders, 17(3), 233-242.
Zanarini, M. C., Vujanovic, A. A., Parachini, E. A., Boulanger, J. L., Frankenburg, F. R.,, Hennen, J. (2003). A Screening Measure For Bpd: The Mclean Screening Instrument For Borderline Personality Disorder (Msi-Bpd). Journal of Personality Disorders, 17(6), 568-573.
Zanarini, M. C., Weingeroff, J. L., Frankenburg, F. R., & Fitzmaurice, G. M. (2015). Development of the self-report version of the Zanarini Rating Scale for Borderline Personality Disorder. Personal Ment Health, 9(4), 243-249.
Zimmerman, M., & Balling, C. (2019). SCREENING FOR BORDERLINE PERSONALITY DISORDER WITH THE MCLEAN SCREENING INSTRUMENT: A REVIEW AND CRITIQUE OF THE LITERATURE. Journal of Personality Disorders, 33.
Table 2. Full search strategy implemented in MEDLINE (Ovid)
Table 3.
Table 4
Table 5. Strengths and limitations of selected self-report instruments for Borderline Personality Disorder (BPD).