Discussion
This is the first study examining the SCI’s validity and reliability, and that of any insomnia screening tool, in a sample of English-speaking stroke survivors. Results from 180 participants indicated that 41.7% met DSM-5 criteria for insomnia disorder, and 60.0% exhibited insomnia symptoms (including those classified as insomnia disorder), consistent with meta-analytic findings on post-stroke insomnia prevalence (2,3), and supporting the representativeness of the sample. The remaining 40% had no insomnia symptoms.
The SCI demonstrated ‘excellent’ diagnostic accuracy when screening for DSM-5 insomnia disorder (AUC=0.86) and/or symptoms (AUC=0.85) post-stroke. Moreover, examination of internal consistency indicated acceptable reliability (Cronbach’s α=0.84). The SCI-2 demonstrated ‘acceptable’ accuracy, with an optimal cut-off of ≤2; paralleling validations in the general population.19 However, the full SCI demonstrated significantly higher accuracy and should be the preferred assessment method where possible.
Estimates of diagnostic accuracy in the current study are similar, although slightly lower, to recent validations in stroke populations of the SCI in Indonesian13 and the Insomnia Severity Index, Pittsburgh Sleep Quality Index and Athens Insomnia Index in Chinese.10 Nonetheless, comparisons should be drawn cautiously. Existing validations of SCI translations demonstrate heterogeneity in estimates of optimal cut-offs,15–18 suggesting possible confounds from translation and/or nuanced cultural differences. Additionally, a recent meta-analysis demonstrated that condition prevalence may influence the sensitivity and specificity of binary diagnostic tests.33 The prevalence of insomnia in the previous validations10 13 was 15.6% and 31.0%, respectively, both being lower than the current study (41.7%) and previous estimates of post-stroke insomnia prevalence.2 3 Thus, the variance in prevalence should be considered when comparing studies of diagnostic accuracy, and when considering a screening tool for clinical or research use.
The optimal cut-off for detecting insomnia disorder using the SCI in the current study was found to be ≤13. This is lower than both the conventional cut-off for the general population of ≤16,12 14 and the optimal cut-off of ≤23 observed in the validation of the Indonesian translation of the SCI post-stroke.13 Were the conventional cut-off adopted in the current sample, 96% of positive insomnia disorder cases would be correctly identified. However, approximately 45.7% of participants without insomnia disorder would be incorrectly classified as a positive case (ie, having insomnia disorder). Depending on the intended use for the scale, there may be instances where clinicians or researchers wish to maximise sensitivity at the expense of specificity (or vice versa) but should do so with caution. Use of an erroneously high threshold on the SCI will lead to an increased false-positive rate, potentially leading to inappropriate treatment and use of resources in clinical settings; and may cast doubt on the validity of findings in research.
Understanding the heterogeneity of optimal diagnostic thresholds is important for ensuring the SCI is appropriately used in different clinical samples, and future research should explore this further. One possibility is that poorer sleep post-stroke34 may necessitate a lower threshold. Mean SCI scores in the current study (x̄ = 13.74, SD=7.51) are slightly lower and display greater variance than results from a retrospective exploration of scores in 200 000 adults who completed the SCI (x̄ = 14.97, SD=5.93; 14). However, the results of the latter14 may be biased towards people with sleep impairment, and the true population mean SCI scores are likely higher. Variation in reference standards likely also contributes to variability in optimal cut-offs, and ideally studies seeking to validate measures of insomnia should select reference standards in accordance with guidelines on the assessment and diagnosis of insomnia disorder.9 In studies of diagnostic accuracy, the reference standard assumes infallibility. Yet, the index test may not always be solely responsible for the discrepancy.35 The conventional cut-off of ≤1612 14 was determined as the minimum score necessary for putative DSM-5 insomnia disorder, with convergent validity assessed against the ISI. The ISI boasts commendable accuracy but standalone is not a recommended tool for diagnosing DSM-5 insomnia disorder9 and is less specific than clinical interviews. Thus, when comparing the optimal diagnostic cut-offs between studies, one should be cognisant of the possibility of differences in reference standards explaining some or all the variation in optimal cut-offs.
Spectrum bias, a phenomenon whereby accuracy of a diagnostic test may be confounded by symptom severity,36 may further explain discrepancies in optimal diagnostic thresholds between studies. Scores closer to the cut-off are more prone to misclassification; therefore, studies where index test scores demonstrate a bimodal distribution have a greater risk of spectrum bias. Hasan et al13 excluded 164 participants with sleep or psychiatric disorders other than insomnia in their validation. Many of these participants would likely have scored lower on the SCI regardless of insomnia classification, as suggested by the relationships between SCI score and measures of mental health in the current study and corroborated by meta-analytic results from Hertenstein et al.37 Likely as a result of their inclusion criteria, mean SCI scores in Hasan et al13 for participants without insomnia were 29.55 (SD=1.96), with little variance. Compared with 19.33 (SD=6.81) in the current study, which is more akin to previous research.34 Such a skewed distribution of scores allows specificity to remain high at higher thresholds, contributing to potentially misleading estimates of accuracy and optimal cut-offs. Hasan and colleagues demonstrate that the Indonesian SCI can discriminate post-stroke insomnia disorder in the absence of potential confounds. The present study adds to this by validating the English SCI in self-reported stroke survivors and doing so while including participants with comorbid psychological or sleep disorders; thereby increasing generalisability of the findings. Results of the current study, when considered in the context of the existing literature, highlight the importance of considering spectrum bias in studies of diagnostic accuracy. When considering a screening tool, researchers and clinicians should consider whether a study’s selection criteria are clinically meaningful to the population they intend to use the test in.
This study has several strengths. A priori power analyses, a rarity in studies of diagnostic accuracy,38 ensured the study was suitably powered to obtain reliable estimates of accuracy. Consecutive completion of index and reference tests reduced the risk of temporal changes in disease state. Finally, stratifying analyses by insomnia classification reduced the risk of spectrum bias and prevalence influencing estimates of accuracy.33
Nonetheless, voluntary response sampling does carry a risk of response bias and may lead to over-representation of participants with an interest in sleep, or more severe sleep difficulties. However, prevalence in the current sample aligns with that of previous meta-analytic estimates.2 3 Recruiting participants via social media may have led to a younger than expected stroke sample (mean age=49.61 ± 12.41 years). Future research may wish to consult older stroke survivors to explore methods of increasing the accessibility of online stroke research to a wider age range. Nevertheless, stroke can affect people of any age, and research representative of younger stroke survivors is necessary, particularly to understand the challenges of improving life after stroke for working age individuals—perhaps with young families—compared with older adults. Results of a linear regression from a large cross-sectional study (n=200 000) by Espie and colleagues14 did demonstrate that SCI score was predicted to decrease 0.057 points per year of life. Findings from the current study demonstrated no significant relationship between age, or gender, and total SCI score; however, it is likely that a much larger sample size would be needed to detect an effect as small as that described in Espie et al.14 Thus, although the modulating effect of demographic characteristics on the accuracy of the SCI was not explored, current findings provide no evidence to suggest that such an effect exists. Due to the online nature of recruitment, participant self-reports of stroke were not verified by another means. Various efforts were made, however, to minimise the risk of individuals without a history of stroke volunteering to take part. First, recruitment adverts were shared explicitly by stroke researchers and third sector organisations supporting stroke survivors, targeting stroke peer support groups. Moreover, offering no incentive to participants removed the risk of individuals taking part without a history of stroke for financial gain. Furthermore, while the results of a systematic review by Woodfield et al39 demonstrate that the sensitivity and positive predictive value of self-reported stroke is low in samples with low prevalence, specificity (96% to 99.1%) and negative predictive value (88.2% to 99.9%) were consistently high. As one would expect, authors report that positive predictive values increased in line with prevalence. Thus, in samples with high prevalence of stroke, such as in the current study, the risk of false positives is likely to be small. We therefore believe self-reports of stroke in the present sample to be reliable. Nevertheless, where time and resources allow, future researchers may wish to verify self-reported stroke status by recruiting participants from stroke outpatient clinics or obtaining clinical records. Moreover, the number of incomplete responses (n=118) is worth noting. This may indicate systemic challenges related to conducting stroke research online, or difficulties faced by potential participants with language difficulties. Relatedly, the duration of the survey was relatively long, and future research may consider limiting survey length further to increase accessibility. Finally, participants were not asked which type of stroke they had. This was a methodological decision to avoid excluding participants who did not know or remember. Existing evidence does indicate potential relationships between insomnia and stroke type/location;40 therefore, future researchers may wish to consider purposive sampling methods that allow exploration of whether these phenomena may influence insomnia presentation and the accuracy of diagnostic measures.
In summary, this study confirms the validity and reliability of the English version of the SCI when detecting DSM-5 insomnia in self-reported stroke survivors and indicates that a lower threshold than is traditionally used should be considered when using the SCI in the stroke population. At the optimal diagnostic threshold of ≤13, the SCI demonstrates ‘excellent’ diagnostic accuracy. The brevity of the tool makes it attractive for clinical and research use and the SCI-2 does offer an acceptable shorter alternative. Nevertheless, the accuracy of the SCI-2 is outperformed by the full SCI, and use of the latter should be preferred where possible. In conclusion, the SCI should be considered a valid and reliable screening tool for DSM-5 insomnia disorder and insomnia symptoms post-stroke.