Discussion
The American Heart Association/American Stroke Association (AHA/ASA) identified the relative lack of prehospital suspected stroke disease prevalence data as a major research limitation affecting LVO prediction scale assessments.20 We found that among 220 EMS suspected stroke patients who were prehospital CPSS positive and had LKW times within 6 hours, the most common final diagnosis category was stroke mimic (50%), followed by non-LVO ischaemic stroke (20.5%), intracranial haemorrhage (15.9%) and LVO stroke (13.6%). This distribution is similar to other studies that assessed prehospital suspected stroke patients without excluding significant subgroups. Taken all together, prevalences in the literature have ranged from 25% to 68% for stroke mimics, 24%–38% for non-LVO ischaemic strokes, 4%–16% for intracranial haemorrhages and 4%–15% for LVO strokes.10 11 24–26 Isolated M2 occlusions accounted for 33% of LVO strokes in our study and over 40% of occlusions in two larger studies with lower overall LVO prevalence than ours.10 11 Excluding M2 occlusions from our study would reduce the prevalence of LVO stroke to 9.1%. Knowledge of these estimated prevalences of LVO stroke among prehospital suspected stroke patients could help inform future prehospital triage modelling studies and triage test policy decisions when local data are not available.20 27
We sought to identify LVO prediction scale thresholds that met high PPV and NPV goals because predictive values are post-test probabilities of LVO stroke when the prevalence of diseases in a study is similar to that of the target population. Predictive values are strongly influenced by disease prevalence but are also affected by variations in test sensitivity and specificity.6–8 Our prespecified goal of finding thresholds with PPVs ≥80% was too optimistic given the relatively low prevalence of LVO stroke and the diagnostic performance of LVO scales. Only FAST-ED ≥7 met this goal, but its sensitivity was so low (17%) that it missed 83% of LVO strokes. We subsequently lowered our PPV goal to ≥50% in order to find scale thresholds where a positive test result would mean that the patient was equally or more likely to be suffering from LVO stroke than not. PASS =3, RACE ≥7 and FAST-ED ≥6 all met this goal, but none maintained a sensitivity ≥50%. Our prespecified analysis to find scale thresholds with NPVs ≥95% for LVO stroke identified multiple standard and alternative thresholds (table 2).
With high alternative FAST-ED thresholds meeting our PPV goals, although with markedly low sensitivities, and the standard FAST-ED threshold ≥4 standing out as the only threshold to achieve >70% sensitivity and specificity simultaneously, it is tempting to conclude that FAST-ED might outperform other scales. However, these findings should be considered with caution for two reasons. First, Nguyen et al studied FAST-ED ≥4 in a larger prospective study in ambulances with EMS providers performing the examination and found that it performed similarly (60% sensitivity and 85% specificity) to C-STAT, PASS, G-FAST and RACE at their standard thresholds.10 Second, our results could be due to the FAST-ED training given to ED physicians as part of the Mission Protocol quality improvement initiative. The protocol had instructions for ED providers to use FAST-ED as part of their initial assessment and document it. However, this was not routinely done in practice and was very rarely documented. Training was not provided for other scales. The NIHSS subitem scores used in this study to score LVO prediction scales were obtained by neurologists without FAST-ED training; however, their examinations could have been influenced by discussions with ED providers. While this is a limitation of our study, it could also suggest that dedicated training might improve sensitivity and specificity in the ED.
Our results suggest that given the prevalence of LVO stroke among prehospital suspected stroke patients and the sensitivity and specificity tradeoffs of LVO prediction scales, no single scale threshold tested here will likely be able to guide prehospital stroke triage efficiently by itself. Attempts to reach PPVs near 50% by selecting higher than standard thresholds will result in many false-negative missed LVO strokes (figure 2, top right). While several standard thresholds and many lower alternative thresholds can reduce the probability of LVO stroke when tests are negative, this will be accompanied by many false positives. EMS systems that already use the CPSS as a binary initial stroke screen and choose to focus on avoiding missed LVOs could readily adopt CPSS ≥2 as an LVO prediction scale without training personnel to use a new scale. CIs in our study were wide, especially with thresholds that met high PPV goals, but our findings were predominantly in line with larger prospective prehospital validation studies assessing multiple scales simultaneously.10 11
Our primary analysis focused on a definition of LVO stroke that included ICA, M1, M2 and basilar occlusions. Our exploratory analysis that excluded M2 occlusions from the definition of LVO stroke (figures 2 and 3, top left) showed that standard thresholds may not miss many ICA, M1 or basilar occlusions (high sensitivity); however, the corresponding proportion of positive tests with these occlusions (PPV) will be low. It is reassuring that excluding M2s increases the sensitivities of some standard thresholds for ICA, M1 or basilar occlusions into the 80%–90% range as more proximal occlusions are more reliably amenable to EVT. ICA and M1 occlusions are also more morbid and benefit from the strongest evidence base for EVT.28 29 Though more distal MCA occlusions have a higher recanalisation rate with thrombolysis, it is only in the 30% range.30 Registry study data support the benefit of EVT for M2 occlusions found in clinical practice, however, 2019 AHA/ASA guidelines currently provide only a class IIb recommendation.29 Patients presenting with non-disabling symptoms or distal M2 occlusions may still represent situations with EVT equipoise. The Australian and New Zealand Clinical Guidelines for Stroke Management suggest EVT may be considered ‘based on individual patient and advanced imaging factors’.31 As noted above, M2s have comprised a significant proportion of LVOs in studies with prehospital suspected stroke patients (33% in our study and over 40% in two others) so the development of prehospital diagnostic tests that are sensitive for them while retaining specificity would be beneficial. Several device-based approaches are under development to address the diagnostic limitations of LVO prediction scales, including by some of the current authors, but no devices have examined M2 occlusions in large numbers nor completed prehospital validation studies.32–34
Alternatively, broadening our definition of successful triage to include not only M2 occlusions but also intracranial haemorrhage results in high PPVs but low sensitivities. This may be reassuring to prehospital systems of care decision makers. Many of the ‘false positive’ non-LVO patients, when only ICA, M1 or basilar occlusions are counted as ‘true positives’, will be suffering from distal vessel occlusions or intracranial haemorrhage. These patients could also benefit from the frequent colocalisation of vascular and neurosurgical expertise at EVT centres. In health systems that can accommodate the added patient volume, there may be little downside to overtriage, especially where thrombolysis door to needle times are faster at EVT centres than non-EVT centres. Though controversial, map-based modelling studies analysing the USA and Canada suggest that prehospital bypass of non-EVT centres may benefit even thrombolysis eligible suspected LVO patients unless local non-EVT centres can provide door to needle times of 30 min or less.35
When examining the exclusion of M2s or addition of intracranial haemorrhages, it is important to note that changing the reference standard does not change patient-level test results and as a result does not affect patient-level triage decisions. For example, in each analysis, the number of patients with positive and negative CPSS =3 test results remains constant at 60 positives and 124 negatives (online supplemental tables 5–8). When the reference standard is changed from LVO stroke alone to LVO stroke and intracranial haemorrhage combined, the same patients still have positive and negative CPSS =3 test results, but the number of true positives, false negatives, false positives and true negatives changes. This in turn leads to different PPV, NPV, sensitivity and specificity results.
There are important limitations to our study. We only included EMS suspected stroke CPSS positive patients with LKW times of 6 hours or less that were brought to ZSFG over the course of 1 year. The prevalence of disease may vary by region, and it is not clear if scale performance would differ in the 6–24-hour stroke time window. At the time of our study, San Francisco was not using a prehospital triage system to divert patients with possible LVO stroke to EVT centres. However, it is possible that the Mission Protocol quality improvement effort at ZSFG, which included EMS outreach and education programmes, could have led to unmeasured changes in EMS routing patterns for suspected stroke patients that could bias the prevalence estimates described here.
The diagnostic performance portion of our study also excluded patients without NIHSS scores documented on ED arrival. This led to a disproportionate exclusion of patients with intracranial haemorrhage. This should be considered when interpreting our primary LVO stroke analysis and our exploratory analyses that included variations of LVO stroke and intracranial haemorrhage as combined reference standards.
Most importantly, our LVO prediction scales were retrospectively calculated from prospectively recorded NIHSS scores performed by neurologists in the ED rather than prehospital EMS providers prior to transport. While this allowed us to compare many scales at various thresholds, it does not represent the performance of scales as used by EMS providers in the prehospital setting. In addition, neurological examination changes can occur between EMS assessments and ED arrival.36 Our use of NIHSS scores also precluded us from testing promising scales such as the ambulance clinical triage for acute stroke treatment (ACT-FAST) algorithm due to its stepwise algorithm approach and the Los Angeles Motor Scale (LAMS) due to the absence of handgrip data.37 38 A recent multisociety consensus statement recommended maximum EMS travel times during LVO stroke triage that are tailored to urban, suburban and rural settings.39 However, there are no consensus PPV, NPV, sensitivity or specificity goals for prehospital LVO stroke triage. This led us to create PPV and NPV goals that we thought were clinically reasonable, given the prevalence of LVO stroke. We also considered a secondary analysis using an NPV goal of ≥99%. However, only the lowest alternative scale thresholds met this goal, and none did so with a specificity ≥25%.
Our study has several strengths. First, our use of consecutive prehospital stroke alerts over the entire study period allowed us to better estimate the prevalence of disease among suspected stroke patients identified by EMS providers in the prehospital setting. Second, we compared a wide range of LVO prediction scale thresholds rather than limiting our analysis to standard thresholds. In some cases, standard thresholds originally thought to be more sensitive than specific have subsequently been found to be more specific than sensitive.10 11 22 We propose that selecting thresholds that meet future consensus goals will be more valuable than a focus on standard thresholds. High NPV or sensitivity goals should likely be favoured for LVO stroke given the profound yet time-sensitive benefit of EVT as well as the association of transfers from non-EVT centres to EVT centres with worse outcomes.1 4 5 7
Finally, our study did not attempt to derive a new LVO prediction scale and instead focused on the external validation of multiple scales. Although our assessments occurred in an ED, all patients were identified as suspected stroke patients by EMS providers in the prehospital setting.
In conclusion, our data were consistent with others suggesting that the prevalence of LVO stroke among unselected prehospital suspected stroke patients is low. Prevalence is a key factor in the performance of LVO prediction scales. High alternative LVO prediction scale thresholds were required to meet a PPV goal of ≥50%, but these thresholds missed most LVO strokes. Including intracranial haemorrhages as true positives increased the number of scales that could provide PPVs ≥50%. Several standard thresholds and many alternative lower thresholds provided NPVs ≥95%, including CPSS ≥2, though false positives were common. EMS systems already using the CPSS as a binary initial stroke screen could also adopt it as a high NPV LVO triage prediction scale without having to incorporate an additional stroke scale. The limitations of these neurological examination-based tests support the need for further investigation of alternative approaches to prehospital LVO stroke identification, such as Mobile Stroke Units and portable LVO stroke diagnostic devices.34 In the meantime, implementation of LVO prediction scales still may benefit correctly classified patients more than prehospital systems that do not attempt prehospital LVO stroke triage.