Cross-Cultural Translation and Validation of the Thai Version of the Scale for the Assessment and Rating of Ataxia (SARA-TH)
Article information
Abstract
Objective
To culturally adapt the original English Scale for the Assessment and Rating of Ataxia to Thai (SARA-TH) and to evaluate the reliability and validity of the SARA-TH in assessing ataxia in acute ischemic stroke or transient ischemic attack (TIA) patients, as assessed by three healthcare professionals.
Methods
The SARA underwent translation and cross-cultural adaptation to Thai according to established guidelines. Reliability (e.g., internal consistency, intrarater reliability, interrater reliability) and validity (e.g., content validity, convergent validity) were assessed in a sample of 50 participants with ataxia after acute ischemic stroke or TIA. Spearman correlation analysis was used to examine the relationships between the SARA-TH and the Barthel Index (BI-TH), the National Institutes of Health Stroke Scale (NIHSS-TH), and the International Cooperative Ataxia Rating Scale (ICARS) to assess convergent validity. Interrater and intrarater reliability among experienced and novice neurologists, physiotherapists, and occupational therapists were assessed using weighted kappa.
Results
The SARA-TH demonstrated good comprehension and exhibited no significant floor or ceiling effects. It showed excellent internal consistency (Cronbach’s α≥0.776). Significant correlations were found between the SARA-TH score and the BI-TH score (rs=-0.743 to -0.665), NIHSS-TH score (rs=0.404–0.513), and ICARS score (rs=0.859–0.917). The intrarater reliability for each rater ranged from 0.724 to 1.000 (p<0.01), and the interrater reliability varied from 0.281 to 0.927 (p<0.01).
Conclusion
The SARA-TH has excellent internal consistency, validity, and intrarater reliability, as well as acceptable interrater reliability among health professionals with varying levels of experience. It is recommended for assessing ataxia severity in individuals following acute ischemic stroke or TIA.
INTRODUCTION
Ataxia is a neurological disorder characterized by deficits in the regulation and synchronization of voluntary muscle movements [1]. Ataxia commonly results from damage to the cerebellum or its connections, vestibular system, posterior column in the spinal cord or peripheral nervous system, which encompasses the neural pathways responsible for proprioception (joint positioning and movement information) transmission to the cerebellum. Symptoms and signs of ataxia correlate with the specific localization of lesions within the cerebellum or cerebellar connections, manifesting as disturbances in stance, gait, eye movements, muscle tone, skilled movements, and speech [2]. Cerebellar ataxia can be caused by multiple factors, such as cerebrovascular disease, infection, and genetic disorders [3]. A previous study in Hong Kong revealed that cerebellar ataxia significantly affects mobility, particularly gait ataxia, which is prevalent in up to 90% of cases and leads to falls and subsequent injuries. This impairment hampers patients’ occupational engagement, resulting in pronounced economic and social ramifications. The direct and indirect financial burdens associated with patient care and treatment over a span of six months amounted to HKD 146,832 [4]. To evaluate cerebellar ataxia accurately, employing a standardized measurement tool is essential to enhance the efficacy of rehabilitation interventions for patients with this condition.
The International Cooperative Ataxia Rating Scale (ICARS), a standardized assessment tool recognized for its high validity and reliability, has been employed extensively for evaluating patients with ataxia [5]. However, the assessment involves 19 items and typically requires 20–30 minutes to complete, rendering it impractical for daily evaluations. Consequently, the Scale for the Assessment and Rating of Ataxia (SARA), a semiquantitative evaluation tool, was developed to assess the level of impairment in individuals with cerebellar ataxia [6]. The SARA has demonstrated excellent reliability and validity for assessing ataxia [7], comprising eight items encompassing assessments of gait, stance, sitting, speech disturbance, finger chase, nose-finger test, fast alternating hand movements, and heel-shin slide test. Scoring on the SARA ranges from 0 to 40, with higher scores indicating greater severity of ataxia. The SARA has been translated or adapted into multiple languages, including Brazilian Portuguese [8], Japanese [9] Chinese [10], Korean [11], and French [12]. These versions have demonstrated strong internal consistency, along with a commendable level of accuracy and reliability [8-12]. Unfortunately, the Thai version of the SARA is unavailable.
Neurologists, physiotherapists (PTs), and occupational therapists (OTs) play crucial roles in utilizing the SARA assessment during patient rehabilitation. Neurologists conduct initial assessments for patients with ataxia, while PTs and OTs collaborate to design and implement rehabilitation programs aimed at restoring mobility and functional abilities. Therefore, this study aimed to culturally adapt the original English SARA to Thai. Additionally, we sought to evaluate the internal consistency, intrarater reliability, interrater reliability, content validity, and convergent validity of the Thai version (SARA-TH) for assessing ataxia in patients with acute ischemic stroke or transient ischemic attack (TIA), as assessed by three healthcare professionals. We hypothesized that the SARA-TH would demonstrate validity and reliability in assessing ataxia severity following a stroke by neurologists, PTs, or OTs.
METHODS
Permission was granted by Tanja Schmitz-Hübsch, the original author of the SARA. This study was divided into two stages: (1) linguistic translation of the original English version of the SARA into the Thai language and (2) tests of the reliability and validity of the SARA-TH.
Stage I: Linguistic translation of the SARA into the Thai language
The SARA-TH was translated and cross-culturally adapted according to the standard guidelines suggested by Beaton et al. [13]. An English lecturer with a PhD in linguistics and a neuro-PT with a PhD in physiotherapy and 15 years of experience, proficient in both English and Thai (with Thai as their first language), independently translated the original English version of the SARA into Thai, resulting in versions 1-A and 1-B. Subsequently, the translators, together with two researchers, a neuro-PT and a musculoskeletal PT, engaged in comprehensive discussions and prepared written reports before proceeding with back translation. Version 2 resulted from consensus and underwent back-translation into English by two proficient translators with proficiency in both English and Thai, independently yielding versions 3-A and 3-B. In the next step, an expert committee (consisting of four translators, one expert linguistic chair and 2 researchers) discussed the original questionnaire, all translated versions and written reports until they reached an agreement regarding the semantic, idiomatic, experiential and conceptual equivalences between the original and targeted versions. The prefinal version was tested on five patients who were not included in the data analysis via video by one neurologist and one neuro-PT (15 years of experience). No modifications to the questionnaire were needed.
Stage II: Reliability and validity of the SARA-TH
The study protocol was approved by the Institutional Review Board of Naresuan University (NU-IRB P1-0084/2564) and Thai Clinical Trials Registry (TCTR) number TCTR20240423001. All processes were performed in accordance with the declaration of Helsinki. All participants provided written informed consent prior to their participation.
Participants
From April 2022 to March 2024, consecutive ischemic stroke or TIA patients admitted to the Stroke Unit of Naresuan University Hospital were included in this study. A total of 50 individuals were enrolled, adhering to guidelines recommending a sample size of ≥50 participants for evaluating reliability and validity [14]. Inclusion criteria were patients aged 20 to 80 years who had been diagnosed with ataxia due to acute ischemic stroke or TIA. The diagnosis was confirmed by brain imaging (computed tomography or magnetic resonance imaging) [15] and assessed by a neurologist with over 10 years of experience. Ataxia was determined based on clinical manifestations, including gait, truncal, and limb ataxia [2], evaluated through tandem walking, finger-to-nose, and heel-to-shin tests. Additionally, participants needed to be able to follow at least a 2-stage verbal command [16] and provide written informed consent. The exclusion criteria included patients with other neurological conditions affecting movement (e.g., Parkinson’s disease or multiple sclerosis) and those with severe orthopaedic issues such as fractures.
Instruments
Participants were evaluated on all the following assessments.
(1) Barthel Index-Thai version
The Barthel Index (BI) is a reliable tool for assessing activities of daily living (ADL) poststroke [17] and consists of 10 items: feeding, bathing, grooming, dressing, bowel control, bladder control, toileting, chair transfer, ambulation and stair climbing. Each item is scored at 0, 5, or 10, yielding a maximum total score ranging from 0 to 100. Higher scores indicate a greater degree of functional independence. Notably, it has been proposed as a suitable outcome measure for stroke trials and clinical practice [18]. Moreover, the BI has been translated into Thai and has demonstrated favourable validity and reliability [19].
(2) National Institutes of Health Stroke Scale-Thai version
The National Institutes of Health Stroke Scale (NIHSS) serves as a comprehensive assessment tool that quantifies neurological deficits following a stroke. It includes 11 items assessing the level of consciousness, horizontal eye movement, visual fields, facial palsy, motor arm, motor leg, sensory function, limb ataxia, language (aphasia), speech (dysarthria), and extinction and attention (neglect). Ratings for each item are assessed on a scale of 0 to 4 points, where 0 denotes normal function, and provisions are made for untestable items. The total score ranges from 0 to 42, with higher scores reflecting increased severity [20]. This scale holds validity in predicting mortality following acute stroke and has been proposed for supporting decisions regarding stroke rehabilitation [21]. The NIHSS was adapted and validated as a Thai version (NIHSS-TH) and has been reported to be a reliable tool [22].
(3) ICARS
The ICARS consists of 19 items categorized into four subscales: posture and gait disturbance, kinetic function, speech disorders, and oculomotor disorders. Scores on the ICARS range from 0 to 100, with higher scores indicating more significant impairments [6]. The ICARS has demonstrated strong criterion validity, excellent reliability and adequate internal consistency [5].
(4) SARA
The SARA includes 8 items: gait, stance, sitting, speech disturbance, finger chase, nose-finger test, fast alternating hand movements and heel-shin slide. The total score ranges from 0 to 40, with higher scores indicating greater severity of ataxia [6]. The SARA showed strong concurrent validity with the BI [23] and robust construct validity by correlating significantly with the ICARS [24]. It also demonstrated excellent reliability [6] and adequate internal consistency for the original version [6] and the Chinese version [10].
Assessment procedures
Initially, the participants were assessed by the BI-TH and NIHSS-TH administered by a registered nurse who was trained to use those assessments and had more than 10 years of experience working in the stroke unit. Subsequently, a neurologist assessed participants’ ataxia using the ICARS on the same day as the BI and NIHSS-T assessments. Then, the participants underwent SARA-TH assessments in the afternoon of the same day, which were recorded via videotape. Six health professionals, comprising neurologists, PTs, and OTs, were stratified based on their professional experience: three raters had more than 3 years of experience in their respective fields (experienced), while the other three had less than 3 years of experience (novice). The three-year experience threshold was established based on the time required for a general practitioner to achieve Diplomate status with the Thai Board of Neurology. In the absence of specialized certifications for PTs and OTs in Thailand, we applied the same criteria used for neurologists to these professionals. Accordingly, the experienced neurologist, PT, and OT each have 12, 12, and 13 years of experience, respectively, since obtaining their qualifications. In contrast, the novice neurologist, PT, and OT have 2 months, 2 years, and 1 year of experience, respectively. These raters independently rated the SARA scores of each participant via video twice, with a two-week interval.
Statistical analysis
Statistical calculations were performed for 50 participants using the SPSS statistical package (version 17). Descriptive statistics (percentages, means and standard deviations) were used to describe the participants’ demographic characteristics. The validity and reliability of the test were then investigated as follows.
(1) Content validity
Content validity was evaluated by an expert committee panel in the translational stage. The established threshold for identifying floor or ceiling effects is 15% [25].
(2) Convergent validity
The correlations between the SARA-TH and BI-TH, NIHSS-TH, and ICARS were examined using the Spearman rank correlation coefficient (rho). The correlations were defined as strong (rho≥0.5), moderate (0.3≤rho<0.5) or weak (rho<0.3) [26,27].
(3) Internal consistency
The internal consistency of the SARA-TH was evaluated using Cronbach’s α coefficient. A Cronbach’s α value higher than 0.7 was acceptable [28].
(4) Intra- and interrater reliability
The intra- and interrater reliability of the SARA-TH were assessed using kappa statistics for 50 participants. The following criteria were utilized to interpret the kappa values: ≤0 indicated no agreement, 0.01–0.20 indicated no to slight agreement, 0.21–0.40 indicated fair agreement, 0.41–0.60 indicated moderate agreement, 0.61–0.80 indicated substantial agreement, and 0.81–1.00 indicated almost perfect agreement [29].
RESULTS
Participants
A total of 50 participants were enrolled. The descriptive characteristics of the participants and scores for all measurements are provided in Table 1. There were no missing data.
Content validity
The SARA-TH was translated both forward and backwards, followed by a thorough review by a committee of investigators to address and resolve any discrepancies. The administration of SARA-TH proceeded without any issues. There were no significant floor or ceiling effects noted for the SARA-TH. Two percent of participants assessed by PT2 and four percent assessed by OT2 achieved the lowest possible SARA-TH score.
Convergent validity
The SARA-TH scores, evaluated by six raters, exhibited significant negative correlations with the BI-TH, ranging from -0.743 to -0.665 (p<0.01). Additionally, positive correlations were observed between the SARA-TH scores and the NIHSS-TH scores, ranging from 0.404 to 0.513 (p<0.01). Moreover, a strong positive correlation emerged between the SARA-TH and ICARS, ranging from 0.859 to 0.917 (p<0.01). Detailed correlations between the SARA-TH and BI-TH, NIHSS-TH, and ICARS scores are provided in Table 2.
Internal consistency
The Cronbach’s α coefficients for the SARA-TH, which assess gait, stance, sitting, speech disturbance, finger chase, nose-finger test, fast alternating hand movements, and heel-shin slide across six raters, ranged from 0.776 to 0.823, as documented in Table 3. These measures represent satisfactory internal consistency.
Intrarater reliability
The reliability of individual assessments was examined using weighted kappa scores to measure intrarater consistency, as outlined in Table 4. Among experienced neurologists, PTs, and OTs (Neuro1, PT1, and OT1), there was an exceptionally high level of agreement, ranging from 0.804 to 1.000, indicating almost perfect agreement. Similarly, novice Neuro, PT, and OT raters (Neuro2, PT2, and OT2) displayed substantial to almost perfect agreement in their two ratings, with weighted kappa scores ranging from 0.724 to 0.948.
Interrater reliability
Interrater reliability was assessed for individual SARA-TH items by both experienced and novice neurologists, PTs, and OTs. Among experienced raters, agreement levels ranged from 0.431 to 0.927 (p<0.01), indicating moderate to almost perfect agreement. Between experienced and novice raters, substantial to almost perfect agreement was observed, with weighted kappa scores ranging from 0.281 to 0.747 (p<0.01), signifying fair to substantial agreement. Table 5 presents the weighted kappa scores for these comparisons.
DISCUSSION
This study aims to translate and cross-culturally adapt the original English version of the SARA into Thai and test its validity. Furthermore, it investigated the interrater reliability among interdisciplinary professionals engaged in ataxic stroke rehabilitation. Patients with TIA and ischemic stroke were enrolled to confirm the feasibility of using the SARA-TH for early rehabilitation in cases of hyperacute stroke. The findings suggest that the SARA-TH had satisfactory reliability and validity in ataxic stroke patients.
The findings found no significant floor or ceiling effects for the SARA-TH, suggesting good content validity. Moreover, the content validity was also verified by the expert committee during the translation process. The SARA-TH score exhibits a strong correlation with the BI-TH similar to the original version [6]. Consequently, the SARA-TH may offer utility in forecasting ADL functionality among ataxic stroke patients. Furthermore, similar to the original [30] and the Chinese [10] versions, the SARA-TH demonstrates a substantial correlation with the ICARS, a scale encompassing 19 items that necessitate a lengthier completion time [10]. These results affirm the use of the SARA-TH as a routine assessment tool for patients with ataxic stroke, potentially supplanting the need for the ICARS. However, the SARA-TH only demonstrates a moderate correlation with the NIHSS. This could be because the NIHSS evaluates a wider range of neurological deficits, including facial palsy, sensory function, and language difficulties, in addition to ataxia [20]. Therefore, although the SARA-TH is effective for assessing ataxia, it may not completely replace the NIHSS-TH for predicting the severity of overall neurological deficits. It is noteworthy that the SARA-TH showed only a moderate correlation with the NIHSS but a strong correlation with the BI. This differing trend can be explained by the International Classification of Functioning, Disability, and Health conceptual framework. Eight out of ten items on the BI (i.e., feeding, bathing, grooming, dressing, toileting, chair transfer, ambulation, and stair climbing) can be categorized into the activity-disability level, similar to four out of eight items on the SARA (i.e., gait, stance, sitting, and speech disturbance). In contrast, all items on the NIHSS can be categorized into the impairment level [31]. Therefore, it can be concluded that the SARA score represents not only patients’ impairments but also the activity-disability level of stroke patients. The Cronbach’s alpha coefficient of the SARA-TH was greater than 0.7, indicating that all items within the test measure a unified concept or construct [32] similar to the original [6], Korean [11], Brazilian [8], Chinese [10], and French [12] versions.
The intrarater reliability of SARA-TH was almost perfect agreement (ICC 0.724–1.000) similar to the original [6] and Korean [11] versions. This robust agreement may be attributed to the standardized patient assessment facilitated by a consistent original video. Each rater had the chance to observe the subject’s movements from a constant perspective during the initial assessment, reducing ambiguity in interpreting subsequent results. These findings align with a prior study [33] investigating intrarater reliability of movement performance in children based on video assessments.
The over all interrater reliability among three experienced health professionals showed substantial to nearly perfect agreement [28]. This suggests that ataxia in stroke patients can be effectively assessed using the SARA-TH among health professionals with different backgrounds. However, four specific items (finger chase, nose-finger test, fast alternating hand movements, and heel shin slide) exhibited moderate to substantial agreement among experienced neurologists, PTs, and OTs, and fair to moderate agreement among both experienced and novice health professionals in each field. These outcomes align with findings from the Japanese version study, which reported the lowest interrater reliability for the finger chase and fast alternating hand movements [9]. This suggests that accurately scoring the distance of abnormal movements of fingers or heels between raters can be challenging. The operational definition for the finger chase and nose-finger test items requires raters to distinguish between different distances, such as between 5 and 2 centimetres for the nose-finger test and more than 15 and less than 15 centimeters for the finger chase which can be tough to judge through observation alone. To boost reliability among raters, tools aiding in objectively measuring finger distance may prove essential. Similarly, accurately distinguishing the number of times the heel goes off the shin can also pose challenges in reliability between raters. Scoring the fast alternating hand movements item requires raters to pay attention to both the time taken and the performance of each participant. Without consensus on how to assess these two aspects, reliability between raters may be compromised.
The reliability of individual raters within themselves appears to be more consistent compared to the reliability between different raters. This suggests that it could be preferable to have one rater to assess SARA-TH in clinical or research settings. However, if multiple raters are involved, additional training in test administration and an exploration of interrater reliability are recommended to ensure consistent ratings. A Training Tool and Certification Program developed by the German Center for Neurodegenerative Diseases aims to enhance the quality of SARA assessments [34]. However, language proficiency, particularly in English, may limit accessibility to this program for some Thai raters.
While the SARA-TH demonstrates validity and reliability among health professionals, it does have certain limitations. First, our study participants may only represent stroke individuals who do not exhibit varying severity in sitting or speech items. Future research should investigate the psychometric properties of the SARA-TH in participants with different conditions. Second, this current study investigates the reliability between health professionals while assessing participants via video. Although we filmed all participants’ movements following the SARA-TH instructions from the same viewpoint to control assessors’ scoring, assessors had no chance to directly interact with participants. It would be interesting to investigate the reliability of the SARA-TH scoring participants in real-time. Last, this study solely examines the psychometric properties at a time point, emphasizing the necessity for further investigation into other properties such as predictive validity, discriminative validity, construct validity, or responsiveness in the future.
In conclusion, the SARA-TH was successfully translated into Thai, showing excellent internal consistency, validity, and intrarater reliability, as well as acceptable interrater reliability among health professionals with varying levels of experience. It is recommended for evaluating ataxia severity in individuals following acute ischemic stroke or TIA.
Notes
CONFLICTS OF INTEREST
No potential conflict of interest relevant to this article was reported.
FUNDING INFORMATION
This research was funded by Naresuan University, grant number R2566C036.
AUTHOR CONTRIBUTION
Conceptualization: Roongpiboonsopit D, Wiangkham T, Srisoparb W. Data curation: Roongpiboonsopit D, Srisoparb W. Methodology: Roongpiboonsopit D, Srisoparb W. Formal analysis: Isariyapan O, Srisoparb W. Funding acquisition: Roongpiboonsopit D, Wiangkham T, Srisoparb W. Investigation: Roongpiboonsopit D, Laohapiboolrattana W, Isariyapan O, Kongsuk J, Pattanapongpitak H, Sonkaew T, Termjai M, Isaravisavakul S, Wairit S, Srisoparb W. Resources: Wairit S. Project administration: Wiangkham T, Srisoparb W. Supervision: Wiangkham T. Validation: Roongpiboonsopit D, Laohapiboolrattana W, Wiangkham T, Pattanapongpitak H, Sonkaew T, Termjai M, Isaravisavakul S, Wairit S. Visualization: Wiangkham T, Srisoparb W. Writing – original draft: Srisoparb W. Writing – review and editing: Srisoparb W. Approval of final manuscript: all authors.
ACKNOWLEDGMENTS
All authors wish to express their gratitude to all participants and the physical therapy students who contributed to this study as research assistants.