Reliability and Applicability of the Bayley Scale of Infant Development-II for Children With Cerebral Palsy
Article information
Abstract
Objective
To obtain reliability and applicability of the Korean version Bayley Scale of Infant Development-II (BSID-II) in evaluating the developmental status of children with cerebral palsy (CP).
Methods
The inter-rater reliability of BSID-II scores from 68 children with CP (46 boys and 22 girls; mean age, 32.54±16.76 months; age range, 4 to 78 months) was evaluated by 10 pediatric occupational therapists. Patients were classified in several ways according to age group, typology, and the severity of motor impairment by the level of the Gross Motor Function Classification System (GMFCS). The measures were performed by video analysis, and the results of intraclass correlation (ICC) were obtained for each of the above classifications. To evaluate the clinical applicability of BSID-II for CP, its correlation with the Gross Motor Function Measure (GMFM), which has been known as the standard motor assessment for CP, was investigated.
Results
ICC was 0.99 for the Mental scale and 0.98 for the Motor scale in all subjects. The values of ICC ranged from 0.92 to 0.99 for each age group, 0.93 to 0.99 for each typology, and 0.99 to 1.00 for each GMFCS level. A strong positive correlation was found between the BSID-II Motor raw score and the GMFM total score (r=0.84, p<0.001), and a moderate correlation was observed between the BSID-II Mental raw score and the GMFM total score (r=0.65, p<0.001).
Conclusion
The Korean version of BSID-II is a reliable tool to measure the functional status of children with CP. The raw scores of BSID-II showed a great correlation with GMFM, indicating validity of this measure for children with CP on clinical basis.
INTRODUCTION
Cerebral palsy (CP) is a group of disorders due to impaired development of movement and posture, which are attributed to various non-progressive damages in the developing fetal or infant brain. Motor dysfunctions in CP are often accompanied by various problems involving sensation, cognition, communication, strabismus, perception, and behavior [1]. Recently, due to the conspicuous development of pediatric medicine, the survival rate of children with high risks, such as premature-born children or those with traumatic brain injury has increased [2]. When treating children with neurological or developmental disorders, the exact assessment of their current developmental status is essential for planning the strategy of therapy and for determining therapeutic efficacy. However, the functional implications of CP involves various developmental domains as listed above, that is, the gross motor domain, cognition, communication, perception, etc. It should be also noted that the severity of involvement varies in each domain. Moreover, the functional development process of children with CP does not always follow the routine developmental stages in normal children. Thus, it is difficult to assess variable status of CP exactly, and the issues involving the assessment of children with CP using a tool are reliability and clinical applicability.
Reliability indicates reproducibility of the same value through repetitive assessment. When reproducibility is measured by different raters, it is called "inter-rater reliability". It reflects the degree of standardization of the test and the ability of the raters to perform the evaluation correctly [3]. Therefore, confirming the reliability of a tool is mandatory before its use in practice. There are some reports about reliable measures of motor function and performances in children with CP [3,4]. However, these reports have focused on gross motor function only, which is related to ambulation ability. Few studies have reported the reliability of measurement tools in regard to other important aspects of CP including cognition, communication, perception, and fine motor function [5].
The Bayley Scales of Infant Development (BSID) is a world-wide tool used to evaluate the development of variable cognitive function as well as motor performances of infants and children [6]. The BSID was published on the basis of the California Mental scale and Motor scale that have been used since the 1930s. Through revision, the second edition was released in 1993, and is currently in use around the world [6]. This version was adopted and standardized as the Korean version of Bayley Scale of Infant Development-II (BSID-II) in Korea and has been used since 2004 [7].
The BSID-II itself was not purposed to diagnose impairments, but rather focuses on providing normative data to assess the current status by comparing with the norms [8]. For its applicability, only those with Down syndrome [9], prematurity [10], and prenatal drug exposure [11] were validated as the subjects in the original English version. The Korean interpretation version provided validity from just 14 disabled children without a specific diagnosis [7]. Although some studies used BSID-II for the assessment of the functional status of CP, no study revealed the validity of BSID-II application for CP subjects before enrollment [5,12-18]. Since the BSID-II was not developed to quantify the ability of children with CP with a wide range of motor and cognitive dysfunction [8], the scale needs to be validated for CP subjects prior to interpretation of the results. To validate a function measuring instrument, determination of its reliability and ability in assessing the purposed functional status is essential.
The goal of this study was to evaluate the reliability and ability of the function, i.e., clinical applicability, of the Korean version of BSID-II for children with CP. To confirm the reliability for a whole range of conditions, each stratified group was examined according to age, typology, and severity. To evaluate clinical applicability, the correlation between the raw scores of BSID-II and Gross Motor Function Measurement (GMFM), which is known as the touchstone of the functional evaluation of CP, was investigated. The correlation between the Mental and Motor scales was also assessed.
MATERIALS AND METHODS
This study was commenced after approval of the Institutional Review Board of CHA Medical Center, Republic of Korea. The parents of the participants provided written informed consent for this study before enrollment. Performance during each assessment of BSID-II was video-recorded, and each video record was used for BSID-II scoring by the other enrolled evaluators.
Subjects
The participants were children with CP who were receiving rehabilitation treatment from December 2010 to January 2011. The inclusion criteria were: a diagnosis of CP, who showed abnormalities in movement and posture, muscle tonus, and detectable brain lesion by imaging studies which correlated with physical impairment. The exclusion criteria were the presence of congenital anomalies or highly possible genetic syndrome and any medico-surgical condition affecting the analysis. The defining diagnosis of CP was made by a pediatric rehabilitation medicine doctor. By referring to Shoukri's study [19], the sample size was elicited. Ten occupational therapists conducted BSID-II assessment, and 68 children with CP (46 males and 22 females), whose function was scored under 42 months of age using both the Motor and Mental scales, participated in the present study. The subjects were classified by three kinds of criteria for further analyses. Firstly, the classification was made according to their chronological age as five subgroups: 0-12, 13-24, 25-36, 37-42, and more than 42 months. They were also classified into five subgroups by typology: spastic bilateral, spastic unilateral, dystonia, chorea athetosis, and ataxia [1]. Classification in terms of the severity of functional motor abilities was made according to the Gross Motor Function Classification System (GMFCS) from the least impaired as level I to the most severely impaired as level V [20].
Evaluation tools
All subjects were evaluated with BSID-II, GMFM-88, and GMFCS.
Korean version of the BSID-II
The Korean BSID-II is an evaluation tool for the assessment of the developmental status of individual children, which was standardized by Park and Cho [7]. The Korean BSID-II is used for children in the age range of 1 to 42 months, however, it is also applicable to children over 42 months of age with developmental delay aged if their function is below than their normal counterparts [7,8]. BSID-II consists of Mental, Motor, and behavior rating scales. The Mental scale provides a raw score, a developmental age of mental status, and a Mental Developmental Index (MDI); the Motor scale provides a raw score, a developmental age of motor related function, and a Psychomotor Developmental Index (PDI) [6,21]. The behavior scale is about the quality of patient behavior during the test [7]. The Mental scale was designed to assess mainly cognition through evaluation of sensory/perception, knowledge, memory, problem solving skills, and language. The Motor scale tests evaluate the ability to control gross muscle groups responsible for movements associated with crawling, sitting, walking, and jumping and tests fine motor manipulations involved in prehension, adaptative use of writing implements and imitation of hand movements [22].
GMFM-88 and GMFCS
The GMFM-88 is a criterion-referenced observational measure for the assessment of children with CP [23]; it consists of 88 items grouped into 5 dimensions: lying and rolling; sitting; crawling and kneeling; standing; and walking, running, and jumping. The scale was proposed to quantitatively evaluate gross motor function. Score for each dimension is expressed as a percentage of the maximum score for that dimension. The total score is calculated by averaging the percentage scores across the 5 dimensions, range from 0 to 100.
During the GMFM evaluation, GMFCS is also assessed. GMFCS was developed by Palisano et al. [24] to classify the degree of gross motor impairment of children with CP into five levels. The distinction between each level is based on the ability to move and the need of supporting devices.
Methods to evaluate inter-rater reliability of BSID-II
Ten pediatric occupational therapists who were well-educated in conducting BSID-II served as raters in this study. Three of them had experience exclusively in the pediatric setting for more than 8 years, 2 had 7 years, and 5 had 2 years of experience. Before the study began, they went through a training session for about 1 month, 3 days a week, for more than 4 hours a day by watching and scoring video recordings of actual tests by each therapist.
The testing time for one child was approximately 40 minutes to 1 hour. While each therapist carried out the BSID-II, the whole process was video-recorded by an assistant therapist who sorted the process into mental and motor parts afterwards. The other 9 therapists then assessed the same patient by watching the video recordings. The pediatric physical therapist conducted GMFM and GMFCS for the same patients within 7 days after the BSID-II exam.
In addition, the raters searched for the limitations of the Korean BSID-II in children with CP as complementary to the original version of BSID-II.
Statistical analysis
Intraclass correlations (ICCs) for inter-rater reliability were analyzed according to age groups, the typology of CP, GMFCS levels, and career of the raters. Correlation analyses were performed to evaluate the relationship between the GMFM total score and the raw scores of the Motor and Mental scales in BSID-II with the Pearson correlation coefficient or the Spearman rank coefficient according to the number of samples. The correlation between MDI and PDI was also obtained with the Pearson correlation coefficient or the Spearman rank coefficient. The correlation between GMFCS levels and the raw scores of the Motor and Mental scales in BSID-II were analyzed with the Spearman rank coefficient.
For the analyses, SPSS ver. 19.0 (IBM, Armonk, NY, USA) program in CHA Medical Center was used. For this study, ICCs below 0.75 were considered 'poor to fair', those above 0.75 were considered 'good', and above 0.90 'excellent' [25]. In terms of correlation, the coefficient r≥0.8 indicated 'high' correlation, 0.6-0.8 'good', 0.4-0.6 'moderate', and ≤0.4 'poor' [26].
RESULTS
Characteristics of population
Sixty-eight children with CP, 48 boys (70.6%) and 20 girls (29.4%) participated in the present study. Their mean age was 32.54±16.76 months (range, 8 to 78 months). Demographic data are presented in Table 1.
The mean raw scores of the BSID-II Mental and Motor scales measured by the raters are presented in Table 2.
Inter-rater reliability of BSID-II
The ICC values for inter-rater reliability of BSID-II scores assessed by 10 raters were 0.99 for the Mental scale and 0.98 for the Motor scale in all subjects, which was interpreted as having 'excellent' reliability (Table 3). The ICC values of BSID-II scores for each group divided by age, typology, GMFCS levels, and career of the raters also demonstrated 'excellent' reliability (Table 3).
Correlation between GMFCS and raw scores of BSID-II
The GMFCS levels had a significant negative correlation with both Motor scale (r=-0.86, p<0.001) and Mental scale (r=-0.60, p<0.001) raw scores of BSID-II.
Correlation between GMFM and raw scores of BSID-II
The total score of GMFM showed a positive correlation with both Motor scale (r=0.84, p<0.001) and Mental scale (r=0.65, p<0.001) scores of BSID-II in all subjects. Analysis according to typology revealed a high or good correlation between the GMFM total score and the Motor scale scores in the spastic bilateral and spastic unilateral subgroups (rs>0.77, p<0.001), while dystonia and ataxia did not (Table 4). A moderate degree and high degree correlation between GMFM and Mental scale scores was observed in the spastic bilateral and spastic unilateral subgroups, respectively. When analysis was conducted according to GMFCS level, GMFCS levels I, IV, and V showed a moderate to high degree correlation between the GMFM and both BSID-II Motor and Mental scale scores (rs>0.58, p<0.030) (Table 4).
Correlation between PDI and MDI of BSID-II
Of a total of 68 participants, 55 were younger than 42 months of age and were enrolled for PDI and MDI correlation analysis. A moderate degree of positive correlation between the PDI and MDI was shown in all subjects. The correlations were good to high in all the subgroups of typology, except for ataxia (rs>0.60, p<0.030) (Table 5).
Comments about considerable points of Korean BSID-II in evaluating children with CP with reference to the original version
In the research process, we discovered several possible inaccuracies or ambiguous terminologies in the Motor scale of the Korean BSID-II with reference to the original English version of BSID-II. In the 23rd, 29th, and 32nd items in the Motor scale, the given condition for scoring could be misunderstood as 'bench sitting' by describing it as 'sitting on a table' in the Korean version. While the original version said 'sitting on the bench surface', Korean version could be mean 'sitting mat surface'. In the 27th item, the description 'moves wrist' in the Korean version could include wrist flexion and extension, but in the original version, it is expressed as 'rotates wrist'. In the 67th item, 'If the child stands up after lying on the belly' is not a clear expression, whereas it was described as 'If the child rolls into the prone position before standing up' in the original version. In the 78th item, the height of rope for jumping in the Korean version is 5 cm lower than the original version's 8 inch, which might be a misprint of 15 cm. In the 91st item, 'manipulates pencil in hand' is represented as 'involvement of any fingers fulfills the score', while the original version accepts 'using only the fingers to position the pencil'. In the present study, we chose to follow the policy of the original version of BSID-II for such ambiguous items.
Four pivotal typing errors were found in the scoring system and have been reported to the press company.
DISCUSSION
In this study, inter-rater reliability of the Korean version of BSID-II was examined by ten raters, based on the scores from the video recordings of children with CP. The inter-rater reliability was found to be very excellent for Korean BSID-II in both the Motor and Mental scales when applied to children with CP. High values of ICC results were not affected by age, typology or severity. This shows significance because BSID-II enables quantitative observation of various abnormal functions of children with CP during their development in the absence of available tools, especially for cognitive function [27]. To date, only the GMFM has been validated specifically for children with CP and has been used as a standardized measurement for observing gross motor function [23,28]. In the present study, we examined the clinical availability of the Motor and Mental scales of BSID-II by analyzing the correlation with GMFM score. The results showed a good correlation between the Motor scale of BSID-II and GMFM in all subjects. The results induce similarity in the observational power of the BSID-II Motor scale with that of GMFM. The Mental scale also showed a correlation with the GMFM to a moderate degree.
Further subgroup analysis according to typology and severity revealed a strong correlation of the BSID-II Motor scale with GMFM in the spastic bilateral and spastic unilateral subgroups, and the GMFCS level I, IV, and V, which also showed positive correlations even with Mental scales. However, the GMFCS level II and III groups were different by showing no correlation. This can be interpreted by the idea that there are obviously different observational points which were brought by differences in the scoring point of the two independent systems which were accentuated in those groups. Although the GMFCS level II and II groups involved other typologies, the majority of them were spastic bilateral (Table 1). Considering their function level, they mostly corresponded to 'spastic diplegia' by the old CP classification that has better motor function in upper limbs and poorer motor function in lower limbs. Considering assessment points of each measure, it is understandable that the GMFCS level II and III groups showed no correlation of the BSID-II Motor scale with GMFM. The GMFM mainly assesses the function of the lower extremities whereas the BSID-II Motor scales consider the function of the upper extremities as one of the important determinant.
The Mental scale of BSID-II or MDI measures a broad range of cognition, including sensory/perception, language, and social skills [8]. We did not assess cognitive function with other instruments besides the BSID-II Mental scale. Thus, validation for cognition could not be fully provided in this study. However, considering the high reliability, its significant correlation with motor variables and the validity of BSID-II Motor scale which has the same scoring system, the BSID-II Mental scale can be suggested as a cognition evaluating tool for CP subjects.
All subject groups showed a moderate degree of correlation between MDI and PDI [26]. According to the subgroup analysis, the spastic unilateral subgroup showed a strong relationship between MDI and PDI. However, the number of subjects was too small to draw any conclusions. Good correlation in the spastic bilateral subgroup seems rational, because the severity of brain damage determines both motor and cognitive sequalae.
Taken together, BSID-II is a useful tool for the evaluation of children with CP who are under the developmental status for motor and other functions including cognition. Moreover, the scale is useful for the evaluation of children aged more than 42 months whose functions are lower than their normal counterparts. The overall correlation between BSID-II scores and GMFM was high (Table 4). Thus, this confirms the validity of the score system for children with CP.
In conclsion, inter-rater reliability of the BSID-II Motor and Mental scales by ten raters was shown to be very high. The BSID-II Motor scales showed a high correlation with the GMFM in all subjects. MDI and PDI were also well correlated. The results of this study show that BSID-II is a valid tool for the evaluation of various functions of CP.
Notes
No potential conflict of interest relevant to this article was reported.