Determining the Reliability of a New Method for Measuring Joint Range of Motion Through a Randomized Controlled Trial
Article information
Abstract
Objective
To compare the reliability and validity of the Korean range of motion standard protocol (KRSP) for measuring joint range of motion (ROM) with those of the conventional ROM measurement using a goniometer.
Methods
We conducted a randomized controlled trial involving 91 healthy elderly individuals. We compared two strategies of measuring joint ROM to evaluate the reliability and validity of each standardized protocol: first, the KRSP based on the Chungnam National University guidelines and second, handheld goniometric measurement. In the first strategy, 3 examiners (1 rehabilitation doctor, 1 physical therapist, and 1 physical therapy student) independently measured joint ROM in 46 randomly selected subjects; in the second strategy, another 3 examiners (1 rehabilitation doctor, 1 physical therapist, and 1 physical therapy student) measured joint ROM in 45 randomly selected subjects. The reliability of each protocol was calculated using intraclass correlation coefficient, ICC(2,1), and root mean square error (RMSE).
Results
Both protocols showed good to excellent intra-rater reliability. With goniometer use, the inter-rater reliability was low—ICC(2,1), 95% confidence interval ranged from 0.643 (0.486–0.783) to -0.078 (-0.296–0.494)— and RMSE was high. With the KRSP, the inter-rater reliability ranged from 0.846 (0.686–0.931) to 0.986 (0.972–0.994) and RMSE was low.
Conclusion
ROM measurements using the KRSP showed excellent reliability. These results indicate that this protocol can be the reference standard for measuring ROM in clinical settings as an alternative to goniometers.
INTRODUCTION
Joint range of motion (ROM) is the most common outcome used to evaluate the effect of treatment for musculoskeletal disease. Recognizing limitations in joint ROM is essential to clinicians for making diagnoses, evaluating improvement or deterioration in mobility, and determining functional limitations. Therefore, a reliable and valid measurement tool to objectively measure disease improvement or progression, outcomes, and mobility impairment is of importance [1].
The measurement methods and types of evaluation of ROM vary among clinicians and institutions, including the American Medical Association and McBride guidelines; however, there is no consensus [2-6]. For example, the measurement method for normal ROM provided by the Disability Act and National Pension Act of Korea is based on passive ROM, in which the examiner moves the patient and offers no information about measurement position, the site of applied force, and the degree of pushing force. In a study by Cho et al. [7] regarding ankle ROM, a significant correlation was found between ankle ROM and pushing force; however, ROM was measured using different techniques according to the knee joint position and sex. Therefore, it is crucial to establish a standard reference for joint ROM measurement considering the pushing force (site and amount) and measurement position.
Classically, a standard goniometer for measuring joint ROM is the gold standard in clinical settings because it is portable and relatively inexpensive. However, it has several limitations, making it difficult for clinicians to use. Clinicians need both hands to use a goniometer, making limb stabilization difficult. Thus, the risk of a high measurement error increases [8]. An inclinometer is a tool that uses the constant of gravity as a reference point to evaluate joint ROM. Digital inclinometers are easy to use, portable, and lightweight. However, they are expensive compared to goniometers. Moreover, to minimize the risk of measurement errors, examiners have to accurately and always set the zero point before use [9,10].
Different authors have reported a range of reliability of ROM measurements. Trijffel et al. [11] reviewed previous studies of inter-rater reliability for measuring passive ROM of the lower extremities using different measurement tools. They suggested that clinicians should be cautious when relying on results obtained through measurement of passive movements in joints for making decisions because of their low reliability. Therefore, it is necessary to establish standards protocols for ROM to achieve consistent and precise values based on the measurement method, position, site, and degree of force.
The aim of this study was to compare the reliability and validity of the newly developed Korean ROM standard protocol (KRSP) developed based on the Chungnam National University guidelines in measuring ROM with those of the conventional ROM measurement using a goniometer.
MATERIALS AND METHODS
We conducted a randomized controlled trial that included healthy elderly subjects recruited from the Department of Rehabilitation Medicine at Chungnam National University Hospital, Korea. A computer-generated randomization sequence was generated. The reliability of joint ROM was evaluated with the two measurement methods. The study was approved by the Chungnam National University Ethics Committee (No. 2016-07-014) and was performed in accordance with the protocol; all subjects provided written informed consent for participating in this study. However, the subjects did not know whether they were in the experimental or comparison group (Fig. 1). The KRSP is described in detail in Supplementary Tables S1–S5.

Flow diagram of the randomized controlled trial. LOM, range of motion; KRSP, Korean range of motion standard protocol.
Participants and sample size
A total of 101 healthy elderly volunteers aged ≥65 years were recruited in the initial validation study. Before measuring ROM, the body mass index of all subjects was measured, and radiology and laboratory studies were conducted to rule out diseases affecting ROM. Those with a history of orthopedic surgery due to fracture or other diseases; with limited joint ROM due to severe osteoarthritis or rheumatoid arthritis, congenital defects, structural disease, amputation, and central or peripheral nervous system injuries; who were unable to participate due to their poor general condition; or who had difficulty in voluntary decision-making due to cognitive impairment were excluded. Participants were randomly allocated to the KRSP group or goniometer group.
A sample size of 41 subjects with three observations per subject achieves 81% power to detect an intraclass correlation of 0.80 under the alternative hypothesis when the intraclass correlation under the null hypothesis is 0.65 using an F-test with a significance level of 0.05. Therefore, 101 participants were recruited in consideration of a 15% dropout rate.
Raters
Six examiners (2 rehabilitation doctors, 2 experienced physical therapists, and 2 physical therapy students) participated. Three examiners (1 rehabilitation doctor, 1 experienced physical therapist, and 1 physical therapy student) performed ROM measurements using the KRSP. The other 3 examiners used a goniometer for ROM measurement. All examiners reviewed the testing procedures of the KRSP and its use prior to actual testing. To prevent the occurrence of systematic differences between the examiners due to repeated testing, the sequence of the examiners was randomly determined. After a warm-up period and 3 familiarization trials for each measurement, which was the same in both groups, the testing procedure started at the subject’s dominant side. The interval between the measurements was at least 1 hour. Each examiner performed 3 additional measurements for further analysis of correlation. Data for each measurement were recorded by an independent recorder on separate data sheets so that the examiner was unable to view any measurements from previous encounters with each subject.
Procedure
KRSP group
ROM was measured according to change in the pushing force (upper limb: 0 kg, 2 kg of pushing force; lower limb: 0 kg, 4 kg). Zero kg of pushing force means full active ROM. All measurements were conducted according to the KRSP guidelines described in Supplementary Tables S1–S5.
Goniometer group
ROM was measured using a standard goniometer marked in increments of 1, with 2 adjustable overlapping arms. The measurement position was the same as that used in the KRSP guidelines. Only active ROM was measured because both the examiner’s hands were already occupied.
Statistical analysis
All statistical analyses were performed using IBM SPSS version 22 (IBM Corp, Armonk, NY, USA). Mean and standard deviation for ROM was calculated. All dependent variables demonstrated normal distribution (Kolmogorov-Smirnov test), and parametric tests were applied.
The independent t-test was used to compare interdevice differences. Relative reliability was assessed using intraclass correlation coefficient, ICC(2,1) and corresponding 95% confidence interval (CI). For intra-rater reliability, 3 trials for each measurement were compared. For inter-rater reliability, the mean of 3 trials was used, where the measurement from 3 independent examiners was compared to determine whether the particular instrument (in this case the inclinometer and goniometer) can be used with confidence and reliability. Absolute reliability was determined by calculating the root mean square error (RMSE) between measures and between examiners.
ICC interpretation was based on the guidelines by Fleiss, according to whom high (>0.90) values reflect excellent reliability, values between 0.80 and 0.89, good; between 0.70 and 0.79, moderate; and <0.70, low [1]. Acceptable reliability was set a priori at ≥0.70. A p-value of less than 0.05 was considered significant.
RESULTS
A total of 91 participants were enrolled. Among 101 volunteers, 5 were excluded: 2 due to surgery, 1 due to limited ROM, 1 due to cognitive impairment, and 1 failed to undergo the measurement. Another 5 did not complete the measurement. Participant characteristics are listed in Table 1. ROM measurements for all 6 examiners are listed in Tables 2 and 3. The average of the measurements taken by the two instruments by different examiners was calculated to investigate any significant differences. The independent t-test showed a statistically significant difference between the groups in all measurements except shoulder extension, abduction, wrist extension, and ankle eversion. The inter- and intra-rater reliability data for the two measurement modalities among all 6 examiners are reported in Tables 4–8.
Intra-rater reliability
Both groups showed good to excellent intra-rater reliability. In the goniometer group, ICC(2,1) (95% CI) ranged from 0.850 (0.775–0.906) for hip flexion found by examiner F to 0.984 (0.975–0.990) for elbow flexion found by examiner F (Table 4). RMSE ranged from 1.483 (hip extension found by examiner F) to 3.268 (hip flexion found by examiner D) (Table 7). In the KRSP group, ICC(2,1) (95% CI) ranged from 0.985 (0.977–0.991) for active hip abduction found by examiner B to 0.999 (0.999–0.999) for passive shoulder flexion found by examiner A (Table 5). RMSE ranged from 0.543 (passive hip external rotation found by examiner A) to 3.268 (active wrist flexion found by examiner A) (Table 7).
Inter-rater reliability
In the goniometer group, the inter-rater reliability was low and ICC(2,1) (95% CI) ranged from 0.643 (0.486–0.783) for hip external rotation to -0.078 (-0.296–0.494) for ankle dorsiflexion with knee flexion 0° (Table 6). Higher RMSE was observed, ranging from 5.650 (shoulder flexion) to 9.342 (shoulder flexion) (Table 8). The KRSP group showed excellent inter-rater reliability; ICC(2,1) (95% CI) ranged from 0.846 (0.686–0.931) for active plantar flexion with 90° knee flexion to 0.986 (0.972–0.994) for active shoulder abduction (Table 6). RMSE ranged from 1.431 (active hip internal rotation) to 2.993 (passive hip external rotation) (Table 8).
DISCUSSION
Accurate and consistent measurement of joint ROM is important for physical examination and disability evaluation. In the current healthcare scenario, accurate medical recording has become critical [9]. The ability of healthcare providers to effectively communicate a subject’s condition or improvement is significant. However, measurement methods are extremely diverse, with no consensus or guidelines regarding measurement tools, joint positions, and pushing force.
Therefore, in this study, we developed an innovative method for measuring joint ROM. A standardized protocol (KRSP) applied to each joint was completed through preliminary experiments and consultation from experts. Our randomized controlled trial involving healthy elderly participants displayed excellent reliability of the new protocol for measuring joint ROM. To our knowledge, this is the first study to perform a reliability evaluation in a randomized controlled trial.
Several factors affect joint ROM measurement. Extrinsic factors include type of measurement (active vs. passive), site, amount of pushing force in passive ROM, movement of adjacent joints or muscles, measurement instrument, and skill level of the examiner. Intrinsic factors include sex, obesity, soft tissue condition (muscle mass, tendon length, and tissue viscoelasticity), and physical activity (exercise and/or occupation) [7,9,12]. While intrinsic factors are difficult to control, extrinsic factors can be controlled using a standardized protocol. In this study, we developed a highly reliable ROM measurement protocol by defining the measurement tools, measurement method, amount of pushing force, and position of pushing force.
There have been several new attempts to accurately evaluate ROM measurements. Some studies have measured joint ROM using an optical motion capture system to obtain more objective and error-free values [13]. This device has excellent accuracy, but it is expensive and difficult to use and apply in clinical settings. More recently, new motion analysis devices such as inertial sensors and Kinect have been introduced [14-17]. However, skilled technicians are needed to operate these devices; more importantly, their reliability has not been proven. Several studies have reported acceptable reliability of ROM measurement using a smartphone application [18-22]. This method is easy to use and inexpensive but cannot control various compensatory movements that occur during joint measurements, which can lead to poor reliability in joints with multiplanar motion.
Regarding the KRSP, we tried to standardize the measurement methods. The measurement position, site of sensor attachment, and site of pushing force were clearly defined. Several straps and molded devices were fabricated to prevent motion of adjacent body parts. During shoulder rotation measurement, a fabricated acrylic panel was set to maintain shoulder abduction at 90°. In case of the hip joint, we faced many difficulties in controlling measurement due to the variety of motion directions and variables. During hip flexion evaluation, a pelvic strap was used to prevent lumbar flexion. When evaluating knee flexion, a hip strap was used to prevent the pelvis from lifting up. To evaluate ankle motion at 90° knee flexion, a rectangular acrylic plate with a string was prepared to fix the lower leg.
Another issue is that the inclinometer is attached to the patient’s arm, as was the case in our study, and both the examiner’s hands are free. Therefore, the axis of the patient’s arm can be fixed with one hand, and the other hand can deliver pressure to the arm while measuring passive ROM. This led to an excellent intra- and interrater reliability of the KRSP.
Several studies have investigated the inter-rater reliability of digital inclinometers for the measurement of joint ROM. Hoving et al. [23] reported ICCs of 0.11 to 0.80 in a study of shoulder joints of 6 healthy volunteers and 6 examiners using a digital inclinometer. A study by de Winter et al. [24] found ICCs of 0.28 to 0.90 for 155 patients with 2 examiners using a similar device. Our study showed high inter-rater ICC(2,1) ranging from 0.846 to 0.986. Regarding the intra-rater reliability of the KRSP, the average ICC(2,1) was 0.995 for the rehabilitation doctor, 0.994 for the physical therapist and 0.995 for the physical therapy student. Interpreting this with the results of inter-rater reliability, the KRSP showed excellent reliability, independent of the operator’s skill level, with high correlation coefficients for the averaged measures of all 3 examiners.
Previous studies have shown various ICCs for goniometers. For shoulder ROM measurement, Hayes et al. [25] reported inter-rater ICCs of 0.64–0.69 and intra-rater ICCs of 0.53–0.65 in standard goniometry. Shin et al. [19] reported slightly higher ICCs in standard goniometer among 3 examiners for both active (ICC range, 0.67–0.91; average ICC, 0.80) and passive (ICC range, 0.64–0.90; average ICC, 0.79) shoulder ROM. For elbow, wrist, knee, and ankle ROM measurement, authors have reported a moderate to high level of intra-rater reliability associated with the universal goniometer [26-31]. Rothstein et al. [26] showed that intra-rater reliability for knee flexion and extension and the elbow joints was high (ICC 0.91–0.99). Inter-rater reliability was also high (ICC 0.88–0.97), except for knee extension (ICC 0.63–0.70). Horger [30] found that measurement of wrist motion using a goniometer was highly reliable and that intra-rater reliability was higher than inter-rater reliability for all active and passive movements. In a study by Youdas et al. [28] focusing on the reliability of goniometric measurement of ankle ROM, in terms of the intra-rater reliability of measurements obtained with a goniometer, the ICC was 0.64 to 0.92 (median, 0.825) for ankle dorsiflexion and 0.47 to 0.96 (median, 0.865) for ankle plantarflexion. Inter-rater ICCs for measurements were 0.28 for ankle dorsiflexion and 0.25 for ankle plantarflexion. For the hip, a range of ICCs were reported. Herrero et al. [32] reported excellent (>0.80) intra-rater reliability and low (0.375 and 0.475) inter-rater reliability. Poulsen et al. [33] found moderate inter-rater reliability for goniometric hip ROM measurement.
Compared with previous studies, our analysis incorporated more examiners with varying skills and a larger sample size, further strengthening our results. The results of the studies by Youdas et al. [28] and Herrero et al. [32] are in agreement with those of our study regarding acceptable intra-rater reliability in repeated measures of joint ROM, independent of the skill level of the operator and low inter-rater reliability. However, we found that goniometer use had higher intra-rater reliability and poorer inter-rater reliability. The reason for the high intrarater reliability is that the examiners remembered every measurement, because there was not enough interval between measurements. Regarding the low inter-rater reliability; the potential sources of error include location of the axis of motion, inconsistent patient positioning and goniometric alignment, inability or disinclination of the patient to accurately reproduce active joint motion, and inconsistent identification or alteration of bony landmarks used to guide measurement [29].
Joint ROM measurement using a conventional goniometer should be made by the same examiner because considerable disagreement exists when measurements are made by different examiners. This disagreement could erroneously influence the clinical decision-making process. Due to lack of reliability of the conventional goniometer currently in use, clinicians should be cautious when interpreting results, especially those obtained by different examiners.
This study had some limitations. First, this study was performed with elderly participants. It is possible that more significant age-related effects would have become apparent if more participants were measured or if the cohort was younger. Second, this protocol is limited to the healthy participants studied, so the results cannot be extrapolated to other injured populations. Third, the intrarater relativity was relatively high. This may because the 3 sequential measurements were made at the same time in a day and the examiner memorized the measured values. If we evaluate ROM at longer time intervals, the intrarater reliability may be lowered, especially for measurements made according to a non-standardized protocol.
The newly developed highly reliable measurement method can be used as reference standard for ROM measurement in clinical settings. Moreover, standardized and consistent evaluations of ROM would be possible, regardless of examiners. Further studies for simplification as well as computerization of the developed protocol will make it more useful for application in clinical settings.
To conclude, ROM measurement using the goniometer currently in use showed poor inter-rater reliability and therefore cannot be used as an objective physical evaluation. ROM measurements according to the newly developed KRSP showed excellent inter-rater and intra-rater reliability. These results indicate that the KRSP can be the reference standard protocol for measuring ROM in the clinical setting as an alternative to a goniometer. Further research is required to enable better clinical application of this protocol.
Notes
No potential conflict of interest relevant to this article was reported.
Conceptualization: Ko H, Ahn SY. Methodology: Ahn SY, Ko H, Yoon JO, Cho SU, Park JH. Formal analysis: Ahn SY. Writing – original draft: Ahn SY. Writing – review and editing: Cho KH. Approval of final manuscript: all authors.
Acknowledgements
This research is supported by “The Foundation Assist Project of Future Advanced User Convenience Service” through the Ministry of Trade, Industry and Energy (MOTIE) (R0004840, 2019).
SUPPLEMENTARY MATERIALS
Supplementary materials can be found via https://doi.org/10.5535/arm.2019.43.6.707. Table S1. Measurement position. Table S2. Inclinometer measurement: sensor attachment, site of applying pushing force (upper limb). Table S3. Inclinometer measurement: sensor attachment, site of applying pushing force (lower limb). Table S4. Measurement protocol for upper limb range of motion. Table S5. Measurement protocol for lower limb range of motion.