Article Text

Download PDFPDF

Developing a testing battery for measuring dogs’ stifle functionality: the Finnish Canine Stifle Index (FCSI)
  1. Heli K Hyytiäinen1,
  2. Sari H Mölsä1,
  3. Jouni J T Junnila2,
  4. Outi M Laitinen-Vapaavuori1 and
  5. Anna K Hielm-Björkman1
  1. 1 Department of Equine and Small Animal Medicine, Faculty of Veterinary Medicine, University of Helsinki, Helsinki, Finland
  2. 2 Oy 4Pharma Ltd, Helsinki, Finland
  1. E-mail for correspondence; heli.hyytiainen{at}


This study aimed at developing a quantitative testing battery for dogs’ stifle functionality, as, unlike in human medicine, currently none is available in the veterinary field. Forty-three dogs with surgically treated unilateral cranial cruciate ligament rupture and 21 dogs with no known musculoskeletal problems were included. Eight previously studied tests: compensation in sitting and lying positions, symmetry of thrust in hindlimbs when rising from lying and sitting, static weight bearing, stifle flexion and extension and muscle mass symmetry, were summed into the Finnish Canine Stifle Index (FCSI). Sensitivities and specificities of the dichotomised FCSI score were calculated against orthopaedic examination, radiological and force platform analysis and a conclusive assessment (combination of previous). One-way analysis of variance (ANOVA)was used to evaluate FCSI score differences between the groups. Cronbach’s alpha for internal consistency was calculated. The range of the index score was 0–263, with a proposed cut-off value of 60 between ‘adequate’ and ‘compromised’ functional performance. In comparison to the conclusive assessment, the sensitivity and specificity of the FCSI were 90 per cent and 90.5 per cent, respectively. Cronbach’s alpha for internal reliability of the FCSI score was 0.727. An estimate of the surgically treated and control dogs’ FCSI scores were 105 (95 per cent CI 93 to 116) and 20 (95 per cent CI 4 to 37), respectively. The difference between the groups was significant (P<0.001).

  • physiotherapy
  • orthopaedics
  • cruciate ligament
  • dogs
  • rehabilitation
  • lameness

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

View Full Text

Statistics from


Cranial cruciate ligament rupture and subsequent osteoarthritis are two of the most common orthopaedic problems in dogs.1–4 In cranial cruciate ligament disease, physiotherapy is currently considered part of the treatment entirety, and it has been shown to be beneficial for surgically treated patients.1 5 Several approaches for the rehabilitation of these patients have been suggested, including hydrotherapy, electrotherapy modalities, manual therapies such as massage and passive range of motion exercises as well as active therapeutic exercises.1 2 5–8 Rehabilitation performed by a physiotherapist is based on clinical reasoning and a physiotherapy process,9 rather than on a fixed ‘recipe’-type of protocol. The physiotherapist collects and processes information, reaches an understanding of the problem and devises a plan; interventions are carried out, outcomes evaluated and the whole process is reviewed at the end of treatment.10 The therapy is based on constant re-evaluation of the patient, and the therapy plan may change accordingly. The main focus during physiotherapy is the functionality of the patient,11 which can be described as the dog’s ability to perform the activities of daily living, such as position changes (sit, lie down, walk) and performing around its normal habitat (thresholds, stairs, various ground surfaces).

In humans, several testing batteries have been developed and used to assess the progress and outcome of rehabilitation after anterior cruciate ligament reconstruction.12–18 This is especially important in sports medicine, where anterior cruciate ligament injury is most common and the therapist must be able to assess when the patient is able to return to sports.19 Small animal orthopaedics lacks specific testing batteries for assessing the functional outcome of stifle rehabilitation. Thus far, only individual methods have been used to evaluate the status of the patient. These include the use of goniometer to measure dogs’ joints’ passive range of motion,20 a tape measure in measuring dogs’ hindlimb circumference,21 22 bathroom scales to evaluate the static weight bearing (SWB) between hindlimbs23 and advanced equipment such as force platform to measure ground reaction forces and temporospatial values.24 By combining some of the methods into a testing battery, the overall functional performance level of the dog could be better evaluated.

Recently, individual evaluation methods sensitive to stifle problems were introduced and ranked.25 The ranked methods in order from first to last were: evaluation of hindlimb muscle atrophy and sitting position, measurement of SWB and stifle flexion, evaluation of lying position and thrust from sitting, measurement of stifle extension, manual assessment of SWB, evaluation of thrust from lying position, assessment of movement in stairs and of diagonal movement, measurement of tarsus extension and flexion and visual evaluation of lameness. The primary objective of the present study was to develop an indexed testing battery to be used in dogs with stifle problems by combining and scoring the evaluation methods that have previously been shown the be best in detecting stifle dysfunction. This testing battery could then be used to quantify the level of dog’s stifle dysfunction. The secondary objective was to test the sensitivity and specificity as well as the internal reliability of the completed testing battery. In addition, a cut-off value between ‘adequate’ and ‘compromised’ performance level was set for the testing battery’s total score.

Materials and methods

A written consent from owners was obtained. Inclusion criteria for the study group was a body mass over 17 kg and a unilateral cranial cruciate ligament deficiency that had been surgically treated at least one year before the study took place.

Exclusion criteria were bilaterally treated cranial cruciate ligament rupture, other known concomitant stifle problems (ie, patellar luxation, septic or immune-mediated arthritis) or owner-reported orthopaedic or neurological problems.

Inclusion criteria for the control group dogs was age one to eight years, no known orthopaedic problems, no lameness or signs of orthopaedic disease on orthopaedic examination and radiographic screening results that were free of hip dysplasia according to the Federation Cynologique Internationale screening protocol (grade A or B).

Items of testing battery

Eight items, reported to be the best and most sensitive of 14 previously studied physiotherapeutic stifle evaluation methods,25 were chosen in order to develop a new testing battery: the Finnish Canine Stifle Index (FCSI). Compensations in sitting and lying positions and symmetry of thrust of hindlimbs in getting up from sitting and lying positions were used as active items. Manual assessment of symmetry of muscles and measurement of symmetry in SWB between hindlimbs using bathroom scales and the measurement of stifle passive range of motion (PROM)(flexion and extension) with a universal goniometer25 were used as passive items. Active items were the ones where the dogs would perform the task itself, and the passive ones were the ones where the examiner would assess the factors without the dog’s contribution. Testing was performed on both hindlimbs. All evaluations were done by one tester (HH). The protocols of each item/task are described in detail in online supplementary appendix 1.

Supplemental material

Active items

In the active items, the quality of the performance and the number of compensations detected were the determining factors. Sitting position was scored using an ordinal scale of 0 to 3. Possible compensations detected were decreased flexion in the stifle and/or tarsus, external rotation of the limb, abduction of the limb, sitting on either hip or other severe compensations (any type of deviation from normal symmetrical positions of an animal due to pain or mechanical restrictions in the musculoskeletal system). In any of the listed positions, any noticeable difference to the ipsilateral limb’s position was considered as compensatory position. Scoring was executed as follows: 0=no compensations detected, 1=one of the above-mentioned compensations detected, 2=two of the above-mentioned compensations detected and 3=three or more of the above-mentioned compensations detected. Scoring of the lying position was done in a similar manner. In these two items (sitting and lying position), both limbs were evaluated individually.

Possible differences in the balance and amount of push between hindlimbs, when getting up to standing, was visually observed, and named as ‘symmetry of the thrust’. Symmetry of the thrust in hindlimbs from both sitting and lying positions was scored with one of two nominal variables: 0=adequate, 2=weaker thrust. The amount of muscle mass between hindlimbs including biceps, hamstring and quadriceps mass, was assessed by manually palpating the thigh circumference, and the findings were named as ‘symmetry of muscle mass’. Symmetry of muscle mass was scored in a similar fashion: 0=adequate, 2=less muscle mass. These items compared the weaker hindlimbs with the better one, and only the weaker limb was assigned a numerical result, while the stronger limb was marked as ‘adequate’. In case of asymmetry, the sides of the weaker thrust and the smaller muscle mass were scored as 2 and the contralateral limb as 0.

Passive items

Symmetry of hindlimb SWB was scored from 0 to 3 based on the percentage difference of SWB between hindlimbs proportional to the body weight of the dog. The cut-off values for the scoring of the SWB were calculated based on the average percentage difference between hindlimbs in the clinically healthy group ((|SWB (left limb)−SWB (right limb)|/dog’s weight)×100). A normal difference was determined to be 3.3 per cent+2.7 per cent (mean+sd),23 and by summing sds, the measured differences were divided into four categories: symmetry index between 0 per cent and 6 per cent=0, between 6 per cent and 8.7 per cent=1, between 8.7 per cent and 11.4 per cent=2 and greater than 11.4 per cent=3. Thereby, SWB also compared the weaker hindlimbs with the better one, and only the weaker limb was assigned a numerical result, while the stronger limb was marked as ‘adequate’.

PROM of the stifle joint flexion and extension were also scored from 0 to 3 based on the results of our clinically healthy dogs according to the limits presented in online supplementary appendix 2. Recorded values were divided into four categories (0–3), where 0 was given to dogs within the normal variation, 1 if the difference was within 2sds, 2 if the difference was within 3sds and 3 if the difference was above 3sds (see online supplementary appendix 2). Thus, flexion was scored: less than 51.7°=0, 51.7°−57.9°=1, 57.9°−64.2°=2 and over 64.2°=3. Extension in turn was scored: over 147.7°=0, 140.5°−147.7°=1, 133.3°−140.5°=2 and less than 133.3°=3. Thereby, the PROM evaluated both limbs individually.

Supplemental material

Final synopsis of scoring

All of the above-mentioned scores were summed for each limb separately, resulting in a total score of 0–21 per limb. For some of the items, both limbs received a score, although the test focuses on scoring primarily only one hindlimb, the weaker one. As all eight items might not be performed by each individual due to early stages of rehabilitation-related limitations or merely due to poor cooperation, the final sum of scores should be divided by the number of evaluations done and multiplied by 100 (sum of (item1+item2+… item8)/number of evaluations conducted×100), resulting in a total score from 0 to 262.5, rounded up to 263. As a consequence, a dog that performs poorly in some of the items and fails to perform all eight items would receive a higher total score than a dog performing poorly on the same items, but performing all other items with no weaknesses. The number of evaluations included, if other than eight, is given after the index in parentheses.

Statistical analysis

The difference between the surgically treated dogs and the clinically healthy group in FCSI score was analysed with one-way analysis of variance (ANOVA). The internal consistency between the eight evaluation methods of the FCSI score was evaluated using Cronbach’s alpha. To define a cut-off value between adequate and compromised performance, a receiver operating characteristic curve (ROC) analysis was done.

To evaluate the sensitivity and specificity of the testing battery, each dog’s FCSI score was compared with the results from four commonly used clinical evaluation methods used on the same cohort of dogs. The four methods have been reported previously in detail by Hyytiäinen and others.23–25 As the first method, an orthopaedic examination was performed by an experienced surgeon, including palpation of limbs and spine, lameness evaluation, evaluation of conscious proprioception and withdrawal reflex. Second, a radiological examination of osteoarthritic (OA) changes via mediolateral and craniocaudal views of the stifles, and in extended ventrodorsal view of the hip joints, was done. Third method of evaluation, a force platform analysis of five valid runs over the force plate measuring the peak vertical force and vertical impulse, was performed. And as fourth method, a conclusive assessment that was the clinical decision based on the combined results of the three aforementioned evaluations, was used. All statistical analyses were done using SAS System for Windows, V.9.3 (SAS Institute, Cary, North Carolina, USA) and R for Windows, V.3.1.0 (R Foundation for Statistical Computing, Vienna, Austria).


Forty-three surgically treated dogs were included into the study. Surgical treatment techniques included intracapsular (n=20), extracapsular (n=7) and osteotomy (n=16) including tibial plateau leveling osteotomy (n=9) and tibial tuberocity advancement (n=7). All dogs had osteoarthritic findings in the surgically treated joint, and 11 dogs in both stifle joints, confirmed with orthopaedic and radiographic examination at the time of the study. Nineteen of the included dogs were males and 24 dogs were females, with an average age and a body weight (±sd) of 7.0±2.5 years and 37.6±9.4 kg, respectively. There were 15 labrador retrievers, 6 Rottweilers and 22 other medium-sized to large-sized breed dogs.

Twenty-one clinically healthy adult dogs from another study conducted simultaneously,25 were used as controls, and to set a range for scoring the SWB and PROM of a normal stifle joint. The control dogs included 7 males and 14 females. Their average age and body weight (±sd) were 3.2±1.6 years and 35.5±8.3 kg, respectively. Twelve of the dogs were labrador retrievers and 9 dogs were Rottweilers.

From a possible FCSI score range of 0–263, the mean (±sd) for surgically treated dogs was 105 (±43) (95 per cent CI 93 to 116) and for clinically healthy dogs 20 (±27) (95 per cent CI 4 to 37), with a significant (P<0.001) difference existing between the two groups (Table 1). When the final scores of the FCSI were dichotomised into ‘adequate’ or ‘compromised’ performance levels, a cut-off value of 60, based on the results of a ROC analysis, was set (Fig 1). Cronbach’s alpha for the internal reliability of the total FCSI score was 0.727.

FIG 1:

Receiver operating characteristic curve (ROC) analysis generating a cut-off value for the Finnish Canine Stifle Index (FCSI) testing battery.


Descriptive statistics and frequencies of the various testing methods

Sensitivity and specificity of the dichotomised FCSI score when compared with the conclusive assessment were 90 per cent and 90.5 per cent, with orthopaedic examination 88.4 per cent and 90.5 per cent, with vertical impulse 76.2 per cent and 45 per cent, and with peak vertical force 75 per cent and 46 per cent, respectively. Sensitivity compared with stifle radiographs and with stifle and hip radiographs together was 87.8 per cent. Specificity compared with radiographs could not be calculated, as the clinically healthy dogs were not radiographed. Both groups of dogs’ scores are presented in Fig 2.

FIG 2:

The Finnish Canine Stifle Index (FCSI) results of dogs with surgically treated cranial cruciate ligament deficiency (surgically treated) and of dogs with no known musculoskeletal problems (control). The limit of 60 between the groups of surgically treated and clinically healthy dogs yields a high sensitivity and adequate specificity.


The principal aim of this study was to generate a testing battery with a numerical index. The total numerical score would assign a specific numerical value to the change in a patient’s function when the testing battery would be used as a follow-up tool for example, after surgery. Moreover, the range of the total score would be classified into two categories, describing the level of performance as ‘adequate’ or ‘compromised’. This classification would be helpful in clinical situations, when estimating the need for further rehabilitation.

In this paper, the range of the index score was calculated to be 0–263. The differentiation between dogs with surgically treated cranial cruciate ligaments and osteoarthritic changes in their stifles and dogs with no known musculoskeletal problems based on FCSI results was explicit (Fig 2). The cut-off value between ‘adequate’ and ‘compromised’ was set at 60 based on the division of the FCSI results between dogs (Fig 2) and on the ROC analysis (Fig 1). Although the ROC analysis suggests that the optimal cut-off value would have been 68.75 or 56.25, both are impractical in use. As the value of 60 has equally good sensitivity and specificity, the authors suggest to use this. To further study the nature of the index, Cronbach’s alpha was measured for all items of the testing battery. The result of 0.727 is moderate because the items are derived from two different components, active and passive.26

In the measurements of SWB and PROM, the scoring scale was based on the results from clinically healthy dogs. The clinically healthy dogs’ extension of the stifle was within the range of values of normal dogs published in earlier studies, whereas the flexion joint angle the authors report was slightly larger (less flexion). This may be due to the fact that in our study two breeds, labrador retrievers and Rottweilers (PROM mean 51.7°−147.7°), were included, whereas previous publications have reported on only one breed at a time: labrador retrievers (mean 42°−162°), Greyhounds (mean 50.60°−144.72°) and German Shepherd dogs (33°−153°).20 27 28 If the results of previous studies are looked at in light of the FCSI’s PROM scoring, all healthy individuals of other breeds, except greyhounds, would get a PROM score of 0 (=over 147.7°) in both flexion and extension. Healthy Greyhounds would get a score of 1 (=140.4°−147.7°) in stifle extension (online supplementary appendix 2). As the sd (8.86°) and the extension range (127.50°−161.50°) reported for Greyhounds are rather wide,28 there is no reason to doubt the applicability of the limits presented in our study. However, although the limits presented here are mainly supported by previously published information,25 it should be noted that breed differences may have some effect on the definition of normal stifle PROM, and further studies with other breeds are warranted.

To test the sensitivity and specificity, and thereby, the criterion validity of the FCSI, orthopaedic, radiological and conclusive assessments were used,25 as they were considered to evaluate nearly the same aspects as the testing battery. In addition, they are the most common methods applied by veterinarians to evaluate the status of dogs with stifle problem. Both sensitivity and specificity of the FCSI were the weakest when compared with the results of the force platform analysis. With vertical impulse, the sensitivity and specificity were 76.2 per cent and 45 per cent, and with peak vertical force 75 per cent and 46 per cent, respectively. In comparison, with all other methods (orthopaedic, radiological and conclusive assessment) both sensitivity and specificity were above 87 per cent. The reason for the difference in sensitivity and specificity between the methods is unknown to us. An explanation may lie in the fact that these methods do not measure the same thing. Force platform in this case was used to measure dynamic peak vertical force and impulse in trot, but none of the items in the FCSI is directed to measure movement or forces generated through the limbs during trotting. Moreover, although one might question the use of a gait metric, as no gait metric is used in the FCSI itself, the ones used here are considered to be the golden standards in lameness evaluation, and so far in evaluation of functionality. They have been used in lack of better comparisons. Nevertheless, it is important to recognise that the FCSI merely offers one more view regarding the functionality of the dog, and its use should always be supported by other, already recognised objective measurement tools. High sensitivity can be considered desirable when developing a testing battery, as detecting a diseased animal is of importance. There is no harm done if a false positive (ie, dog with no problem with its stifle, but with a higher FCSI score) is more thoroughly evaluated, but if a false negative (ie, a dog with problems in its stifle, but with a low score) is left unnoticed, the consequences to that individual might be more harmful in the long run. In addition, with higher sensitivity than specificity, the outcome evaluation is not overly positive, but rather more stringent.

In the formula used to derive the total score of the FCSI, the divisor is the number of items performed. In a dog that does not perform all eight items and has faults in other performed items, the significance of the divisor is emphasised. Thus, the dog will get a somewhat higher total score than it would when performing all eight items or less than eight items with no faults. In other words, if the dog performs perfectly on six items, the omission of the two tasks reduces the divisor from eight to six, and any possible faults within those six performed items have a greater influence on the total score. Thus, the test does not disregard poorly performing dogs or penalise healthy dogs on items they fail to perform, and this may cause some error in the test results. On the other hand, if the dog’s test result indicates a stifle problem in some of the items, the missed items would more likely also have faults rather than a perfect performance. The formula has been designed to minimise the bias due to missing item results, although some level of bias due to absent information can never be totally excluded.

It is worth mentioning that the equipment needed when using the testing battery, a universal goniometer and two bathroom scales, is affordable and accessible to all clinicians, both veterinarians and physiotherapists. Furthermore, by performing the measurements in the same manner repeatedly, as described in the instructions of the testing battery (available at, the use of the FCSI itself is easy to standardise.

Despite surgical treatment of cranial cruciate ligament deficiency, progression of osteoarthritic changes in the surgically treated stifles is common.29 30 This information is further supported by our findings that all of the dogs in our study group had osteoarthritic changes and at least some dysfunction in their surgically treated stifles. Although the inclusion criteria for this study was unilateral surgically treated cranial cruciate ligament injury, orthopaedic and radiographic evaluations revealed/confirmed that many dogs had osteoarthritic changes bilaterally in their stifle joints. This is in accordance with the previous literature, where 40 per cent–50 per cent of dogs with cranial cruciate ligament deficiency have bilaterally diseased stifles.31 32 Therefore, having dogs with bilateral stifle problems in our study group corresponds well with the clinical situation.

The FCSI has been developed for unilateral stifle dysfunction, to limit subjectivity in treatment outcome assessment. As some of the items in the testing battery do give a score to both limbs, and as for example the cranial cruciate ligament disease is often bilateral, there can be a score to both limbs. However, only the more dysfunctional limb receives a full total score. In case of a dog with a bilateral problem, the therapist is aware of the situation and can put the results of the test into context. However, it is important to note that the FCSI has not been designed to be a diagnostic test, but a measure for change in outcome.

The major limitation of our study was the small number of participating dogs. In addition, problems in the front limbs and trunk were not considered. Even though the battery is solely targeted to the hindlimbs, an orthopaedic and physiotherapeutic evaluation should always include evaluation of the patient as a whole. The FCSI was tested in dogs weighing between 17.5 and 60.0 kg, and the testing battery’s usability for small breed dogs warrants additional studies. Moreover, the stifles of the clinically healthy dogs were not radiographed, and other possible orthopaedic issues that all dogs included in the study may have had, but of which the authors were unaware, may have affected the results of the FCSI. While this is considered a limitation of the study, it again mimics reality at veterinary practices, as dogs potentially have issues beyond the one being examined. Also, the surgical method used to treat cranial cruciate ligament deficiency may affect the outcome of the treatment.33 Several different surgical methods had been used to treat the study group dogs’ cranial cruciate ligament deficiencies. Although this may have caused differences in the outcome of the treatment in these dogs, the differences would not affect the conclusions of this study. Our aim was to study the testing battery with a study group resembling a clinical population as closely as possible. Thus, the individual treatment method was not of interest here, but the actual functional outcome in each individual was.

As a conclusion, a testing battery, the FCSI, for assessment of stifle dysfunction was generated. The FCSI could provide veterinarians and animal physiotherapists with an affordable clinical outcome measure, a tool that also will aid communication between physiotherapists and veterinarians. The quantitative result of the test is informative and can evaluate outcome after surgical treatment and indicate the need for further physiotherapy.


View Abstract


  • Funding This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Ethics approval The study protocol was approved by the University of Helsinki Ethics Review Board at Viikki Campus.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.