Spine Segmental Assessment

Welcome to theEvidence in Motion Orthopaedic Manual Physical Therapy Fellowship Project. This space was created by and for the fellows in the Evidence in Motion Fellowship program. Please do not edit unless you are involved in this project, but please come back in the near future to check out new information!!

Original Editors - Mark Shepherd, Matt Haberl and Cheryl Sparks

Top Contributors - Mark Shepherd  


Reliability of the segmental palpation of the lumbar spine. This will include assessment of the lumbar spine and look at reliability for reproduction of stiffness, mobility, and pain. Provide summaries of articles that support or refute the use of lumbar segmental palpation. Create this page as a succinct and highly referenced position statement on the use of segmental palpation of the lumbar spine.


Current research has demonstrated on more than one occasion the benefit of manual therapy in addition to standard conservative care for improving patient outcomes and reducing pain in patients with low back pain.[1][2][3][4] Clinical decision making associated with manual therapy often relies on the manual assessment of the spine. Questionable levels of reliability in the manual examination however, have called some to question the utility of the manual assessment.[5] Current mechanisms behind the effects of manual therapy are not well understood, and there is a poor relationship between findings from the manual lumbar assessment and actual pathology.[6][7][8][9][5] Therefore, the purpose of this review is to summarize the current literature as it pertains to the reliability of the manual segmental examination of the lumbar spine and how it plays into the management of patients with low back pain.

Search Strategy

The PubMed MEDLINE database was searched for the period of 1980-2010 in the English language using keywords: motion palpation, reliability AND palpation, reliability AND spinal motion palpation, intervertebral motion, and validity AND motion palpation. The PubMed MEDLINE search engine was utilized using a basic search template. Selection of this database was determined to allow for the inclusion of multidisciplinary research articles that included physical therapy, chiropractic medicine, and osteopathic medicine. Articles were then cross-referenced from identified systematic reviews and published research articles’ references to complete the manual search. Research articles and systematic reviews have been included in this review if they reported on research related to evidence for identifying segmental mobility or pain provocation in the lumbar spine.

Evidence for Identifying Stiffness with the Manual Examination

As clinicians treating low back pain (LBP), specific assessments such as posterior to anterior (PA) motion assessments and hyper and hypomobility are used to determine an appropriate treatment. For example, the clinical prediction rule for identifying patients with LBP that may benefit from lumbar thrust manipulation includes segmental mobility testing; specifically, the rule indicates at least one or more hypomobile segments of the lumbar spine.[1][2] Abbott et al[7] in 2009 found that 98% of therapists surveyed base patient treatment choices at least partly on the basis of lumbar segmental motion assessment findings. Landel and colleagues[8] provide additional information regarding the intertester reliability of the PA examination in assessing intersegmental lumbar spine motion. The authors measured spinal mobility by a “PA force to a single vertebral spinous process in the prone position and judged as hypermobile, normal, and hypomobile” and defined lumbar segmental motion as “the difference between the intervertebral angles measured from the resting and the end-range images.”[8] Twenty-nine subjects under 45 years of age with complaints of non-specific low back pain were included in the data set. Grade IV PA forces, moving cranially from L5 to L1, were applied to each vertebral level. Dynamic interactive MRI was used to quantify lumbar segmental motion. The study found that intertester reliability for identifying the least mobile segment (hypomobility) among physical therapists was good: ICC= .69 (95% CI .53-.82), agreement of 82.8%.[8] Identification of the most mobile segment (hypermobility): intertester reliability is poor: Kappa= .29 (95% CI -.13- .71), agreement 79.3%. Furthermore, validity for identifying least and most mobile segments is poor: kappa .04 (95% CI -.16 - .24), agreement 24.1%, and for least mobile, Kappa .00 (95% CI -.09 - .08), agreement 27.6%.[8]

Huijbregts[9], in 2002, performed a systematic review of reliability studies for spinal motion palpation. The article mentions little statistical conclusions are made regarding this issue because results are noted in percent agreement instead of kappa values and do not correct for agreement based on chance. The author concludes: “intrarater agreement varies from less than chance to generally moderate or substantial agreement. Interrater agreement only rarely exceeds poor to fair agreement. Rating scales measuring absence versus presence or magnitude of pain response yield higher agreement values than mobility rating scales.” [9](pg. 36) Huijbergts goes on to suggest one possible explanation for higher intrarater, rather than interrater reliability, is the “raters might correctly identify the presence of the same segmental motion abnormality, but incorrectly name the segmental level at which this abnormality was found. Raters can be expected to be very consistent in their (in)correct identification of the spinal level; this will result in higher intra- rather than interrater reliability.”[9] (pg. 36)

Degenhardt et al[10] investigated the interobserver reliability of common osteopathic palpatory tests used to evaluate the lumbar spine. Subjects included in the study, (n = 119), were healthy volunteers with a mean age of 29 years. Board certified or board eligible osteopathic physicians based their diagnoses on relative agreement of the position of the transverse process compared to neutral. Flexion motion was assessed with subjects fully flexed in sitting, and extension motion was assessed in prone. Examiners underwent pre-training data collection assessments for tenderness, tissue texture and asymmetry in 3 planes (coronal, sagittal and transverse). This process included both a consensus phase and post-training to improve examiner technique in hopes to improve interexaminer reliability. The authors found measures of asymmetry in the transverse plane improved from pre-training 0.17, to post-training 0.34. However, the shift was from poor to fair reliability.[10]

Evidence of Pain Provocation as a Part of the Manual Examination

Hestboek and colleagues[5] reviewed studies from 1976-1995 regarding chiropractic tests reliability and validity. Within their review of primarily chiropractic research it was noted that “focusing on palpation for pain had consistently acceptable reliability values.” Other findings within this study demonstrated further chiropractic tests, including visual inspection, palpation for motion assessment, leg length inequality, muscle tension and misalignment were found unreliable and non-validated.[5] French and colleagues confirmed agreement of where to direct specific manipulation was not reproducible and variable between practitioners.[1]  These findings have caused practitioners to question the use and interpretation of the manual examination when used in isolation in determining treatment interventions.

When looking at the segmental examination in isolation, Seffinger and colleagues[2] performed a systematic review including studies by osteopathic physicians, chiropractors, and physical therapists. Twelve of 19 included studies that looked specifically at lumbar pain provocation tests, where 64% demonstrated acceptable reliability overall in patients referred for back pain.[2] The assessment procedure has commonly been described with the patient in prone as the practitioner places a PA pressure along the spinous process of levels L1-L5 and/or the sacral base. Patients are then instructed to report whether the PA pressure reproduces pain or not.[10][3][4][5][6]

Physical therapists incorporating prone PAs for the reproduction of pain in patients with suspected lumbar instability demonstrated moderate intra-tester reliability, (k = .57, agreement 82%).[5] Inter-rater reliability however was poor to moderate, and has been variable depending on the study and the patient population. Inter-class correlation levels have ranged from poor to moderate, (ICC 0.21 to 0.73) in patients seeking chiropractic care for low back pain. This was also found to be poor to moderate (0.67-0.73) in a study including physical therapists examining patients with non-specific low back pain.[5] Again, PA mobilizations were performed where the patient was instructed to rate their pain level on a scale of 0-10, however data was collapsed during statistical evaluation, combining pain scales similar to other studies.[3] Of the studies reviewed it was identified pain provocation at spinal levels L4-5 and S1 to be most reliable with the use of digital pressure.[2]

Despite poor to moderate reliability, prone PA’s have consistently been relied upon to identify hypermobile and painful segments, however a study by Fritz and colleagues identified there was no correlation between pain reproduction and instability on flexion and extension stress images in patients with suspected lumbar instability.[5] This has further been supported by Landel et al with the use of dynamic MRI.[8] These findings further suggest a painful lumbar segment with PA testing does not result in the lesion causing the pathology (i.e. lumbar instability). However, Hicks and colleagues[8] identified a sub-population of individuals who may benefit from stabilization exercises based on a cluster of symptoms including a positive prone instability test, straight leg raise greater than 90 degrees, noticeable aberrant motion during flexion, and age <40 years of age. Indicators of failure for patients placed on a lumbar stabilization program included noted segmental hypomobility, absences of aberrant motions, low fear and avoidance behaviors and negative prone instability test.[8] Teyhan and colleagues supported these findings identifying that individuals who met 2 of 4 factors for success or did not have 2 or more factors for failure did have significant changes in lumbar motion during video fluoroscopy indicating a role of the segmental examination in these individuals.[9]

Clinicians may not be able to accurately determine mobility secondary to lack of perceived stiffness with PA testing in patients with low back pain.[8] This is likely due to the extensive overlying soft tissue that can be the source of pain production, making it difficult to localize to the joint level. Extensive soft tissue surrounding the joint will likely deflect perceived isolation of forces directed to the intervertebral joint.[7] Maher and Adams, however identified there is fair correlation between pain and stiffness (r= 0.27-0.40).[3] Binkley and colleagues[4] went on to note clinical decisions based on pain provocation and stiffness could identify proper treatment to within 1.4 intervertebral levels between practitioners. Identification of hypomobility or hypermobility in conjunction with other clinical findings may assist in proper treatment classification of a patient with low back pain.[1][2][8]

Further argument has been made that reliability is improved with experience. Degenhardt et al reported this can demonstrate improvement with intensive training in students in osteopathic school who show low levels of reliability in early training.[10] These findings were found in subjects who were healthy volunteers electing to be examined and did not present for a formal examination of back pain. Reliability scores improved from poor to fair to substantial with training over 12 sessions in 3 months, (k =.32 to k = .68, p. 0.02.).[10]

When utilizing prone PA’s as a part of the manual examination one must recognize reliability may be limited between practitioners but can be increased when utilized with pain provocation. Most studies identified reliability and validity based on the production of pain or lack thereof. It appears beneficial to obtain subjective pain ratings to identify baseline levels of discomfort or irritability prior to intervention. While prone PA’s may not be as successful at determining instability in isolation, the examination can assist in clinical decision making processes when utilized in conjunction with other clinical findings and validated clinical prediction rules.

Conclusions- Clinical Bottom Line 

It is difficult to draw conclusions on the overall reliability for the lumbar manual examination due to the inherent variability in clinical practice patterns. The use of manual assessment plays an integral role in physical therapist practice, and there exist many questions surrounding its diagnostic utility. Based on the research included in this review, several general conclusions should be considered:

  • Segmental mobility assessment is a common tool used in clinical practice to assist in clinical decision making and treatment selection based on current treatment guidelines and clinical prediction rules.
  • Inter-tester agreement is good for determining lumbar segmental hypomobility when compared to lumbar segmental hypermobility in patients with non-specific low back pain.
  • When utilizing prone lumbar PAs for reproduction of pain (present or absent), intra-tester reliability is moderate, whereas inter-tester reliability is poor.
  • There is no correlation between pain reproduction and mobility or pain reproduction during diagnostic imaging suggesting a poor correlation between pain production with lumbar manual assessment (PAs) and lumbar pathology.
  • The segmental examination may provide valuable information to direct appropriate treatment when clustered with the patient’s overall symptomatology

One common conclusion among the studies reviewed was the need for more clinical research in this arena. Huijbregts in his 2002 systematic review on the reliability of spinal motion testing, leaves clinicians with these lasting thoughts: “What is the role of motion palpation in the examination of the patient? Does motion palpation provide the needed crucial information on location, nature, and direction of the spinal segmental motion abnormality as proposed by some authors or is it just one component of the complex of history, tests, and measures that direct our attention and determine our diagnosis, prognosis, and intervention?” [9](pg. 37)

It is important to remember that mobility tests may only be a piece of the patient’s full presentation. The manual exam possesses better utility for pain provocation, but lacks specificity to accurately identify or diagnose pain producing structures or pathology, as do most components of the examination pertinent to the lumbar spine.[8][5] Intra-rater reliability is superior to inter-rater reliability, and examination findings are thought to be more useful if judgments are simply based on a dichotomized scale, (i.e. the presence or absence of pain and/or the presence or absence of stiffness.) Manual therapy application has several supporting constructs and theories based on different schools of thought. Clinicians must be careful to understand the constructs studied when considering the relevance to their own practice. It is recommended that clinicians incorporate continuous patient feedback when manually assessing for pain, joint motion and resistance in order to corroborate pertinent findings from the manual exam and improve diagnostic utility.[3] The segmental examination provides useful information when provided with the assistance of other clinical exam findings to assist in appropriate decision making.


  1. 1.0 1.1 1.2 1.3 Flynn T, Fritz J, Whitman J, et al. A clinical prediction rule for classifying patients with low back pain who demonstrate short-term improvement with spinal manipulation. Spine (Phila Pa 1976). 2002;27:2835-2843.
  2. 2.0 2.1 2.2 2.3 2.4 2.5 Childs JD, Fritz JM, Flynn TW, et al. A clinical prediction rule to identify patients with low back pain most likely to benefit from spinal manipulation: A validation study. Ann Intern Med. 2004;141:920-928.
  3. 3.0 3.1 3.2 3.3 3.4 Fritz JM, Childs JD, Flynn TW. Pragmatic application of a clinical prediction rule in primary care to identify patients with low back pain with a good prognosis following a brief spinal manipulation intervention. BMC Fam Pract. 2005;6:29-29.
  4. 4.0 4.1 4.2 United kingdom back pain exercise and manipulation (UK BEAM) randomised trial: Cost effectiveness of physical treatments for back pain in primary care. BMJ. 2004;329:1381-1381.
  5. 5.0 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 Hestœk L, Leboeuf-Yde C. Are chiropractic tests for the lumbo-pelvic spine reliable and valid? A systematic critical literature review. J Manipulative Physiol Ther. Mosby, Inc.; 2000;23:258-275. Available from: http://linkinghub.elsevier.com/retrieve/pii/S0161475400901738?showall=true.
  6. 6.0 6.1 Bialosky JE, Bishop MD, Price DD, Robinson ME, George SZ. The mechanisms of manual therapy in the treatment of musculoskeletal pain: A comprehensive model. Man Ther. 2009;14:531-538.
  7. 7.0 7.1 7.2 Abbott JH, Flynn TW, Fritz JM, Hing WA, Reid D, Whitman JM. Manual physical assessment of spinal segmental motion: intent and validity. Man Ther. 2009;14(1):36-44.
  8. 8.00 8.01 8.02 8.03 8.04 8.05 8.06 8.07 8.08 8.09 8.10 Landel R, Kulig K, Fredericson M, Li B, Powers CM. Intertester reliability and validity of motion assessments during lumbar spine accessory motion testing. Phys Ther. 2008;88:43-49.
  9. 9.0 9.1 9.2 9.3 9.4 9.5 Huijbregts PA. Spinal motion palpatation: A review of reliability studies. Journal of Manual & Manipulative Therapy (Journal of Manual & Manipulative Therapy). 2002;10:24.
  10. 10.0 10.1 10.2 10.3 10.4 Degenhardt BF, Snider KT, Snider EJ, Johnson JC. Interobserver reliability of osteopathic palpatory diagnostic tests of the lumbar spine: Improvements from consensus training. J Am Osteopath Assoc. 2005;105:465-473.