Test Diagnostics: Difference between revisions

No edit summary
(Categories added)
 
(31 intermediate revisions by 6 users not shown)
Line 7: Line 7:
== Purpose  ==
== Purpose  ==


The purpose of this page is to provide users of Physiopedia a quick reference to commonly used statistics in physical therapy practice.  These statistics are often used to describe the effectiveness of special tests in identifying specific disorders.  Knowing the diagnostic accuracy of special tests is important obtaining an accurate diagnosis, and in turn maximizing treatment outcomes.  
The purpose of this page is to provide users of Physiopedia a quick reference to commonly used diagnostic statistics in physical therapy practice and issues around evaluation of these statistics in clinical research. Diagnostic accuracy statistics are often used to describe the effectiveness of special tests in identifying specific disorders.&nbsp; Knowing the diagnostic accuracy of special tests is important in obtaining an accurate diagnosis, and in turn maximizing treatment outcomes. <ref name=":0">Fritz J, Wainner R. [https://pubmed.ncbi.nlm.nih.gov/11688591/ Examining diagnostic tests: an evidence - based perspective.] Phys Ther 2001; 81(9):1546-1564.</ref>


== Sensitivity  ==
== Diagnosis in Physical Therapy Practice ==
Physical therapists use the diagnosis of specific conditions to guide their treatment options. Through the physiotherpy assessment, clinicians gather data to evaluate and form clinical judgements. <ref name=":0" /> The diagnostic process begins with aquiring relevant data from the history and physical examination. Some data may be used to focus the examination on a specific part of the body, other to identify a specific pathology and some to select an appropriate intervention. <ref name=":0" />


Sensitivity is defined as the ability of a test to identify patients with a particular disorder.<ref>Sackett, D.L., Straws, S.E., Richardson, W.S., et al. (2000) Evidence-based medicine: How to practice and teach EBM.(2nd ed.) London: Harcourt Publishers Limited.</ref>&nbsp; In other words, it represents the proportion of a population with the target disorder that has a positive result with the diagnostic test<ref>Dutton, M. (2008). Orthopaedic: Examination, evaluation, and intervention (2nd ed.). New York: The McGraw-Hill Companies, Inc.</ref>.&nbsp; Tests that are highly sensitive are most useful for ruling out a disorder, as people who test negative are more likely not to have the target disorder.&nbsp; "'''SnNout'''" is an acronym that can be used to remember that a highly '''s'''ensitive test and a '''n'''egative result is good for ruling '''out '''the disorder in question.<ref>Flynn, T.W., Cleland, J.A., Whitman, J.M. (2008). User's guide to the musculoskeletal examination: Fundamentals for the evidence-based clinician. Buckner, Kentucky: Evidence in Motion</ref><br>
== Diagnostic Accuracy ==
Determining the diagnostic accuracy through the estimation of sensitivity and specificity of a test is the first step in the evaluation of a diagnostic test.<ref name=":1">Fardy J, Barrett B. [https://pubmed.ncbi.nlm.nih.gov/25694317/ Evaluation of diagnostic tests]. Methods Mol Biol 2015; 1281:289-300.</ref> This is accomplished by comparing the performance of the test in question with a reference or "gold" standard in a 2x2 contigency table. <ref name=":1" /> 
{| class="wikitable"
|+
<ref name=":0" />
!2X2 Table
!'''Reference test positive result'''
!'''Reference test negative result'''
|-
|'''Diagnostic test positive'''
|True positive results
A
|False positive results
B
|-
|'''Diagnostic test negative'''
|False negative results
C
|True negative results
D
|}
 
=== Study Design Considerations ===
The optimal study design for diagnostic studies is suggested to be the '''prospective cohort study'''; in this design, the prospective blind comparison of the test(s) and the reference standard in a sample of patients relevant to clinical practice can reduce possible study bias. <ref name=":0" />
 
Study bias refers to the successibity of study results to deviation from the truth in a consistent manner. <ref>Geddes J, Harrison P. [https://pubmed.ncbi.nlm.nih.gov/9703533/ Closing the gap between research and practice.] Br J Psyciatry 1997; 171:220-225.</ref> Other factors may also contribute to study bias such as the study population, the diagnostic test, the reference standard and these should be carefully considered when evaluating the results of a study. <ref name=":0" /> Fritz and Wainner <ref name=":0" />have summarised these issues in a tabular form which is presented below.
{| class="wikitable"
|+
<ref name=":0" />
!'''Study factor'''
!
|-
|Population
|Population should be representative of patients on whom test is used
|-
|Diagnostic test
|Intended purpose of test should be clearly defined
Test description in terms of procedures, performance and interpretation of results should be clear
 
The results of the reference standard should be unknown to examiners
|-
|Reference standard
|Relevant to intended diagnostic purpose
Condition of interest clearly defined
 
Applied consistently to all study participants
 
Independent of diagnostic test
 
The results of the diagnostic test should be unknown to examiners
|}
 
=== Overall Accuracy ===
The overall accuracy of a test is defined as the number of correct results divided by the total number of tests conducted i.e. (A+D)/(A+B+C+D). <ref>Greenhalgh T. [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2127365/ How to read a paper: papers that report diagnostic or screening tests.] BMJ 1997; 315(7107):540-543.</ref>It reflects the proportion of tests that are correct; however, because it does not distinguish false test results, it is considered of limited value. <ref>Bernstein J. [https://pubmed.ncbi.nlm.nih.gov/9314406/ Decision analysis.] J Bone Joint Surg Am 1997; 79:1404-1414.</ref>
 
=== Sensitivity ===
 
Sensitivity is defined as the ability of a test to identify patients with a particular disorder.<ref>Sackett D, Straws S, Richardson W, Rosenberg W, Haynes B. [https://journals.sagepub.com/doi/abs/10.1177/088506660101600307 Evidence-based medicine: How to practice and teach EBM.](2nd ed.) London: Harcourt Publishers Limited, 2000.</ref> In other words, it represents the proportion of a population with the target disorder that has a positive result with the diagnostic test i.e A/(A+C). <ref name=":2">Dutton M. Orthopaedic: [https://accessphysiotherapy.mhmedical.com/content.aspx?bookid=1821&sectionid=127518037 Examination, evaluation, and intervention (2nd ed.).] New York: The McGraw-Hill Companies, Inc, 2008.</ref> Tests that are highly sensitive are most useful for '''ruling out''' a disorder, as people who test negative are more likely not to have the target disorder.&nbsp; "'''SnNout'''" is an acronym that can be used to remember that a highly '''s'''ensitive test and a '''n'''egative result is good for ruling '''out '''the disorder in question.<ref name=":3">Flynn T, Cleland J, Whitman J. [https://academic.oup.com/ptj/article/88/12/1605/2742256 User's guide to the musculoskeletal examination: Fundamentals for the evidence-based clinician.] Buckner, Kentucky: Evidence in Motion, 2008.</ref>


For example, the [[Neers Test|Neers Test]] has been reported to have a sensitivity rating of 0.93 for detecting subacromial impingement.&nbsp; So, if the test is negative, the examiner can be confident that the patient does not have impingement.  
For example, the [[Neers Test|Neers Test]] has been reported to have a sensitivity rating of 0.93 for detecting subacromial impingement.&nbsp; So, if the test is negative, the examiner can be confident that the patient does not have impingement.  


== Specificity ==
=== Specificity ===
 
Specificity is the ability of a test to identify patients that do not have the disorder in question.<ref>Sackett D, Straws S, Richardson W. [https://www.amazon.com/Evidence-Based-Medicine-Practice-Teach-Straus/dp/0443062404 Evidence-based medicine: How to practice and teach EBM.(2nd ed.)] London: Harcourt Publishers Limited,2000.</ref> In other words, specificity is the proportion of the population without the target disorder who test negative for the disorder i.e D/(B+D).<ref name=":2" /> Therefore, tests that are highly specific are useful for '''ruling in''' a disorder.&nbsp; The acronym "'''SpPin'''" is commonly used to remember that a test with high '''sp'''ecificity and a '''p'''ositive result is good for ruling '''in''' a disorder.<ref name=":3" />


Specificity is the ability of a test to identify patients that do not have the disorder in question.<ref>Sackett, D.L., Straws, S.E., Richardson, W.S., et al. (2000) Evidence-based medicine: How to practice and teach EBM.(2nd ed.) London: Harcourt Publishers Limited.</ref>&nbsp; In other words, specificity is the proportion of the population without the target disorder who test negative for the disorder.<ref>Dutton, M. (2008). Orthopaedic: Examination, evaluation, and intervention (2nd ed.). New York: The McGraw-Hill Companies, Inc.</ref>&nbsp; Therefore, tests that are highly specific are useful for ruling in a disorder.&nbsp; The acronym "'''SpPin'''" is commonly used to remember that a test with high '''sp'''ecificity and a '''p'''ositive result is good for ruling '''in''' a disorder.<ref>Flynn, T.W., Cleland, J.A., Whitman, J.M. (2008). User's guide to the musculoskeletal examination: Fundamentals for the evidence-based clinician. Buckner, Kentucky: Evidence in Motion</ref><br>  
For example, the [[Hawkins / Kennedy Test|Hawkins-Kennedy test]] for subacromial impingement has been reported by some to have a specificty of 1.00, or 100%. A positive test result is very likely include those people who have impingement.  
=== Predictive Values ===
Predictive values reflect the proportion of patients with a positive or negative result that are correct results. <ref name=":0" /> These statistics are calculated  horizontally from the 2x2 table. The positive predictive value represents the proportion of patients with a positive test result who actually have the condition i.e. A/(A+B), whereas the negative predictive value refers to the proportion of patients who have a negative test result and not the condition i.e. D/(C+D). <ref name=":0" />


For example, the [[Hawkins / Kennedy Test|Hawkins-Kennedy test]] for subacromial impingement has been reported by some to have a specificty of 1.00, or 100%. A positive test result is very likely include those people who have impingement.<br>
Watch this video<ref>Cochrane Austria. Diagnostic Testing Accuracy. Available from: https://youtu.be/9a-d4d4UHD4 (accessed 28-5-2022)</ref> for a detailed discussion on the above statistics.  


== Likelihood Ratios  ==
{{#ev:youtube|9a-d4d4UHD4}}


Likelihood ratios are an index measurement that combines the sensitivity and specificty values of a specific test.&nbsp; Likelihood ratios can be used to gauge the performance of a diagnostic test, as it indicates how much a given diagnostic test will lower or raise the pretest probability of the target disorder.<ref>Dutton, M. (2008). Orthopaedic: Examination, evaluation, and intervention (2nd ed.). New York: The McGraw-Hill Companies, Inc.</ref> &nbsp;
=== Likelihood Ratios ===


*Positive likelihood ratio (+LR) is the proportion of people who test positive and actually have the disorder.&nbsp; In other words, +LR indicates the shift in probability that favors the existence of a disorder.<ref>Jaeschke, R., Guyatt, J.R., Sackett, D.L. (1994). Users guide to the medical literature. III. How to use an article about a diagnostic test. B. What are the results and will they help me in caring for my patients? JAMA, 27: 703-707.</ref> &nbsp; +LR is usually calculated by: +LR = Sensitivity / (1 - Specificity)  
Likelihood ratios are an index measurement that combines the sensitivity and specificty values of a specific test.&nbsp;Likelihood ratios can be used to gauge the performance of a diagnostic test, as it indicates how much a given diagnostic test will lower or raise the pretest probability of the target disorder.<ref name=":2" /> &nbsp;
*Negative likelihood ratio (-LR) is the proportion of people who test negative and who do not actually have the disorder.&nbsp; Or, a test with a -LR indicates the shift in probability that favors the absence of the disorder.<ref>Cleland, J. (2005). Introduction, orthopedic clinical examination: An evidence-based approach for physical therapists. Carlstadt, NJ: Icon Learning Systems, LLC.</ref>&nbsp; -LR is usually calculated by: -LR = (1 - Sensitivity)/Specificity
 
*'''Positive likelihood ratio (+LR)''' is the proportion of people who test positive and actually have the disorder.&nbsp; In other words, +LR indicates the shift in probability that favors the existence of a disorder.<ref name=":4">Jaeschke R, Guyatt J, Sackett D. [https://pubmed.ncbi.nlm.nih.gov/8309035/ Users guide to the medical literature. III. How to use an article about a diagnostic test. B. What are the results and will they help me in caring for my patients?] JAMA 1994; 27: 703-707.</ref> &nbsp; +LR is usually calculated by: +LR = Sensitivity / (1 - Specificity)
*'''Negative likelihood ratio (-LR)''' is the proportion of people who test negative and who do not actually have the disorder.&nbsp; Or, a test with a -LR indicates the shift in probability that favors the absence of the disorder.<ref>Cleland J. Orthopaedic clinical examination: An evidence-based approach for physical therapists. Carlstadt, NJ: Icon Learning Systems, LLC, 2005.</ref>&nbsp; -LR is usually calculated by: -LR = (1 - Sensitivity)/Specificity


<br>  
<br>  


{| cellspacing="2" cellpadding="2" border="2" style="width: 587px; height: 189px;"
{| style="width: 587px; height: 189px;" cellspacing="2" cellpadding="2" border="2"
|+ Interpretation of Likelihood Ratios <ref> Jaeschke, R., Guyatt, J.R., Sackett, D.L. (1994). Users guide to the medical literature. III. How to use an article about a diagnostic test. B. What are the results and will they help me in caring for my patients? JAMA, 27: 703-707.</ref>  
|+Interpretation of Likelihood Ratios <ref name=":4" />
|-
|-
| '''&nbsp;&nbsp;&nbsp; +LR'''  
| '''&nbsp;&nbsp;&nbsp; +LR'''  
Line 53: Line 117:
| Alter probability to a small and rarely important degree
| Alter probability to a small and rarely important degree
|}
|}
=== Statistical Significance and Confidence Intervals ===
Results of studies of diagnostic tests are commonly analysed with the chi-square statistic and significance level. <ref name=":0" />This tests the hypothesis that the test results and reference standard have no association and it should be interpreted in combination to the diagnostic accuracy estimates and their confidence intervals.
Confidence intervals (CIs) refer to the precision of the diagnostic accuracy estimates. <ref name=":0" />95% CIs are the most common, and indicate the range of values within which the population value would lie with 95% certainty. Wide CIs are not considered clinically important and thus, a diagnostic accuracy value may be questionnable if not precise. <ref name=":0" />
== Reliability of Tests ==
The evaluation of diagnostic tests does not stop in the determination of diagnostic accuracy. A test should also be reliable in order to provide consistent and useful information for clinicians. Reliability refers to the ability of a test to produce the same results on different occasions provided that the patient status has not changed. <ref>Batterham A, George K. [https://www.sciencedirect.com/science/article/abs/pii/S1466853X00900105 Reliability in evidence-based clinical practice: a primer for allied health professionals.] Phys Ther Sport 2000; 1(2): 54-62. </ref> Reliability is considered a precursor to other examinations of the performance of diagnostic tests but is better evaluated when enbedded in the study design of the diagnostic testing.<ref name=":0" />


== Resources  ==
== Resources  ==
[https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2517891/ STARD Statement for Reporting Diagnostic Accuracy Studies]


Oxford Center for Evidence Based Medicine Links:  
[https://www.youtube.com/watch?v=9a-d4d4UHD4 Diagnostic Testing Accuracy]  by Cochrane Austria


*[http://www.cebm.net/index.aspx?o=1043 Likelihood Ratios]
[https://dita.org.au/ DiTA Diagnostic Test Accuracy database] by PEDro 
*[http://www.cebm.net/index.aspx?o=1042 SpPin and SnNout]


== References  ==
== References  ==


<references />
<references />
[[Category:Special_Tests]]
[[Category:Primary Contact]]
[[Category:EBP]]
[[Category:Research]]

Latest revision as of 12:15, 10 April 2023

Purpose[edit | edit source]

The purpose of this page is to provide users of Physiopedia a quick reference to commonly used diagnostic statistics in physical therapy practice and issues around evaluation of these statistics in clinical research. Diagnostic accuracy statistics are often used to describe the effectiveness of special tests in identifying specific disorders.  Knowing the diagnostic accuracy of special tests is important in obtaining an accurate diagnosis, and in turn maximizing treatment outcomes. [1]

Diagnosis in Physical Therapy Practice[edit | edit source]

Physical therapists use the diagnosis of specific conditions to guide their treatment options. Through the physiotherpy assessment, clinicians gather data to evaluate and form clinical judgements. [1] The diagnostic process begins with aquiring relevant data from the history and physical examination. Some data may be used to focus the examination on a specific part of the body, other to identify a specific pathology and some to select an appropriate intervention. [1]

Diagnostic Accuracy[edit | edit source]

Determining the diagnostic accuracy through the estimation of sensitivity and specificity of a test is the first step in the evaluation of a diagnostic test.[2] This is accomplished by comparing the performance of the test in question with a reference or "gold" standard in a 2x2 contigency table. [2]

[1]
2X2 Table Reference test positive result Reference test negative result
Diagnostic test positive True positive results

A

False positive results

B

Diagnostic test negative False negative results

C

True negative results

D

Study Design Considerations[edit | edit source]

The optimal study design for diagnostic studies is suggested to be the prospective cohort study; in this design, the prospective blind comparison of the test(s) and the reference standard in a sample of patients relevant to clinical practice can reduce possible study bias. [1]

Study bias refers to the successibity of study results to deviation from the truth in a consistent manner. [3] Other factors may also contribute to study bias such as the study population, the diagnostic test, the reference standard and these should be carefully considered when evaluating the results of a study. [1] Fritz and Wainner [1]have summarised these issues in a tabular form which is presented below.

[1]
Study factor
Population Population should be representative of patients on whom test is used
Diagnostic test Intended purpose of test should be clearly defined

Test description in terms of procedures, performance and interpretation of results should be clear

The results of the reference standard should be unknown to examiners

Reference standard Relevant to intended diagnostic purpose

Condition of interest clearly defined

Applied consistently to all study participants

Independent of diagnostic test

The results of the diagnostic test should be unknown to examiners

Overall Accuracy[edit | edit source]

The overall accuracy of a test is defined as the number of correct results divided by the total number of tests conducted i.e. (A+D)/(A+B+C+D). [4]It reflects the proportion of tests that are correct; however, because it does not distinguish false test results, it is considered of limited value. [5]

Sensitivity[edit | edit source]

Sensitivity is defined as the ability of a test to identify patients with a particular disorder.[6] In other words, it represents the proportion of a population with the target disorder that has a positive result with the diagnostic test i.e A/(A+C). [7] Tests that are highly sensitive are most useful for ruling out a disorder, as people who test negative are more likely not to have the target disorder.  "SnNout" is an acronym that can be used to remember that a highly sensitive test and a negative result is good for ruling out the disorder in question.[8]

For example, the Neers Test has been reported to have a sensitivity rating of 0.93 for detecting subacromial impingement.  So, if the test is negative, the examiner can be confident that the patient does not have impingement.

Specificity[edit | edit source]

Specificity is the ability of a test to identify patients that do not have the disorder in question.[9] In other words, specificity is the proportion of the population without the target disorder who test negative for the disorder i.e D/(B+D).[7] Therefore, tests that are highly specific are useful for ruling in a disorder.  The acronym "SpPin" is commonly used to remember that a test with high specificity and a positive result is good for ruling in a disorder.[8]

For example, the Hawkins-Kennedy test for subacromial impingement has been reported by some to have a specificty of 1.00, or 100%. A positive test result is very likely include those people who have impingement.

Predictive Values[edit | edit source]

Predictive values reflect the proportion of patients with a positive or negative result that are correct results. [1] These statistics are calculated horizontally from the 2x2 table. The positive predictive value represents the proportion of patients with a positive test result who actually have the condition i.e. A/(A+B), whereas the negative predictive value refers to the proportion of patients who have a negative test result and not the condition i.e. D/(C+D). [1]

Watch this video[10] for a detailed discussion on the above statistics.

Likelihood Ratios[edit | edit source]

Likelihood ratios are an index measurement that combines the sensitivity and specificty values of a specific test. Likelihood ratios can be used to gauge the performance of a diagnostic test, as it indicates how much a given diagnostic test will lower or raise the pretest probability of the target disorder.[7]  

  • Positive likelihood ratio (+LR) is the proportion of people who test positive and actually have the disorder.  In other words, +LR indicates the shift in probability that favors the existence of a disorder.[11]   +LR is usually calculated by: +LR = Sensitivity / (1 - Specificity)
  • Negative likelihood ratio (-LR) is the proportion of people who test negative and who do not actually have the disorder.  Or, a test with a -LR indicates the shift in probability that favors the absence of the disorder.[12]  -LR is usually calculated by: -LR = (1 - Sensitivity)/Specificity


Interpretation of Likelihood Ratios [11]
    +LR     -LR                                  Interpretation
 > 10.0  < 0.1 Generate large and often conclusive shifts in probability
 5.0 - 10.0  0.1 - 0.2 Generate moderate shifts in probability
 2.0 - 5.0  0.2 - 0.5 Generate small, but sometimes important shifts in probability
 1.0 -2.0  0.5 - 1.0 Alter probability to a small and rarely important degree

Statistical Significance and Confidence Intervals[edit | edit source]

Results of studies of diagnostic tests are commonly analysed with the chi-square statistic and significance level. [1]This tests the hypothesis that the test results and reference standard have no association and it should be interpreted in combination to the diagnostic accuracy estimates and their confidence intervals.

Confidence intervals (CIs) refer to the precision of the diagnostic accuracy estimates. [1]95% CIs are the most common, and indicate the range of values within which the population value would lie with 95% certainty. Wide CIs are not considered clinically important and thus, a diagnostic accuracy value may be questionnable if not precise. [1]

Reliability of Tests[edit | edit source]

The evaluation of diagnostic tests does not stop in the determination of diagnostic accuracy. A test should also be reliable in order to provide consistent and useful information for clinicians. Reliability refers to the ability of a test to produce the same results on different occasions provided that the patient status has not changed. [13] Reliability is considered a precursor to other examinations of the performance of diagnostic tests but is better evaluated when enbedded in the study design of the diagnostic testing.[1]

Resources[edit | edit source]

STARD Statement for Reporting Diagnostic Accuracy Studies

Diagnostic Testing Accuracy by Cochrane Austria

DiTA Diagnostic Test Accuracy database by PEDro

References[edit | edit source]

  1. 1.00 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 1.10 1.11 1.12 1.13 Fritz J, Wainner R. Examining diagnostic tests: an evidence - based perspective. Phys Ther 2001; 81(9):1546-1564.
  2. 2.0 2.1 Fardy J, Barrett B. Evaluation of diagnostic tests. Methods Mol Biol 2015; 1281:289-300.
  3. Geddes J, Harrison P. Closing the gap between research and practice. Br J Psyciatry 1997; 171:220-225.
  4. Greenhalgh T. How to read a paper: papers that report diagnostic or screening tests. BMJ 1997; 315(7107):540-543.
  5. Bernstein J. Decision analysis. J Bone Joint Surg Am 1997; 79:1404-1414.
  6. Sackett D, Straws S, Richardson W, Rosenberg W, Haynes B. Evidence-based medicine: How to practice and teach EBM.(2nd ed.) London: Harcourt Publishers Limited, 2000.
  7. 7.0 7.1 7.2 Dutton M. Orthopaedic: Examination, evaluation, and intervention (2nd ed.). New York: The McGraw-Hill Companies, Inc, 2008.
  8. 8.0 8.1 Flynn T, Cleland J, Whitman J. User's guide to the musculoskeletal examination: Fundamentals for the evidence-based clinician. Buckner, Kentucky: Evidence in Motion, 2008.
  9. Sackett D, Straws S, Richardson W. Evidence-based medicine: How to practice and teach EBM.(2nd ed.) London: Harcourt Publishers Limited,2000.
  10. Cochrane Austria. Diagnostic Testing Accuracy. Available from: https://youtu.be/9a-d4d4UHD4 (accessed 28-5-2022)
  11. 11.0 11.1 Jaeschke R, Guyatt J, Sackett D. Users guide to the medical literature. III. How to use an article about a diagnostic test. B. What are the results and will they help me in caring for my patients? JAMA 1994; 27: 703-707.
  12. Cleland J. Orthopaedic clinical examination: An evidence-based approach for physical therapists. Carlstadt, NJ: Icon Learning Systems, LLC, 2005.
  13. Batterham A, George K. Reliability in evidence-based clinical practice: a primer for allied health professionals. Phys Ther Sport 2000; 1(2): 54-62.