What is the sensitivity and specificity of the Timed Up and Go test?

Oct 26, 2025 3 min read •

Geriatrics Physical Therapy Medical Testing

A 2014 meta-analysis revealed that the Timed Up and Go (TUG) test has a limited ability to predict falls in community-dwelling older adults and should not be used in isolation. A closer look at the data shows that what is the sensitivity and specificity of the Timed Up and Go test varies, with pooled results suggesting it is more effective at ruling in, rather than ruling out, falls in high-risk individuals.

Quick Summary

The sensitivity and specificity of the Timed Up and Go test vary depending on the population studied and the cut-off time used. Overall, it has moderate predictive ability, often with higher specificity than sensitivity for fall risk. Its accuracy is limited when used alone, especially for high-functioning older adults.

Key Points

Moderate Predictive Ability: Overall, the TUG test has a moderate ability to predict future falls in older adults, with its predictive value depending heavily on the patient's functional status and the specific cut-off score used.
Higher Specificity than Sensitivity: In community-dwelling older adults, pooled meta-analysis data shows a higher specificity (74%) than sensitivity (31%) for predicting falls at the $\geq 13.5$ second cut-off. This means it is better at correctly identifying non-fallers than fallers.
Limited Value for High-Functioning Individuals: The test is less useful for distinguishing between fallers and non-fallers in healthy, high-functioning older people and is more valuable for assessing lower-functioning individuals.
Varying Cut-Off Points: The optimal cut-off time for predicting falls varies widely across studies, ranging from 10 to over 30 seconds, depending on the population and testing conditions. The CDC now recommends a more rigorous 12-second cut-off.
Improved Accuracy with Additions: Instrumental versions (iTUG) using wearable sensors and dual-task versions that incorporate cognitive or manual tasks can increase the diagnostic accuracy by providing more objective data beyond the total completion time.
Not a Standalone Test: Due to the multifactorial nature of fall risk, the TUG test should not be used in isolation. Clinicians should incorporate it as part of a broader, multi-component assessment to achieve higher predictive accuracy.
Useful for Measuring Change: Despite its limitations in predicting future falls, the TUG test demonstrates good reliability and is a useful tool for tracking functional mobility changes over time, especially following an intervention.

Understanding Timed Up and Go Test Accuracy

The Timed Up and Go (TUG) test is a widely used clinical tool for measuring basic mobility and assessing the risk of falls, particularly in older adults. Sensitivity and specificity are key metrics for evaluating its diagnostic accuracy. Sensitivity refers to the test's ability to correctly identify individuals who will experience the outcome (e.g., falls), while specificity measures its ability to correctly identify those who will not. The predictive value of the TUG test is complex and varies significantly based on the patient population and the specific cut-off score used.

Sensitivity and Specificity in Community-Dwelling Older Adults

For community-dwelling older adults, a meta-analysis using the common cut-off point of $\geq 13.5$ seconds revealed a pooled sensitivity of only 31% and a pooled specificity of 74%. This means the test is more accurate at correctly identifying those who will not fall (high specificity) than it is at correctly identifying those who will (low sensitivity). A low sensitivity indicates that many individuals who are at risk of falls are missed by the test, leading to false negatives. This suggests the TUG is better for "ruling in" falls for individuals who test high, but poor for "ruling out" falls for those with a fast time.

The Influence of Functional Status and Population

The diagnostic value of the TUG test is not uniform across all populations. Research shows that its discriminative ability is affected by the functional status of the group being tested. For high-functioning older adults living independently, the test has limited usefulness for distinguishing between fallers and non-fallers. However, it holds more value for less-healthy, lower-functioning groups, such as those in institutional settings or with specific pathologies.

High-functioning populations: A meta-analysis found only a small mean difference in TUG times between fallers and non-fallers (0.63 seconds) in high-functioning cohorts.
Lower-functioning populations: In institutional settings, the mean difference between fallers and non-fallers was much larger (3.59 seconds).
Other populations: Studies on specific groups like COPD patients, cancer survivors, and colorectal cancer patients have reported varying sensitivity and specificity values, often with different cut-off times tailored to the specific cohort.

The Problem with Varying Cut-Off Points

Defining the predictive value of the TUG is complicated by the wide range of reported cut-off points across studies. Optimal cut-offs can vary from 10 to 32.6 seconds based on the population and study methodology. The commonly cited $\geq 13.5$ second cut-off, based on a smaller study, shows limited predictive accuracy in larger meta-analyses. The CDC now suggests a 12-second cut-off, though optimal values remain varied.

Comparing Standard TUG with Instrumented and Dual-Task Versions

Instrumented TUG (iTUG) uses wearable sensors to gather more detailed, objective data on sub-tasks like turning and gait speed, potentially improving accuracy. The dual-task TUG, which adds a cognitive or manual task, can also increase predictive value, especially for more active older adults.


Test Version	Primary Measurement	Predictive Value	Strengths	Limitations
Standard TUG	Total time to complete	Moderate, variable	Quick, easy to administer, inexpensive	Limited accuracy, especially in high-functioning populations
Instrumented TUG (iTUG)	Total time + detailed sub-phase metrics (e.g., turning speed)	Improved, but varies	More objective and sensitive data	Requires additional equipment and specialized analysis
Dual-Task TUG	Total time + performance during a concurrent task	Higher for physically active seniors	Challenges cognitive function and balance simultaneously	Requires careful selection of the secondary task to avoid undue stress

The Multifactorial Nature of Fall Risk

The TUG's variable accuracy underscores that falls result from multiple factors. It primarily assesses mobility and balance, making it insufficient as a standalone fall risk assessment tool. A multifactorial screening approach, integrating the TUG with other assessments, is recommended for a more comprehensive risk evaluation. For more information on complementary fall risk assessments, visit the CDC's STEADI initiative.

Conclusion

In conclusion, what is the sensitivity and specificity of the Timed Up and Go test is complex and lacks a single definitive answer. For community-dwelling older adults at the $\geq 13.5$ second cut-off, pooled data indicates higher specificity (identifying non-fallers) than sensitivity (identifying fallers). These values fluctuate significantly based on population, functional status, and the cut-off used. While quick and accessible, the TUG's predictive value for future falls is moderate at best and should not be used in isolation. It is best integrated into a comprehensive, multi-component assessment to more accurately identify fall risk.

Frequently Asked Questions

For the TUG test, sensitivity is its ability to correctly identify individuals who will fall (true positives), while specificity is its ability to correctly identify those who will not fall (true negatives). A meta-analysis found a higher pooled specificity than sensitivity, suggesting it is better at ruling in high-risk individuals than ruling out low-risk ones.

For healthy, community-dwelling older adults, a TUG time of less than 10 seconds is generally considered a good, normal result. However, normative data varies by age group, with older individuals having slightly longer average times.

No, the TUG test is most effective in assessing mobility and fall risk in lower-functioning older adults or those with specific pathologies. Its ability to discriminate between fallers and non-fallers is limited in high-functioning, healthy older people.

Variations arise due to different study populations (e.g., age, health status), the specific cut-off times used to define high risk, and differences in testing conditions (e.g., type of chair, use of assistive devices).

While retrospective studies show an association between TUG time and a history of falls, its ability to predict future falls in prospective studies is limited to moderate at best. It is most accurate when used in conjunction with a multifactorial assessment.

The accuracy can be improved by using instrumented versions (iTUG) with sensors to analyze sub-task movements or by adding a dual-task component (cognitive or manual) to challenge the individual's balance and attention simultaneously.

The Centers for Disease Control and Prevention (CDC) recommends a 12-second cut-off to differentiate individuals at a higher risk of falls. However, research confirms that predictive values still vary.

References

Medical Disclaimer

This content is for informational purposes only and should not replace professional medical advice. Always consult a qualified healthcare provider regarding personal health decisions.