Introduction
Maximum oxygen uptake (VO2max), lactate thresholds and running economy have been widely used to assess endurance and aerobic capacity in middle and long distance runners, and all related to athletic performance.1 However, these variables are time consuming and expensive in field settings still; indirect tests can be utilized to substitute these latter assessments. The utility of a test depends on its validity, accuracy and reliability (reproducibility). Validity can be assumed if a test represents accurately those features of the phenomena, which are aimed to describe, explain or theorise.2
Regarding accuracy, this is the degree of a test to measure the true value. Finally, reliability informs about reproducibility of a test and a procedure of repeated measures is used in order to calculate repeatability; so we can consider reliability as the degree to which an assessment tool produces stable and consistent results (also known as test-retest reliability). Both low reliability and accuracy may limit applicability and utility of field performance tests.
However, utility of field tests has commonly relied on construct validity, usually associated with the capacity of the test to estimate or be associated with laboratorial variables or clinical tests.3 In this sense, one of the most studied physiological constructs is VO2max, which determines the maximum aerobic capacity and should be related with endurance and long-term performance.4 Thus, several field tests have been created in order to obtain a valid and reliable estimation of VO2max. One of the first tests developed to estimate VO2max was Cooper's test, which is a simple time limit single-stage test, where athletes need to cover as many meters as possible during a 12-min all-out test.5 The VO2max estimated from Cooper and a multistage shuttle run tests has been strongly correlated in young healthy adults, which may confer a good concurrence at least for this population. The same study showed a good reliability (Φ: 0.96) and acceptable systematic error of 4.3% for maximal oxygen uptake prediction.6 However, the Cooper's test accuracy has not been still reported to date. Also, there are a lack of data of reliability and accuracy data in athletes.
Since, there is a lack of knowledge about the reproducibility (test-retest reliability) characteristics of field tests to estimate endurance capacity such us Cooper's test in long distance runners, it was our aim to analyze the reliability and accuracy of Cooper's test on amateur long distance runners over two repeated measures (test-retest).
Method
Subjects
Fifteen adult male amateur athletes (34.5 ± 1.9 years, and 3.7 ± 4.6 years of training) volunteered to participate in the study. All athletes were informed of the study characteristics, procedures and risks; afterwards a signed informed consent was obtained from those who decided to be enrolled. The Ethical Review Institutional Board (IRB) at the University of Malaga approved the research protocol.
Experimental procedures
Test-retest approach was used by repeating Cooper's test twice in a period of 48h. Reliability analysis was carried out in all variables obtained from the Cooper test such as distance, heart rate (HR) at the end of the test and the rate of perceived exertion (RPE). Two Cooper's tests split by 48h were carried out in a synthetic track of 400 m, and under similar meteorological conditions. Every day athletes followed thoroughly the same protocol: firstly, a 15-min running warm-up was performed at between 50 and 70% of the theoretical maximal HR (220-Age). Then, the original Cooper's test was executed; briefly, athletes were asked to run all-out during 12-min along the inner lane of the track; immediately afterwards a member of research team recorded the distance in meters by placing a mark exactly in the point where every athlete stood still. Also, the HR at the end of test was recorded by using a HR monitor Polar RS300X (Polar Electro, Finland), and the RPE using the 0-10 Borg scale was individually asked to each participant.7
Statistical analysis
The accuracy of total distance in Cooper's test, maximal HR and RPE were calculated by bias correction factor (Cb) from concordance correlation coefficient analysis. Absolute reliability was reported as the mean differences, coefficient of variation (CV), (√((Σ(test1−test2)2)/2N)), the standard error of the mean (SEM) and the effect size (ES) using the d coefficient of Cohen. For this study, an ICC < 0.50 was considered fair; from 0.50 to 0.75 was considered good and >0.75 excellent. Also, Cohen's d ES of 0.20 was considered small, 0.50 medium, and 0.80 large. The relative reliability was studied using the intraclass correlation coefficient (ICC) and relative CV (%CV, (CV/mean 100)). An agreement analysis was conducted to confirm systematic and proportional bias by using Bland and Altman plots8 and Kendall's Tau rank correlation coefficients.
Results
Statistical analysis of the anthropometric and training characteristics of the sample are reported in Table 1. In this sample, inter-subject variability for total distance covered was 10.9-11.8% for the distances of 1st and 2nd test respectively, which reflected the dispersion of the results around the mean of the population. The accuracy of Cooper's test was relatively high for distance (Cb = 0.994) and HR (Cb = 0.956) but low for RPE (Cb = 0.478).
Variable | Mean±SD |
---|---|
Weight (kg) | 67.3 ± 10.7 |
Height (cm) | 171.0 ± 6.8 |
Age (years) | 34.5 ± 1.9 |
Body mass index (kg/m2) | 22.9 ± 1.5 |
Training time (years) | 3.7 ± 4.6 |
Km/week (km) | 44.8 ± 9.8 |
No significant differences were found between test 1 and 2 either for total distance or HR. Additionally, our ICC results from test-retest data indicated that Cooper's test had a very good reliability for covered distance and HR (Table 2). Regarding RPE, we observed a good ICC, although a significant difference was found between RPE in the first and second test (P < 0.001, Table 2).
Reliability | Distance 1 (m) | Distance 2 (m) | HR1 (bpm) | HR2 (bpm) | RPE1 | RPE2 |
---|---|---|---|---|---|---|
Mean±SD | 3026 ± 330 | 3047 ± 359 | 182 ± 7.3 | 183 ± 5.7 | 8.7 ± 0.6 | 9.5 ± 0.5 |
Mean diff (95% CI) | 20.46 (−20.22 to 61.15) | 1.13 (−066 to 2.93) | 0.8 (0.48-1.11)* | |||
ICC (95% CI) | 0.99 (0.96-0.99) | 0.93 (0.80-0.98) | 0.68 (0.05-0.89) | |||
CV (CV %) | 52.2 (1.7%) | 2.4 (1.3%) | 0.7 (7.5%) | |||
SEM | 18.97 | 0.8387 | 0.1447 | |||
Cohen's d | 0.059 | 0.173 | 1.405 |
Data in the table are from two repeated all-out Cooper's test. 1 and 2 subscripts indicate first and second Cooper's test respectively. HR, maximal heart rate during the last minute of the test; SD, standard deviation; Mean diff, mean difference between first and second test; IC, interval of confidence; ICC, intraclass correlation coefficient; CV, coefficient of variation (CV (original units) = √Σ(test1 − test2)2/n; % cv = cv/mean × 100); SEM, standard error of the mean; RPE, rate of perceived exertion (scale from 0 to 10).
*P < 0.001, for paired sample T-test.
Agreement analysis from the Bland-Altman plots did not showed systematic error for both, distance (difference = −20.5 m, P > 0.05) or maximal HR (difference = −1.1 bpm, P > 0.05), neither proportional bias as confirmed by Kendall's Tau rank correlation coefficient between differences and mean of measurements (Fig. 1).
Discussion
The aim of this study was to perform a preliminary reliability and accuracy of the Cooper's test in amateur long-distance runners. Our data support a good reliability as suggested previously by other authors, who studied the reliability of Cooper's test in non-athletic samples.5,6 In spite of small differences between the two trials, CV of Cooper's test remained still around 52.2 m, although in relative units it was as low as 1.7%. This moderately high CV could be explained by the great heterogeneity of the athletic performance of the sample (range: 2350-3520 m trial 1 and 2275-3540 m trial 2), so the same absolute distance may represent similar percentages for high and low extremes in performance. In spite of the limitation, this may offer better generalization of our results since they included a larger range of performances and may highlight the bias of reliability data from a previous study where a more homogenous sample than ours was analyzed.5 Moreover, the ES of the differences was as low as 0.059 and the non-significant difference on covered distances between trials may indicate the good repeatability of this test.
Firstly, these results may be helpful for coaches and scientists when prescribing training load, reporting VO2max changes or predicting performance in order to interpret the variability of their outcomes. On the other hand, researchers could use these data in order to calculate sample size. This study does not lack of limitations, and our results could be biased by the intensity of test, so it can be argued that the athletes did not exercise at maximum or same effort in both trials. By using HR, the intensity of aerobic exercise test may be easily confirmed. In this study, all participants reached theoretical maximal HR values as predicted from age, which may suggest that both trials were performed all-1 out. In relation with heart rate reliability, it was also observed a CV was also observed among 4 and 3.1%, a low effect size of the difference (0.17), as well as very low absolute reliability for the maximal HR (1.13 bpm); all together these results suggest that trials 1 and 2 were similar in intensity. Additionally, RPE is a recognized marker of intensity and homeostatic disturbance during exercise and it is usually monitored during exercise tests to complement other dimensions of intensity.9 Garcin analyzed the reliability of the HR and RPE in progressive and constant intensity exercises, concluding that these variables are reliable and replicable in these exercises.10 Nevertheless, our results did not confirm this latter evidence and RPE had a low reliability as confirmed by the very large ES found (1.4). A plausible reason for this disagreement may be related with the poor experience of athletes in using this variable.
In conclusion our results showed that the Cooper's test is highly reliable when repeated after 48h as confirmed by HR and distance data. This study provided support for the Cooper's test as an accurate and reliable test to assess performance in a sample of amateur long-distance runners. Nonetheless, more studies are it must be necessary in order to validate performance-related constructs with Cooper's test to confirm its utility as training tool in field settings.