My SciELO
Services on Demand
Journal
Article
Indicators
- Cited by SciELO
- Access statistics
Related links
- Cited by Google
- Similars in SciELO
- Similars in Google
Share
Revista Española de Salud Pública
On-line version ISSN 2173-9110Print version ISSN 1135-5727
Abstract
UTRA, Isabel María Barroso; CANIZARES PEREZ, Mayilée and LERA MARQUES, Lydia. The influence of data structure for selecting statistical analysis methods. Rev. Esp. Salud Publica [online]. 2002, vol.76, n.2, pp.95-103. ISSN 2173-9110.
In medical research, data is grouped either as per the design of the study or the selection of the sample. This structure must be taken into account in order to make correct estimates of the of the parameters and standard errors involved.. This study is of the methodological type and is aimed at illustrating methods for estimating population-related methods and regression models with grouped data. For this purpose, nine variables from the First National Risk Factor and Preventive Measure Survey conducted in Cuba in 1995 are employed. The prevalence of high blood pressure is overestimated by 15% when the conventional estimators are used as compared with the weight-based and adjusted analysis. In the regression models for the body mass index based on the conventional procedures, sex, degree of schooling, degree of sedentariness, smoking habit, diastolic and systolic blood pressure were found to be significant. However, when the method taking into account the structure of conglomerates was employed, the degree of schooling and sedentariness ceased to be significant. When the random intercept model was adjusted, the 91.3% total variability was found to be explained by individual variables, the 8.7% variability being attributed to larger units. When estimating population-related parameters based on conglomerate-structure data involving inconsistent selection probabilities, the use of sample-related weights and analysis methods that take in the correlation among subjects (potential) for one same conglomerate. When adjusting regression models, it is not only important to efficiently estimate the coefficients, but rather the focus (aggregated or disaggregated) must be taken into account for modeling the problem under study.
Keywords : Conglomerates; Databases; Random effects; Aggregated; Disaggregated.