INTRODUCTION
Appetite is the action that covers the entire field of food intake, ranging from selection and motivation to preference (1). It is influenced by several psychological and physiological factors such as emotional states, hormones, epigenetics (2), menstrual cycle, age, and sex, among others (3).
Appetite can be measured through subjective appetite sensations (SAS) such as hunger, fullness, satiety, desire to eat, and prospective food consumption (4). It is possible to measure SAS through scales and one of the most relevant tools for this purpose are the visual analogue scales (VAS) (5). However, before using them, it is necessary to evaluate their repeatability (6) to assure consistency in their use when they are applied in more than one occasion to the same group of people (7,8). VAS have been developed and validated it in the English language (5), which makes it difficult to be applied in other populations. Therefore, it has been suggested to adapt them considering specific language and cultural context (9).
Moreover, owing the variability in SAS according to menstrual cycle and sex, it is important to consider these variables when VAS are used (10). The menstrual cycle, which normally lasts 28-32 days, consists of a follicular phase (FP) made up of 12 to 14 days, in which low concentrations of estrogens and progesterone are present; 1 day of ovulation in which the concentration of luteinizing hormone increases because of increased estradiol; and a luteal phase (LP) lasting between 12 and 14 days, characterized by high concentrations of estrogens and progesterone (10). Some studies had reported that during the FP hunger or food consumption is lower compared to LP (11 12 13-14), but other authors had reported less hunger in LP (3). Furthermore, Brennan et al. found that energy intake was different between the menstrual cycle phases. Participants reported less hunger in FP than in LP, and demonstrated that energy intake in young healthy women is highly repeatable during FP (11).
On the other hand, Gregersen et al. reported that appetite ratings were not influenced by BMI, diet or weight, but they differed according to age and sex, and women had significantly higher satiety and fullness ratings than men (3). To our knowledge, we have not found any study in the literature that studies repeatability in men and in women in the different phases of the menstrual cycle. Therefore, the aim of this study was to analyze the repeatability of VAS scores in men and in women in the follicular and luteal phases of the menstrual cycle. We hypothesized that men and women in LP and in FP would experience different SAS after intake of a standardized meal.
MATERIALS AND METHODS
STUDY POPULATION
Thirty-four participants were recruited using posters and flyers from August to December 2019. The study was carried out in the Instituto de Nutrigenética y Nutrigenómica Traslacional at Universidad de Guadalajara. We hypothesized that men and women at different menstrual cycle phase present differences in repeatability of subjective appetite sensations. The sample size was calculated with a statistical power of 80 %, according to the study of Horner et al., who considered a sample size between 36 and 73 to detect a 10 mm change in the VAS of a paired design (15). Inclusion criteria were men and women aged 18-25 years, with normal weight, with a regular breakfast habit. Women were included if they had a regular menstrual cycle (28-32 days). A woman was considered in the follicular phase (FP) when she was within the first 5-7 days of the onset of menstruation, and in the luteal phase (LP) when she was at 20-24 days after the first day of menstruation (10) as reported in a medical record. Exclusion criteria were subjects who followed a vegan or vegetarian diet, food allergies, elite athletes, being under treatment for weight loss, receiving medications that alter appetite, smokers, having respiratory symptoms, and women who were pregnant, breastfeeding or using contraceptives. Forty-two subjects were included in the study and signed the informed consent—5 of these did not meet all the inclusion criteria, and 3 abandoned the study in the re-test session. Finally, only 34 participants that completed all sessions were included. This study was approved by the Ethics and Biosafety Committees of the Health Sciences University Center of the Universidad de Guadalajara (Register number: CI-03619). All participants signed an informed consent, and all procedures were performed according to the Declaration of Helsinki (16).
ANTHROPOMETRIC MEASUREMENTS
Anthropometric measurements were performed after 10 h of fasting, without shoes and with light clothes. Height was determined using a stadiometer with a precision of 0.1 cm and a measuring range up to 205 cm (SECA® stadiometer, SECA GMBH & Co., Hamburg, Germany; model 213). Body composition was analyzed by electrical bioimpedance (Inbody 370, Biospace Co., Seoul, Korea, 250 kg capacity, 0.1 kg precision). Waist circumference was measured in the narrowest diameter between the last rib and the iliac crest using a Lufkin Rosscraft® tape (Lufkin Rosscraft® metal tape measure, NV, USA; model W606, range 0 to 200 cm, accurate to 0.1 cm).
VISUAL ANALOGUE SCALES
The VAS consisted of a straight horizontal line of 100 mm with the words “None” or “Not at all” located at the left end, and at the right end the words “Extremely” or “As much as I have ever felt”. Participants were asked to mark a transversal line with an ultrafine point pen (Bic crystal, 0.7 mm) between these two ends according to their appetite sensation in that specific moment. Quantification was done by measuring the distance from the left end of the line to the mark, and then a numerical value was obtained (1,17).
CROSS-CULTURAL ADAPTATION
To address the translation from English to Spanish, a linguist translator participated in the process. Besides, specialists in the area reviewed that the translation of VAS was logical, easy to understand, semantic, and conceptually equivalent, so the desired information could be collected (Fig. 1). After translation and back-translation, the research group reviewed them and evaluated the equivalence with bilingual and monolingual individuals. Subsequently, VAS were applied to a group of nine university students. They were asked whether the instructions were clear on each VAS to ensure their comprehensibility.
BREAKFAST DESIGN
The breakfast fixed meal consisted of a sandwich and simple water. The amount and type of ingredients were as follows: 2 slices of half-baked whole wheat, 1 ½ tablespoon of sour cream, 2 slices of turkey ham (40 g), 1 slice of red tomato, 1 leaf of lettuce, and 250 mL of simple water. The composition of breakfast was 267 kilocalories (42 % carbohydrates, 23 % protein, and 35 % fat). Energy and macronutrient content were analyzed with the Nutritionist Pro™ software (Axxya Systems, Stafford, TX, USA).
STUDY DESIGN
Participants were asked to attend the research unit in two different sessions (test and retest). They attended at 7:40 h with an overnight fast of 10 hours. They were asked to arrive by bus, car, or train and with minimum or no physical activity the day prior to the intervention. At 8:00 h the participants went to a white room with sufficient lighting at room temperature. Qualified nutritionists gave them instructions to fill the VAS. While participants were fasting (from 8:10 to 8:15 h), the VAS were filled. Then, subjects had 10 minutes to eat breakfast. Immediately after, the VAS were filled one more time. Four weeks later, subjects were asked to repeat the same process under the same conditions.
STATISTICAL ANALYSES
Quantitative variables were assessed for normal distribution with the Kolmogorov-Smirnov test and expressed as mean ± standard deviation (SD) or median and interquartile range. Comparisons between groups were performed with one-way ANOVA or the Kruskal-Wallis test. Differences between tests and re-tests were analyzed with the paired t-test or Wilcoxon's test. Repeatability was analyzed through the coefficient of repeatability (CR) and intraclass correlation coefficient (ICC). CR was calculated as CR = 2 × SD, where SD is the SD of the differences between paired data (18).
A Bland-Altman plot was also calculated. ICC was calculated using the two-way mixed model and absolute agreement (19). Excellent repeatability was concluded when ICC > 0.8, good when between 0.7 and 0.8, and moderate when between 0.6 and 0.7 (20). To compare ICC between groups we used the Cocor software version 1.1-3 (21). Data were analyzed using SPSS version 21.0 (IBM Corp., Armonk, NY). Graphics were created with Graphpad Prism version 8.3.1 (GraphPad Software, San Diego, CA).
RESULTS
GENERAL CHARACTERISTICS OF THE STUDY POPULATION
A total of 34 participants (50 % women) were enrolled with a mean age of 21.0 ± 1.4 years. Nine women were at FP and eight women were at LP of the menstrual cycle. All anthropometric variables, except BMI, were different between men and women but not between groups of women. Age was also no different between groups (Table I).
SPANISH ADAPTATION OF VAS
The cross-cultural adaptation of the Spanish version of VAS were easy to fill, comprehensible, well understood, and none of the scale forms remained incomplete or unanswered.
APPLICATION OF VAS
The change (∆) between post-breakfast and fasting VAS was compared between test and retest sessions and no differences were found. Besides, hunger, desire to eat, and prospective food consumption diminished after breakfast; fullness and satiety increased (Fig. 2).
REPEATABILITY OF VAS
Repeatability of the five VAS in fasting ranged at 30-49 in men, 27-37 in women FP, and 37-95 in women LP. In post-breakfast, repeatability ranged within 31-45, 29-52, and 20-75 in men, women FP, and women LP, respectively. When using the ICC, women LP displayed values less than 0.5 in most of the VAS; contrary, most of VAS in women FP and men showed ICC values greater than 0.7; but comparisons of these coefficients of correlation between groups, showed that only the ICC of hunger and desire to eat were different between men and women (Table II).
Finally, Bland-Altman plots are shown infigure 3 for post-breakfast hunger and satiety in the three groups. It was observed that in men one subject was outside of limits of agreement. Besides, in the LP group, the interval agreement for post-breakfast satiety scale was wider when compared with the other groups.
DISCUSSION
In the present study, the adaptation and repeatability of the five VAS used to measure SAS were assessed. The utilization of VAS allowed the evaluation of somatic sensory aspects in a practical and repeatable way, and we were able to translate them for their use in the Spanish language. One of the challenges is the adaptation of scales validated and implemented in other countries. In Spanish-speaking countries few studies related to the adaptation of SAS have been reported—for example, Ozório et al. adapted VAS for assessing appetite in the Portuguese language in Brazil. However, the population involved in that study were exclusively hospitalized cancer patients (22). Further, González-Antón et al. adapted VAS to the Spanish language in Spain to assess appetite sensations and glycemic response (23), which hinders an objective comparison with our study.
Among the parameters used to assess repeatability, the most widely used is the CR (18), but ICC has also been applied successfully to describe repeatability in satiety research (24 25-26) and it is recognized as a useful tool for an objective evaluation to classify repeatability as high or low (20). Importantly, the use of more than one indicator allows to strengthen the interpretation of results and increase the ability to do comparisons between studies (20).
In this study, the repeatability of VAS was assessed through the CR and ICC indicators in three different groups: men and women both in FP and LP. Regarding the repeatability of VAS in the group of men, it is important to note that although most authors have reported a lower repeatability in fasting scores (5,15,27), our study showed that CR values were very similar for fasting and post-breakfast, with a CR range from 30 to 49. The fullness scale was the most repeatable according to CR, as for both in fasting and post-breakfast in men. The CR for the fullness scale was similar to those reported by Flint et al. (27) and Raben et al. (28), and smaller than the one reported by Horner et al. during fasting (15). In general, the CRs of VAS of Horner et al. were higher than ours, probably because their study was done in subjects with overweight and obesity (15). Besides, according to the ICC interpretation (20), all applied scales in men (except satiety in fasting) in both fasting and post-breakfast, showed good repeatability.
Repeatability of VAS was also analyzed in women in FP or LP of their menstrual cycle. Regarding the FP group, the study by Tucker et al. also included healthy women in the FP (25). These results agree with ours as for the scales with the best repeatability being hunger and desire to eat, with an ICC between 0.72 and 0.67; nevertheless, our ICC values were higher. Tucker et al. also concluded that the scales were not repeatable because an ICC ranged from 0.18 to 0.40 (25). In our case, fasting and post-breakfast ICC values in this group of women achieved a good and excellent repeatability for most of them. About the CR in the study of Tucker et al., the scores of hunger, fullness, and desire to eat in fasting were similar to those found in this study; the exception was the CR of prospective food consumption, which was lower in this study. The repeatability of VAS to assess SAS in women in LP has not been reported previously.
It has been mentioned that interpretation of repeatability as strong or weak is very subjective (25). The CR scores mean that 95 % of the differences between a test and a re-test will fall within this value, that is, within the limits of agreement proposed by Bland Altman (18); but the ICC values can be classified in categories such as excellent, good, or moderate repeatability (20). In this study, we compared ICC between groups to determine if repeatability of VAS was different between them. In fact, repeatability of the five VAS in fasting and post-breakfast times was similar between men, women in FP, and women in LP. The only exception was fasting hunger and desire to eat between men and women in LP. Some authors have suggested that VAS in appetite research should be used only when women are in the follicular phase due to changes in energy intake and energy expenditure (25,27). Interestingly, no differences were detected in VAS repeatability between women in different menstrual cycle phases; nevertheless, more studies that corroborate our findings should be design. To our knowledge, this is the first study that report objective comparisons of VAS repeatability scores by sex or different menstrual cycle phases in women in appetite research.
When evaluating the repeatability of VAS, different methodologies are performed such as fixed meal (15) or meal with energy content according to individual needs (27), others give a standardized diet before the intervention (27), while others do not (25), and repeatability tests are made within different time intervals such 3-4 weeks (27) or only 7 days (15). Besides repeatability is calculated with different formulas (15,25). Therefore, it is important to establish standardized criteria to compare studies in a more precise way.
The limitations of this study were the lack of a biochemical markers of menstrual cycle phase and the age of participants due to all were young subjects with normal weight, which does not allow us to generalize the usefulness of this instrument. Future studies are required to assess the repeatability of VAS in the Spanish language populations, including a wider range of ages and BMI.
CONCLUSIONS
In conclusion, the adaption of VAS to assess SAS to the Spanish language was comprehensible and the questions were easy to fill and well understood. Repeatability of the VAS was similar between men and women in different stages of the menstrual cycle. Previously, no study had evaluated repeatability in women in the LP of the menstrual cycle and none of them had compared ICC between groups; therefore, further studies considering these characteristics and with larger sample sizes are needed to corroborate our findings.