Introduction
Oral exams (also known as ' viva voce ') were the first method of evaluation of medical students [ 1 , 2 ]. Students were worthy of certification after they have pleased a jury that has tested their skills through questions about specific knowledge [ 3 ]. For many years, this method was regarded as of extreme importance, with Plato referring to it as a better way of communication/evaluation when compared to written evaluations, once that the written part only can remind someone who already has knowledge, while the oral part is capable to give away knowledge about the subject in question [ 4 ]. However, because of the increase in book production, the need for oral exams has decreased [ 5 ]. In the cardio-respiratory block of the 2nd grade of the medicine course (Universidade da Beira Interior) it is performed the only oral exam that is called 'fresh heart'. The oral exam was introduced in 2005 and the coordinator of the cardio-respiratory block noticed that 1 out of 5 students would fail in the exam. It is important to understand the factors that affect the performance of the students. Being the performance a complex subject, several variables like students, teachers and university have an impact in the outcome. It is important to remember that the subjectivity involved in the oral exam can be intimidating for the student, once that can compromise all the neutrality of the process [ 2 ]. It is very easy to teachers make judgements about the person that stands before them, and this highlights the importance of having an impartial process [ 6 ].
The main objectives of this study are to statistically analyze the oral exam results from 2005 to 2014, and to evaluate the influence of the gender, evaluation design, teachers, blood pressure and heart rate in the performance.
Subjects and methods
Two populations were assessed in this study: from a retrospective point of view (2005 to 2014), data were collected from 1042 students of the 2nd grade of the medicine course that have been evaluated with the 'fresh heart' oral exam; the population of the study ( n = 144) was the 2nd grade of the medicine class of 2014/2015 of the Faculdade de Ciências da Saúde (FCS) of the Universidade da Beira Interior. Inclusion criteria were: filling both questionnaires, before and after the exam, and to register blood pressure and heart rate before and after the oral exam. Exclusion criteria were: partly filled questionnaires, and not having measured the blood pressure and heart rate, before and after the oral exam. The oral exam was on the December 10th of 2014 and the students were distributed in a list with a schedule. The students would wait in a room and then called to a room where they would fill the questionnaire before the oral exam (questionnaire A) and their blood pressure and heart rate would also be assessed. After this, two students would enter in the oral exam room, one student would stay near the door and the other student would do the oral exam in front of a table with fresh hearts and two teachers. After doing the oral exam, the student near the door would call another student to come in and the former would go to the table to start the oral exam. Each teacher was randomly switched. After having finished the exam, students would go to a room where they would fill the second questionnaire (questionnaire B) and their blood pressure and heart rate would be assessed again.
The oral exam consisted in five questions about anatomical structures which were previously discussed in classes with a practical component. The evaluation consisted in a set of questions distributed by a total of three envelopes with different colors which students could choose. The questions were based on three categories: anatomical orientation of the heart (one question), main structures (two questions) and secondary structures (two questions) by this particular order. The first three questions were crucial for approval, so if the student missed one of the first three questions (anatomical orientation of the heart or one of the two main structures of the heart) then would fail the exam. The two questionnaires were applied before and after the oral exam and the students filled them voluntarily and in a form of anonymity with a written consent approved by the ethics commission of the college. Both questionnaires, were translated and adapted from the study of Hashmat et al [ 7 ] and consisted of two groups. The first group consisted of demographic data of the students (age and gender). The second group consisted of two parts: the first part consisted in measuring blood pressure and heart rate; the second part assessed potential factors that may have affected the student's performance (questionnaire A: study method, workload and sleep; questionnaire B: room conditions and exam conditions) on the basis of a 5-level Likert scale (1: strongly disagree; 2: slightly disagree; 3: indifferent; 4: slightly agree; 5: strongly agree). Both questionnaires were tested in a restricted group of students with the purpose of verify their validity. The study was approved by the Ethics Commission of the FCS (Process ID: EC-FCS-2015-006). In order to establish a baseline of the blood pressure and heart rate in a non-stress situation, measurements were taken during the theoretical classes in order to allow a future comparative analysis with those obtained on the day of the oral exam. All statistical analyses were performed with SPSS v. 21. Firstly, in the questionnaires, the possible factors that could influence the performance of students were analyzed with internal consistency scales (Cronbach's α), leading to individually assess each item with Mann-Whitney. T -test and chi-square were applied to determine whether there was a significant difference between the expected frequencies and the observed frequencies in categories like gender, blood pressure, heart rate and student's performance.
Results
Between 2005-2014, the analysis of the data collected allowed to verify that, 1042 students did the oral exam. Of these, it was verified that 71.5% were female ( n = 745) and 28.5% were male ( n = 297). It was found that 19.96% had failed in the oral exam and 80.04% had approved. The overall percentage of failing in the exam was higher for female students (20.5%) and lower for male students (18.5%), however, this difference was not statistically significant (χ 2 = 0.541; p = 0.462). The 2nd grade of the medicine course of 2014/2015 had 144 students, of which 70.1% ( n = 101) were female and 29.9% ( n = 43) were male. Of the 144 students, 23 failed in the oral exam (15.9%). The overall percentage of failures was higher for male students (25.6%) and lower for the female students (11.9%), and this difference was statistically significant (χ 2 = 4.218; p = 0.040). It was not found any association between the gender of the teachers and the gender of the students (Teacher X: χ 2 = 2.769; p = 0.096 / Teacher Y: χ 2 = 1.266; p = 0.260). However, it was found that the rate of disapproval in the exam was higher for teacher Y, being this difference statistically significant (χ 2 = 4.226; p = 0.040). In questionnaire A, it was not found any relationship between the assessed factors and the final result of the oral exam. In questionnaire B, the parameters 'I was nervous for watching my colleague's exam' and 'The presence of a colleague in the room left me nervous' had more impact for those who had failed in the exam (Mann-Whitney: p = 0.041 and p = 0.014, respectively). It was not found any other association between the student's performance and other factors. It was also found a significant difference between the baseline and diastolic blood pressure after the exam (Student's t : p = 0.03) and the baseline and heart rate after the exam (Student's t : p = 0.028).
Discussion
Between 2005 and 2014, the percentage of students who failed the exam was 19.96%. Although the overall percentage of bad performance is higher for the female gender (20.5% versus 18.5% in the male gender), the differences observed were not statistically significant ( p > 0.05), so it is concluded that the performance in the exam between 2005 and 2014, has no relation to gender. It is known that several factors influence the academic performance of students in the pre-graduate education; nonetheless, oral exams have their own specificities, particularly because they are seen as an assessment method that triggers stress [ 8 ]. There was an association between the parameters 'I was nervous to watch my colleague's exam' and 'The presence of a colleague at the back of the room made me nervous' with higher frequency for those who failed ( p < 0.05). It has also been found, contrary to other studies [ 9 ], that vital parameters assessed before the test, such as blood pressure and heart rate, have no association with the exam's result. However, with regard to the vital parameters assessed after the exam, it was found that the difference between the baseline and the diastolic blood pressure and heart rate after the exam is higher for those who had failed in the exam ( p < 0.05). On the other hand, female gender is associated with higher stress levels in practical examinations [ 9 ], it was expected that the female gender would be associated with worse scores in this exam. However, it was found that during 2005-2014 there was no relationship between gender and the outcome and, in the school year of 2014-2015, the male gender was associated with worse outcomes. Similar results have also been found in other studies [ 10 ]. There may be personal prejudices influenced either by previous encounters between the teacher and the student, either by the student's look or even by the gender of the student and the teacher [ 3 ]. It was found that there was an association between teacher Y and a higher percentage of bad performances, however it was not found any association between student's gender and teacher's gender. In fact, although in previous studies has been found an association between worse performance between the pair female teacher/female student, in our study it was not the case [ 11 ].