If we want students to improve at coding, especially at the introductory level, then it makes intuitive sense for the instructor to incorporate live coding into their lectures. This is a popular idea in the CS education literature, and many studies show that students and instructors perceive some pedagogical benefits of live coding.
But does live coding actually improve student outcomes more effectively than annotating and explaining pre-written code? Anshul Shah, Emma Hogan, Vardhan Agarwal, John Driscoll, Leo Porter, William G. Griswold, and Adalbert Gerald Soosai Raj consider this question in their ICER 2023 paper, “An Empirical Evaluation of Live Coding in CS1.” In short, they report that the answer is no. In fact, the only statistically significant results in their paper indicate potential drawbacks of live coding.
Experimental setup
In their experiment, one instructor at UCSD taught two sections of CS1. The sections were identical (e.g., weekly material, activities, assignments, etc.) except for two aspects: the start time (9:30 am or 11 am) and the presentation of code examples (static-code or live-coding) in lectures.
In the static-code group (the 9:30 section), the instructor “showed students pre-written examples of code when demonstrating an implementation of a concept.” While presenting the examples, the instructor wrote annotations and visualizations.
In the live-coding group (the 11:00 section), the instructor “showed code examples by opening a Python file in an IDE and writing code in front of the students.” While coding, the instructor verbalized their thoughts and demonstrated effective development strategies such as interpreting error messages and using print statements to track variables.
Attendance was consistently near 85-90% in both sections, and the sample size was about 110 students per group. Although students were not randomly assigned to the sections, at the time of registration, they didn’t know that one lecture would be taught with static code and the other with live coding. There were also no significant differences between the two groups in programming experience, demographics, or average GPA.
Questions and results
The paper considers three areas that live coding, in theory, could improve: students’ programming processes, course performance, and lecture experience. The main result of the paper is that in the first two areas, there were no significant differences between the two groups. Moreover, with regard to the third area, more students in the live-coding group thought the lectures were “too fast, did not hold their attention, and did not facilitate note-taking.” Their exact research questions, and some details on their methods, are described below.
Programming processes
“How do adherence to effective programming processes and programming productivity differ between students in live coding and static-code pedagogies?”
The authors examined students’ behavior on various coding challenges held during lectures. More specifically, the authors analyzed snapshots of code obtained whenever a student ran or submitted code to the course’s online IDE (EdStem). Using those snapshots, they evaluated an assortment of metrics related to programmer productivity (e.g., “Measure of Incremental Development,” “Repeated Error Density,” time until correct implementation, number of print statements).
After running statistical tests on those metrics, the authors found no statistically significant differences between the two groups. The live-coding group performed better at a debugging task and had a lower time until correct implementation on the final coding challenge (about 14 minutes compared to 16), but those results were also not statistically significant.
Course performance
“How does course performance on exams and assignments, specifically on code tracing, code explaining, and code writing questions, differ between students in live-coding and static-code pedagogies?”
The authors also measured students’ course performance based on programming assignments, worksheets, and two exams. Every question from the worksheets and exams was categorized as “code explaining,” “code writing,” “code tracing with loops,” “code tracing without loops,” and “basic questions.”
Again, none of the p-values calculated by the authors indicated statistical significance. This included a p-value for every type of question, grades on the programming assignments, and grades on the exams. For example, on the four “code writing” questions on the final exam, correctness differed by just one percentage point between the two groups.
Lecture experience
How does the lecture experience, in terms of engagement during lecture and perceptions of code examples, differ between students in live-coding and static-code pedagogies?
Finally, the authors examined how students felt about the lectures themselves based on their responses to a required mid-term survey and the official, anonymized, end-of-term course evaluation.
In the survey, more students in the static-code group reported benefits related to code comprehension. On the other hand, more students in the live-coding group reported benefits related to programming processes. The latter sounds like a positive effect of live coding, but the authors remind us that “these differences in perceived benefits did not materialize into discernible differences” according to the metrics they calculated when answering their other research questions (“Measure of Incremental Development,” “Repeated Error Density,” etc.).
Furthermore, in the live-coding group, 18% of responses mentioned that the instructor should slow down, whereas this rate was only 2% in the static-code group. (In both groups, about 1% of responses mentioned that the instructor should speed up.)
The only statistically significant results in the paper came from the end-of-term course evaluation. These results showed that students in the live-coding group were less engaged than those in the static-code group.1 In particular, as the authors put it, “students in the live-coding group were more likely to disagree that lectures held attention or facilitated note taking.”
Discussion
In summary, after running various statistical tests, the authors found that live coding was not more effective at improving students’ performance than presenting static code, and students in the live-coding group were less engaged during lectures. The authors allude to context switching and cognitive load as potential explanations:
Live-coding examples require the instructor to move away from the lecture slides and open a new file in an IDE for each example, which leads to an inherent overhead cost of showing the examples…
These results may be due to the instructor simultaneously writing code while also explaining their reasoning and strategies. Students could be unsure whether to write down the code or the instructors’ explanations, potentially resulting in an inability to focus at all…Ultimately, students may not have been able to fully absorb the programming processes demonstrated in the live-coding examples because of the pace and difficulty in focusing.
The authors speculate that “there may be an additional step between students observing effective programming processes and being able to apply these processes in their own work.” They then cite methods from Cognitive Apprenticeship beyond modeling (e.g., scaffolding, coaching) as possible ways to supplement traditional lectures.
Closing thoughts
My hunch is that the most effective CS1 lectures combine live coding with static code, perhaps in the following way: the instructor writes the code while verbalizing their thought process, then (ideally seamlessly, to reduce context switching) annotates and explains the code as if it had been written ahead of time. (Or maybe it’d be better to flip the order.) It sounds repetitive, but I think it’s useful to repeat concepts, especially when they are new. Even slight changes in presentation can provide additional insight.
Results like the ones in this paper highlight the importance of education research. Live coding might have intuitive appeal, but research allows us to check our intuitions.
As a reminder, the static-code lecture was at 9:30 am, whereas the live-coding lecture was at 11:00 am, so maybe there was a “hungry judge” effect.