In the assessment of aural skills of second language learners, the study of the inclusion of visual stimuli has almost exclusively been conducted in the context of listening assessment. While the inclusion of contextual information in test input has been advocated for by numerous researchers (Ockey, 2010), little has been said regarding the scoring of speaking tests, which also involves raters’ listening comprehension. This study is designed to identify the possible variation in the scoring of speaking test performance when the speech samples to be scored are presented in either audio-only or audio-visual format. A group of raters were first asked to score a set of audio-only speech samples from an achievement speaking test consisting of one monologic task through an online platform. Weeks later, they scored the same samples presented in audio-visual format. Scores from both scoring sessions were compared. Findings suggest that the inclusion of visual stimuli may not result in significant effects on assigned scores or internal consistency. Yet, given the raters’ reported preference of using the audio-visual format, the study results call for further exploration of the positive effects of delivery methods on rater effect.