Multimedia instructional systems have been widely applied in teaching and learning, but the media presentation mode that is best for English listening comprehension remains uncertain, and whether unnecessary information led to cognitive overload for learners also remains inconclusive. According to the studies done by Jones and Plass (2002) and Diao, Chandler and Sweller (2007), students learning with double mode (sound and text) outperformed students learning with single mode (sound) and had lower cognitive load. Studies.