Visualizing Video Sounds with Sound Word Animation
Text captions are important means to provide sound information in videos when the sound is not accessible. However, conventional text captions are far less expressive for non-verbal sounds since they are designed to visualize speech sound. To address this problem, we propose a method for automatically transforming non-verbal video sounds to animated sound words, and positioning them near the sound source objects in the video for visualization. This provides natural visual representation of non-verbal sounds with rich information about the sound category and dynamics. We conducted a user study with over 300 participants using an online crowdsourcing service. The results showed that animated sound words could not only effectively and naturally visualize the dynamics of sound while clarify the position of the sound source, but also contribute to making video watching more enjoyable and increasing the visual impact of the video.
- F. Wang, H. Nagano, K. Kashino and T. Igarashi, "Visualizing Video Sounds With Sound Word Animation to Enrich User Experience," in IEEE Transactions on Multimedia, vol. 19, no. 2, pp. 418-429, Feb. 2017. [doi]
- F. Wang, H. Nagano, K. Kashino and T. Igarashi, "Visualizing video sounds with sound word animation," 2015 IEEE International Conference on Multimedia and Expo (ICME), Turin, 2015, pp. 1-6. [doi]
- 王 方舟，永野 秀尚，柏野 邦夫，五十嵐 健夫．擬音語アニメーションによる動画音響の可視化手法．
Visual Computing / グラフィクスと CAD合同シンポジウム 2014
- Master Thesis (Submitted on February 20, 2014) [PDF]