Introduction to Professor Wen-Hsiao Peng and His Lab at National Yang Ming Chiao Tung University

Could you briefly introduce yourself (and your University/Lab)?

I am a Professor with the Department of Computer Science, National Yang Ming Chiao Tung University (NYCU), Taiwan. I received my Ph.D. in Electronics Engineering from the then National Chiao Tung University (now, NYCU) in 2005. From 2000 to 2001, I worked at Intel Microprocessor Research Laboratory, Santa Clara, CA, USA, where I was involved in the development of International Organization for Standardization (ISO) Moving Picture Experts Group (MPEG)-4 Fine Granularity Scalability. From 2015 to 2016, I was a Visiting Scholar with the IBM Thomas J. Watson Research Center, Yorktown Heights, NY, USA. Since 2003, I have actively participated in the ISO/IEC & ITU-T video coding standardization process. In particular, NYCU’s MPEG research group is one of the few university teams around the world that participated in ISO/IEC & ITU-T Call-for-Proposals on Scalable Video Coding (2004), High-Efficiency Video Coding (2011), and HEVC Screen Content Coding Extensions (2014). My current research interests include learning-based video/image coding, multimedia analytics, and computer vision. Currently, I head the Visual Signal Processing and Communications Technical Committee in the IEEE Circuits and Systems Society. I serve as AEiC for Digital Communications for IEEE JETCAS, Associate Editor/Special Session Organizer for IEEE TCSVT, and Guest Editor for IEEE TCAS-II. I was Distinguished Lecturer of APSIPA.

What have been your most significant research contributions up to now?

My earlier research effort was focused on inter-frame prediction for video coding. I formalized motion vectors for variable block-size motion compensation as motion samples taken from a dense motion field on an irregular sampling grid. This viewpoint laid the foundation for a more generic inter-frame prediction framework capable of reconstructing a temporal predictor from any irregularly sampled motion vectors. It also offered a theoretical justification for the widely-discussed template matching prediction during the development of HEVC. More recently, I started to explore deep learning-assisted and deep learning-based video/image coding. For deep learning-assisted coding, I introduced reinforcement learning to video encoder control, showing that it could potentially be an attractive alternative to address dependent decision-making that arises in many encoder control problems. I also linked reinforcement learning to my earlier work on inter-frame prediction. The work demonstrated that the classic inter-prediction technique can be combined with modern deep reinforcement learning to arrive at an even more efficient solution that would not be possible otherwise. For deep learning-based coding, I am exploring end-to-end learned codecs, for which lots of issues remain widely open.

What problems in your research field deserve more attention (or what problems will you like to solve) in the next few years, and why?

Both deep learning-assisted and deep learning-based video/image coding are emerging as hot topics that are expected to attract lots of attention in the research and standards communities. The former is likely to make its way into products much faster than expected as it does not need to make any change to the codec itself. It may appear in the form of pre-processing, post-processing, or the combination of both to re-purpose or enhance any off-the-shelf codec. Deep leaning-based video/image coding, particularly end-to-end learned codecs, are catching up quickly in terms of compression performance. However, complexity and generalizability are two decisive factors to its success. Whether end-to-end learned codecs will really take off remains an open question. For academic research, there are many issues (e.g. rate control, content-dependent encoder optimization, multi-rate encoding, low-complexity considerations, new applications) that deserve further exploration and investigation.

What advice would you like to give to the young generation of researchers/engineers?

Try to keep an eye on the forefront developments in video coding and machine learning communities. Be knowledgeable about old tricks; they often help formulate a more principled approach. Practice connecting the dots; what appears seemingly irrelevant techniques may be solutions to your work. Last but not least, avoid relying entirely on black-box approaches – they are often unexplainable and hard to debug.