By Luka Murn, ESR at BBC Research and Development
Due to the increasing demand for video broadcast and streaming at better qualities and higher resolutions, efficient means of video compression are more important than ever. With the global pandemic situation, rapid growths in telemedicine, video conferencing, increased penetration of online education, integration of innovative technologies like virtual reality (VR), play a major role in driving demand for video solutions.
To address these needs and ensure interoperability between services and devices, new video coding standards are being developed. A joint team of video coding experts has been developing the next generation video coding standard, the Versatile Video Coding (VVC), for finalisation by the end of 2020. In order to achieve a 50% better compression rate for the same video quality when compared to the previous standard, VVC provides novel coding tools, with a few based on highly simplified forms of Machine Learning (ML).
My research has been focused on developing explainable ML solutions for video processing. When applied in video coding, ML approaches have demonstrated they can help deliver high-quality content at even lower bitrates. However, such solutions usually bring coding gains at the cost of substantial increases in computational complexity and memory consumption. In many cases, the high complexity of these schemes limits their potential for implementation within real-time applications. Additionally, the intricacy of ML models makes it challenging to understand exactly what these algorithms have learned, putting the trustworthiness of their deployment into question. Their methods of manipulating data need to be properly explained to mitigate potential unexpected outcomes.
In a previous blog post, I’ve highlighted how interpretability of ML models allows for verification of its learned outputs. Interpretability is an area of ML research that aims to explain how the results of learned ML algorithms are derived clearly and plainly. By interpreting trained ML models and understanding how these complex algorithms work, my research demonstrated how it is possible to extract relevant knowledge from such approaches. This has made them significantly more efficient and lightweight, supporting the development of ML tools for practical applications.
You can read more about this work in my research paper, to be presented in the International Conference on Image Processing, organised by IEEE. I’ve provided a broad overview of the research carried out, including visualisations, in a blog post for the BBC Research & Development website.
Furthermore, the insights developed by work on interpretable ML for video coding can help influence the design of subsequent ML models. This has resulted in establishing a series of guidelines towards applying ML solutions for video processing, which I have summarised in a conference paper submitted to the International Broadcasting Convention.
Lastly, I have successfully completed my PhD transfer process at the DCU School of Computing. By submitting a written report and defending it at a transfer talk, I’ve demonstrated: (a) that I have developed a suitably-detailed research plan, (b) that the research plan, if executed successfully, will produce research that would satisfy the university’s requirements for awarding the PhD degree, and (c) that I have the ability to execute the proposed research plan.
I have also introduced and described my PhD research in a YouTube video produced by the DCU School of Computing.