├── 360-degree video papers.md ├── Conference Paper.md ├── Links and News.md ├── Multimedia Libraries and Repositories.md ├── Paper Weekly.md ├── README.md ├── Repository of papers.md ├── Research Group.md ├── The way of multimedia research.md └── asserts ├── 19Infocom_stick.PNG ├── 20ATC_Firefly1.PNG ├── 20sigcomm_dds.PNG └── 21ASSR_1.PNG /360-degree video papers.md: -------------------------------------------------------------------------------- 1 | Back to [Repository of papers](https://github.com/jinyucn/Video-Streaming-Research-Papers/blob/main/Repository%20of%20papers.md) 2 | 3 | ## Video Transmission 4 | 5 | + [Popularity-Aware 360-Degree Video Streaming](http://mcn.cse.psu.edu/paper/xianda/infocom-xianda21.pdf) [Infocom 21] 6 | + [Robust 360° Video Streaming via Non-Linear Sampling](https://www.cs.purdue.edu/cgvlab/papers/popescu/2021InfoCommCOREPopescu.pdf) [Infocom 21] 7 | + [AdaP-360: User-Adaptive Area-of-Focus Projections for Bandwidth-Efficient 360-Degree Video Streaming](http://www.cs.binghamton.edu/~yaoliu/publications/mm20-adap360.pdf) [MM 20] 8 | + [QuRate: Power-Efficient Mobile Immersive Video Streaming](https://dl.acm.org/doi/pdf/10.1145/3339825.3391863) [MMsys 20] 9 | + [Deja View: Spatio-Temporal Compute Reuse for Energy-Efficient 360° VR Video Streaming](https://ieeexplore.ieee.org/document/9138937/) [ISCA 20] 10 | + [SR360: Boosting 360-Degree Video Streaming with Super-Resolution](https://dl.acm.org/doi/abs/10.1145/3386290.3396929) [NOSSDAV 20] 11 | + [Streaming 360-Degree Videos Using Super-Resolution](https://www3.cs.stonybrook.edu/~mdasari/assets/pdf/infocom20.pdf) [Infocom 20] 12 | + [Tile Rate Allocation for 360-Degree Tiled Adaptive Video Streaming](https://dl.acm.org/doi/pdf/10.1145/3394171.3413550) [MM 20] 13 | + [EPASS360: QoE-aware 360-degree Video Streaming over Mobile Devices](https://ieeexplore.ieee.org/document/9024132) [TMC 20] 14 | + [DRL360: 360-degree Video Streaming with Deep Reinforcement Learning](https://ieeexplore.ieee.org/document/8737361) [Infocom 19] 15 | + [Pano: Optimizing 360° Video Streaming with a Better Understanding of Quality Perception](https://people.cs.uchicago.edu/~junchenj/docs/360StreamingQuality_SIGCOMM.pdf) [Sigcomm 19] 16 | + [Proactive Caching for Vehicular Multi-View 3D Video Streaming via Deep Reinforcement Learning](https://ieeexplore.ieee.org/document/8677285) [TWC 19] 17 | + [CLS: A Cross-user Learning based System for Improving QoE in 360-degree Video Adaptive Streaming](https://www.icst.pku.edu.cn/NetVideo/docs/20201104111337969645.pdf) [MM18] 18 | + [Viewport-Driven Rate-Distortion Optimized 360° Video Streaming](https://ieeexplore.ieee.org/abstract/document/8422859) [ICC 18] 19 | + [360ProbDASH: Improving QoE of 360 Video Streaming Using Tile-based HTTP Adaptive Streaming](https://www.icst.pku.edu.cn/NetVideo/docs/20201104112437185498.pdf) [MM 17] 20 | + [Adaptive 360-Degree Video Streaming using Scalable Video Coding](https://dl.acm.org/citation.cfm?id=3123414) [MM 17] 21 | 22 | ## Live Video Streaming 23 | 24 | + [SphericRTC: A System for Content-Adaptive Real-Time 360-Degree Video Communication](https://dl.acm.org/doi/10.1145/3394171.3413999) [MM 20] 25 | + [An Analysis of Delay in Live 360° Video Streaming Systems](https://ece.northeastern.edu/fac-ece/dkoutsonikolas/publications/multimedia20.pdf) [MM 20] 26 | + [Low-latency FoV-adaptive Coding and Streaming for Interactive 360° Video Streaming](https://dl.acm.org/doi/pdf/10.1145/3394171.3413751) [MM 20] 27 | + [Flocking-based Live Streaming of 360-degree Video](https://dl.acm.org/doi/abs/10.1145/3339825.3391856) [MMsys 20] 28 | + [Mobile Streaming of Live 360-Degree Videos](https://www2.cs.sfu.ca/~mhefeeda/Papers/tmm20_vrcast.pdf) [TMM 20] 29 | + [MA360: MULTI-AGENT DEEP REINFORCEMENT LEARNING BASED LIVE 360-DEGREE VIDEO STREAMING ON EDGE](https://www.icst.pku.edu.cn/NetVideo/docs/20201026174450838459.pdf) [ICME 20] 30 | + [A Measurement Study of YouTube 360° Live Video Streaming](https://zyan.gsucreate.org/papers/360measure_nossdav19.pdf) [NOSSDAV 19] 31 | + [Event-driven Stitching for Tile-based Live 360 Video Streaming](https://zyan.gsucreate.org/papers/livetexture_mmsys19.pdf) [MMsys 19] 32 | + [RATS: Adaptive 360-degree Live Streaming](https://dl.acm.org/doi/10.1145/3304109.3323837) [MMsys 19] 33 | + [LIME: Understanding Commercial 360° Live Video Streaming Services](https://bohan00.github.io/share/LIME_MMSys19.pdf) [MMSys 19] 34 | 35 | ## System 36 | 37 | + [How to Evaluate Mobile 360° Video Streaming Systems?](http://www.cse.buffalo.edu/faculty/dimitrio/publications/hotmobile20.pdf) [HotMobile 20] 38 | + [Freedom: Fast Recovery Enhanced VR Delivery Over Mobile Networks](https://dl.acm.org/doi/pdf/10.1145/3307334.3326087) [MobiSys 19] 39 | + [Tile-based Caching Optimization for 360° Videos](http://graphics.cs.aueb.gr/graphics/docs/papers/MOBIHOC-2019.pdf) [Mobihoc 19] 40 | + [A Two-Tier System for On-Demand Streaming of 360 Degree Video Over Dynamic Networks](https://par.nsf.gov/servlets/purl/10105313) [TCSVT 19] 41 | + [Flare: Practical Viewport-Adaptive 360-Degree Video Streaming for Mobile Devices](https://www.research.att.com/ecms/dam/sites/labs_research/content/publications/N_Flare_Practical_Viewport_Adaptive_360Degree_Video_Streaming.pdf) [Mobicom 18] 42 | + [Rubiks: Practical 360-Degree Streaming for Smartphones](https://www.cs.utexas.edu/~mubashir/papers/rubiks_mobisys.pdf) [MobiSys 18] 43 | + [Creating the Perfect Illusion : What will it take to Create Life-Like Virtual Reality Headsets?](https://www.microsoft.com/en-us/research/uploads/prod/2018/02/perfectillusion.pdf) [HotMobile 18] 44 | + [POI360: Panoramic Mobile Video Telephony over LTE Cellular Networks](http://xyzhang.ucsd.edu/papers/XXie_CoNEXT17_POI360.pdf) [CoNEXT 17] 45 | + [VR is on the Edge: How to Deliver 360° Videos in Mobile Networks](https://dl.acm.org/doi/10.1145/3097895.3097901) [VR/AR Network 17] 46 | + [VR/AR Immersive Communication: Caching, Edge Computing, and Transmission Trade-Offs](https://dl.acm.org/doi/pdf/10.1145/3097895.3097902) [VR/AR Network 17] 47 | 48 | ## Viewport Prediction 49 | 50 | + [Graph Learning Based Head Movement Predictionfor Interactive 360 Video Streaming](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9416230) [TIP 21] 51 | + [PARIMA: Viewport Adaptive 360-Degree Video Streaming](https://arxiv.org/pdf/2103.00981) [WWW 21] 52 | + [LiveDeep: Online Viewport Prediction for Live Virtual Reality Streaming Using Lifelong Deep Learning](http://ieeexplore.ieee.org/document/9089486/) [VR 20] 53 | + [Viewport Prediction for 360° Videos: A Clustering Approach](https://dl.acm.org/doi/pdf/10.1145/3386290.3396934) [NOSSDAV 20] 54 | + [A Spherical Convolution Approach for Learning Long Term Viewport Prediction in 360 Immersive Video](https://aaai.org/ojs/index.php/AAAI/article/view/7377/7241) [AAAI 20] 55 | + [Sparkle: User-Aware Viewport Prediction in 360-degree Video Streaming](https://ieeexplore.ieee.org/document/9238515) [TMM 20] 56 | + [DGaze: CNN-Based Gaze Prediction in Dynamic Scenes](https://ieeexplore.ieee.org/document/8998375) [TVCG 20] 57 | + [Very Long Term Field of View Prediction for 360-degree Video Streaming](https://arxiv.org/pdf/1902.01439) [MIPR 19] 58 | + [LADDERNET: KNOWLEDGE TRANSFER BASED VIEWPOINT PREDICTION IN 360° VIDEO](https://ieeexplore.ieee.org/document/8682776) [ICASSP 19] 59 | + [SPHERICAL CLUSTERING OF USERS NAVIGATING 360! CONTENT](https://ir.cwi.nl/pub/28575/28575.pdf) [ICASSP 19] 60 | + [Viewport Prediction for Live 360-Degree Mobile Video Streaming Using User-Content Hybrid Motion Tracking](https://dl.acm.org/doi/10.1145/3328914) [IMWUT 19] 61 | + [Your Attention is Unique: Detecting 360-Degree Video Saliency in Head-Mounted Display for Head Movement Prediction](https://dl.acm.org/doi/10.1145/3240508.3240669) [MM 18] 62 | + [Gaze Prediction in Dynamic 360° Immersive Videos](http://openaccess.thecvf.com/content_cvpr_2018/papers/Xu_Gaze_Prediction_in_CVPR_2018_paper.pdf) [CVPR 18] 63 | + [CUB360: EXPLOITING CROSS-USERS BEHAVIORS FOR VIEWPORT PREDICTION IN 360 VIDEO ADAPTIVE STREAMING](https://ieeexplore.ieee.org/document/8486606) [ICME 18] 64 | + [Predictive View Generation to Enable Mobile 360-degree and VR Experiences](http://esdat.ucsd.edu/sites/esdat.ucsd.edu/files/publications/SIGCOMM_workshop_predictive_view.pdf) [VR/AR Network 18] 65 | + [Fixation Prediction for 360° Video Streaming to Head-Mounted Displays](https://2017.acmmmsys.org/wp-content/uploads/2017/05/Cheng-Hsin-Hsu.pdf) [NOSSDAV 17] 66 | + [Predicting Head Movement in Panoramic Video: A Deep Reinforcement Learning Approach](https://arxiv.org/pdf/1710.10755) [TPAMI 15] 67 | 68 | ## Dataset 69 | 70 | + [A Taxonomy and Dataset for 360° Videos](https://arxiv.org/pdf/1905.03823) [MMSys 19] 71 | + [360-degree Video Gaze Behaviour: A Ground-Truth Data Set and a Classification Algorithm for Eye Movements](https://dl.acm.org/doi/10.1145/3343031.3350947) [MM 19] 72 | + [Gaze Prediction in Dynamic 360° Immersive Videos](http://openaccess.thecvf.com/content_cvpr_2018/papers/Xu_Gaze_Prediction_in_CVPR_2018_paper.pdf) [CVPR 18] 73 | + [A Dataset of Head and Eye Movements for 360° Videos](https://hal.archives-ouvertes.fr/hal-01994923/document) [MMSys 18] 74 | + [360° Video Viewing Dataset in Head-Mounted Virtual Reality](https://dl.acm.org/doi/10.1145/3083187.3083219) [MMSys 17] 75 | + [360-Degree Video Head Movement Dataset](https://dl.acm.org/doi/10.1145/3083187.3083215) [MMSys 17] 76 | + [A Dataset of Head and Eye Movements for 360 Degree Images](https://dl.acm.org/doi/10.1145/3083187.3083218) [MMSys 17] 77 | + [A Dataset for Exploring User Behaviors in VR Spherical Video Streaming](https://dl.acm.org/doi/10.1145/3083187.3083210) [MMSys 17] -------------------------------------------------------------------------------- /Conference Paper.md: -------------------------------------------------------------------------------- 1 | # 21 2 | 3 | ## NSDI 4 | 5 | + [Move Fast and Meet Deadlines: Fine-grained Real-time Stream Processing with Cameo](https://www.usenix.org/conference/nsdi21/presentation/xu) 6 | 7 | Le Xu, *University of Illinois at Urbana-Champaign;* Shivaram Venkataraman, *UW-Madison;* Indranil Gupta, *University of Illinois at Urbana-Champaign;* Luo Mai, *University of Edinburgh;* Rahul Potharaju, *Microsoft* 8 | 9 | + [Whiz: Data-Driven Analytics Execution](https://www.usenix.org/conference/nsdi21/presentation/grandl) 10 | 11 | Robert Grandl, *Google;* Arjun Singhvi, *University of Wisconsin–Madison;* Raajay Viswanathan, *Uber Technologies Inc.;* Aditya Akella, *University of Wisconsin–Madison* 12 | 13 | + [SENSEI: Aligning Video Streaming Quality with Dynamic User Sensitivity](https://www.usenix.org/conference/nsdi21/presentation/zhang-xu) 14 | 15 | Xu Zhang and Yiyang Ou, *University of Chicago;* Siddhartha Sen, *Microsoft Research;* Junchen Jiang, *University of Chicago* 16 | 17 | ## Mobicom 18 | 19 | + [PECAM: Privacy-Enhanced Video Streaming & Analytics via Securely-Recoverable Transformation](https://www.microsoft.com/en-us/research/uploads/prod/2021/02/mobicom21-final80.pdf) 20 | 21 | Authors: Hao Wu and XueJin Tian (National Key Lab for Novel Software Technology, Nanjing University); Minghao Li (Cornell University); Yunxin Liu and Ganesh Ananthanarayanan (Microsoft Research); Fengyuan Xu and Sheng Zhong (National Key Lab for Novel Software Technology, Nanjing University) 22 | 23 | + [Elf: Accelerate High-resolution Mobile Deep Vision with Content-aware Parallel Offloading](https://dl.acm.org/doi/10.1145/3447993.3448628) 24 | 25 | Authors: Wuyang Zhang (Rutgers University); Zhezhi He (Shanghai Jiao Tong University); Luyang Liu (Google); Zhenhua Jia (NVIDIA); Yunxin Liu (Microsoft Research Asia); Marco Gruteser (Google); Dipankar Raychaudhuri (Rutgers University); Yanyong Zhang (University of Science and Technology of China) 26 | 27 | ## Infocom 28 | 29 | **video transmission:** 30 | 31 | + **A Universal Transcoding and Transmission Method for Livecast with Networked Multi-Agent Reinforcement Learning** 32 | 33 | Xingyan Chen and Changqiao Xu (Beijing University of Posts and Telecommunications, China); Mu Wang (State Key Laboratory of Networking and Switching Technology, China); Zhonghui Wu and Shujie Yang (Beijing University of Posts and Telecommunications, China); Lujie Zhong (Capital Normal University, China); Gabriel-Miro Muntean (Dublin City University, Ireland) 34 | 35 | + **AMIS: Edge Computing Based Adaptive Mobile Video Streaming** 36 | 37 | Phil K Mu, Jinkai Zheng, Tom H. Luan and Lina Zhu (Xidian University, China); Mianxiong Dong (Muroran Institute of Technology, Japan); Zhou Su (Shanghai University, China) 38 | 39 | + **An Experience Driven Design for IEEE 802.11ac Rate Adaptation based on Reinforcement Learning** 40 | 41 | Syuan-Cheng Chen, Chi-Yu Li and Chui-Hao Chiu (National Chiao Tung University, Taiwan) 42 | 43 | + **Popularity-Aware 360-Degree Video Streaming** 44 | 45 | Xianda Chen, Tianxiang Tan and Guohong Cao (The Pennsylvania State University, USA) 46 | 47 | + **Robust 360 Video Streaming via Non-Linear Sampling** 48 | 49 | Mijanur R Palash, Voicu Popescu, Amit Sheoran and Sonia Fahmy (Purdue University, USA) 50 | 51 | **video analysis:** 52 | 53 | + **AutoML for Video Analytics with Edge Computing** 54 | 55 | Apostolos Galanopoulos, Jose A. Ayala-Romero and Douglas Leith (Trinity College Dublin, Ireland); George Iosifidis (Delft University of Technology, The Netherlands) 56 | 57 | + **Edge-assisted Online On-device Object Detection for Real-time Video Analytics** 58 | 59 | Mengxi Hanyao, Yibo Jin, Zhuzhong Qian, Sheng Zhang and Sanglu Lu (Nanjing University, China) 60 | 61 | + **Enabling Edge-Cloud Video Analytics for Robotics Applications** 62 | 63 | Yiding Wang and Weiyan Wang (Hong Kong University of Science and Technology, Hong Kong); Duowen Liu (Hong Kong University of Science & Technology, Hong Kong); Xin Jin (Johns Hopkins University, USA); Junchen Jiang (University of Chicago, USA); Kai Chen (Hong Kong University of Science and Technology, China) 64 | 65 | + **Towards Video Streaming Analysis and Sharing for Multi-Device Interaction with Lightweight DNNs** 66 | 67 | Yakun Huang, Hongru Zhao and Xiuquan Qiao (Beijing University of Posts and Telecommunications, China); Jian Tang (Syracuse University, USA); Ling Liu (Georgia Tech, USA) 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | -------------------------------------------------------------------------------- /Links and News.md: -------------------------------------------------------------------------------- 1 | # Links and News 2 | 3 | This document share some useful and interesting links and news seen by the author. 4 | 5 | Video: 6 | 7 | + [雷霄骅(leixiaohua1020)的专栏](https://blog.csdn.net/leixiaohua1020) 8 | + [LiveVideoStack](https://www.zhihu.com/org/livevideostack) 9 | + [Making Your Own Simple MPEG-DASH Server (Windows 10)](https://www.instructables.com/Making-Your-Own-Simple-DASH-MPEG-Server-Windows-10/) 10 | + [通过 nginx 搭建一个基于 http-flv 的直播流媒体服务器](https://www.trickyedecay.me/2019/03/17/how-to-setup-an-live-server-with-nginx-base-on-http-flv/) 11 | + [搭建 基于nginx-http-flv-module的服务,通过flv.js进行网页直播](https://blog.csdn.net/rain_meter/article/details/88127209) 12 | + [Awesome Video](https://github.com/krzemienski/awesome-video) 13 | 14 | Research: 15 | 16 | + [Explore connected papers in a visual graph](https://www.connectedpapers.com/) -------------------------------------------------------------------------------- /Multimedia Libraries and Repositories.md: -------------------------------------------------------------------------------- 1 | # Multimedia Libraries and Repositories 2 | 3 | This document share some useful libraries and repositories collected by authors. 4 | 5 | + [Another repository](https://github.com/VideoForage/Video-Lit) 6 | 7 | Framework: 8 | 9 | + [webrtc](https://webrtc.org/) 10 | + [GPAC](https://github.com/gpac/gpac) 11 | + [QUIC](https://www.chromium.org/quic) 12 | + [DASH](https://github.com/Dash-Industry-Forum/dash.js) 13 | 14 | Coding: 15 | 16 | + [FFmpeg](https://ffmpeg.org/) 17 | + [x265](https://github.com/videolan/x265) 18 | + [SVC](https://avc.hhi.fraunhofer.de/svc) 19 | 20 | Software: 21 | 22 | + [OBS](https://obsproject.com/) 23 | + [VLC player](https://www.videolan.org/) -------------------------------------------------------------------------------- /Paper Weekly.md: -------------------------------------------------------------------------------- 1 | # paper reading 2 | 3 | This is a document recording some papers I read every week from April, 2021. 4 | 5 | ## keyword 6 | 7 | #abr #video_system #live #VR #360 #viewport_prediction #video_analysis 8 | 9 | ## papers 10 | 11 | + **Enabling Edge-Cloud Video Analytics for Robotics Applications** 12 | 13 | 21 Infocomm HKUST #video_analysis 14 | 15 | 这篇讲的是SR在video analysis的应用,解决Robotics Applications中poor tail accuracy的问题。作者把该问题分为两类:class-wise tail accuracy:在某些灯光、场景下,accuracy对resolution有较高要求。frame-wise tail accuracy:某些帧不好检测。 16 | 17 | 作者设计了两个模块,ASR:利用SR在server端增强画质。CAC:类似于码率控制。 18 | 19 | ![21ASSR_1](asserts/21ASSR_1.PNG) 20 | 21 | CAC按我的理解跟下面的文章有点像。检测tail frame:当物体的size小于1%的frame size时,则认为比较难检测,此时应该提高分辨率。码率的设置采用推断的方式。通过多次尝试找到能够实现高准确率的configurations。 22 | 23 | + **Server-Driven Video Streaming for Deep Learning Inference** 24 | 25 | 20 Sigcomm University of Chicago #video_analysis 26 | 27 | 这篇文章做的是视频分析(object detection/segmentation)网络。作者采用的是camera (client)-server架构,camera上传视频给server端采用DNN模型进行inference。本文的创新在于把上传过程分为两个阶段,(1)上传low quality视频,这样可以给服务端做初步推断,同时给出feedback指导后续上传;(2)上传high-quality视频,提高检测的accuracy。这里的feedback是指第一步检测出的部分bounding boxes需要重新上传高质量视频做进一步检测和确认。相比于传统的客户端(精度低、耗时)或服务器优化(耗带宽),两步优化的目的是减小带宽。 28 | 29 | ![20sigcomm_dds](asserts/20sigcomm_dds.PNG) 30 | 31 | 以上图为例,左上是之前帧检测的objects,左下是低质量帧检测的结果。作者提出两个方法移除掉不必要重传:(1)和原先检测结果有30%以上重叠;(2)超过视频帧大小4%的物体(如果物体足够大则低质量也能检测出)。然后将剩余的bounding boxes作为feedback传回camera,重传高质量视频。 32 | 33 | + **DeepVista: 16K Panoramic Cinema on Your Mobile Device** 34 | 35 | 21 WWW University of Minnesota #360 36 | 37 | 这篇介绍的是利用edge对全景视频进行转码、获取viewport和分发,跟传统的tile-based思路不太一样。 38 | 39 | 视频分为两种形式,PS:最低质量,整帧直接从服务器下载,持续提供画面;VS:viewport-adaptive,服务器传送高质量视频(16K)到edge,edge解码后进行转码,encode后传送给client,性能瓶颈为edge到client的带宽。 40 | 41 | 文章主要在讲edge的工作。首先把16K的帧分解为两个8K*8K的帧,然后在edge用两个GPU进行解码(并行减小时间)。然后用LSTM预测viewport(预测长度较小、文章依据这些待转码的帧有个公式计算)。作者用CumeMap投影,然后将一帧分割blocks,依据坐标选取出视野内对应的blocks,reorganize出一个总的viewport。然后作者建立QoE采用枚举法选出对应的quality level。(这里有个小疑问,作者说要决定PS和VS的level,但是好像没看到PS的,以及两者对带宽的竞争)这篇文章比较好的地方是做了系统实现。 42 | 43 | + **Intelligent Edge-Assisted Crowdcast with Deep Reinforcement Learning for Personalized QoE** 44 | 45 | 20 Infocom Simon Fraser University 46 | 47 | 在传统的基于CDN-viewers架构中,作者引入了edge server来帮助转码和分发,因此需要调度某个channel某个quality version是由CDN分发还是edge。首先要建立优化目标,包括QoE(延迟、channel转换的)、转码损耗和带宽的损耗,决策是每个用户u是否在CDN/edge获取频道为h版本为v的视频。 48 | 49 | 作者采用A3C求解,状态依次资源、用户、需求。动作为哪个server/CDN服务用户,然后server会选择对应的h和v最大化reward。(这里有点不太明白是不是要多一个算法求解,还是直接输出两个决策) 50 | 51 | 本文的亮点在于引入edge及其对应的优化。 52 | 53 | + **Popularity-Aware 360-Degree Video Streaming** 54 | 55 | 21 Infocom The Pennsylvania State University #360 56 | 57 | 这篇文章的贡献在于macrotiles。传统的360 streaming会把视频分割成多个tile,每个tile单独编码,但是这样会使得tile之间无法进行运动补偿编码,影响编码性能。因此,作者提出将用户集中观看的区域(ROI)单独编码,其他依然采用tile编码。这样ROI可以单独编码为高质量,且画面较大可以使用帧间补偿,提高编码效率。 58 | 59 | 作者首先formulate一个QoE maximization问题,用多重背包证明为NP-hard。 60 | 61 | 然后,作者提出聚类方法来获取ROI区域,这块区域可单独编码。 62 | 63 | 针对当前用户,作者首先预测viewport,然后看是否有覆盖它的ROI,如果有,该ROI在带宽约束下尽可能下载最高码率,其他区域下载最低码率。否则,跟传统的tile方法一样,设计一个简单的方法分配viewport和其他区域的码率。 64 | 65 | + **PARIMA: Viewport Adaptive 360-Degree Video Streaming** 66 | 67 | 21 WWW Indian Institute of Technology Kharagpur, India #viewport_prediction 68 | 69 | 这篇论文做的主要是viewport prediction。作者提出了用prime objects和user trajectories来预测。本文创新点主要在于提取出object的高层特征。 70 | 71 | 预测上,作者先用ARIMIA模型预测出观看轨迹,然后用Passive-Aggressive Regression model融合不同特征。 72 | 73 | 不过基于objects的思想跟[这篇](https://dl.acm.org/doi/10.1145/3328914)比较像。 74 | 75 | + **SphericRTC: A System for Content-Adaptive Real-Time 360-Degree Video Communication** 76 | 77 | 20 MM SUNY Binghamton #360 78 | 79 | 这篇文章有点像在codec上面优化,然后结合到WebRTC中。作者在优化方面,主要是确定per-pixel magnification,像素点离viewpoint越近,编码的分辨率越高,则像素点越被放大(我的理解应该是越密集)。 80 | 81 | 作者基于viewpoint的向量和每个像素点的向量依次计算角度差,然后推导出像素点大小$g$。作者这里更多的是依据视角和给定的分辨率确定每个像素点的大小(密度),网络调整部分应该是依赖webtrc本身。 82 | 83 | + **Firefly: Untethered Multi-user VR for Commodity Mobile Devices** 84 | 85 | 20 ATC University of Minnesota, Twin Cities #VR 86 | 87 | 这篇文章讲VR的渲染。优化的技术有:offline content preparation(提前将background在服务器端渲染和编码好)、viewport-adaptive streaming(划分tile,预测用户的viewport背景并下载)、adaptive content quality control(在多用户场景下,分配有限的带宽资源)。作者分析该系统能支持15个用户和60FPS帧率。 88 | 89 | ![20ATC_Firefly1](asserts/20ATC_Firefly1.PNG) 90 | 91 | 1. offline rendering engine: 作者在水平方向上将帧划分为多个tile,并备好多个质量版本,提前渲染和编码好,存储在数据库中等待下载。 92 | 2. viewport prediction: 这里的坐标有translational movement (x,y,z)和rotational movement (yaw, pitch)。作者训练前50ms的线性模型来预测未来150ms的。这里的时间粒度是毫秒级的。同时,作者采用扩大FoV,用户静止时增加tile数量等策略。(防止静止后突然移动) 93 | 3. Adaptive Quality Control. 作者提出一个简单的算法给多用户分配带宽资源,确定各tile下载的质量。 94 | 4. Client-Side Hierarchical Cache. 作者在客户端实现多级缓存,L1存储已解码的tile,L2和L3存储待解码的tile,L2的速度更快,存储更紧急的tile。 95 | 5. Handling Dynamic Foreground Objects. 跟其他工作一样,作者在客户端渲染小的object。同时,作者依据客户端的资源决定渲染的质量。 96 | 97 | 作者在安卓实现了该系统。 98 | 99 | + **Stick: A Harmonious Fusion of Buffer-based and Learning-based Approach for Adaptive Streaming** 100 | 101 | 这篇文章的ABR是使用buffer-based算法。但是作者提出一个问题:在不同带宽、视频情形下,BB的buffer-bound应该具体去设置。因此,作者使用了一个DRL模型(DDPG)来决定buffer-bound大小。 102 | 103 | 同时,作者考虑到buffer-bound某段时间可以保持恒定。因此,设计一个Trigger算法,输入是带宽、缓存总大小和当前缓存,输出是使用当前缓存上限的概率,表示是否要激活DRL输出新的上限,节省资源。 104 | 105 | ![19Infocom_stick](asserts/19Infocom_stick.PNG) 106 | 107 | + **SenSei: Aligning Video Streaming Quality with Dynamic User Sensitivity** 108 | 109 | 21 NSDI University of Chicago 110 | 111 | 这篇文章讲的是同一个视频不同chunk用户的sensitivity是不一样的,比如足球比赛中射门时,用户对质量和卡顿要求更高,因为对应的QoE权重也更高。总体思想跟19ICNP的Hotdash一致。这里在不同chunk中加入卡顿,然后在Amazon MTurk 平台上请人观看打分,进一步拟合不同chunk的权重。 112 | 113 | + **Rubiks: Practical 360-Degree Streaming for Smartphones** 114 | 115 | 18 MobiSys UT Austin #video_system 116 | 117 | 作者在手机实现了360度视频系统并进行优化。优化包括:编码上,作者划分tile和划分layer(SVC)。传输优化上,作者建立QoE,考虑解码时间,变量为每个tile码率、layer的数目。求解采用搜索方式。 118 | 119 | 作者在安卓上进行实现,也是本文的亮点。 120 | 121 | + **Favor: Fine-Grained Video Rate Adaptation** 122 | 123 | 18 MMSys UT Austin #abr 124 | 125 | 这篇文章除了考虑传统的问题设置外,还增加了frame dropping和playout rate。frame dropping可以降低帧率,减小视频大小。playout rate可以减缓播放时间,避免卡顿。因此需要对QoE做优化。 126 | 127 | 在方法上,作者采用了类似MPC的方法,并利用贪心算法做分配。 128 | 129 | 作者基于VLC实现,客户端基于HTTP。 130 | 131 | + **Learning in situ: a randomized experiment in video streaming** 132 | 133 | 19 NSDI Stanford University [website](https://puffer.stanford.edu) #abr 134 | 135 | 这篇论文的主要贡献在于构建了一个真实环境的streaming平台,供用户观看采集真实数据。以及提出用DNN(supervised learning)进行带宽预测+传统MPC优化方法(动态规划)进行码率控制的ABR算法。只是整篇文章读下来有点像技术报告。 136 | 137 | 从17年Pensieve以来,近些年的abr文章主要解决DRL无法泛化的问题,即实验(线下)和真实(线上)的差距。本文首先采集用户真实的观看数据,测量各算法的QoE,发现Pensieve的性能甚至不如传统的Buffer-Based。 138 | 139 | Q:作者一直在强调真实环境,but真实环境比实验室模拟环境优势在哪里呢。实验室的带宽等也是从真实环境采集来的,其他的也没有太大差别吧,是不是实验室数据集足够大也可以逼近真实环境呢。 140 | 141 | 从算法角度,作者首先提出了一个基于DNN的监督学习预测带宽,输入见p7左上角,输出为已知大小的视频的传输时间bin的概率分布。 142 | 143 | Q:传统比较直观的基于DL的带宽预测是用时序模型,如LSTM,来做。作者的这个方法的优势在哪里呢。是否是要取代LSTM呢。 144 | 145 | 接着,作者通过一个动态规划方法来求解码率,但这里介绍的实在有些简略。之前学习是在隐马尔可夫模型的维特比算法和RL的动态规划方法。这里DP会有两个维度,一个是水平的时序转移,这里是chunk i,一个是垂直的状态分布,这里是size s。在每一步chunk i,依次计算下一步i+1(作者是从后往前看,所以是i-1)的最大值v*和对应的状态转移概率的乘积。所以从动态规划的角度看,作者这里的区别在于:转移方向不同+由于是确定性测量,省略转移概率。而适中的Pr就是为了计算QoE引入的概率,跟动态规划或者说概率转移没有太大关系,更像是下一步中一个状态内($K_i^{s}$)的分裂。因此,作者这里适当增加一些解释会更好。(一开始看到那个公式+值迭代让我想到了RL的DP,还想说怎么不用循环求解呢。。) 146 | 147 | ABR一直都是RL+传统两条路在竞争和融合,只是越来越有点各说各话了。 148 | 149 | ------------------------------------------------------------------------------------------------------------------------------------------------------ 150 | 151 | + **PiTree: Practical Implementation of ABR Algorithms Using Decision Trees** 152 | 153 | 19 MM Tsinghua University #abr 154 | 155 | 采用决策树来转化传统的DRL-based ABR算法,使得算法更轻量级且性能不受大的影响。 156 | 157 | + **Ekya: Continuous Learning of Video Analytics Models on Edge Compute Servers** 158 | 159 | 20 UC Berkeley #video analysis; #video_system 160 | 161 | 将continuous learning加入视频分析中,边推断边学习。 162 | 163 | Continuous training on limited edge resources requires smartly deciding **when** to retrain each video stream’s model, **how much resources** to allocate, and **what configurations** to use. But making these decisions has two challenges. 164 | 165 | 什么时候分配:每次训练完分配。资源:多少unit的GPU resources。configuration:number of training epochs, batch sizes, number of neurons in the last layer, number of frozen layers, and fraction of training data. 166 | 167 | 作者设计了一个启发式算法进行分配。估计分配后的准确率时,采用小的分析器进行估计。 168 | 169 | + **HotDASH: Hotspot Aware Adaptive Video Streaming using Deep Reinforcement Learning** 170 | 171 | 18 ICNP,IIT Kharagpur, Kharagpur, India #abr 172 | 173 | 这一篇做的是热点视频片段的提前下载。观看视频时,经常会有用户更感兴趣的片段(比如比赛关键时刻),在网络状况好时,提前下载这些片段的高清版本,使得这些片段有更高的质量。 174 | 175 | 作者改进自pensieve,采用DRL求解,对于每个时刻,决定下一个普通chunk的码率,决定下一个hotspot chunk的码率,再决定是否要下载hotspot chunk。 176 | 177 | 作者实现了一个比较完整的系统,ABR决策在服务端作为一个service。计算hotspot时采用不同的QoE质量函数,得到的值会更高。 178 | 179 | + **User Centered Adaptive Streaming of Dynamic Point Clouds with Low Complexity Tiling** 180 | 181 | 这篇文章没有完全get到。讲的是volumetric video的传输,可以跟360度视频一样分割tile。本文是依据拍摄的摄像机进行分割,从object中心映射到各个point,然后看该向量跟哪个摄像头的向量接近(我的理解)。分割成tile后,就可以有不同的质量表示,然后依据用户的坐标和视野请求对应的tile。 182 | 183 | + **MultiLive: Adaptive Bitrate Control for Low-delay Multi-party Interactive Live Streaming** 184 | 185 | 20 infocom, Tsinghua University #live 186 | 187 | 这篇文章解决的是多端视频通信问题(连麦),每一端既为上传方又为下载方。作者首先建立问题,最大化QoE,包含质量、质量变化、卡顿和延时,同时设置带宽上限,优化的变量是各个streamer上传和receiver下载的码率。 188 | 189 | 作者采用Non-linear Programming方法求解问题,得到一个时间窗口固定的传输码率,同时每一帧中利用PID controller校正误差,得到从每一帧发送方到接收方的(下载)传输码率。上传方面,作者又设计了一个基于聚类的方法,来决定一个stream应该上传哪种码率。(这里作者说上传一个SVC码率,按我理解SVC分为基础层和增强层,确实可以上传一个码率,后面算法又可以上传多种码率,有点不太理解)。 190 | 191 | 这篇文章从问题到求解到参数设置,还有挺多部分有待商榷,不过场景和idea不错。问题:1、作者看似是基于CS架构,但是整体的场景和设置又有点像P2P;2、作者在问题设置定义了上传和下载的带宽上限,一个用户可以同时充当sender和receiver,怎么区别这两种类型的带宽占用比例呢,平衡这两个过程呢,能否跟其他paper一样把带宽约束去掉呢;3、作者说避免转码,采用多码率SVC编码。如何比较转码和SVC的差异呢,因为CS架构转码传输还是一个比较常见的模型。4、作者问题定义有些是以帧为单位的,后面有些是以time interval为单位的,感觉有点混乱。5、公式4有点不是很直观,加入是减慢,后一项就变成正的?如果缓存变空了,会有影响吗。目标缓存大小能否作为优化变量呢。碰到阈值播放速度就改变,改变多久呢,速度改变对QoE的影响呢。6、作者看起来是一帧帧调整码率,这样会造成码率的抖动吗。 192 | 193 | ------------------------------------------------------------------------------------------------------------------------------------------------------ 194 | 195 | 196 | 197 | 198 | 199 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Video-Streaming-Research-Papers 2 | 3 | I am a graduate student working hard on **video streaming and video analysis (processing)**. I build this repository to share some knowledge about researches in the field of multimedia. I hope it can help people who is interested in the multimedia research and development. 4 | 5 | The **documents** include: 6 | 7 | [Repository of papers](https://github.com/jinyucn/Video-Streaming-Research/blob/main/Repository%20of%20papers.md) : Some **papers** about video delivery and processing (paper list). 8 | 9 | [Conference Paper](https://github.com/jinyucn/Video-Streaming-Research-Papers/blob/main/Conference%20Paper.md): Papers indexed by **conference**. 10 | 11 | [Paper Weekly](https://github.com/jinyucn/Video-Streaming-Research/blob/main/Paper%20Weekly.md) : The **summarization of papers** read every week. 12 | 13 | [Multimedia Libraries and Repositories](https://github.com/jinyucn/Video-Streaming-Research/blob/main/Multimedia%20Libraries%20and%20Repositories.md) : Some useful **tools and repositories**. 14 | 15 | [Links and News](https://github.com/jinyucn/Video-Streaming-Research/blob/main/Links%20and%20News.md) : Some useful and interesting **links and news**. 16 | 17 | [Research Group](https://github.com/jinyucn/Video-Streaming-Research/blob/main/Research%20Group.md) : Some **research groups** collected from the published papers. 18 | 19 | [The way of multimedia research](https://github.com/jinyucn/Video-Streaming-Research/blob/main/The%20way%20of%20multimedia%20research.md): **Technical stack** in video streaming. 20 | 21 | The repository is being updated. 22 | 23 | Drop an **email** to [jychencs AT gmail DOT com] freely if you have any questions or want to have an in-depth communication about multimedia research. 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | -------------------------------------------------------------------------------- /Repository of papers.md: -------------------------------------------------------------------------------- 1 | # Repository of papers 2 | 3 | Papers related to video transmission/processing often appear in journals or conferences in the multimedia field, network field, and system field. 4 | 5 | The most relevant top journals for video transmission/processing include **TON, TMC, TIP, TCSVT, TMM**, etc. 6 | 7 | The most relevant top conference for video transmission/processing include **SIGCOMM, MobiCom, INFOCOM, ACM MM, MMsys**, etc. 8 | 9 | Other relevant conference also include **NSDI, OSDI, MobiSys, SOSP, ISCA,** etc. 10 | 11 | [Another paper lists](https://github.com/VideoForage/Video-Lit). 12 | 13 | ## Video Streaming 14 | 15 | + [SENSEI: Aligning Video Streaming Quality with Dynamic User Sensitivity](https://www.usenix.org/system/files/nsdi21-zhang.pdf) [NSDI 21] 16 | 17 | + [Learning in situ: a randomized experiment in video streaming](https://www.usenix.org/system/files/nsdi20-paper-yan.pdf) [NSDI 20] [[code]](https://github.com/stanfordsnr/puffer) 18 | + [OnRL: Improving Mobile Video Telephony via Online Reinforcement Learning](https://dl.acm.org/doi/10.1145/3372224.3419186) [Mobicom 20] 19 | + [GROOT: A Real-time Streaming System of High-Fidelity Volumetric Videos](https://juheonyi.github.io/files/GROOT.pdf) [Mobicom 20] 20 | + [ViVo: Visibility-Aware Mobile Volumetric Video Streaming](https://www-users.cs.umn.edu/~fengqian/paper/vivo_mobicom20.pdf) [Mobicom20] 21 | + [Stick: A Harmonious Fusion of Buffer-based and Learning-based Approach for Adaptive Streaming](https://godka.github.io/Infocom_20_Stick.pdf) [Infocom20] 22 | + [Interpreting Deep Learning-Based Networking Systems](https://dl.acm.org/doi/10.1145/3387514.3405859) [Sigcomm 20] [Metis] 23 | + [Comyco: Quality-Aware Adaptive Video Streaming via Imitation Learning](https://arxiv.org/pdf/1908.02270) [MM 19] [[code]](https://github.com/thu-media/Comyco) 24 | + [Learning to Coordinate Video Codec with Transport Protocol for Mobile Video Telephony](https://dl.acm.org/doi/10.1145/3300061.3345430) [Mobicom 19] [Concerto] 25 | + [PiTree: Practical Implementation of ABR Algorithms Using Decision Trees](https://zilimeng.com/papers/pitree-mm19.pdf) [MM 19] [[code]](https://github.com/transys-project/pitree) 26 | + [Jigsaw: Robust Live 4K Video Streaming](https://www.cs.utexas.edu/~jianhe/jigsaw-mobicom19.pdf) [Mobicom 19] 27 | + [Intelligent Edge-Assisted Crowdcast with Deep Reinforcement Learning for Personalized QoE](https://ieeexplore.ieee.org/document/8737456/) [Infocom 19] [DeepCast] 28 | + [QFlow: A Reinforcement Learning Approach to High QoE Video Streaming over Wireless Networks](https://arxiv.org/abs/1901.00959) [Mobihoc 19] 29 | + [Edge Computing Assisted Adaptive Mobile Video Streaming](https://ieeexplore.ieee.org/document/8395060) [TMC 19] 30 | + [Oboe: Auto-tuning Video ABR Algorithms to Network Conditions](https://engineering.purdue.edu/~isl/papers/sigcomm18-final128.pdf) [Sigcomm 18] 31 | + [HotDASH: Hotspot Aware Adaptive Video Streaming using Deep Reinforcement Learning](https://ieeexplore.ieee.org/iel7/8526479/8526788/08526814.pdf) [ICNP 18] [[code]](https://github.com/SatadalSengupta/hotdash) 32 | + [QARC: Video Quality Aware Rate Control for Real-Time Video Streaming via Deep Reinforcement Learning](https://arxiv.org/abs/1805.02482) [MM 18] 33 | + [Ensemble Adaptive Streaming – A New Paradigm to Generate Streaming Algorithms via Specializations](https://ieeexplore.ieee.org/document/8681142) [TMC 18] 34 | + [Neural Adaptive Video Streaming with Pensieve](https://people.csail.mit.edu/hongzi/content/publications/Pensieve-Sigcomm17.pdf) [Sigcomm 17] [[code]](https://github.com/hongzimao/pensieve) 35 | + [CS2P: Improving video bitrate selection and adaptation with data-driven throughput prediction](https://cs.cmu.edu/~junchenj/cs2p.pdf) [Sigcomm 16] 36 | + [BOLA: Near-optimal bitrate adaptation for online videos](https://arxiv.org/pdf/1601.06748.pdf) [Infocom 16] 37 | + [mDASH: A Markov Decision-Based Rate Adaptation Approach for Dynamic HTTP Streaming](https://ieeexplore.ieee.org/iel7/6046/4456689/07393865.pdf) [TMM 16] 38 | + [A Control-Theoretic Approach for Dynamic Adaptive Video Streaming over HTTP](https://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p325.pdf) [Sigcomm 15] [MPC] 39 | + [A Buffer-Based Approach to Rate Adaptation: Evidence from a Large Video Streaming Service](https://web.stanford.edu/class/cs244/papers/sigcomm2014-video.pdf) [Sigcomm 14] [Buffer-Based] 40 | + [A Control-Theoretic Approach to Rate Adaption for DASH Over Multiple Content Distribution Servers](https://ieeexplore.ieee.org/document/6662391) [TCSVT 14] 41 | + [Improving fairness, efficiency, and stability in HTTP-based adaptive video streaming with FESTIVE](https://dl.acm.org/doi/10.1145/2413176.2413189) [CoNEXT 12] 42 | 43 | | Year | Method | Detail | 44 | | ---- | ------------------------------------------------------------ | --------------------------------------- | 45 | | 21 | [Fugu](https://www.usenix.org/system/files/nsdi20-paper-yan.pdf) [NSDI 21] | DNN (bandwidth prediction)+DP (control) | 46 | | 20 | [OnRL](https://dl.acm.org/doi/10.1145/3372224.3419186) [Mobicom 20] | Online RL | 47 | | 20 | [Stick](https://godka.github.io/Infocom_20_Stick.pdf) [Infocom 20] | Buffer-based+Learning-based | 48 | | 19 | [Comyco](https://arxiv.org/pdf/1908.02270) [MM 19], [Concerto](https://dl.acm.org/doi/10.1145/3300061.3345430) [Mobicom 19] | Imitation Learning | 49 | | 19 | [PiTree](https://zilimeng.com/papers/pitree-mm19.pdf) [MM 19] | Explainable Learning | 50 | | 18 | [Oboe](https://engineering.purdue.edu/~isl/papers/sigcomm18-final128.pdf) [Sigcomm 18] | Auto-tuning parameters | 51 | | 17 | [Pensieve](https://people.csail.mit.edu/hongzi/content/publications/Pensieve-Sigcomm17.pdf) [Sigcomm 17] [Update](https://openreview.net/pdf?id=SJlCkwN8iV) [ICML 19] | Reinforcement Learning | 52 | | 16 | [CS2P](https://cs.cmu.edu/~junchenj/cs2p.pdf) [Sigcomm 16] | | 53 | | 16 | [BOLA](https://arxiv.org/pdf/1601.06748.pdf) [Infocom 16] | Buffer-Based+Lyapunov Optimization | 54 | | 15 | [MPC](https://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p325.pdf) [Sigcomm 15] | MPC | 55 | | 14 | [Buffer-Based](https://web.stanford.edu/class/cs244/papers/sigcomm2014-video.pdf) [Sigcomm 14] | Buffer-Based | 56 | | 12 | [Rate-Based](https://dl.acm.org/doi/10.1145/2413176.2413189) [CoNEXT 12] | Rate-Based | 57 | 58 | ## Super Resolution in Video Streaming 59 | 60 | + [Efficient Volumetric Video Streaming Through Super Resolution](https://dl.acm.org/doi/10.1145/3446382.3448663) [HotMobile 21] 61 | + [SplitSR: An End-to-End Approach to Super-Resolution on Mobile Devices](https://ubicomplab.cs.washington.edu/pdfs/splitsr.pdf) [IMWUT 21] 62 | + [Neural-Enhanced Live Streaming: Improving Live Video Ingest via Online Learning](https://dl.acm.org/doi/abs/10.1145/3387514.3405856) [Sigcomm 20] [LiveNas] 63 | + [NEMO: Enabling Neural-enhanced Video Streaming on Commodity Mobile Devices](https://dl.acm.org/doi/10.1145/3372224.3419185) [Mobicom 20] 64 | + [Streaming 360-Degree Videos Using Super-Resolution](https://ieeexplore.ieee.org/document/9155477) [Infocom 20] [[code]](https://github.com/VideoForage/Video-Super-Resolution) 65 | + [SR360: Boosting 360-Degree Video Streaming with Super-Resolution](https://dl.acm.org/doi/abs/10.1145/3386290.3396929) [Nossdav 20] 66 | + [Improving Quality of Experience by Adaptive Video Streaming with Super-Resolution](https://ieeexplore.ieee.org/document/9155384) [Infocom 20] 67 | + [Supremo: Cloud-Assisted Low-Latency Super-Resolution in Mobile Devices](https://arxiv.org/pdf/1908.07985) [TMC 20] 68 | + [MobiSR: Effcient OnDevice Super-Resolution through Heterogeneous Mobile Processors](https://arxiv.org/pdf/1908.07985) [Mobicom 19] 69 | + [Dejavu: Enhancing Videoconferencing with Prior Knowledge](http://panhu.me/pdf/2019/Dejavu.pdf) [HotMobile 19] 70 | + [Bridging the Edge-Cloud Barrier for Real-time Advanced Vision Analytics](https://www.usenix.org/system/files/hotcloud19-paper-wang.pdf) [HotCloud 19] 71 | + [Neural Adaptive Content-aware Internet Video Delivery](https://www.usenix.org/system/files/osdi18-yeo.pdf) [OSDI 18] [NAS] [[code]](https://github.com/kaist-ina/NAS_public) 72 | 73 | ## Live Video Streaming 74 | 75 | *Some articles may be repeated.* 76 | 77 | + [Look Ahead at the First-mile in Livecast with Crowdsourced Highlight Prediction](https://www2.cs.sfu.ca/~jcliu/Papers/LookAhead20.pdf) [Infocom 20] 78 | + [Neural-Enhanced Live Streaming: Improving Live Video Ingest via Online Learning](https://dl.acm.org/doi/abs/10.1145/3387514.3405856) [Sigcomm 20] [LiveNas] 79 | + [MultiLive: Adaptive Bitrate Control for Low-delay Multi-party Interactive Live Streaming](http://www.ece.sunysb.edu/~xwang/public/paper/MultiLive.pdf) [Infocom 20] 80 | + [Vabis: Video Adaptation Bitrate System for Time-Critical Live Streaming](https://jackkosaian.github.io/files/papers/sigcomm2019vantage.pdf) [TMM 20] 81 | + [Optimizing Social Welfare of Live Video Streaming Services in Mobile Edge Computing](https://ieeexplore.ieee.org/document/8653413) [TMC 20] 82 | + [Intelligent Edge-Assisted Crowdcast with Deep Reinforcement Learning for Personalized QoE](https://ieeexplore.ieee.org/document/8737456/) [Infocom 19] [DeepCast] 83 | + [Vantage: Optimizing video upload for time-shifted viewing of social live stream](https://jackkosaian.github.io/files/papers/sigcomm2019vantage.pdf) [Sigcomm 19] 84 | + [Edge-based Transcoding for Adaptive Live Video Streaming](https://www.usenix.org/conference/hotedge19/presentation/dogga) [HotEdge 19] 85 | + [QARC: Video Quality Aware Rate Control for Real-Time Video Streaming based on Deep Reinforcement Learning](https://dl.acm.org/doi/abs/10.1145/3240508.3240545) [MM 18] 86 | + [Characterizing User Behaviors in Mobile Personal Livecast: Towards an Edge Computing-assisted Paradigm](https://dl.acm.org/doi/10.1145/3219751) [ToMM 18] 87 | + [Cloud-Assisted Crowdsourced Livecast](https://dl.acm.org/doi/10.1145/3095755) [ToMM 17] 88 | + [Coping With Heterogeneous Video Contributors and Viewers in Crowdsourced Live Streaming: A Cloud-Based Approach](https://www2.cs.sfu.ca/~jcliu/Papers/HeterogeneousVideo.pdf) [TMM 16] 89 | + [When Crowd Meets Big Video Data: Cloud-Edge Collaborative Transcoding for Personal Livecast](https://ieeexplore.ieee.org/document/8478387) [TNSE 15] 90 | 91 | 19 MM Grand Challenge: 92 | 93 | + [A Hybrid Control Scheme for Adaptive Live Streaming](https://dl.acm.org/doi/pdf/10.1145/3343031.3356049) 94 | + [HD3: Distributed Dueling DQN with Discrete-Continuous Hybrid Action Spaces for Live Video Streaming](https://dl.acm.org/doi/pdf/10.1145/3343031.3356052) 95 | + [Continuous Bitrate & Latency Control with Deep Reinforcement Learning for Live Video Streaming](https://dl.acm.org/doi/pdf/10.1145/3343031.3356063) 96 | + [BitLat: Bitrate-adaptivity and Latency-awareness Algorithm for Live Video Streaming](https://dl.acm.org/doi/pdf/10.1145/3343031.3356069) 97 | + [Latency Aware Adaptive Video Streaming using Ensemble Deep Reinforcement Learning](https://dl.acm.org/doi/pdf/10.1145/3343031.3356071) 98 | 99 | ## 360-degree video 100 | 101 | *3-DOF* 102 | 103 | see [360-degree video papers](https://github.com/jinyucn/Video-Streaming-Research-Papers/blob/main/360-degree%20video%20papers.md) 104 | 105 | ## Volumetric Video 106 | 107 | *6-DOF, point cloud* 108 | 109 | + [Efficient Volumetric Video Streaming Through Super Resolution](https://dl.acm.org/doi/10.1145/3446382.3448663) [HotMobile 21] 110 | + [GROOT: A Real-time Streaming System of High-Fidelity Volumetric Videos](https://juheonyi.github.io/files/GROOT.pdf) [Mobicom 20] 111 | + [ViVo: Visibility-Aware Mobile Volumetric Video Streaming](https://www-users.cs.umn.edu/~fengqian/paper/vivo_mobicom20.pdf) [Mobicom 20] 112 | + [Towards Viewport-dependent 6DoF 360 Video Tiled Streaming for Virtual Reality Systems](https://dl.acm.org/doi/10.1145/3394171.3413712) [MM 20] 113 | + [User Centered Adaptive Streaming of Dynamic Point Clouds with Low Complexity Tiling](https://ir.cwi.nl/pub/30378/30378.pdf) [MM 20] 114 | + [Towards Viewport-dependent 6DoF 360 Video Tiled Streaming for Virtual Reality Systems](https://dl.acm.org/doi/pdf/10.1145/3394171.3413712) [MM 20] 115 | + [A Pipeline for Multiparty Volumetric Video Conferencing: Transmission of Point Clouds over Low Latency DASH](https://repository.tudelft.nl/islandora/object/uuid:4a0178a3-971b-491a-b856-014d091f188e/datastream/OBJ/download) [MMsys 20] 116 | + [Cloud Rendering-based Volumetric Video Streaming System for Mixed Reality Services](https://arxiv.org/pdf/2003.02526) [MMsys 20] 117 | + [Low-latency Cloud-based Volumetric Video Streaming Using Head Motion Prediction](https://arxiv.org/pdf/2001.06466) [NOSSDAV 20] 118 | + [Emerging MPEG Standards for Point Cloud Compression](https://ir.cwi.nl/pub/29040/Emerging-MPEG-Standards-for-Point-Cloud-Compression.pdf) [TCSVT 19] 119 | + [Rate-Utility Optimized Streaming of Volumetric Media for Augmented Reality](https://arxiv.org/pdf/1804.09864) [arxiv 18] 120 | + [Design, Implementation, and Evaluation of a Point Cloud Codec for Tele-Immersive Video](https://core.ac.uk/download/pdf/206494004.pdf) [TCSVT 17] 121 | 122 | ## Virtual Reality 123 | 124 | Virtual reality papers research how to render with low latency in edge/cloud architecture. They often render small objects in mobile devices and render heavy background in the server. 125 | 126 | + [Q-VR: System-Level Design for Future Mobile Collaborative Virtual Reality](https://arxiv.org/ftp/arxiv/papers/2102/2102.13191.pdf) [ASPLOS 21] 127 | 128 | + [Coterie: Exploiting Frame Similarity to Enable High-Quality Multiplayer VR on Commodity Mobile](https://dl.acm.org/doi/abs/10.1145/3373376.3378516) [ASPLOS 20] 129 | + [Firefly: Untethered Multi-user VR for Commodity Mobile Devices](https://www.usenix.org/conference/atc20/presentation/liu-xing) [ATC 20] 130 | + [Mobile VR on Edge Cloud: A Latency-Driven Design](https://dl.acm.org/doi/10.1145/3304109.3306217) [MMSys 19] 131 | + [MUVR: Supporting Multi-User Mobile Virtual Reality with Resource Constrained Edge Cloud](https://ieeexplore.ieee.org/document/8567653/) [Egde Computing 18] 132 | + [Cutting the Cord: Designing a High-quality Untethered VR System with Low Latency Remote Rendering](https://dl.acm.org/doi/10.1145/3210240.3210313) [MobiSys 18] 133 | + [Furion: Engineering High-Quality Immersive Virtual Reality on Today’s Mobile Devices](http://www.cse.psu.edu/~gxc27/teach/597/Furion.pdf) [Mobicom 17] 134 | + [CloudVR: Cloud Accelerated Interactive Mobile Virtual Reality](https://dl.acm.org/doi/pdf/10.1145/3240508.3240620) [MM 17] 135 | + [Enabling High-Quality Untethered Virtual Reality](https://cs.uwaterloo.ca/~oabari/Papers/NSDI17.pdf) [NSDI 17] 136 | + [FlashBack: Immersive Virtual Reality on Mobile Devices via Rendering Memoization](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/06/flashback_mobisys2016.pdf) [MobiSys 16] 137 | 138 | ## Augmented Reality 139 | 140 | Papers about augmented reality deals with the inference of video analysis. 141 | 142 | + [Cuttlefish: Neural Configuration Adaptation for Video Analysis in Live Augmented Reality]() [TPDS 21] 143 | 144 | + [Edge Assisted Real-time Object Detection for Mobile Augmented Reality](http://www.winlab.rutgers.edu/~luyang/papers/mobicom19_augmented_reality.pdf) [Mobicom 19] 145 | 146 | ## Video Analysis 147 | 148 | + [Whiz: Data-Driven Analytics Execution](https://www.usenix.org/system/files/nsdi21-grandl.pdf) [NSDI 21] 149 | + [Move Fast and Meet Deadlines: Fine-grained Real-time Stream Processing with Cameo](https://www.usenix.org/system/files/nsdi21-xu.pdf) [NSDI 21] 150 | 151 | + [PECAM: Privacy-Enhanced Video Streaming and Analytics via Securely-Reversible Transformation](https://www.microsoft.com/en-us/research/uploads/prod/2021/02/mobicom21-final80.pdf) [Mobicom 21] 152 | + [Ekya: Continuous Learning of Video Analytics Models on Edge Compute Servers](https://arxiv.org/abs/2012.10557) [arxiv 20] 153 | + [Decomposable Intelligence on Cloud-Edge IoT Framework for Live Video Analytics](https://ieeexplore.ieee.org/document/9099311/) [IOTJ 20] 154 | + [Reducto: On-Camera Filtering for Resource-Efficient Real-Time Video Analytics](http://web.cs.ucla.edu/~ravi/publications/reducto_sigcomm20.pdf) [Sigcom 20] 155 | + [Server-Driven Video Streaming for Deep Learning Inference](https://people.cs.uchicago.edu/~junchenj/docs/DDS-Sigcomm20.pdf) [Sigcom 20] 156 | + [SPINN: Synergistic Progressive Inference of Neural Networks over Device and Clouds](https://arxiv.org/abs/2008.06402) [Mobicom 20] 157 | + [Chameleon: Scalable adaptation of video analytics](https://people.cs.uchicago.edu/~junchenj/docs/Chameleon_SIGCOMM_CameraReady_faceblurred.pdf) [Sigcomm 18] 158 | + [Noscope: Optimizing neural network queries over video at scale](https://www.vldb.org/pvldb/vol10/p1586-kang.pdf) [VLDB 17] 159 | + [Live video analytics at scale with approximation and delay-tolerance](https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/zhang) [NSDI 17] 160 | 161 | model training system: 162 | 163 | + [Optimus: an efficient dynamic resource scheduler for deep learning clusters](https://i.cs.hku.hk/~cwu/papers/yhpeng-eurosys18.pdf) [EuroSys 18] 164 | + [Learning without forgetting](https://arxiv.org/abs/1606.09282) [ECCV 16] 165 | + [Scalable Bayesian Optimization Using Deep Neural Networks](https://arxiv.org/abs/1502.05700) [ICML 15] 166 | + [Practical bayesian optimization of machine learning algorithms](https://papers.nips.cc/paper/4522-practical-bayesian-optimization-of-machine-learning-algorithms.pdf) [NeuIPS 12] 167 | 168 | Limited by the author's knowledge, some papers may be missed. 169 | 170 | -------------------------------------------------------------------------------- /Research Group.md: -------------------------------------------------------------------------------- 1 | # Research Group 2 | 3 | This is a document about some research groups around the world in the filed of multimedia transmission/processing. 4 | 5 | Considering the privacy of the researchers, this document may not be published. 6 | 7 | People can view the top conference/journal papers to search the groups. -------------------------------------------------------------------------------- /The way of multimedia research.md: -------------------------------------------------------------------------------- 1 | # The way of multimedia research 2 | 3 | This is a document about the way to build the **technical stack** of multimedia research/development. 4 | 5 | - [x] Some basic knowledge about **computer science and mathematical tools**, especially computer network, operating system, convex optimization, machine learning, etc. 6 | 7 | - [x] The **video streaming algorithm** listed in [Repository of papers](https://github.com/jinyucn/Video-Streaming-Research/blob/main/Repository%20of%20papers.md) , such as Buffer-Based, Rate-Based, Pensieve, Comyco and Pitree. 8 | - [ ] **Video Coding** (H.264, H.265, SVC) and related tools (FFmpeg). **Video coding scientists** aim to design better compression methods to reduce the video size and maintain the perceived quality, and **video transmission scientists** aim to design better rate adaptation schemes according to the network condition using some existing coding tools. So, their connection is very close, but their research focuses are a little different. 9 | - [ ] **Image processing** (super-resolution, denoising and deblurring). Some advanced tools, such as StyleGAN, make video more interesting. 10 | - [x] **Mobile/Edge/Cloud computing**. Sometimes it needs to allocate some resources and group mobile users in the video system. Some optimization algorithms, such as bandit, knapsack, greedy method, auction, game theory and DRL, are always used to solve the system problem. 11 | - [x] **Transmission protocols**, such as RTMP, RTSP, HLS, DASH. 12 | - [ ] Advanced tools. [**DASH**](https://github.com/Dash-Industry-Forum/dash.js) is often used in on-demand videos in the experiments. [**webrtc**](https://webrtc.org/) is often used in video telephony. [QUIC](https://www.chromium.org/quic) , [VLC player](https://www.videolan.org/) 13 | - [x] The ability to **build an optimization problem** of video streaming or video analysis. 14 | - [x] A vision of the **latest researches**, such as AI, 6-DOF videos (point cloud), etc. 15 | 16 | ------------------------------------------------------------------------------------------------------------------------------------------------------ 17 | 18 | Additional skills: 19 | 20 | - [ ] Deep learning (pytorch/tensorflow). Nowadays, many SOTA algorithms, such as DRL, super-resolution and video coding, use deep learning based methods. 21 | - [ ] **Android development**, if you want to develop your mobile video application and system. 22 | - [ ] **Video quality assessment**, which means how to evaluate the video's perceived quality. Basic methods: PSNR, SSIM, VMAF. 23 | - [ ] **VR** focuses on how to render in time. VR developing tool: Unity. **AR** is related to video analysis. Tools: ARKit and ARCore. 24 | - [ ] 360-degree video: viewport prediction, projection, gaze collection, stitching, etc. 25 | - [ ] Using C/C++ to develop a video system. 26 | - [ ] many more: Nginx, graphics. 27 | 28 | [A good road map in Chinese](https://zhuanlan.zhihu.com/p/354676754) 29 | 30 | 31 | 32 | It seems it is really hard to conquer the field of multimedia. However, you only need to be expert at one field and learn some basic knowledge in other fields. Multimedia application has penetrated into every aspect of our life, such as video entertainment, video conferencing, and video analysis with some computer vision algorithms. If you are a multimedia expert, you can create many amazing and valuable applications. Last but not least, interest is the most important thing. 33 | 34 | -------------------------------------------------------------------------------- /asserts/19Infocom_stick.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jinyucn/Video-Streaming-Research-Papers/df84d1e54eddcdd47a6b064f63c9cb95b76704de/asserts/19Infocom_stick.PNG -------------------------------------------------------------------------------- /asserts/20ATC_Firefly1.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jinyucn/Video-Streaming-Research-Papers/df84d1e54eddcdd47a6b064f63c9cb95b76704de/asserts/20ATC_Firefly1.PNG -------------------------------------------------------------------------------- /asserts/20sigcomm_dds.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jinyucn/Video-Streaming-Research-Papers/df84d1e54eddcdd47a6b064f63c9cb95b76704de/asserts/20sigcomm_dds.PNG -------------------------------------------------------------------------------- /asserts/21ASSR_1.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jinyucn/Video-Streaming-Research-Papers/df84d1e54eddcdd47a6b064f63c9cb95b76704de/asserts/21ASSR_1.PNG --------------------------------------------------------------------------------