├── README.md
├── additional-resources.md
├── ffmpeg-integration.md
├── images
├── abr-ladder.png
├── ffmpeg-logo.png
├── hevc.png
├── ngcodec-logo.png
├── system-architecture.png
├── transcode-flow.png
├── vp9.png
├── vyusync-logo.png
└── xilinx-logo-red-black.png
├── installation-and-getting-started.md
├── known-issues-limitations.md
├── ngcodec-hevc-vp9-encoder.md
├── system-requirements.md
├── using-ffmpeg-with-xilinx.md
├── vyusync-decoder.md
└── xilinx-abr-scaler.md
/README.md:
--------------------------------------------------------------------------------
1 |
2 |
22 |
23 | ## Introduction
24 |
25 | ABR stands for adaptive bit rate streaming. It is a protocol for video streaming over HTTP where the source content is encoded at multiple bit rates and resolutions. This document describes the Xilinx video transcoding system that can accelerate ABR transcoding from H.264 to HEVC, or from H.264 to VP9. The system supports live video input streams of up to 1920x1080 at 60 frames per second.
26 |
27 | ## Overview
28 | The streaming client is made aware of the available streams at differing bit rates. At the start, the client requests the lowest bit rate stream. If the client finds the download speed is greater than the bit rate of the stream, it requests the next higher bit rate. Later, if the client finds the download speed for a stream is lower than the bit rate (because the network throughput has deteriorated, for example) it requests a lower bit rate stream.
29 |
30 | 
31 |
32 | The three main functional parts of an ABR video transcoder are a decoder, a scaler, and an encoder. A typical transcoding pipeline works as follows:
33 |
34 | 1. The original 1080p60 H.264 elementary stream is decoded.
35 | 2. The scaler takes the uncompressed 1080p and produces multiple renditions of the content by scaling to resolutions of 720p, 480p, 320p, and 240p.
36 | 3. These renditions are then encoded by a single instance of the VP9 (or HEVC) encoder by time division multiplexing the hardware.
37 |
38 | Xilinx has integrated the video transcoding accelerators into FFmpeg. FFmpeg is an open source software project, containing a vast suite of libraries, plugins, and programs for handling video, audio, and other multimedia files and streams. The FFmpeg program itself, designed for command-line-based processing of video and audio files, is widely used for transcoding. FFmpeg is part of the workflow of many software projects, and its libraries are a core part of software media players such as VLC, for example.
39 |
40 | The next sections provide more details on the following video transcoding accelerators:
41 |
42 | * The H.264 video decoder provided by our partner [VYUsync](https://www.vyusync.com/).
43 | * The [Xilinx](https://www.xilinx.com/) ABR scaler.
44 | * The HEVC and VP9 encoders provided by our partner [NGCodec](https://ngcodec.com/).
45 |
46 | :arrow_forward:**Next Topic:** [2. VYUSync Decoder](vyusync-decoder.md)
47 |
--------------------------------------------------------------------------------
/additional-resources.md:
--------------------------------------------------------------------------------
1 |
2 |
22 |
23 | # Additional resources
24 | * [Package files for the ABR Video Transcoding Evaluation](https://www.xilinx.com/products/boards-and-kits/alveo/applications/adaptive-bit-rate-video-transcoding-application.html#gettingStarted)
25 | * [Getting Started Guide for U200 and U250 Cards](https://www.xilinx.com/support/documentation/boards_and_kits/accelerator-cards/ug1301-getting-started-guide-alveo-accelerator-cards.pdf)
26 | * [Xilinx](https://www.xilinx.com/)
27 | * [NGCodec](https://ngcodec.com/)
28 | * [VYUSync](https://www.vyusync.com/)
29 | * [About FFmpeg](https://www.ffmpeg.org/about.html)
30 |
31 | :arrow_backward:**Previous Topic:** [9. Known Issues and Limitations](known-issues-limitations.md)
32 |
--------------------------------------------------------------------------------
/ffmpeg-integration.md:
--------------------------------------------------------------------------------
1 |
2 |
22 |
23 |
24 | # FFmpeg integration
25 |
26 |
27 | 
28 |
29 | Xilinx has integrated the video transcoding accelerators into FFmpeg. FFmpeg is one of the most popular frameworks for video transcoding. It allows for a seamless transition from running a transcoding flow in pure software on a host machine to offloading key CPU-intensive workloads (such as video decoding, ABR scaling, and HEVC or VP9 encoding) to a Xilinx accelerator card such as the **U200**. Depending on the host/server configuration, up to eight accelerator cards can be supported without stressing the CPU(s).
30 |
31 |
32 | ## System Architecture
33 |
34 |
35 | The following diagram shows the anatomy of the layers of a host/server with a Xilinx accelerator card. Terminology is explained below.
36 |
37 | 
38 |
39 | * **Alveo U200**: The U200 data center accelerator card suitable for cloud or on-premise deployments. Application building is enabled through the [SDAccel™](https://www.xilinx.com/products/design-tools/software-zone/sdaccel.html) integrated environment.
40 | * **Device Support Archive (DSA)**: The archive contains all of the design and metadata needed for a hardware function to interact with the physical design. It is the output product of the hardware platform design process described in this guide.
41 | * **Xilinx Accelerator Binary (XCLBIN)**: Xilinx accelerator binary bitstreams are packaged in xclbin file format by the SDAccel compiler. Xilinx provides two xclbin accelerator binaries as part of the ABR video transcode evaluation package. One is a H.264 to ABR HEVC transcode accelerator, and the other is a H.264 to ABR VP9 transcode accelerator, as shown in the images below.
42 |
43 | 
44 |
45 | 
46 |
47 | * **PCIe**: The Xilinx PCIe hardware device consists of two regions, the static region and the programmable region. The static region provides the connectivity framework to the programmable region, which executes the hardware functions as defined in the software kernel.
48 | * **Static Region**: Represents the fixed logic portion (contained in the DSA) of the programmable device that manages the design state before, during, and after partial reconfiguration of the device. This logic is not reimplemented with the programmable region.
49 | * **Programmable Region**: Describes the partition region that accepts the hardware functions from the SDAccel development environment. One at a time, xclbins are downloaded into the programmable region of the device.
50 | * **Xilinx Runtime (XRT)** is implemented as a combination of user-space and kernel driver components. XRT, which supports both PCIe-based cards and Zynq® UltraScale+™ MPSoCs, provides the software interface to the Xilinx accelerator card.
51 | * **Xilinx Media Accelerator (XMA)**: The Xilinx Media Accelerator (XMA) library (`libxmaapi`) is a host interface that simplifies the development of applications managing and controlling video accelerators such as decoders, scalers, filters, and encoders. The `libxmaapi` library is comprised of two APIs: an application interface and a plugin interface. The application interface is a higher level, generalized interface intended for application developers responsible for integrating control of Xilinx accelerators into software frameworks such as FFmpeg, GStreamer, or proprietary frameworks. The plugin interface is a lower level interface intended for developers responsible for implementing hardware control of specific Xilinx acceleration kernels. In general, plugins are developed by kernel providers because they are specialized user-space drivers that are aware of the low-level hardware interface.
52 | * **FFmpeg**: Xilinx has integrated the VYUsync H.264 decoder accelerator, the Xilinx ABR scaler accelerator, and the NGCodec HEVC and VP9 encoder accelerators into a fork of FFmpeg version 3.3. All source code is available from this [link](https://github.com/Xilinx/FFmpeg-xma) on GitHub.
53 |
54 | ## FFmpeg Command Line Application
55 |
56 | In the following example, an FFmpeg command line is invoked that ingests a H.264 video elementary bitstream and reencodes the file as HEVC at a lower bit rate, targeting the Xilinx accelerated decoder and encoder. As a result, the `main()` function of the FFmpeg command is invoked and this calls the `xma_initialize()` function. The `xma_initialize()` function is called prior to executing any other XMA functions that handle resource management and control the Xilinx accelerators. The `xma_initialize()` function reads the system configuration file from `/tmp/ffmpeg_config.yml`. The XMA configuration file uses YAML syntax to describe how the system is to be configured. In this case, you can choose between a H.264 to ABR VP9 transcode configuration or a H.264 to ABR HEVC transcode configuration. The H.264 to ABR VP9 transcode configuration file is shown below.
57 |
58 | SystemCfg:
59 | - logfile: /tmp/xma.log
60 | - loglevel: 2
61 | - dsa: xilinx_u200_xdma_201820_1
62 | - pluginpath: /opt/xilinx/xma/plugins
63 | - xclbinpath: /opt/xilinx/xcdr/xclbins
64 | - ImageCfg:
65 | xclbin: u200_h264dec_abr_vp9_ddr1.xclbin
66 | zerocopy: disable
67 | device_id_map: [0]
68 | KernelCfg: [[ instances: 1,
69 | function: decoder,
70 | plugin: libvyuh264.so,
71 | vendor: VYUSync,
72 | name: vyusync_h264_decoder_0,
73 | ddr_map: [0]],
74 | [ instances: 1,
75 | function: encoder,
76 | plugin: libngcvp9.so,
77 | vendor: NGCodec,
78 | name: krnl_ngcodec_pistachio_enc,
79 | ddr_map: [0]],
80 | [ instances: 1,
81 | function: scaler,
82 | plugin: libxabrscaler.so,
83 | vendor: Xilinx,
84 | name: v_abrscaler_top,
85 | ddr_map: [1]]]
86 |
87 | The configuration file tells FFmpeg, by way of `xma_initialize()`, which Xilinx accelerator binary (xclbin) should be loaded into the programmable region. It also communicates what the topology of the xclbin is in terms of the number of accelerators, what device memories they are using, and where to find the XMA software plugins.
88 |
89 | No entries need to be modified in the above configuration file other than path names (`pluginpath` and `xclbinpath` tags) or file names (`logfile` tag). The other tags are tightly connected to the Xilinx accelerator binary listed after the `xclbin` tag.
90 |
91 | One exception is the `device_id_map` tag which is listed to `[0]` in the above configuration file. This indicates that you are using only one Xilinx accelerator device. The `device_id_map tag` allows for a sequence of device identifiers to be supplied that is analogous to an array of integers. Device identifiers are zero relative, and the maximum allowed value is 15 for a total of 16 supported devices within a system. Given a server that can hold four Xilinx U200 accelerator cards, use the following setting:
92 |
93 | `device_id_map: [0,1,2,3]`
94 |
95 | All devices supplied in the sequence are then programmed by the `xma_initialize()` function with the xclbin specified in the `xclbin` tag. From an application and FFmpeg point of view, this means you can launch up to four FFmpeg command lines in parallel, each setting up a H.264 to ABR VP9 transcoding pipeline capable of running up to 1920x1080 at 60 frames per second.
96 |
97 | When the `xma_initialize()` function successfully completes, the FFmpeg `main()` function performs initialization of all requested processing plugins. In this case, the Xilinx accelerated H.264 decoder and Xilinx accelerated VP9 encoder plugins have been registered with FFmpeg and the initialization callback of the plugin is invoked. The FFmpeg encoder plugin begins by creating an XMA session using the `xma_enc_session_create()` function. The `xma_enc_session_create()` function finds an available resource based on the properties supplied and, assuming resources are available, invokes the XMA plugin initialization function. The XMA plugin initialization function allocates any required input and output buffers on the device and performs initialization of the Xilinx accelerator if needed.
98 |
99 | After initialization is complete, the FFmpeg plugin for the encoder uses the `xma_enc_session_send_frame()` function and the `xma_enc_session_recv_data()` function to send uncompressed frames and receive compressed data from the encoder accelerator. The FFmpeg plugin for the decoder uses the `xma_dec_session_send_data()` function and the `xma_dec_session_recv_frame()` function to send compressed data and receive uncompressed frames from the decoder accelerator.
100 |
101 | More details on XMA can be found in the [Xilinx Media Accelerator (XMA) Developers Guide](https://gitenterprise.xilinx.com/ipssw/libxmaapi/wiki).
102 |
103 | :arrow_backward:**Previous Topic:** [4. Xilinx ABR Scaler](xilinx-abr-scaler.md)
104 |
105 | :arrow_forward:**Next Topic:** [6. System Requirements](system-requirements.md)
106 |
--------------------------------------------------------------------------------
/images/abr-ladder.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Xilinx/ABR-video-transcode/15d2f376860bf9565d0eeaf3347652f66a637435/images/abr-ladder.png
--------------------------------------------------------------------------------
/images/ffmpeg-logo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Xilinx/ABR-video-transcode/15d2f376860bf9565d0eeaf3347652f66a637435/images/ffmpeg-logo.png
--------------------------------------------------------------------------------
/images/hevc.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Xilinx/ABR-video-transcode/15d2f376860bf9565d0eeaf3347652f66a637435/images/hevc.png
--------------------------------------------------------------------------------
/images/ngcodec-logo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Xilinx/ABR-video-transcode/15d2f376860bf9565d0eeaf3347652f66a637435/images/ngcodec-logo.png
--------------------------------------------------------------------------------
/images/system-architecture.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Xilinx/ABR-video-transcode/15d2f376860bf9565d0eeaf3347652f66a637435/images/system-architecture.png
--------------------------------------------------------------------------------
/images/transcode-flow.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Xilinx/ABR-video-transcode/15d2f376860bf9565d0eeaf3347652f66a637435/images/transcode-flow.png
--------------------------------------------------------------------------------
/images/vp9.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Xilinx/ABR-video-transcode/15d2f376860bf9565d0eeaf3347652f66a637435/images/vp9.png
--------------------------------------------------------------------------------
/images/vyusync-logo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Xilinx/ABR-video-transcode/15d2f376860bf9565d0eeaf3347652f66a637435/images/vyusync-logo.png
--------------------------------------------------------------------------------
/images/xilinx-logo-red-black.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Xilinx/ABR-video-transcode/15d2f376860bf9565d0eeaf3347652f66a637435/images/xilinx-logo-red-black.png
--------------------------------------------------------------------------------
/installation-and-getting-started.md:
--------------------------------------------------------------------------------
1 |
2 |
22 |
23 | # Installation and Getting Started
24 |
25 | The required package files for setting up the ABR video transcoding evaluation and the U200 accelerator card can be found at the link [here](https://www.xilinx.com/products/boards-and-kits/alveo/applications/adaptive-bit-rate-video-transcoding-application.html#gettingStarted). Read the [Getting Started Guide for U200 and U250 Cards](https://www.xilinx.com/support/documentation/boards_and_kits/accelerator-cards/ug1301-getting-started-guide-alveo-accelerator-cards.pdf) for the installation procedure for the card as well as the Xilinx Runtime.
26 |
27 | After you have successfully installed the Xilinx Runtime and U200 deployment package, and downloaded the ABR video transcoding evaluation package, follow the steps below to install the package on your system.
28 |
29 | Ubuntu Installation Guide
30 |
31 | ## Ubuntu Installation Guide
32 |
33 | 1. Unzip and untar the video transcoding tarball:
34 |
35 | `tar -xvzf xcdr_deb_pkgs.tar.gz`
36 |
37 | 2. Edit `/etc/apt/sources.list` to add the directory where the packages are located. You need **sudo** access for this:
38 |
39 | `deb file:/home/user/xcdr_pkgs ./`
40 |
41 | 3. Run this command after changing `/etc/apt/sources.list`:
42 |
43 | `sudo apt-get update`
44 |
45 | 4. Install the downloaded packages with a single command line:
46 |
47 | `sudo apt-get install xcdr`
48 |
49 | >**:pushpin: NOTE** On Nimbix you will also need to install the following `sudo apt-get install libva-drm1`.
50 |
51 |
52 | RHEL/CentOS Installation Guide
53 |
54 | ## RHEL/CentOS Installation Guide
55 |
56 | 1. Unzip and untar the video transcoding tarball:
57 |
58 | `tar -xvzf xcdr_rpm_pkgs.tar.gz`
59 |
60 | 2. To add a local yum repository to the list of repositories, the `/etc/yum.repos.d` directory must be updated. Create a file called `localrepo.repo` that contains the following configuration:
61 |
62 | ```
63 | [localrepo]
64 | name=Xilinx Transcoder Repository
65 | baseurl=file:///home/user/xcdr_pkgs
66 | gpgcheck=0
67 | enabled=1
68 | ```
69 |
70 | 3. Install the downloaded packages with the following command line:
71 |
72 | `sudo yum install xcdr`
73 |
74 |
75 |
76 |
77 | The above steps install the ABR video transcoding evaluation package and all its dependencies. The transcoding package has dependencies (among others) on the Xilinx Runtime, the DSA for the U200 card, and FFmpeg. All packages are installed under the `/opt/xilinx/` directory.
78 |
79 | ```console
80 | .
81 | ├── dsa
82 | │ └── xilinx_u200_xdma_201820_1
83 | ├── ffmpeg
84 | │ ├── bin
85 | │ ├── etc
86 | │ ├── include
87 | │ ├── lib
88 | │ └── share
89 | ├── xcdr
90 | │ ├── bin
91 | │ ├── scripts
92 | │ └── xclbins
93 | ├── xma
94 | │ └── plugins
95 | └── xrt
96 | ├── bin
97 | ├── include
98 | ├── lib
99 | ├── license
100 | └── share
101 | ```
102 |
103 | As a final step of the installation, execute the following command:
104 |
105 | `source /opt/xilinx/xcdr/setup.sh`
106 |
107 | This includes `/opt/xilinx/ffmpeg/bin` and `/opt/xilinx/xcdr/bin` in your path, and adds `/opt/xilinx/ffmpeg/lib` and `/opt/xilinx/xma/lib` to the `LD_LIBRARY_PATH`.
108 |
109 | When the installation is complete, you are almost ready to start using FFmpeg with the Xilinx accelerated video transcoding functionality. The package does not come with any sample H.264 video files, but you can download a copy of **Big Buck Bunny** from the following [link](http://distribution.bbb3d.renderfarming.net/video/mp4/bbb_sunflower_1080p_30fps_normal.mp4).
110 |
111 |
112 | :arrow_backward:**Previous Topic:** [6. System Requirements](system-requirements.md)
113 |
114 | :arrow_forward:**Next Topic:** [8. Using FFmpeg with Xilinx Accelerated Video Transcoding](using-ffmpeg-with-xilinx.md)
115 |
--------------------------------------------------------------------------------
/known-issues-limitations.md:
--------------------------------------------------------------------------------
1 |
2 |
22 |
23 | # Known Issues
24 | * Bit rates higher than 20 Mbps may hang the VP9 encoder.
25 |
26 | # Limitations
27 | * The Xilinx ABR video transcode package is provided for evaluation purposes only. The encoded video streams are watermarked with the NGCodec logo.
28 | * Although the video transcode solution can scale to cloud scale, with up to 16 cards per server, the evaluation version is limited to a single accelerator card only.
29 |
30 |
31 | :arrow_backward:**Previous Topic:** [8. Using FFmpeg with Xilinx Accelerated Video Transcoding](using-ffmpeg-with-xilinx.md)
32 | :arrow_forward:**Next Topic:** [10. Additional Resources](additional-resources.md)
33 |
--------------------------------------------------------------------------------
/ngcodec-hevc-vp9-encoder.md:
--------------------------------------------------------------------------------
1 |
2 |
22 | # NGCodec HEVC and VP9 Encoder
23 |
24 |
25 | NGCodec's RealityCodec™ is a broadcast-quality live distribution encoder for HEVC/VP9 with multiple ABR outputs. Running on Xilinx® FPGA instances in public clouds or customer data centers, the NGCodec encoder is an efficient way for OTT service providers and distributors, MSOs, and telephone companies to deliver the highest video quality at the lowest bit rates over the internet and other mediums. By leveraging the high-quality, high-density cloud platform to deliver the fewest bits, operators and service providers benefit immensely with a reduction in CAPEX and OPEX. The NGCodec encoder is readily integrated in FFmpeg and has flexible APIs for integration with other custom frameworks.
26 |
27 | ## Features
28 |
29 | * High-quality live encoding
30 | * Xilinx® FPGA accelerated encoding with no host CPU requirements
31 | * 32 simultaneously independent encoded streams on a single Xilinx device
32 | * Programmable latency of 1 to 4 seconds
33 | * Simple API based on industry standards
34 | * 4:2:0 8-bit and 10-bit with HDR on the roadmap
35 | The encoder currently supports:
36 |
37 | * Broadcast-quality 1080p60 HEVC/VP9 live encoding in a single Xilinx® Alveo U200 Data Center accelerator card suitable for cloud or on-premise deployments
38 | * Built-in multipass encoding
39 | * Flexible multiple ABR outputs with up to 32 streams with a single instance
40 | * HEVC: Main 10 Profile up to Level 5.1 HD/SD 4:2:0 8-bit
41 | * Constant bit rate (CBR), capped variable bit rate (VBR), and fixed QP modes
42 | * Bit rates: Configurable from 100 Kbps to 40 Mbps
43 | * Slice types: I, P, and B with flexible open/closed GOP modes and GOP lengths
44 |
45 | ## Benefits:
46 |
47 | * 60 fps real-time encoding for resolutions up to 1920x1080 with better quality than x265 preset
48 | * 10 times lower power consumption than CPU/GPU
49 | * Support for HLS and DASH ABR outputs
50 | * Consistent output quality independent of the number of encoding channels
51 | * FFmpeg plugin
52 |
53 | ## Supported Encoding Tools:
54 |
55 | * Advanced scene change detection algorithm
56 | * Enhanced video pre-analysis with configurable look-ahead
57 | * Coding tools: CABAC, deblocking Filter, SAO Filter, coding Units up to 64x64 pixels, adaptive transform sizes up to 32x32 pixels, adaptive quantization, all inter and intra modes with rate-distortion optimization (RDO)
58 |
59 |
60 | ## Supported Resolutions and Formats
61 |
62 | * HD/SD resolutions down to 240p, with both horizontal and vertical dimension divisible by 4.
63 | * 4:2:0 8-bit
64 |
65 | For more information, please visit [www.ngcodec.com](https://www.ngcodec.com) or write to ian.jefferson@ngcodec.com.
66 |
67 | :arrow_backward:**Previous Topic:** [2. VYUsync Decoder](vyusync-decoder.md)
68 |
69 | :arrow_forward:**Next Topic:** [4. Xilinx ABR Scaler](xilinx-abr-scaler.md)
70 |
--------------------------------------------------------------------------------
/system-requirements.md:
--------------------------------------------------------------------------------
1 |
2 |
22 |
23 | # System Requirements
24 |
25 | The Xilinx ABR video transcoding evaluation package is supported on the following **host** platforms:
26 |
27 | * x86_64
28 | * AARCH64
29 | * PPC64LE
30 |
31 | A minimum of 64 GB of memory is required.
32 |
33 | For this evaluation package, currently only the following **Xilinx accelerator card** is supported:
34 |
35 | * Alveo U200 (xilinx_u200_dynamic_5_1)
36 |
37 | The package requires Linux kernel 3.10 and GCC with C++11 features, and has been tested on the following **software platforms**:
38 |
39 | * RHEL/CentOS 7.4 and 7.5
40 | * Ubuntu 16.04.4 LTS
41 |
42 | :arrow_backward:**Previous Topic:** [5. FFmpeg Integration](ffmpeg-integration.md)
43 |
44 | :arrow_forward:**Next Topic:** [7. Installation and Getting Started](installation-and-getting-started.md)
45 |
--------------------------------------------------------------------------------
/using-ffmpeg-with-xilinx.md:
--------------------------------------------------------------------------------
1 |
2 |
22 |
23 | # Using FFmpeg with Xilinx Accelerated Video Transcoding Functionality
24 |
25 |
26 | This section will walk you through a sampling of FFmpeg command lines that show how to use the Xilinx accelerated video transcoding functionality. Take a look at the following example command lines.
27 |
28 |
29 | Example 1: Running the Xilinx Accelerated H.264 Decoder
30 |
31 | ## Example 1: Running the Xilinx Accelerated H.264 Decoder
32 |
33 | Make sure to configure the device for either VP9 or HEVC encoding. In this case, configure the device for HEVC transcode acceleration with the following command:
34 |
35 | `xcdrctl -p HEVC -b U200`
36 |
37 | The `xcdrctl` command is a simple Python application, located under `/opt/xilinx/xcdr/bin`. Executing this application writes the configuration file for the HEVC transcoding accelerators in `/var/tmp/xmacfg.yaml`, and subsequently downloads the xclbin to the device.
38 |
39 | With the device programmed, you can now decode an elementary H.264 bitstream using the Xilinx accelerated decoder as follows:
40 |
41 | `ffmpeg -y -c:v VYUH264 -i input.h264 -vsync 0 output.yuv`
42 |
43 | `-c:v VYUH264` preceding the input file indicates that you are using the H.264 Xilinx accelerated decoder to decode the H.264 encoded elementary bitstream. The resulting decoded frames are written as raw video to the output file.
44 |
45 | >**:pushpin: NOTE** You can find sample FFmpeg commands in `/opt/xilinx/xcdr/scripts/`.
46 |
47 |
48 |
49 |
50 | Example 2: Running the Xilinx Accelerated Scaler
51 |
52 | ## Example 2: Running the Xilinx Accelerated Scaler
53 |
54 | The following command line shows how to scale the 1920x1080 uncompressed input frames to 1280x720:
55 |
56 | ```bash
57 | ffmpeg -f rawvideo -pix_fmt yuv420p -s:v 1920x1080 -i input.yuv \
58 | -filter_complex "scale_xma=1: out_1_width=1280:out_1_height=720:[a]" \
59 | -map '[a]' -frames 2000 -f rawvideo -pix_fmt yuv420p -y out.yuv
60 | ```
61 |
62 | In this evaluation package, the scaler can generate up to four scaled renditions of a single input at a time. This is shown in the command line below:
63 |
64 | ```bash
65 | ffmpeg -f rawvideo -pix_fmt yuv420p -s:v 1920x1080 -i input.yuv \
66 | -filter_complex "scale_xma=4: \
67 | out_1_width=1280:out_1_height=720: \
68 | out_2_width=848:out_2_height=480: \
69 | out_3_width=640:out_3_height=360: \
70 | out_4_width=424:out_4_height=240[a][b][c][d]" \
71 | -map '[a]' -f rawvideo -pix_fmt yuv420p -y out1.yuv \
72 | -map '[b]' -f rawvideo -pix_fmt yuv420p -y out2.yuv \
73 | -map '[c]' -f rawvideo -pix_fmt yuv420p -y out3.yuv \
74 | -map '[d]' -f rawvideo -pix_fmt yuv420p -y out4.yuv
75 | ```
76 |
77 | In the above command line, `-filter_complex "scale_xma...[a][b][c][d]"` scales the input frames to an image pyramid of 1280x720, 848x480, 640x360, and 424x240 using the Xilinx ABR scaler. Each of the scaled outputs `[a][b][c][d]` can then, using the `-map '[a]'` command, be referred to individually and written to four output files. `-f rawvideo -pix_fmt yuv420p` indicates that the input frames are raw video, formatted as `yuv420p`, which is a planar YUV 4:2:0 video format. `-s:v 1920x1080` indicates that the resolution of the uncompressed input frames is 1920x1080.
78 |
79 |
80 |
81 | Example 3: Running the Xilinx Accelerated HEVC Encoder
82 |
83 | ## Example 3: Running the Xilinx Accelerated HEVC Encoder
84 |
85 | Using the command line below, you can run the Xilinx accelerated HEVC encoder to encode the 1280x720 scaled rendition into an HEVC elementary bitstream.
86 |
87 | `ffmpeg -i input.yuv -frames 240 -c:v NGC265 -y out.hevc`
88 |
89 | `-c:v NGC265` preceding the output file indicates that you are encoding the raw video using the NGCodec HEVC Xilinx accelerated encoder to an HEVC elementary bitstream file. The standard method for getting the supported options and information about an encoder from FFmpeg is to issue the following command:
90 |
91 | `ffmpeg –h encoder=NGC265`
92 |
93 | To find the list of available encoders, issue the command:
94 |
95 | `ffmpeg --codecs`
96 |
97 | In the case of the Xilinx accelerated HEVC encoder, the results are as follows:
98 |
99 | ```console
100 | Encoder NGC265 [NGCodec H.265 / HEVC]:
101 | General capabilities: delay threads
102 | Threading capabilities: auto
103 | Supported pixel formats: yuv420p
104 | ngc265 AVOptions:
105 | -aq-mode AQ method (from 0 to 1) (default 1)
106 | -rc-lookahead Number of frames to look ahead for frametype and ratecontrol (from 8 to 64) (default 30)
107 | -idr-period IDR Period (from 0 to INT_MAX) (default 0)
108 | -aq-temp-gain Temporal AQ strength. Reduces blocking and blurring in flat and textured areas. (from 50 to 200) (default 100)
109 | -aq-spat-gain Spatial AQ strength. Reduces blocking and blurring in flat and textured areas. (from 50 to 200) (default 100)
110 | -minQP MIN QP for capped VBR (from -12 to 51) (default -12)
111 | ```
112 |
113 | An overview of all the relevant parameters that control the picture quality of the encoder (including FFmpeg standard controls such as `-b`, and `-g`) is shown in the table below.
114 |
115 | | Parameter Name | FFmpeg Command Option | Mininum to Maximum Value Range | Suggested Value |
116 | | :------------------------ |:-------------| :-------| :-------|
117 | | Fixed QP | -q | 0-51 | >=15 |
118 | | Min QP | -minQP | -12-51 | -12 | |
119 | | Bit rate | -b | 100K-35M | Depends on resolution|
120 | | I-Period Interval | -g | 0-32767 | 0 |
121 | | AQ Mode | -aq-mode | 0-1 | 1|
122 | | Temporal AQ Gain | -aq-temp-gain | 50-200 | 100|
123 | | Spatial AQ Gain | -aq-spat-gain | 50-200 | 100|
124 | | Lookahead Distance | -rc-lookahead | 8-30| 30|
125 | | IDR Period | -idr-period| 0-32767| 0|
126 |
127 |
128 |
129 | Example 4: Running Xilinx Accelerated Transcoding from H.264 to HEVC
130 |
131 | ## Example 4: Running Xilinx Accelerated Transcoding from H.264 to HEVC
132 |
133 | As well as running all three accelerators in isolation, you can also put them together in a transcoding pipeline. To transcode a single H.264 encoded elementary bitstream file into an HEVC encoded bitstream file, use the following command line:
134 |
135 | `ffmpeg -c:v VYUH264 -r 60 -i input.h264 -frames 100 -c:v NGC265 -g 60 -idr-period 60 -b:v 5000k -r 60 -y output.hevc`
136 |
137 | `-c:v VYUH264` preceding the input file indicates that you are using the VYUSync H.264 Xilinx accelerated decoder to decode the H.264 encoded elementary bitstream. `-c:v NGC265` preceding the output file indicates that you are encoding the decoded bitstream using the NGCodec HEVC Xilinx accelerated encoder to an HEVC elementary bitstream file. You are using a bit rate target of 5,000 Kbps (or 5 Mbps) as indicated by `-b:v 5000k`. The IDR period is set to 60 frames with `-idr-period 60`, and the GOP length is set to 60 frames with `-g 60`.
138 |
139 |
140 |
141 | Example 5: Running Xilinx Accelerated Transcoding from H.264 to HEVC Using ABR
142 |
143 | ## Example 5: Running Xilinx Accelerated Transcoding from H.264 to HEVC Using ABR
144 |
145 | The following command shows how to transcode a 1920x1080 H.264 encoded elementary bitstream file into four lower resolution HEVC encoded bitstream files.
146 |
147 | ```bash
148 | ffmpeg -c:v VYUH264 -i input.h264 \
149 | -filter_complex "scale_xma=4: \
150 | out_1_width=1280:out_1_height=720: \
151 | out_2_width=848:out_2_height=480: \
152 | out_3_width=640:out_3_height=360: \
153 | out_4_width=424:out_4_height=240[a][b][c][d]" \
154 | -map '[a]' -frames 2000 -c:v NGC265 -g 60 -idr-period 60 -b:v 3000k -r 60 -y out1.hevc \
155 | -map '[b]' -frames 2000 -c:v NGC265 -g 60 -idr-period 60 -b:v 2000k -r 60 -y out2.hevc \
156 | -map '[c]' -frames 2000 -c:v NGC265 -g 60 -idr-period 60 -b:v 1000k -r 60 -y out3.hevc \
157 | -map '[d]' -frames 2000 -c:v NGC265 -g 60 -idr-period 60 -b:v 800k -r 60 -y out4.hevc
158 | ```
159 |
160 | In the above command line, `-filter_complex "scale_xma...[a][b][c][d]"` is used to scale the decoded frame scales to an image pyramid of 1280x720, 848x480, 640x360, and 424x240 using the Xilinx ABR scaler. Each of the scaled outputs `[a][b][c][d]` can then be referred to individually using the `-map '[a]` command. Each of the outputs is encoded with its own parameters (such as bit rate, GOP length, and so on) to an HEVC elementary bitstream file.
161 |
162 |
163 |
164 | Example 6: Running Xilinx Accelerated Transcoding from H.264 to VP9
165 |
166 | ## Example 6: Running Xilinx Accelerated Transcoding from H.264 to VP9
167 |
168 | Now that you are switching from HEVC to VP9 encoding, configure the device for VP9 transcoding with the following command.
169 |
170 | `xcdrctl -p VP9 -b U200`
171 |
172 | This puts the configuration file for the VP9 transcoding accelerator in the appropriate location, and subsequently downloads and programs the xclbin to the device. If the device is not configured correctly, expect to see the following error:
173 |
174 | ```console
175 | ERROR: Unable to allocate NGCVP9 encoder session
176 | Error initializing output stream -- Error while opening encoder for output stream - maybe incorrect parameters such as bit_rate, rate, width or height
177 | 2018-08-31 13:30:14.729 ERROR xmares No available kernels of type 'scaler' from vendor NGCodec
178 | 2018-08-31 13:30:14.729 ERROR xmaencoder Failed to allocate free encoder kernel. Return code -3 Conversion failed!
179 | ```
180 |
181 | With the device correctly configured, you can now transcode a single H.264 encoded elementary bitstream file into a VP9 encoded bitstream file using the following command line:
182 |
183 | `ffmpeg -c:v VYUH264 -r 60 -i input.h264 -frames 100 -f rawvideo -c:v NGCVP9 -g 60 -idr-period 60 -b:v 5000k -r 60 -y output.vp9`
184 |
185 | The above command line is identical to the command line used for HEVC transcoding, with the exception of indicating through `-c:v NGCVP9` that the NGCodec VP9 encoder is used for encoding the decoded frames.
186 |
187 | The supported options for the Xilinx accelerated VP9 encoder can be queried with the following command:
188 |
189 | `ffmpeg –h encoder=NGCVP9`
190 |
191 | This shows the following results:
192 |
193 | ```console
194 | Encoder NGCVP9 [NGCodec vp9 ]:
195 | General capabilities: delay threads
196 | Threading capabilities: auto
197 | Supported pixel formats: yuv420p
198 | ngcvp9 AVOptions:
199 | -aq-mode AQ method (from 0 to 1) (default 1)
200 | -rc-lookahead Number of frames to look ahead for frametype and ratecontrol (from 8 to 64) (default 30)
201 | -idr-period IDR Period (from 0 to INT_MAX) (default 0)
202 | -aq-temp-gain Temporal AQ strength. Reduces blocking and blurring in flat and textured areas. (from 50 to 200) (default 100)
203 | -aq-spat-gain Spatial AQ strength. Reduces blocking and blurring in flat and textured areas. (from 50 to 200) (default 100)
204 | -minQP MIN QP for capped VBR (from -12 to 51) (default -12)
205 | ```
206 |
207 |
208 |
209 | Example 7: Running Xilinx Accelerated Transcoding from H.264 to VP9 Using ABR
210 |
211 | ## Example 7: Running Xilinx Accelerated Transcoding from H.264 to VP9 Using ABR
212 |
213 | The following command shows how to transcode a 1920x1080 H.264 encoded elementary bitstream file into four lower resolution VP9 encoded bitstream files.
214 |
215 | ```bash
216 | ffmpeg -c:v VYUH264 -i input.h264 \
217 | -filter_complex "scale_xma=4: \
218 | out_1_width=1280:out_1_height=720: \
219 | out_2_width=848:out_2_height=480: \
220 | out_3_width=640:out_3_height=360: \
221 | out_4_width=424:out_4_height=240[a][b][c][d]" \
222 | -map '[a]' -frames 2000 -f rawvideo -c:v NGCVP9 -g 60 -idr-period 60 -b:v 3000k -r 60 -y out1.vp9 \
223 | -map '[b]' -frames 2000 -f rawvideo -c:v NGCVP9 -g 60 -idr-period 60 -b:v 2000k -r 60 -y out2.vp9 \
224 | -map '[c]' -frames 2000 -f rawvideo -c:v NGCVP9 -g 60 -idr-period 60 -b:v 1000k -r 60 -y out3.vp9 \
225 | -map '[d]' -frames 2000 -f rawvideo -c:v NGCVP9 -g 60 -idr-period 60 -b:v 800k -r 60 -y out4.vp9
226 | ```
227 |
228 | This command is again almost identical to the HEVC ABR transcoding command line. Again, `-filter_complex "scale_xma...[a][b][c][d]"` is used to scale the decoded frames to an image pyramid of 1280x720, 848x480, 640x360, and 424x240 using the Xilinx ABR scaler. Each of the scaled outputs `[a][b][c][d]` can then be referred to individually using the `-map '[a]` command. Each of the outputs is encoded with its own parameters (such as bit rate, GOP length, and so on) to a VP9 elementary bitstream file.
229 |
230 |
231 |
232 | Example 8: Running Xilinx Accelerated Transcoding from H.264 to VP9 Using ABR (Keeping the Original Frame Size)
233 |
234 | ## Example 8: Running Xilinx Accelerated Transcoding from H.264 to VP9 Using ABR (Keeping the Original Frame Size)
235 |
236 | The following command shows how to transcode a 1920x1080 H.264 encoded elementary bitstream file into four lower resolution VP9 encoded bitstream files while also transcoding the 1920x1080 source into a VP9 encoded bitstream.
237 |
238 | ```bash
239 | ffmpeg -c:v VYUH264 -i input.h264 \
240 | -filter_complex "split=2[a][temp]; \
241 | [temp] scale_xma=4: \
242 | out_1_width=1280:out_1_height=720: \
243 | out_2_width=848:out_2_height=480: \
244 | out_3_width=640:out_3_height=360: \
245 | out_4_width=424:out_4_height=240[b][c][d][e]" \
246 | -map '[a]' -frames 2000 -f rawvideo -c:v NGCVP9 -g 60 -idr-period 60 -b:v 5000k -r 30 -y out1.vp9 \
247 | -map '[b]' -frames 2000 -f rawvideo -c:v NGCVP9 -g 60 -idr-period 60 -b:v 3000k -r 30 -y out2.vp9 \
248 | -map '[c]' -frames 2000 -f rawvideo -c:v NGCVP9 -g 60 -idr-period 60 -b:v 2000k -r 30 -y out3.vp9 \
249 | -map '[d]' -frames 2000 -f rawvideo -c:v NGCVP9 -g 60 -idr-period 60 -b:v 1000k -r 30 -y out4.vp9 \
250 | -map '[e]' -frames 2000 -f rawvideo -c:v NGCVP9 -g 60 -idr-period 60 -b:v 800k -r 30 -y out5.vp9
251 | ```
252 |
253 | This command is almost identical to the VP9 ABR transcoding command line. You are using `-filter_complex "split[a][temp]"` to split the uncompressed frames into two identical 1920x1080 streams. One of the streams goes to to `scale_xma` to be scaled to the image pyramid of 1280x720, 848x480, 640x360, and 424x240 using the Xilinx ABR scaler. Each of the scaled outputs `[b][c][d][e]` and the remaining split output `[a]` can then, using the `-map '[a]'` command, be referred to individually. Each of the outputs is encoded with its own parameters (such as bit rate, GOP length, and so on) to a VP9 elementary bitstream file. This will only work for frame rates up to 30 fps. Otherwise, the sum of resolutions to encode exceeds the equivalent of 1080p60, which is the maximum supported by the encoder.
254 |
255 |
256 |
257 |
258 | :arrow_backward:**Previous Topic:** [7. Installation and Getting Started](installation-and-getting-started.md)
259 |
260 | :arrow_forward:**Next Topic:** [9. Known Issues and Limitations](known-issues-limitations.md)
261 |
--------------------------------------------------------------------------------
/vyusync-decoder.md:
--------------------------------------------------------------------------------
1 |
2 |
22 |
23 | # VYUsync H.264 Decoder
24 |
25 | 
26 |
27 | VYUsync’s H.264 1080p60 Decoder IP core is a highly optimized, high-resolution decompression engine. The decoder is fully compliant with the ISO/IEC 14496-10 and ITU-T H.264 standards. The decoder has been validated in hardware using the conformance streams from ITU-T and other industry standard test suites for functional and performance testing. The H.264 Decoder IP has been field-tested and proven in various customer applications. Leading broadcast companies have evaluated and purchased the VYUsync H.264 Decoder IP core.
28 |
29 | The VYUsync decoder design is fully autonomous and does not require an external processor to aid the decode operation. It supports Constrained Baseline, Main, and High profiles that allow for bit depths of 8 and 10 bits per sample with support for 4:0:0, 4:2:0, and 4:2:2 Chroma sampling. The evaluation version is limited to 4:2:0 8-bit. The decoder can decode all the bitstreams encoded for the High tier and Level 4.1 as well as for all lower tiers.
30 |
31 | ## Supported Video Formats
32 |
33 | * NTSC, PAL
34 | * 720p50, 720p59.94, 720p60
35 | * 1080i50, 1080i59.94, 1080i60
36 | * 1080p25, 1080p29.97, 1080p30, 1080p23.98, 1080p24, 1080p25, 1080p29.97, 1080p30, 1080p50, 1080p59.94, 1080p60
37 |
38 | In addition, all non-standard resolutions are supported by the decoder up to a maximum resolution of 1920x1080 and a maximum frame rate of 60 fps. The decoder supports a CABAC bit rate up to 80 Mbps, and a CAVLC bit rate up to 160 Mbps.
39 |
40 | For more information about this decoder, please visit [www.vyusync.com](https://www.vyusync.com) or write to contact@vyusync.com.
41 |
42 | :arrow_backward:**Previous Topic:** [1. Overview](README.md)
43 |
44 | :arrow_forward:**Next Topic:** [3. NGCodec HEVC and VP Encoder](ngcodec-hevc-vp9-encoder.md)
45 |
--------------------------------------------------------------------------------
/xilinx-abr-scaler.md:
--------------------------------------------------------------------------------
1 |
2 |
22 |
23 | # Xilinx ABR Scaler
24 | 
25 |
26 | For streaming applications, video is distributed in different resolutions and bit rates to adapt to varying network bandwidth conditions. All ABR transcoding systems require an ABR scaler that downscales an input video stream to several different smaller resolutions that are then reencoded. These smaller resolutions are referred to as an image pyramid or an ABR ladder.
27 |
28 | The Xilinx ABR scaler is an accelerator capable of generating up to eight lower resolution output images from a single input image. The ABR resolution ladder is generated in a sequential fashion: a smaller resolution on the ladder is derived from the next larger resolution on the ladder. The video scaler core resamples the incoming video data stream using a separable polyphase horizontal and vertical filter to preserve hardware resources. The multiple resolution outputs are obtained by feeding previous output as input to the next stage in a cascading manner and applying tailored horizontal and vertical filter coefficients at each stage.
29 |
30 | In the ABR ladder below, a 1280x720, 852x480, 640x360, and 416x240 output are created from a 1920x1080 input.
31 |
32 | 
33 |
34 | The scaler takes the 1920x1080 input and scales it down to 1280x720. It then takes the generated 1280x720 output and scales it down to 852x480, which is then scaled down to 640x360, which is finally scaled down to 416x240. The ABR scaler generates all the output resolutions without any host intervention.
35 |
36 | ## Features
37 |
38 | * Supports up to 12 taps in both horizontal and vertical direction per stage.
39 | * High-quality polyphase scaling with 64 phases, with quality matching the FFmpeg default bicubic setting.
40 | * Dynamically configurable filter coefficients.
41 | * Supports 8-bit 4:2:0.
42 | * Luma and Chroma processed in parallel.
43 | * Supports 1080p60 real time or equivalent distributed between up to eight outputs.
44 | * Supports even spatial resolutions from 256x144 to 1920x1080.
45 | * Although the main use case is downscaling, the ABR scaler also allows for upscaling. The outputs must be configured in descending order.
46 |
47 | For more information, write to sean.gardner@xilinx.com.
48 |
49 | :arrow_backward:**Previous Topic:** [3. NGCodec HEVC and VP Encoder](ngcodec-hevc-vp9-encoder.md)
50 |
51 | :arrow_forward:**Next Topic:** [5. FFmpeg Integration](ffmpeg-integration.md)
52 |
--------------------------------------------------------------------------------