When recording video using the gz::common::VideoEncoder
class, you can opt-in to use hardware (HW) acceleration for the encoding process. By default, only software encoders are used. This tutorial will show how to configure the encoder for HW acceleration and will present ready-made commandlines for some typical use-cases.
You can either use the VideoEncoder
class directly, or you can "meet it" in the video recorder plugin in Gazebo. In both cases, HW-accelerated encoding can be set up.
HW acceleration should provide you with higher encoding performance, potentially leaving more CPU power to the rest of your program/simulation, while taking a bit of GPU memory (video encoding uses different chips than 3D graphics or CUDA computations, so performance-wise, the rest of the GPU should be unaffected).
Configuring hardware acceleration
What needs to be configured
In order to get HW accelerated encoding working, you need to get 3 things right:
- The encoder type
- The HW device to be used
- Whether to use a specialized HW surface
1. Encoder types
The support for HW-accelerated encoding is based on what the local installation of libavcodec
(and your hardware) offers. If the libavcodec/FFMpeg your system has doesn't support HW acceleration, you're out of luck until you get a version that supports some. Some information about various aspects of the acceleration support by FFMpeg can be found on their HWAccelIntro wiki page.
If FFMpeg is correctly installed, you can see the available HW encoders by calling ffmpeg -hide_banner -encoders | grep 264
.
Here are some sample outputs:
Gazebo Common so far supports:
Encoder | Technology | Platforms | GPU vendors | Gazebo Common name |
---|---|---|---|---|
h264_nvenc | NVidia NVEnc | Windows + x86 Linux | NVidia GPUs | NVENC |
h264_qsv | Intel QuickSync | Windows + x86 Linux | Intel GPUs | QSV |
h264_vaapi | VA-API | x86 Linux | most GPUs | VAAPI |
Adding support for more is possible and should not be a big problem, but needs someone with the right hardware to verify the implementation.
The acceleration libraries may also need some drivers for them to work:
- NVEnc:
libnvidia-encode-*
or just CUDA runtime - VA-API:
libva-glx2
,libva-drm2
, maybe alsolibglvnd0
- QSV: Unclear, usually works out of the box
2. Device names
If your computer has more GPUs, it is important to specify which one to use. Here are some basic naming rules:
- NVEnc
- Linux: Devices
/dev/nvidia0
,/dev/nvidia1
etc. Beware that the numbers in names of these devices can differ from the numbers CUDA assigns the GPUs (CUDA orders the GPUs according to their compute capability, whereas numbering of these devices is probably by the order on PCI bus). - Windows: Devices are called the same as on Linux, even though such files do not exist in the system.
- Linux: Devices
- QSV
- Windows: The devices are called just
0
,1
and so on. - Linux: When using QSV on Linux, you can use VA-API device names.
- Windows: The devices are called just
- VA-API
- Linux:
- DRM: The devices are called
/dev/dri/renderD128
,/dev/dri/renderD129
etc. DRM stands for Direct Rendering Manager, not Digital Rights Management To use these devices, make sure your user has write permissions to the device files. Ubuntu usually doesn't give write access to everybody, but just to members of thevideo
group. - GLX: You can also pass an X-server string like
:0
, which means the encoder should use the GPU on which this X server is running (it needs to support the GLX extension). On headless machines, you should use DRM.
- DRM: The devices are called
- Linux:
3. Using HW surface
The last thing you need to decide is whether the selected encoder should use a HW surface (pixel buffer) or the default CPU-located one. With most encoders, this is just a performance issue and they will work both with CPU and GPU surfaces.
It is best if you perform experiments with HW surface used and not used and compare the performance. Select the one that suits your use case better.
Putting it together - configuration via environment variables
To ease configuration of the HW-accelerated encoding, there doesn't have to be explicit support for it in the code using VideoEncoder
. The code may concentrate on implementing the recording procedure itself, and completely ignore any HW acceleration of the recording process. Users of the code can then enable the HW acceleration just using these 3 environment variables:
GZ_VIDEO_ALLOWED_ENCODERS
This is the main variable that allows the VideoEncoder
to probe for supported HW-accelerated encoders. It is a colon-separated list of names described in the table above. Example: GZ_VIDEO_ALLOWED_ENCODERS=NVENC:QSV
. Special value ALL
means that all encoders should be tried. Special value NONE
(or empty value) means that a SW encoder should be used.
If more values are specified, the system will probe all the allowed encoders trying to start them up (if device is specified, then only with the given device). The first allowed encoder that successfully finishes the probe will be used.
The probing mechanism isn't 100% reliable. But it does what's reasonable to do in such an autodetection loop - it checks whether the required/supported device files exist, and if they do (or if there are no device files, as on Windows), the library tries to create an encoding context. If the context is successfully created, the encoder is considered working. Sometimes, something can go wrong in a later stage (e.g. insufficient GPU memory), and that is a kind of thing you have to handle yourself.
GZ_VIDEO_ENCODER_DEVICE
This is a name of the encoder device as specified in the "Device Names" section. If empty, first working device will be used. This auto detection should suffice on single-GPU systems or if you don't care which GPU will be used. If a device is specified, only encoders accepting this device name as an argument will be probed.
GZ_VIDEO_USE_HW_SURFACE
This variable has three possible values:
1
: Explicitly tell the encoder to use a GPU-located buffer.0
: Explicitly tell the encoder to use a CPU-located buffer.- empty: Let the library guess based on some pre-compiled hints.
Refer to section "Using HW surface" for more in-depth description of the meaning of this variable. Usually, leaving it empty should be just fine.
Configuration in code
These values can also be configured in code.
As stated earlier, if you use the standard 6-argument signature of VideoEncoder::Start()
, configuration via the above described environment variables will be performed.
There are two more signatures:
This signature added the _allowHwAccel
boolean argument with which you can explicitly allow/disallow the configuration via environment variables.
Then there is the full signature:
The three added arguments correspond to the environment variables, but with this signature you can set their values from code (and environment variables will have no effect then). This would be useful if you want to e.g. implement a GUI chooser for the acceleration.
The FlagSet<HWEncoderType>
captures a set of allowed encoders. Its value may be e.g. gz::Common::HWEncoderType::QSV | gz::common::HWEncoderType::NVENC
.
How do I know it's working
To make sure you configured the HW acceleration right, you may look at info-level messages where VideoEncoder
documents the detected encoder. It may look like:
There you can see a NVEnc encoder was successfully used on card /dev/nvidia1
to encode a 2-minute video clip.
If something goes wrong, you'll se a lot of error messages in the console. Sometimes they are helpful, sometimes they are not. In the very worst case, wrong configuration may lead to a segfault or any other kind of faulty behavior. This is also the reason why HW-acceleration isn't turned on by default.
This is what it looks like when the system cannot use any of the found devices, so it falls back to SW encoder:
Examples
Here are a few ready-made examples which might or might not work for you right-away. Just give them a try and dig deeper in the configuration if something is wrong.
Linux/Win + Intel GPU
GZ_VIDEO_ALLOWED_ENCODERS=QSV
Linux/Win + NVidia GPU
GZ_VIDEO_ALLOWED_ENCODERS=NVENC
Linux + Intel/NVidia GPU
GZ_VIDEO_ALLOWED_ENCODERS=VAAPI
Linux NVidia Multi-GPU machine
GZ_VIDEO_ALLOWED_ENCODERS=NVENC GZ_VIDEO_ENCODER_DEVICE=/dev/nvidia2
Caveats
FFMpeg on Windows via Conda
The current distribution of FFMpeg via Conda for Windows does not include the h264_qsv encoder. It only has h264_nvenc. This means that using Intel GPUs for HW acceleration is not possible out of the box. A possible solution would be to build a custom build of ffmpeg in the workspace with gz-common. Pull requests with the relevant tutorial are welcome.
NVEnc per-machine limit
If you have a multi-GPU station with desktop-class (not server-class) GPUs, you will run into an artificial limitation from NVidia. You can only run 3 concurrent encoding sessions on one computer, no matter how many GPUs you have. The computer can even have 8 GPUs, but you will only be able to encode 3 videos at a time. The exact maxima of encoding sessions are described in this NVidia support page.
To make things worse, there is no API that would tell you the number of currently running NVEnc sessions. The only way to find out you're launching the fourth is trying to start encoding and getting a memory allocation error. This library catches this error and writes a lengthy description of what might have just happened (either really low memory or this artificial limit). Unfortunately, when you start the doomed fourth encoding session, all the three "legal" sessions will crash. This might be really troublesome on e.g. multi-user systems when you don't even know which jobs of the other users are using NVEnc.
There is a workaround removing this artificial limit - patching the binary blob drivers using https://github.com/keylase/nvidia-patch . This is an unofficial patch that is not supported by NVidia or the Gazebo developers. It is up to you if you can and want to do that. And there is no guarantee it will work forever.