Reference Pictures in libx264
libx264 is simply brilliant; it is today’s fastest and the most bitrate efficient open-source H.264 encoder library. Sadly, I don’t believe there’s a good documentation out there for beginners. So, in this post, I will try to explain what is going on under the hood during the encoding. You can find the code in encoder/encode.c
file under your default x264 folder.
In the code, each encoder thread is abstracted into an x264_t
struct that holds the encoder state. You can find its declaration in common/common.h
. Inside it, there are lots of sub-structs. Here is a few important issues, without any order.
Manipulating reference lists for encoding
Let’s first go over some variables in the encoder.
h->i_ref[]
:int
array of number of reference frames for each listh->frames.reference[]
:x264_frame_t*
array of reference frames of lengthX264_REF_MAX+2
(2 sentinels). May contain both past and future frames.h->fref[][]
:x264_frame_t*
table of final reference frames, 2-by-X264_REF_MAX+3
.
Frames in (2) are distributed on (3) inside the x264_reference_build_list()
function. Then both lists are sorted with respect to the distance to the frame (closer reference frames have smaller indices), using the i_frame
variable.
The orders of the reference lists are double-checked in x264_reference_check_reorder()
; if it is different from the standard’s default, a reordering is signaled. Check out the code snippet below - according to the standard, list0 is ordered with respect to frame_num
if we have a P-slice, while list0 and list1 are ordered with respect to poc
in case of a B-slice.
Signaling the changes in reference picture lists
Slice headers carry, among others, the information regarding the reference frames used to predict the current slice. In libx264, slice headers are of type x264_slice_header_t
and they are initialized in the x264_slice_init()
method, which then calls the x264_slice_header_init()
method with the suitable arguments. Both of these methods involve copying the relevant state variables in the encoder object h
(of type x264_t
) onto their corresponding variables in the slice header h->sh
. The actual byte-stream is written on the output file in x264_slice_header_write()
.
Let’s go over the slice header fields that signal the changes in the reference picture lists:
- If there is a deviation from the default number of active reference frames signaled in the PPS, we
- set
h->sh.b_num_ref_idx_override
that corresponds to the slice header fieldnum_ref_idx_active_override_flag
. - enter the number of active reference frames in list X by changing
h->sh.i_num_ref_idx_lX_active
that corresponds to the slice header fieldnum_ref_idx_lX_active_minus1
. Both of these are done inx264_slice_init()
:
- set
- If there is a deviation from the default ordering of the reference frames in list X, we
- set
h->sh.b_ref_pic_list_reordering[X]
that corresponds to the slice header fieldref_pic_list_reordering_flag_lX
. - enter the
idc
andarg
pair(s) that will correctly signal the new ordering. Check out the final code snippet inx264_slice_header_init()
- this is basically the algorithm mentioned in the previous post.
- set