libx264 is simply brilliant; it is today’s fastest and the most bitrate efficient open-source H.264 encoder library. Sadly, I don’t believe there’s a good documentation out there for beginners. So, in this post, I will try to explain what is going on under the hood during the encoding. You can find the code in encoder/encode.c file under your default x264 folder.

In the code, each encoder thread is abstracted into an x264_t struct that holds the encoder state. You can find its declaration in common/common.h. Inside it, there are lots of sub-structs. Here is a few important issues, without any order.

Manipulating reference lists for encoding

Let’s first go over some variables in the encoder.

  1. h->i_ref[]: int array of number of reference frames for each list
  2. h->frames.reference[]: x264_frame_t* array of reference frames of length X264_REF_MAX+2 (2 sentinels). May contain both past and future frames.
  3. h->fref[][]: x264_frame_t* table of final reference frames, 2-by-X264_REF_MAX+3.

Frames in (2) are distributed on (3) inside the x264_reference_build_list() function. Then both lists are sorted with respect to the distance to the frame (closer reference frames have smaller indices), using the i_frame variable.

#define XCHG(type,a,b) do{ type t = a; a = b; b = t; } while(0)
/* Order reference lists by distance from the current frame. */
for( int list = 0; list < 2; list++ ) {
  h->fref_nearest[list] = h->fref[list][0];
  do {
    b_ok = 1;
    for( int i = 0; i < h->i_ref[list] - 1; i++ ) {
      if( list ? h->fref[list][i+1]->i_poc < h->fref_nearest[list]->i_poc
          : h->fref[list][i+1]->i_poc > h->fref_nearest[list]->i_poc )
        h->fref_nearest[list] = h->fref[list][i+1];
        
        if( x264_reference_distance( h, h->fref[list][i] ) > x264_reference_distance( h, h->fref[list][i+1] ) ) {
           XCHG( x264_frame_t*, h->fref[list][i], h->fref[list][i+1] );
           b_ok = 0;
           break;
        }
    }
  } while( !b_ok );
}

The orders of the reference lists are double-checked in x264_reference_check_reorder(); if it is different from the standard’s default, a reordering is signaled. Check out the code snippet below - according to the standard, list0 is ordered with respect to frame_num if we have a P-slice, while list0 and list1 are ordered with respect to poc in case of a B-slice.

for( int list = 0; list <= (h->sh.i_type == SLICE_TYPE_B); list++ ) {
    for( int i = 0; i < h->i_ref[list] - 1; i++ )
    {
        int framenum_diff = h->fref[list][i+1]->i_frame_num - h->fref[list][i]->i_frame_num;
        int poc_diff = h->fref[list][i+1]->i_poc - h->fref[list][i]->i_poc;
        /* P and B-frames use different default orders. */
        if( h->sh.i_type == SLICE_TYPE_P ? framenum_diff > 0 : list == 1 ? poc_diff < 0 : poc_diff > 0 )
        {
            h->b_ref_reorder[list] = 1;
            return;
        }
    }
}

Signaling the changes in reference picture lists

Slice headers carry, among others, the information regarding the reference frames used to predict the current slice. In libx264, slice headers are of type x264_slice_header_t and they are initialized in the x264_slice_init() method, which then calls the x264_slice_header_init() method with the suitable arguments. Both of these methods involve copying the relevant state variables in the encoder object h (of type x264_t) onto their corresponding variables in the slice header h->sh. The actual byte-stream is written on the output file in x264_slice_header_write().

Let’s go over the slice header fields that signal the changes in the reference picture lists:

  • If there is a deviation from the default number of active reference frames signaled in the PPS, we
    1. set h->sh.b_num_ref_idx_override that corresponds to the slice header field num_ref_idx_active_override_flag.
    2. enter the number of active reference frames in list X by changing h->sh.i_num_ref_idx_lX_active that corresponds to the slice header field num_ref_idx_lX_active_minus1. Both of these are done in x264_slice_init():
h->sh.i_num_ref_idx_l0_active = h->i_ref[0] <= 0 ? 1 : h->i_ref[0];
h->sh.i_num_ref_idx_l1_active = h->i_ref[1] <= 0 ? 1 : h->i_ref[1];
if( h->sh.i_num_ref_idx_l0_active != h->pps->i_num_ref_idx_l0_default_active ||
   (h->sh.i_type == SLICE_TYPE_B && h->sh.i_num_ref_idx_l1_active != h->pps->i_num_ref_idx_l1_default_active) )
{
 h->sh.b_num_ref_idx_override = 1;
}
  • If there is a deviation from the default ordering of the reference frames in list X, we
    1. set h->sh.b_ref_pic_list_reordering[X] that corresponds to the slice header field ref_pic_list_reordering_flag_lX.
    2. enter the idc and arg pair(s) that will correctly signal the new ordering. Check out the final code snippet in x264_slice_header_init() - this is basically the algorithm mentioned in the previous post.
sh->b_ref_pic_list_reordering[0] = h->b_ref_reorder[0];
sh->b_ref_pic_list_reordering[1] = h->b_ref_reorder[1];

/* If the ref list isn't in the default order, construct reordering header */
for( int list = 0; list < 2; list++ )
{
if( sh->b_ref_pic_list_reordering[list] )
{
    int pred_frame_num = i_frame;
    for( int i = 0; i < h->i_ref[list]; i++ )
    {
        int diff = h->fref[list][i]->i_frame_num - pred_frame_num;
        sh->ref_pic_list_order[list][i].idc = ( diff > 0 );
        sh->ref_pic_list_order[list][i].arg = (abs(diff) - 1) & ((1 << sps->i_log2_max_frame_num) - 1);
        pred_frame_num = h->fref[list][i]->i_frame_num;
    }
}
}