README for the Dirac video codec
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

by Thomas Davies, BBC R&D (dirac@rd.bbc.co.uk)


1. Executive Summary
~~~~~~~~~~~~~~~~~~~~

Dirac is an open source video codec. It uses a traditional hybrid video codec
architecture, but with the wavelet transform instead of the usual block
transforms.  Motion compensation uses overlapped blocks to reduce block
artefacts that would upset the transform coding stage.

Dirac can code just about any size of video, from streaming up to HD and
beyond, although certain presets are defined for different applications and
standards.  These cover the parameters that need to be set for the encoder to
work, such as block sizes and temporal prediction structures, which must
otherwise be set by hand.

Dirac is intended to develop into real coding and decoding software, capable
of plugging into video processing applications and media players that need
compression. It is intended to develop into a simple set of reliable but
effective coding tools that work over a wide variety of content and formats,
using well-understood compression techniques, in a clear and accessible
software structure. It is not intended as a demonstration or reference coder.


2. Documentation
~~~~~~~~~~~~~~~~

A user guide and a guide to the software is in progress. More details on
running the codec can be found at http://dirac.sourceforge.net/


3. Building and installing
~~~~~~~~~~~~~~~~~~~~~~~~~~

  GNU/Linux, Unix, MacOS X, Cygwin, Mingw
  ---------------------------------------
    ./configure --enable-debug
        (to enable extra debug compile options)
     OR
    ./configure --enable-profile
        (to enable the g++ profiling flag -pg)
     OR
    ./configure --enable-debug --enable-profile
        (to enable extra debug compile options and profiling options)
     OR
     ./configure

     By default, both shared and static libraries are built. To build all-static
     libraries use
     ./configure --disable-shared

	 To build shared libraries only use
     ./configure --disable-static

     make
     make install

  The INSTALL file documents arguments to ./configure such as
  --prefix=/usr/local (specify the installation location prefix).

  
  MSYS and Microsoft Visual C++
  -----------------------------
     Download and install the no-cost Microsoft C++ compiler from
     http://msdn.microsoft.com/visualc/vctoolkit2003/

     Download and install MSYS (the MinGW Minimal SYStem), MSYS-1.0.10.exe, 
     from http://www.mingw.org/download.shtml. An MSYS icon will be available
     on the desktop.

     Click on the MSYS icon on the desktop to open a MSYS shell window.

     Create a .profile file to set up the environment variables required. 
     vi .profile

     Include the following three lines in the .profile file.

         export PATH=/c/Program\ Files/Microsoft\ Visual\ C++\ Toolkit\ 2003/bin:$PATH
         export INCLUDE=/c/Program\ Files/Microsoft\ Visual\ C++\ Toolkit\ 2003/include
         export LIB=/c/Program\ Files/Microsoft\ Visual\ C++\ Toolkit\ 2003/lib

         (Replace /c/Program\ Files/Microsoft\ Visual\ C++\ Toolkit\ 2003/ with
         the location where VC++ 2003 is installed if necessary)


     Exit from the MSYS shell and click on the MSYS icon on the desktop to open 
     a new MSYS shell window for the .profile to take effect.

     Change directory to the directory where Dirac was unpacked. By default 
	 only the dynamic libraries are built.

     ./configure CXX=cl --enable-debug
         (to enable extra debug compile options)
     OR
     ./configure CXX=cl --disable-shared
	     (to build static libraries)
     OR
     ./configure CXX=cl
     make
     make install

     The INSTALL file documents arguments to ./configure such as
     --prefix=/usr/local (specify the installation location prefix).

  Microsoft Visual C++ .NET 2003
  ------------------------------
  The MS VC++ .NET 2003 solution and project files are in win32/VS2003 
  directory.  Double-click on the solution file, dirac.sln, in the 
  win32/VS2003 directory.  The target 'Everything' builds the codec 
  libraries and utilities. Four build-types are supported

  Debug - builds unoptimised encoder and decoder dlls with debug symbols
  Release - builds optimised encoder and decoder dlls
  Static-Debug - builds unoptimised encoder and decoder static libraries
                 with debug symbols
  Static-Release - builds optimised encoder and decoder static libraries
 
  Static libraries are created in the win32/VS2003/lib/<build-type> directory.

  Encoder and Decoder dlls and import libraries, encoder and decoder apps are 
  created in the win32/VS2003/bin/<build-type> directory. The "C" public API
  is exported using the _declspec(dllexport) mechanism.

  Conversion utilites are created in the 
  win32/VS2003/utils/conversion/<build-type> directory. Only static versions
  are built.

  Instrumentation utility is created in the 
  win32/VS2003/utils/instrumentation/<build-type> directory. Only static
  versions are built.


4. Running the example programs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

4.1 Command-line parameters

At the moment there is a simple command-line parser class which is 
used in all the executables. The general procedure for running a program
is to type:

  prog_name -<flag_name> flag_val ... param1 param2 ...

In other words, options are prefixed by a dash; some options take values, 
while others are boolean options that enable specific features. For example:
When running the encoder, the -qf options requires a numeric argument
specifying the "quality factor" for encoding. The -verbose option enables
detailed output and does not require an argument.

Running any program without arguments will display a list of parameters and
options.

4.2 File formats

The example coder and decoder use a temporary file format which consists of
raw 8-bit planar YUV data together with a header file. This means that data
is stored bytewise, with a frame of Y followed by a frame of U followed by
a frame of V, all scanned in the usual raster order. 

Other file formats are supported by means of conversion utilities that
may be found in the subdirectory util/conversion. These will convert to
and from raw RGB format, and support all the standard raw YUV formats as
well as bitmaps. Raw RGB can be obtained as an output from standard conversion
utilities such as ImageMagick.

Once a raw YUV file has been made, in order to run the codec, a header must be
constructed. The header records such picture information as: the picture
dimensions, which are taken to be those of the luminance or Y component, and
other metadata. The other metadata consists of the chroma format, the frame
rate in Hertz, a flag indicating interlace and, if interlace, a flag
indicating whether the interlace is top-field first. The chroma format setting
records whether the video is sampled 4:4:4, 4:2:2, 4:1:1 or 4:2:0, and is
essential. The frame rate setting is used to calculate bit-rate for the
encoder, and display rate for the decoder, and if omitted a rate of 12Hz is
assumed.

The header file is made using the make_header tool in subdirectory 
picheader/make_header. It's in text format so can also be edited manually.

Example.
  Compress an image sequence of 100 frames of 352x288 video in tiff format.

  Step 1.

  Use your favourite conversion routine to produce a single raw RGB file of 
  all the data. If your routine converts frame-by-frame then you will
  need to concatenate the output.

  Step 2.

  Convert from RGB to the YUV format of your choice. For example, to do
  420, type

  RGBtoYUV420 <file.rgb >file.yuv 352 288 100

  Note that this uses stdin and stdout to read and write the data.

  Step 3.

  Make the appropriate header to accompany the raw data file:

  make_header -xl 720 -yl 576 -cformat format420 -framerate 25 -interlace file

  This writes file.hdr with the corresponding parameters.

  Step 4.

  Run the encoder. This will produce a locally decoded output in the
  same format.

  Step 5.

  Convert back to RGB.

  YUV420toRGB <file.yuv >file.rgb 352 288 100

  Step 6.

  Use your favourite conversion utility to convert to the format of your
  choice.

You can also use the transcode utility to convert data to and from Dirac's
native formats (see http://zebra.fh-weingarten.de/~transcode/):

  This example uses a 720x576x50 DV source, and transcodes to 720x576 YUV in
  4:2:0 chroma format.  Cascading codecs (DV + Dirac) is generally a bad idea
  - use this only if you don't have any other source of uncompressed video.

    transcode -i source.dv -x auto,null --dv_yuy2_mode -k -V -y raw,null -o file.avi
    tcextract -i test.avi -x rgb > file.yuv

    make_header -xl 720 -yl 576 -cformat format420 -framerate 25 -interlace file

Viewing and playback utilities for uncompressed video include MPlayer and
ImageMagick's display command.

  Continuing the 352x288 4:2:0 example above, to display a single frame
  of raw YUV with ImageMagick use the following (use <spacebar> to see
  subsequent frames):

    display -size 352x288 test.yuv

  Raw YUV 420 data can also be played back in MPlayer - use the following 
  MPlayer command:

    mplayer -fps 15 -rawvideo on:size=152064:w=352:h=288 test.yuv

  (at the time of writing MPlayer could not playback 4:2:2 or 4:4:4 YUV data)


4.3 Encoding

There are a large number of parameters that can be used to run the encoder,
all of which are listed below, and which are set using the same conventions as
for make_header. However, things are simplified by using presets for different
applications. These set such things as block sizes and overlaps for motion
estimation and compensation (the codec used overlapped blocks), and
psychovisual weighting. They also define the context in which other parameters
like quality factors, operate. The presets are:

CIF      : for CIF video
SD576    : for standard definition video
HD720    : for 1280x720 High Definition progressive video
HD1080   : for 1920/1440x1080 High Definition progressive video [not yet supported]

The other useful parameter is the quality factor qf. This is a number from 0
to 10.  The higher the number, the better the quality. The encoder attempts to
adapt the encoding process to produce constant quality across the sequence.
(Due to variations in the content, it may not exactly achieve this.)

Simple coding example. Code an SD sequence to high quality.

Solution.

  dirac_encoder -SD576 -qf 9 test test_out

will read test.yuv and test.hdr as input, and output a compressed bitstream
test_out.drc as well as locally-decoded files test_out.yuv and test_out.hdr.

Other parameters.

verbose   : turn on verbosity (if you don't, you won't see the final bitrate!)
start     : code from this frame number
stop      : code up until this frame number
L1_sep    : the separation between L1 frames (frames that are predicted but 
            also used as reference frames, like P frames in MPEG-2)
num_L1    : the number of L1 frames before the next intra frame
xblen     : the width of blocks used for motion compensation
yblen     : the height of blocks used for motion compensation
xbsep     : the horizontal separation between blocks. Always <xblen
ybsep     : the vertical separation between blocks. Always <yblen
cpd       : normalised viewing distance parameter, in cycles per degree.
nolocal   : Do no generate diagnostics and locally decoded output

Using -start and -stop allows a small section to be coded, rather than the
whole thing.

Modifying L1_sep and num_L1 allows for new GOP structures to be used, and
should be entirely safe. There are two non-GOP modes that can also be used for
encoding: setting num_L1=0 gives I-frame only coding, and setting num_L1<0
will produce a sequence with infinitely many L1 frames i.e. with a single I
frame at the beginning of the sequence. 

Modifying the block parameters is strongly deprecated: it's likely to break
the encoder as there are many constraints. Modifying cpd will not break
anything, but will change the way noise is distributed which may be more (or
less) suitable for your application. Setting cpd equal zero turns off
perceptual weighting altogether.

Block separations must currently be set so that an integral number of
macroblocks fits into the frame horizontally and vertically. A macroblock is a
4x4 set of blocks, so 4xblen must divide the frame width and 4yblen the frame
height. 

4.4 Decoding

Decoding is much simpler. Just point the decoder input at the bitstream and the
output to a file:

  dirac_decoder -verbose test_enc test_dec

will decode into test_dec.{yuv,hdr} with running commentary.
