kvazaar 编码器介绍与测试
介绍
Kvazaar是一个开源的视频编码器,它的目标是提供高效的视频编码解决方案。它使用C语言编写,并且支持多种CPU架构和操作系统,包括x86、x64、ARM等CPU以及Windows、Linux、Mac等操作系统。
Kvazaar的Threading模块包括三种类型的线程,分别是输入线程、工作线程和主线程。输入线程用于读取未编码的YUV图像,工作线程用于执行实际的编码压缩任务,主线程用于处理其他任务。它内部支持3种并行机制,分别是Tile、WPP(波前并行)以及帧级并行。
在源码方面,Kvazaar的源码主要在cfg.c和encoder.c这两个文件中,其中threads写入(即赋值)的地方有4处,其余地方均是读取使用值。在Kvazaar编码器负责命令行参数解析的kvz_config_parse函数中,会根据目前编码器所运行的CPU的逻辑核数得到max_threads,默认是4核。
总的来说,Kvazaar是一个功能强大且灵活的视频编码器,具有高效的编码性能和跨平台能力。
------【来自文心一言】
An open-source HEVC encoder licensed under 3-clause BSD
Join channel #ultravideo in Libera.Chat IRC network to contact us or come to our Discord
Kvazaar is still under development. Speed and RD-quality will continue to improve.
http://ultravideo.fi/#encoder for more information.
相关论文
Kvazaar 2.0: Fast and Efficient Open-Source HEVC Inter Encoder
git仓库
https://github.com/ultravideo/kvazaar
测试实验
1. 下载源码:git clone https://github.com/ultravideo/kvazaar
2. 下载依赖项: 根据 README.md
中介绍下载Mac 平台编译依赖项
brew install automake
brew install libtool
brew install yasm
3. 编译:,编译完会生成对应的可执行文件kvazaar
和库文件
./autogen.sh
./configure
make
sudo make install
sudo ldconfig
4. 编码参数: 终端输入命令./kvazaar --help
,就弹出全部可用编码参数使用方法.
Usage:
kvazaar -i <input> --input-res <width>x<height> -o <output>
Required:
-i, --input <filename> : Input file
--input-res <res> : Input resolution [auto]
- auto: Detect from file name.
- <int>x<int>: width times height
-o, --output <filename> : Output file
Presets:
--preset <preset> : Set options to a preset [medium]
- ultrafast, superfast, veryfast, faster,
fast, medium, slow, slower, veryslow
placebo
Input:
-n, --frames <integer> : Number of frames to code [all]
--seek <integer> : First frame to code [0]
--input-fps <num>[/<denom>] : Frame rate of the input video [25]
--source-scan-type <string> : Source scan type [progressive]
- progressive: Progressive scan
- tff: Top field first
- bff: Bottom field first
--input-format <string> : P420 or P400 [P420]
--input-bitdepth <int> : 8-16 [8]
--loop-input : Re-read input file forever.
--input-file-format <string> : Input file format [auto]
- auto: Check the file ending for format
- y4m (skips frame headers)
- yuv
Options:
--help : Print this help message and exit.
--version : Print version information and exit.
--(no-)aud : Use access unit delimiters. [disabled]
--debug <filename> : Output internal reconstruction.
--(no-)cpuid : Enable runtime CPU optimizations. [enabled]
--hash <string> : Decoded picture hash [checksum]
- none: 0 bytes
- checksum: 18 bytes
- md5: 56 bytes
--(no-)psnr : Calculate PSNR for frames. [enabled]
--(no-)info : Add encoder info SEI. [enabled]
--crypto <string> : Selective encryption. Crypto support must be
enabled at compile-time. Can be 'on' or 'off' or
a list of features separated with a '+'. [off]
- on: Enable all encryption features.
- off: Disable selective encryption.
- mvs: Motion vector magnitudes.
- mv_signs: Motion vector signs.
- trans_coeffs: Coefficient magnitudes.
- trans_coeff_signs: Coefficient signs.
- intra_pred_modes: Intra prediction modes.
--key <string> : Encryption key [16,213,27,56,255,127,242,112,
97,126,197,204,25,59,38,30]
--stats-file-prefix : A prefix used for stats files that include
bits, lambda, distortion, and qp for each ctu.
These are meant for debugging and are not
written unless the prefix is defined.
Video structure:
-q, --qp <integer> : Quantization parameter [22]
-p, --period <integer> : Period of intra pictures [64]
- 0: Only first picture is intra.
- 1: All pictures are intra.
- N: Every Nth picture is intra.
--vps-period <integer> : How often the video parameter set is re-sent [0]
- 0: Only send VPS with the first frame.
- N: Send VPS with every Nth intra frame.
-r, --ref <integer> : Number of reference frames, in range 1..15 [4]
--gop <string> : GOP structure [lp-g4d3t1]
- 0: Disabled
- 8: B-frame pyramid of length 8
- 16: B-frame pyramid of length 16
- lp-<string>: Low-delay P/B-frame GOP
(e.g. lp-g8d4t2, see README)
--intra-qp-offset <int>: QP offset for intra frames [-51..51] [auto]
- N: Set QP offset to N.
- auto: Select offset automatically based
on GOP length.
--(no-)open-gop : Use open GOP configuration. [enabled]
--cqmfile <filename> : Read custom quantization matrices from a file.
--scaling-list <string>: Set scaling list mode. [off]
- off: Disable scaling lists.
- custom: use custom list (with --cqmfile).
- default: Use default lists.
--bitrate <integer> : Target bitrate [0]
- 0: Disable rate control.
- N: Target N bits per second.
--rc-algorithm <string>: Select used rc-algorithm. [lambda]
- lambda: rate control from:
DOI: 10.1109/TIP.2014.2336550
- oba: DOI: 10.1109/TCSVT.2016.2589878
--(no-)intra-bits : Use Hadamard cost based allocation for intra
frames. Default on for gop 8 and off for lp-gop
--(no-)clip-neighbour : On oba based rate control whether to clip
lambda values to same frame's ctus or previous'.
Default on for RA GOPS and disabled for LP.
--(no-)lossless : Use lossless coding. [disabled]
--mv-constraint <string> : Constrain movement vectors. [none]
- none: No constraint
- frametile: Constrain within the tile.
- frametilemargin: Constrain even more.
--roi <filename> : Use a delta QP map for region of interest.
Reads an array of delta QP values from a file.
Text and binary files are supported and detected
from the file extension (.txt/.bin). If a known
extension is not found, the file is treated as
a text file. The file can include one or many
ROI frames each in the following format:
width and height of the QP delta map followed
by width * height delta QP values in raster
order. In binary format, width and height are
32-bit integers whereas the delta QP values are
signed 8-bit values. The map can be of any size
and will be scaled to the video size. The file
reading will loop if end of the file is reached.
See roi.txt in the examples folder.
--set-qp-in-cu : Set QP at CU level keeping pic_init_qp_minus26.
in PPS and slice_qp_delta in slize header zero.
--(no-)erp-aqp : Use adaptive QP for 360 degree video with
equirectangular projection. [disabled]
--level <number> : Use the given HEVC level in the output and give
an error if level limits are exceeded. [6.2]
- 1, 2, 2.1, 3, 3.1, 4, 4.1, 5, 5.1, 5.2, 6,
6.1, 6.2
--force-level <number> : Same as --level but warnings instead of errors.
--high-tier : Used with --level. Use high tier bitrate limits
instead of the main tier limits during encoding.
High tier requires level 4 or higher.
--(no-)vaq <integer> : Enable variance adaptive quantization with given
strength, in range 1..20. Recommended: 5.
[disabled]
Compression tools:
--(no-)deblock <beta:tc> : Deblocking filter. [0:0]
- beta: Between -6 and 6
- tc: Between -6 and 6
--sao <string> : Sample Adaptive Offset [full]
- off: SAO disabled
- band: Band offset only
- edge: Edge offset only
- full: Full SAO
--(no-)rdoq : Rate-distortion optimized quantization [enabled]
--(no-)rdoq-skip : Skip RDOQ for 4x4 blocks. [disabled]
--(no-)signhide : Sign hiding [disabled]
--(no-)smp : Symmetric motion partition [disabled]
--(no-)amp : Asymmetric motion partition [disabled]
--rd <integer> : Mode search complexity [0]
- 0: Skip intra if inter is good enough.
- 1: Rough intra mode search with SATD.
- 2: Refine mode search with SSE.
- 3: More SSE candidates for inter and
chroma mode search for 4x4 intra.
- 4: Even more SSE candidates for both.
- 5: Try all intra modes.
--(no-)mv-rdo : Rate-distortion optimized motion vector costs
[disabled]
--(no-)zero-coeff-rdo : If a CU is set inter, check if forcing zero
residual improves the RD cost. [enabled]
--(no-)full-intra-search : Try all intra modes during rough search.
[disabled]
--(no-)intra-chroma-search : Test non-derived intra chroma modes.
[disabled]
--(no-)transform-skip : Try transform skip [disabled]
--me <string> : Integer motion estimation algorithm [hexbs]
- hexbs: Hexagon Based Search
- tz: Test Zone Search
- full: Full Search
- full8, full16, full32, full64
- dia: Diamond Search
--me-steps <integer> : Motion estimation search step limit. Only
affects 'hexbs' and 'dia'. [-1]
--subme <integer> : Fractional pixel motion estimation level [4]
- 0: Integer motion estimation only
- 1: + 1/2-pixel horizontal and vertical
- 2: + 1/2-pixel diagonal
- 3: + 1/4-pixel horizontal and vertical
- 4: + 1/4-pixel diagonal
--(no-)fast-bipred : Only perform fast bipred search. [enabled]
--pu-depth-inter <int>-<int> : Inter prediction units sizes [0-3]
- 0, 1, 2, 3: from 64x64 to 8x8
- Accepts a list of values separated by ','
for setting separate depths per GOP layer
(values can be omitted to use the first
value for the respective layer).
--pu-depth-intra <int>-<int> : Intra prediction units sizes [1-4]
- 0, 1, 2, 3, 4: from 64x64 to 4x4
- Accepts a list of values separated by ','
for setting separate depths per GOP layer
(values can be omitted to use the first
value for the respective layer).
--ml-pu-depth-intra : Predict the pu-depth-intra using machine
learning trees, overrides the
--pu-depth-intra parameter. [disabled]
--(no-)combine-intra-cus: Whether the encoder tries to code a cu
on lower depth even when search is not
performed on said depth. Should only
be disabled if cus absolutely must not
be larger than limited by the search.
[enabled]
--force-inter : Force the encoder to use inter always.
This is mostly for debugging and is not
guaranteed to produce sensible bitstream or
work at all. [disabled]
--tr-depth-intra <int> : Transform split depth for intra blocks [0]
--(no-)bipred : Bi-prediction [disabled]
--cu-split-termination <string> : CU split search termination [zero]
- off: Don't terminate early.
- zero: Terminate when residual is zero.
--me-early-termination <string> : Motion estimation termination [on]
- off: Don't terminate early.
- on: Terminate early.
- sensitive: Terminate even earlier.
--fast-residual-cost <int> : Skip CABAC cost for residual coefficients
when QP is below the limit. [0]
--fast-coeff-table <string> : Read custom weights for residual
coefficients from a file instead of using
defaults [default]
--fast-rd-sampling : Enable learning data sampling for fast coefficient
table generation
--fastrd-accuracy-check : Evaluate the accuracy of fast coefficient
prediction
--fastrd-outdir : Directory to which to output sampled data or accuracy
data, into <fastrd-outdir>/0.txt to 50.txt, one file
for each QP that blocks were estimated on
--(no-)intra-rdo-et : Check intra modes in rdo stage only until
a zero coefficient CU is found. [disabled]
--(no-)early-skip : Try to find skip cu from merge candidates.
Perform no further search if skip is found.
For rd=0..1: Try the first candidate.
For rd=2.. : Try the best candidate based
on luma satd cost. [enabled]
--max-merge <integer> : Maximum number of merge candidates, 1..5 [5]
--(no-)implicit-rdpcm : Implicit residual DPCM. Currently only supported
with lossless coding. [disabled]
--(no-)tmvp : Temporal motion vector prediction [enabled]
Parallel processing:
--threads <integer> : Number of threads to use [auto]
- 0: Process everything with main thread.
- N: Use N threads for encoding.
- auto: Select automatically.
--owf <integer> : Frame-level parallelism [auto]
- N: Process N+1 frames at a time.
- auto: Select automatically.
--(no-)wpp : Wavefront parallel processing. [enabled]
Enabling tiles automatically disables WPP.
To enable WPP with tiles, re-enable it after
enabling tiles. Enabling wpp with tiles is,
however, an experimental feature since it is
not supported in any HEVC profile.
--tiles <int>x<int> : Split picture into width x height uniform tiles.
--tiles-width-split <string>|u<int> :
- <string>: A comma-separated list of tile
column pixel coordinates.
- u<int>: Number of tile columns of uniform
width.
--tiles-height-split <string>|u<int> :
- <string>: A comma-separated list of tile
row column pixel coordinates.
- u<int>: Number of tile rows of uniform
height.
--slices <string> : Control how slices are used.
- tiles: Put tiles in independent slices.
- wpp: Put rows in dependent slices.
- tiles+wpp: Do both.
--partial-coding <x-offset>!<y-offset>!<slice-width>!<slice-height>
: Encode partial frame.
Parts must be merged to form a valid bitstream.
X and Y are CTU offsets.
Slice width and height must be divisible by CTU
in pixels unless it is the last CTU row/column.
This parameter is used by kvaShare.
Video Usability Information:
--sar <width:height> : Specify sample aspect ratio
--overscan <string> : Specify crop overscan setting [undef]
- undef, show, crop
--videoformat <string> : Specify video format [undef]
- undef, component, pal, ntsc, secam, mac
--range <string> : Specify color range [tv]
- tv, pc
--colorprim <string> : Specify color primaries [undef]
- undef, bt709, bt470m, bt470bg,
smpte170m, smpte240m, film, bt2020
--transfer <string> : Specify transfer characteristics [undef]
- undef, bt709, bt470m, bt470bg,
smpte170m, smpte240m, linear, log100,
log316, iec61966-2-4, bt1361e,
iec61966-2-1, bt2020-10, bt2020-12
--colormatrix <string> : Specify color matrix setting [undef]
- undef, bt709, fcc, bt470bg, smpte170m,
smpte240m, GBR, YCgCo, bt2020nc, bt2020c
--chromaloc <integer> : Specify chroma sample location (0 to 5) [0]
Deprecated parameters: (might be removed at some point)
-w, --width <integer> : Use --input-res.
-h, --height <integer> : Use --input-res.
5. 测试: 用标准测试视频序列vidyo1_720p_60.yuv作为测试视频,与 x265 编码器进行对比。
kvazaar
./kvazaar --input-res 1280x720 -ividyo1_720p_60.yuv --seek 0 --preset veryfast -o yancey.hevc
码率:0.545Mbps
FPS:100.76
PSNR:42.6514
B 帧设置: 7
x265
./x265 -o output.h265 --input input.yuv --fps 15 --input-res 1280x720 --preset ultrafast --psnr
码率:179.49kb/s
FPS:357.65
PSNR:40.747
B 帧设置为: 3
简单对比 kvazaar 和 x265,kvazaar 在编码性能和编码复杂度方面还是有一定的差距的,远没有论文介绍所说在相同质量下,在各个 preset 都实现 编码速度3 倍于 x265,且码率节省 10.7%
。
6. 论文结果:
备注
- 目前仅简单对比,编码参数也未对齐论文所介绍设置,后面有时间详细分析研究,感兴趣的可以自己下载源码实验。
- 本实验平台Apple M1 Pro。
- x265 版本 3.5。
- kvazaar 版本最新 master 源码下载编译。
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。 如若内容造成侵权/违法违规/事实不符,请联系我的编程经验分享网邮箱:veading@qq.com进行投诉反馈,一经查实,立即删除!