FFMPEG结构体分析:AVFrame
注:写了一系列的结构体的分析的文章,在这里列一个列表:
FFMPEG结构体分析:AVFrame
FFMPEG结构体分析:AVFormatContext
FFMPEG结构体分析:AVCodecContext
FFMPEG结构体分析:AVIOContext
FFMPEG结构体分析:AVCodec
FFMPEG结构体分析:AVStream
FFMPEG结构体分析:AVPacket
FFMPEG有几个最重要的结构体,包含了解协议,解封装,解码操作,此前已经进行过分析:
在此不再详述,其中AVFrame是包含码流参数较多的结构体。本文将会详细分析一下该结构体里主要变量的含义和作用。
首先看一下结构体的定义(位于avcodec.h):
- /*
- *雷霄骅
- *[email protected]
- *中国传媒大学/数字电视技术
- */
- /**
- *AudioVideoFrame.
- *NewfieldscanbeaddedtotheendofAVFRAMEwithminorversion
- *bumps.Similarlyfieldsthataremarkedastobeonlyaccessedby
- *av_opt_ptr()canbereordered.Thisallows2forkstoaddfields
- *withoutbreakingcompatibilitywitheachother.
- *Removal,reorderingandchangesintheremainingcasesrequire
- *amajorversionbump.
- *sizeof(AVFrame)mustnotbeusedoutsidelibavcodec.
- */
- typedefstructAVFrame{
- #defineAV_NUM_DATA_POINTERS8
- /**图像数据
- *pointertothepicture/channelplanes.
- *Thismightbedifferentfromthefirstallocatedbyte
- *-encoding:Setbyuser
- *-decoding:setbyAVCodecContext.get_buffer()
- */
- uint8_t*data[AV_NUM_DATA_POINTERS];
- /**
- *Size,inbytes,ofthedataforeachpicture/channelplane.
- *
- *Foraudio,onlylinesize[0]maybeset.Forplanaraudio,eachchannel
- *planemustbethesamesize.
- *
- *-encoding:Setbyuser
- *-decoding:setbyAVCodecContext.get_buffer()
- */
- intlinesize[AV_NUM_DATA_POINTERS];
- /**
- *pointerstothedataplanes/channels.
- *
- *Forvideo,thisshouldsimplypointtodata[].
- *
- *Forplanaraudio,eachchannelhasaseparatedatapointer,and
- *linesize[0]containsthesizeofeachchannelbuffer.
- *Forpackedaudio,thereisjustonedatapointer,andlinesize[0]
- *containsthetotalsizeofthebufferforallchannels.
- *
- *Note:Bothdataandextended_datawillalwaysbesetbyget_buffer(),
- *butforplanaraudiowithmorechannelsthatcanfitindata,
- *extended_datamustbeusedbythedecoderinordertoaccessall
- *channels.
- *
- *encoding:unused
- *decoding:setbyAVCodecContext.get_buffer()
- */
- uint8_t**extended_data;
- /**宽高
- *widthandheightofthevideoframe
- *-encoding:unused
- *-decoding:Readbyuser.
- */
- intwidth,height;
- /**
- *numberofaudiosamples(perchannel)describedbythisframe
- *-encoding:Setbyuser
- *-decoding:Setbylibavcodec
- */
- intnb_samples;
- /**
- *formatoftheframe,-1ifunknownorunset
- *ValuescorrespondtoenumAVPixelFormatforvideoframes,
- *enumAVSampleFormatforaudio)
- *-encoding:unused
- *-decoding:Readbyuser.
- */
- intformat;
- /**是否是关键帧
- *1->keyframe,0->not
- *-encoding:Setbylibavcodec.
- *-decoding:Setbylibavcodec.
- */
- intkey_frame;
- /**帧类型(I,B,P)
- *Picturetypeoftheframe,see?_TYPEbelow.
- *-encoding:Setbylibavcodec.forcoded_picture(andsetbyuserforinput).
- *-decoding:Setbylibavcodec.
- */
- enumAVPictureTypepict_type;
- /**
- *pointertothefirstallocatedbyteofthepicture.Canbeusedinget_buffer/release_buffer.
- *Thisisn'tusedbylibavcodecunlessthedefaultget/release_buffer()isused.
- *-encoding:
- *-decoding:
- */
- uint8_t*base[AV_NUM_DATA_POINTERS];
- /**
- *sampleaspectratioforthevideoframe,0/1ifunknown/unspecified
- *-encoding:unused
- *-decoding:Readbyuser.
- */
- AVRationalsample_aspect_ratio;
- /**
- *presentationtimestampintime_baseunits(timewhenframeshouldbeshowntouser)
- *IfAV_NOPTS_VALUEthenframe_rate=1/time_basewillbeassumed.
- *-encoding:MUSTbesetbyuser.
- *-decoding:Setbylibavcodec.
- */
- int64_tpts;
- /**
- *reorderedptsfromthelastAVPacketthathasbeeninputintothedecoder
- *-encoding:unused
- *-decoding:Readbyuser.
- */
- int64_tpkt_pts;
- /**
- *dtsfromthelastAVPacketthathasbeeninputintothedecoder
- *-encoding:unused
- *-decoding:Readbyuser.
- */
- int64_tpkt_dts;
- /**
- *picturenumberinbitstreamorder
- *-encoding:setby
- *-decoding:Setbylibavcodec.
- */
- intcoded_picture_number;
- /**
- *picturenumberindisplayorder
- *-encoding:setby
- *-decoding:Setbylibavcodec.
- */
- intdisplay_picture_number;
- /**
- *quality(between1(good)andFF_LAMBDA_MAX(bad))
- *-encoding:Setbylibavcodec.forcoded_picture(andsetbyuserforinput).
- *-decoding:Setbylibavcodec.
- */
- intquality;
- /**
- *isthispictureusedasreference
- *ThevaluesforthisarethesameastheMpegEncContext.picture_structure
- *variable,thatis1->topfield,2->bottomfield,3->frame/bothfields.
- *Setto4fordelayed,non-referenceframes.
- *-encoding:unused
- *-decoding:Setbylibavcodec.(beforeget_buffer()call)).
- */
- intreference;
- /**QP表
- *QPtable
- *-encoding:unused
- *-decoding:Setbylibavcodec.
- */
- int8_t*qscale_table;
- /**
- *QPstorestride
- *-encoding:unused
- *-decoding:Setbylibavcodec.
- */
- intqstride;
- /**
- *
- */
- intqscale_type;
- /**跳过宏块表
- *mbskip_table[mb]>=1ifMBdidn'tchange
- *stride=mb_width=(width+15)>>4
- *-encoding:unused
- *-decoding:Setbylibavcodec.
- */
- uint8_t*mbskip_table;
- /**运动矢量表
- *motionvectortable
- *@code
- *example:
- *intmv_sample_log2=4-motion_subsample_log2;
- *intmb_width=(width+15)>>4;
- *intmv_stride=(mb_width<<mv_sample_log2)+1;
- *motion_val[direction][x+y*mv_stride][0->mv_x,1->mv_y];
- *@endcode
- *-encoding:Setbyuser.
- *-decoding:Setbylibavcodec.
- */
- int16_t(*motion_val[2])[2];
- /**宏块类型表
- *macroblocktypetable
- *mb_type_base+mb_width+2
- *-encoding:Setbyuser.
- *-decoding:Setbylibavcodec.
- */
- uint32_t*mb_type;
- /**DCT系数
- *DCTcoefficients
- *-encoding:unused
- *-decoding:Setbylibavcodec.
- */
- short*dct_coeff;
- /**参考帧列表
- *motionreferenceframeindex
- *theorderinwhichthesearestoredcandependonthecodec.
- *-encoding:Setbyuser.
- *-decoding:Setbylibavcodec.
- */
- int8_t*ref_index[2];
- /**
- *forsomeprivatedataoftheuser
- *-encoding:unused
- *-decoding:Setbyuser.
- */
- void*opaque;
- /**
- *error
- *-encoding:Setbylibavcodec.ifflags&CODEC_FLAG_PSNR.
- *-decoding:unused
- */
- uint64_terror[AV_NUM_DATA_POINTERS];
- /**
- *typeofthebuffer(tokeeptrackofwhohastodeallocatedata[*])
- *-encoding:Setbytheonewhoallocatesit.
- *-decoding:Setbytheonewhoallocatesit.
- *Note:Userallocated(directrendering)&internalbufferscannotcoexistcurrently.
- */
- inttype;
- /**
- *Whendecoding,thissignalshowmuchthepicturemustbedelayed.
- *extra_delay=repeat_pict/(2*fps)
- *-encoding:unused
- *-decoding:Setbylibavcodec.
- */
- intrepeat_pict;
- /**
- *Thecontentofthepictureisinterlaced.
- *-encoding:Setbyuser.
- *-decoding:Setbylibavcodec.(default0)
- */
- intinterlaced_frame;
- /**
- *Ifthecontentisinterlaced,istopfielddisplayedfirst.
- *-encoding:Setbyuser.
- *-decoding:Setbylibavcodec.
- */
- inttop_field_first;
- /**
- *Telluserapplicationthatpalettehaschangedfrompreviousframe.
- *-encoding:???(nopalette-enabledencoderyet)
- *-decoding:Setbylibavcodec.(default0).
- */
- intpalette_has_changed;
- /**
- *codecsuggestiononbuffertypeif!=0
- *-encoding:unused
- *-decoding:Setbylibavcodec.(beforeget_buffer()call)).
- */
- intbuffer_hints;
- /**
- *Panscan.
- *-encoding:Setbyuser.
- *-decoding:Setbylibavcodec.
- */
- AVPanScan*pan_scan;
- /**
- *reorderedopaque64bit(generallyanintegeroradoubleprecisionfloat
- *PTSbutcanbeanything).
- *TheusersetsAVCodecContext.reordered_opaquetorepresenttheinputat
- *thattime,
- *thedecoderreordersvaluesasneededandsetsAVFrame.reordered_opaque
- *toexactlyoneofthevaluesprovidedbytheuserthroughAVCodecContext.reordered_opaque
- *@deprecatedinfavorofpkt_pts
- *-encoding:unused
- *-decoding:Readbyuser.
- */
- int64_treordered_opaque;
- /**
- *hardwareacceleratorprivatedata(FFmpeg-allocated)
- *-encoding:unused
- *-decoding:Setbylibavcodec
- */
- void*hwaccel_picture_private;
- /**
- *theAVCodecContextwhichff_thread_get_buffer()waslastcalledon
- *-encoding:Setbylibavcodec.
- *-decoding:Setbylibavcodec.
- */
- structAVCodecContext*owner;
- /**
- *usedbymultithreadingtostoreframe-specificinfo
- *-encoding:Setbylibavcodec.
- *-decoding:Setbylibavcodec.
- */
- void*thread_opaque;
- /**
- *log2ofthesizeoftheblockwhichasinglevectorinmotion_valrepresents:
- *(4->16x16,3->8x8,2->4x4,1->2x2)
- *-encoding:unused
- *-decoding:Setbylibavcodec.
- */
- uint8_tmotion_subsample_log2;
- /**(音频)采样率
- *Samplerateoftheaudiodata.
- *
- *-encoding:unused
- *-decoding:readbyuser
- */
- intsample_rate;
- /**
- *Channellayoutoftheaudiodata.
- *
- *-encoding:unused
- *-decoding:readbyuser.
- */
- uint64_tchannel_layout;
- /**
- *frametimestampestimatedusingvariousheuristics,instreamtimebase
- *Codeoutsidelibavcodecshouldaccessthisfieldusing:
- *av_frame_get_best_effort_timestamp(frame)
- *-encoding:unused
- *-decoding:setbylibavcodec,readbyuser.
- */
- int64_tbest_effort_timestamp;
- /**
- *reorderedposfromthelastAVPacketthathasbeeninputintothedecoder
- *Codeoutsidelibavcodecshouldaccessthisfieldusing:
- *av_frame_get_pkt_pos(frame)
- *-encoding:unused
- *-decoding:Readbyuser.
- */
- int64_tpkt_pos;
- /**
- *durationofthecorrespondingpacket,expressedin
- *AVStream->time_baseunits,0ifunknown.
- *Codeoutsidelibavcodecshouldaccessthisfieldusing:
- *av_frame_get_pkt_duration(frame)
- *-encoding:unused
- *-decoding:Readbyuser.
- */
- int64_tpkt_duration;
- /**
- *metadata.
- *Codeoutsidelibavcodecshouldaccessthisfieldusing:
- *av_frame_get_metadata(frame)
- *-encoding:Setbyuser.
- *-decoding:Setbylibavcodec.
- */
- AVDictionary*metadata;
- /**
- *decodeerrorflagsoftheframe,settoacombinationof
- *FF_DECODE_ERROR_xxxflagsifthedecoderproducedaframe,butthere
- *wereerrorsduringthedecoding.
- *Codeoutsidelibavcodecshouldaccessthisfieldusing:
- *av_frame_get_decode_error_flags(frame)
- *-encoding:unused
- *-decoding:setbylibavcodec,readbyuser.
- */
- intdecode_error_flags;
- #defineFF_DECODE_ERROR_INVALID_BITSTREAM1
- #defineFF_DECODE_ERROR_MISSING_REFERENCE2
- /**
- *numberofaudiochannels,onlyusedforaudio.
- *Codeoutsidelibavcodecshouldaccessthisfieldusing:
- *av_frame_get_channels(frame)
- *-encoding:unused
- *-decoding:Readbyuser.
- */
- int64_tchannels;
- }AVFrame;
AVFrame结构体一般用于存储原始数据(即非压缩数据,例如对视频来说是YUV,RGB,对音频来说是PCM),此外还包含了一些相关的信息。比如说,解码的时候存储了宏块类型表,QP表,运动矢量表等数据。编码的时候也存储了相关的数据。因此在使用FFMPEG进行码流分析的时候,AVFrame是一个很重要的结构体。
下面看几个主要变量的作用(在这里考虑解码的情况):
uint8_t *data[AV_NUM_DATA_POINTERS]:解码后原始数据(对视频来说是YUV,RGB,对音频来说是PCM)
int linesize[AV_NUM_DATA_POINTERS]:data中“一行”数据的大小。注意:未必等于图像的宽,一般大于图像的宽。
int width, height:视频帧宽和高(1920x1080,1280x720...)
int nb_samples:音频的一个AVFrame中可能包含多个音频帧,在此标记包含了几个
int format:解码后原始数据类型(YUV420,YUV422,RGB24...)
int key_frame:是否是关键帧
enum AVPictureType pict_type:帧类型(I,B,P...)
AVRational sample_aspect_ratio:宽高比(16:9,4:3...)
int64_t pts:显示时间戳
int coded_picture_number:编码帧序号
int display_picture_number:显示帧序号
int8_t *qscale_table:QP表
uint8_t *mbskip_table:跳过宏块表
int16_t (*motion_val[2])[2]:运动矢量表
uint32_t *mb_type:宏块类型表
short *dct_coeff:DCT系数,这个没有提取过
int8_t *ref_index[2]:运动估计参考帧列表(貌似H.264这种比较新的标准才会涉及到多参考帧)
int interlaced_frame:是否是隔行扫描
uint8_t motion_subsample_log2:一个宏块中的运动矢量采样个数,取log的
其他的变量不再一一列举,源代码中都有详细的说明。在这里重点分析一下几个需要一定的理解的变量:
1.data[]
对于packed格式的数据(例如RGB24),会存到data[0]里面。
对于planar格式的数据(例如YUV420P),则会分开成data[0],data[1],data[2]...(YUV420P中data[0]存Y,data[1]存U,data[2]存V)
具体参见:FFMPEG 实现 YUV,RGB各种图像原始数据之间的转换(swscale)
2.pict_type
包含以下类型:
- enumAVPictureType{
- AV_PICTURE_TYPE_NONE=0,///<Undefined
- AV_PICTURE_TYPE_I,///<Intra
- AV_PICTURE_TYPE_P,///<Predicted
- AV_PICTURE_TYPE_B,///<Bi-dirpredicted
- AV_PICTURE_TYPE_S,///<S(GMC)-VOPMPEG4
- AV_PICTURE_TYPE_SI,///<SwitchingIntra
- AV_PICTURE_TYPE_SP,///<SwitchingPredicted
- AV_PICTURE_TYPE_BI,///<BItype
- };
宽高比是一个分数,FFMPEG中用AVRational表达分数:
- /**
- *rationalnumbernumerator/denominator
- */
- typedefstructAVRational{
- intnum;///<numerator
- intden;///<denominator
- }AVRational;
4.qscale_table
QP表指向一块内存,里面存储的是每个宏块的QP值。宏块的标号是从左往右,一行一行的来的。每个宏块对应1个QP。
qscale_table[0]就是第1行第1列宏块的QP值;qscale_table[1]就是第1行第2列宏块的QP值;qscale_table[2]就是第1行第3列宏块的QP值。以此类推...
宏块的个数用下式计算:
注:宏块大小是16x16的。
每行宏块数:
- intmb_stride=pCodecCtx->width/16+1
宏块的总数:
- intmb_sum=((pCodecCtx->height+15)>>4)*(pCodecCtx->width/16+1)
5.motion_subsample_log2
1个运动矢量所能代表的画面大小(用宽或者高表示,单位是像素),注意,这里取了log2。
代码注释中给出以下数据:
4->16x16, 3->8x8, 2-> 4x4, 1-> 2x2
即1个运动矢量代表16x16的画面的时候,该值取4;1个运动矢量代表8x8的画面的时候,该值取3...以此类推
6.motion_val
运动矢量表存储了一帧视频中的所有运动矢量。
该值的存储方式比较特别:
- int16_t(*motion_val[2])[2];
注释中给了一段代码:
- intmv_sample_log2=4-motion_subsample_log2;
- intmb_width=(width+15)>>4;
- intmv_stride=(mb_width<<mv_sample_log2)+1;
- motion_val[direction][x+y*mv_stride][0->mv_x,1->mv_y];
大概知道了该数据的结构:
1.首先分为两个列表L0和L1
2.每个列表(L0或L1)存储了一系列的MV(每个MV对应一个画面,大小由motion_subsample_log2决定)
3.每个MV分为横坐标和纵坐标(x,y)
注意,在FFMPEG中MV和MB在存储的结构上是没有什么关联的,第1个MV是屏幕上左上角画面的MV(画面的大小取决于motion_subsample_log2),第2个MV是屏幕上第1行第2列的画面的MV,以此类推。因此在一个宏块(16x16)的运动矢量很有可能如下图所示(line代表一行运动矢量的个数):
- //例如8x8划分的运动矢量与宏块的关系:
- //-------------------------
- //|||
- //|mv[x]|mv[x+1]|
- //-------------------------
- //|||
- //|mv[x+line]|mv[x+line+1]|
- //-------------------------
7.mb_type
宏块类型表存储了一帧视频中的所有宏块的类型。其存储方式和QP表差不多。只不过其是uint32类型的,而QP表是uint8类型的。每个宏块对应一个宏块类型变量。
宏块类型如下定义所示:
- //Thefollowingdefinesmaychange,don'texpectcompatibilityifyouusethem.
- #defineMB_TYPE_INTRA4x40x0001
- #defineMB_TYPE_INTRA16x160x0002//FIXMEH.264-specific
- #defineMB_TYPE_INTRA_PCM0x0004//FIXMEH.264-specific
- #defineMB_TYPE_16x160x0008
- #defineMB_TYPE_16x80x0010
- #defineMB_TYPE_8x160x0020
- #defineMB_TYPE_8x80x0040
- #defineMB_TYPE_INTERLACED0x0080
- #defineMB_TYPE_DIRECT20x0100//FIXME
- #defineMB_TYPE_ACPRED0x0200
- #defineMB_TYPE_GMC0x0400
- #defineMB_TYPE_SKIP0x0800
- #defineMB_TYPE_P0L00x1000
- #defineMB_TYPE_P1L00x2000
- #defineMB_TYPE_P0L10x4000
- #defineMB_TYPE_P1L10x8000
- #defineMB_TYPE_L0(MB_TYPE_P0L0|MB_TYPE_P1L0)
- #defineMB_TYPE_L1(MB_TYPE_P0L1|MB_TYPE_P1L1)
- #defineMB_TYPE_L0L1(MB_TYPE_L0|MB_TYPE_L1)
- #defineMB_TYPE_QUANT0x00010000
- #defineMB_TYPE_CBP0x00020000
- //Notebits24-31arereservedforcodecspecificuse(h264ref0,mpeg10mv,...)
注:一个宏块可以包含好几种类型,但是有些类型是不能重复包含的,比如说一个宏块不可能既是16x16又是8x8。
8.ref_index
运动估计参考帧列表存储了一帧视频中所有宏块的参考帧索引。这个列表其实在比较早的压缩编码标准中是没有什么用的。只有像H.264这样的编码标准才有多参考帧的概念。但是这个字段目前我还没有研究透。只是知道每个宏块包含有4个该值,该值反映的是参考帧的索引。以后有机会再进行细研究吧。
在这里展示一下自己做的码流分析软件的运行结果。将上文介绍的几个列表图像化显示了出来(在这里是使用MFC的绘图函数画出来的)
视频帧:
QP参数提取的结果:
美化过的(加上了颜色):
宏块类型参数提取的结果:
美化过的(加上了颜色,更清晰一些,s代表skip宏块):
运动矢量参数提取的结果(在这里是List0):
运动估计参考帧参数提取的结果: