在我之前写的一篇文章《SkeyeRTSPLive传统视频监控互联网+实现利器解决方案》中提到RTSP转RTMP的转流过程,简化流程就是通过SkeyeRTSPClient拉RTSP流,获取音视频编码数据,然后再通过SkeyeRTMPPusher推出去,流程非常简单;然后再实际开发过程中,我们发现其实这个过程并没有想象中那么简单;首先,RTSP协议支持多种音视频编码格式,如音频支持AAC,G711,G726,等,视频支持H264,H625,MJPEG, MPEG等等各种格式,而SkeyeRTMPPusher推流只支持H264(已扩展支持H265)格式,这时,音频我们可以通过SkeyeAACEncoder将音频转码成AAC格式,而视频我们可以通过SkeyeVideoDecoder解码成原始数据,然后再通过SkeyeVideoEncoder将原始数据转码成RTMP推送指定的格式,本文,我们将重点讲述SkeyeVideoDecoder基于Nvidia(英伟达)独立显卡的解码流程。
SkeyeVideoDecoder基Nvidia独立显卡的硬解码库SkeyeNvDecoder
SkeyeNvDecoder库是基于Nvidia独立显卡驱动的硬件解码程序,该解码程序效率非常高效且具有强大的并行解码效能力,其解码效率比ffmpeg软件解码效率提到至少5-6倍,最新的RTX系列显卡其解码效率甚至比软解码高10-12倍,轻松解码多路4K乃至8K高清视频无压力,本文采用的是截止目前(20190714)最新的显卡驱动,CUDA版本需要10.0或者以上版本支持。
1. 接口声明如下:
#ifndef SKEYENVDECODERAPI_H
#define SKEYENVDECODERAPI_H
#include
//++ typedefine start
#ifndef SKEYENVDECODER_HANDLE
#define SKEYENVDECODER_HANDLE void*
#endif//SKEYENVDECODER_HANDLE
typedef enum _OutputFormat //native=默认解码器输出为NV12格式
{
native = 0, bgrp, rgbp, bgra, rgba, bgra64, rgba64
}OutputFormat;
typedef enum _SKEYENvDecoder_CodecType {
SKEYENvDecoder_Codec_MPEG1 = 0, /**
2. SkeyeNvDecoder解码库调用流程
-
第一步,初始化注册解码器
注意,注册解码器函数全局只需调用一;int SKEYENvDecoder_Initsize(string &erroStr) { try { if (!isInitsized) { //显卡只初始化一次 ck(cuInit(0)); int nGpu = 0; ck(cuDeviceGetCount(&nGpu)); for (int i = 0; i
- 第二步,创建解码器实例
SKEYENVDECODER_HANDLE SKEYENvDecoder_Create(NvDecoder_CodecType codec, int videoW, int videoH, bool bLowLatency, OutputFormat eOutputFormat, int& errCode, string &erroStr)
{
//if (!isInitsized || !m_ctxV.size()) {
// return NULL;
//}
try {
ck(cuInit(0));
int nGpu = 0;
ck(cuDeviceGetCount(&nGpu));
CUcontext cuContext = NULL;
m_curIndex++;
m_curIndex = (m_curIndex) % nGpu;
for (int i = 0; i &v = m_ctxV.at(m_curIndex++ % m_ctxV.size());
//std::cout = videoW && videoDecodeCaps.nMaxHeight >= videoH)
{
NvDecoder* pDecoder = new NvDecoder(/*v.first*/cuContext, videoW, videoH, eOutputFormat== native?false:true,
(cudaVideoCodec)codec, NULL, bLowLatency, eOutputFormat);
pDecoder->Start();
return pDecoder;
}
else
{
erroStr = "Width and height not supported on this GPU Decoder";
errCode = -2;
}
}
}
}
catch (std::exception &e)
{
erroStr = e.what();
}
return NULL;
}
- 第三步,调用解码函数解码
int SKEYENvDecoder_Decode(SKEYENVDECODER_HANDLE handle, const uint8_t *pData, int nSize, uint8_t ***pppFrame, int* pnFrameLen, int *pnFrameReturned)
{
if (!handle)
return -1;
NvDecoder* pDecoder = (NvDecoder*)handle;
int anSize[] = { 0, 3, 3, 4, 4, 8, 8 };
//std::unique_ptr pImage(new uint8_t[nFrameSize]);
std::vector* vecOutBuffer = pDecoder->GetFrameBufferVector();
size_t nFrameSize = pDecoder->GetOutFrameSize();
*pnFrameLen = nFrameSize;
int nFrameReturned = 0, nFrame = 0;
uint8_t **ppFrame = NULL;
bool bLowLatency = pDecoder->IsSetLowLatency();
bool bSuc = pDecoder->Decode(pData, nSize, &ppFrame, &nFrameReturned, CUVID_PKT_ENDOFPICTURE/*bLowLatency?CUVID_PKT_ENDOFPICTURE : 0*/);
if (!bSuc)
return -2;
//if (!nFrame && nFrameReturned > 0)
//LOG(INFO) GetVideoInfo();
for (int i = 0; i GetSetOutputFormat())
{
if (i >= (*vecOutBuffer).size())
{
(*vecOutBuffer).push_back(new uint8_t[nFrameSize]);
}
}
if (pDecoder->GetBitDepth() == 8)
{
switch (pDecoder->GetSetOutputFormat())
{
case native:
//GetImage((CUdeviceptr)ppFrame[i], (*vecOutBuffer)[i], pDecoder->GetWidth(), pDecoder->GetHeight() + (pDecoder->GetChromaHeight() * pDecoder->GetNumChromaPlanes()));
break;
case bgrp:
if (pDecoder->GetOutputFormat() == cudaVideoSurfaceFormat_YUV444)
YUV444ToColorPlanar((uint8_t *)ppFrame[i], pDecoder->GetWidth(), (uint8_t*)pDecoder->GetDeviceImagePtr(), pDecoder->GetWidth(), pDecoder->GetWidth(), pDecoder->GetHeight());
else
Nv12ToColorPlanar((uint8_t *)ppFrame[i], pDecoder->GetWidth(), (uint8_t*)pDecoder->GetDeviceImagePtr(), pDecoder->GetWidth(), pDecoder->GetWidth(), pDecoder->GetHeight());
GetImage(pDecoder->GetDeviceImagePtr(), (*vecOutBuffer)[i], pDecoder->GetWidth(), 3 * pDecoder->GetHeight());
break;
case rgbp:
if (pDecoder->GetOutputFormat() == cudaVideoSurfaceFormat_YUV444)
YUV444ToColorPlanar((uint8_t *)ppFrame[i], pDecoder->GetWidth(), (uint8_t*)pDecoder->GetDeviceImagePtr(), pDecoder->GetWidth(), pDecoder->GetWidth(), pDecoder->GetHeight());
else
Nv12ToColorPlanar((uint8_t *)ppFrame[i], pDecoder->GetWidth(), (uint8_t*)pDecoder->GetDeviceImagePtr(), pDecoder->GetWidth(), pDecoder->GetWidth(), pDecoder->GetHeight());
GetImage(pDecoder->GetDeviceImagePtr(), (*vecOutBuffer)[i], pDecoder->GetWidth(), 3 * pDecoder->GetHeight());
break;
case bgra:
if (pDecoder->GetOutputFormat() == cudaVideoSurfaceFormat_YUV444)
YUV444ToColor32((uint8_t *)ppFrame[i], pDecoder->GetWidth(), (uint8_t*)pDecoder->GetDeviceImagePtr(), 4 * pDecoder->GetWidth(), pDecoder->GetWidth(), pDecoder->GetHeight());
else
Nv12ToColor32((uint8_t *)ppFrame[i], pDecoder->GetWidth(), (uint8_t*)pDecoder->GetDeviceImagePtr(), 4 * pDecoder->GetWidth(), pDecoder->GetWidth(), pDecoder->GetHeight());
GetImage(pDecoder->GetDeviceImagePtr(), (*vecOutBuffer)[i], 4 * pDecoder->GetWidth(), pDecoder->GetHeight());
break;
case rgba:
if (pDecoder->GetOutputFormat() == cudaVideoSurfaceFormat_YUV444)
YUV444ToColor32((uint8_t *)ppFrame[i], pDecoder->GetWidth(), (uint8_t*)pDecoder->GetDeviceImagePtr(), 4 * pDecoder->GetWidth(), pDecoder->GetWidth(), pDecoder->GetHeight());
else
Nv12ToColor32((uint8_t *)ppFrame[i], pDecoder->GetWidth(), (uint8_t*)pDecoder->GetDeviceImagePtr(), 4 * pDecoder->GetWidth(), pDecoder->GetWidth(), pDecoder->GetHeight());
GetImage(pDecoder->GetDeviceImagePtr(), (*vecOutBuffer)[i], 4 * pDecoder->GetWidth(), pDecoder->GetHeight());
break;
case bgra64:
if (pDecoder->GetOutputFormat() == cudaVideoSurfaceFormat_YUV444)
YUV444ToColor64((uint8_t *)ppFrame[i], pDecoder->GetWidth(), (uint8_t*)pDecoder->GetDeviceImagePtr(), 8 * pDecoder->GetWidth(), pDecoder->GetWidth(), pDecoder->GetHeight());
else
Nv12ToColor64((uint8_t *)ppFrame[i], pDecoder->GetWidth(), (uint8_t*)pDecoder->GetDeviceImagePtr(), 8 * pDecoder->GetWidth(), pDecoder->GetWidth(), pDecoder->GetHeight());
GetImage(pDecoder->GetDeviceImagePtr(), (*vecOutBuffer)[i], 8 * pDecoder->GetWidth(), pDecoder->GetHeight());
break;
case rgba64:
if (pDecoder->GetOutputFormat() == cudaVideoSurfaceFormat_YUV444)
YUV444ToColor64((uint8_t *)ppFrame[i], pDecoder->GetWidth(), (uint8_t*)pDecoder->GetDeviceImagePtr(), 8 * pDecoder->GetWidth(), pDecoder->GetWidth(), pDecoder->GetHeight());
else
Nv12ToColor64((uint8_t *)ppFrame[i], pDecoder->GetWidth(), (uint8_t*)pDecoder->GetDeviceImagePtr(), 8 * pDecoder->GetWidth(), pDecoder->GetWidth(), pDecoder->GetHeight());
GetImage(pDecoder->GetDeviceImagePtr(), (*vecOutBuffer)[i], 8 * pDecoder->GetWidth(), pDecoder->GetHeight());
break;
}
}
}
nFrame += nFrameReturned;
if (nFrameReturned > 0)
{
if (pnFrameReturned)
*pnFrameReturned = nFrameReturned;
if (native != pDecoder->GetSetOutputFormat())
{
if (pppFrame && (*vecOutBuffer).size() > 0)
*pppFrame = &(*vecOutBuffer)[0];
}
else
{
if (pppFrame && ppFrame)
*pppFrame = ppFrame;
}
}
}
- 第四步,停止解码,销毁解码器
void SKEYENvDecoder_Release(SKEYENVDECODER_HANDLE handle)
{
if (!handle)
return;
NvDecoder* pDecoder = (NvDecoder*)handle;
pDecoder->Stop();
delete pDecoder;
m_curIndex--;
if (m_curIndex
- 第五步,注销解码器,释放资源
int SKEYENvDecoder_Uninitsize()
{
isInitsized = false;
for (int nI = 0; nI
自此,SKEYENvDecoder的封装就完成了,我们可以通过其接口调用Nvidia的显卡进行硬件解码测试, 以下为真实应用效果,硬解12路效果图cpu I5占比11,730显卡点75-80,如下图所示:
有任何技术问题,欢迎大家和我技术交流:
295222688@qq.com
大家也可以加入SkeyePlayer流媒体播放器 QQ群进行讨论:
102644504
服务器托管,北京服务器托管,服务器租用 http://www.fwqtg.net