NVDEC

Website, builds, or other suggestions.
User avatar
Zeranoe
Site Admin
Posts: 688
Joined: Sat May 07, 2011 7:12 pm
Contact:

Re: NVDEC

Post by Zeranoe » Thu Mar 08, 2018 9:10 pm

I've added

Code: Select all

--enable-nvdec
to the configure, so look for it in tonight's build.
hydra3333 wrote:
Thu Mar 08, 2018 2:23 am
Now, to find the latest status of OpenCL ... and try to convince that it can be done and put into a standard zeranoe build :D
The issue with OpenCL is the lack of the needed dll on a server. The API calls for OpenCL need to be implemented with LoadLibrary to get around this, but it hasn't been done yet as far as I know.

DJX
Posts: 54
Joined: Mon Aug 06, 2012 10:37 pm

Re: NVDEC

Post by DJX » Thu Mar 08, 2018 11:18 pm

Thank you.

hydra3333
Posts: 158
Joined: Sun Apr 28, 2013 1:03 pm
Contact:

Re: NVDEC

Post by hydra3333 » Fri Mar 09, 2018 1:09 am

Zeranoe wrote:
Thu Mar 08, 2018 9:10 pm
I've added

Code: Select all

--enable-nvdec
to the configure, so look for it in tonight's build.

The issue with OpenCL is the lack of the needed dll on a server. The API calls for OpenCL need to be implemented with LoadLibrary to get around this, but it hasn't been done yet as far as I know.
Thank you for nvdec.

Khronos have implemented something; https://github.com/KhronosGroup/OpenCL-ICD-Loader
Based on deadsix27's python code, below are 2 hacked-up bit of rdp type script which links it statically into ffmpeg for 32bit and slightly differently for 64 bit ... and it works. Just a thought.

Truly terrible example code:

Code: Select all

bits_target=32
echo "KHRONOS OpenCL ICD build"
build_openCL_icd_v1_which_works $bits_target

bits_target=64
echo "KHRONOS OpenCL ICD build"
build_openCL_icd_new_v2 $bits_target


build_openCL_icd_v1_which_works() {
echo "---------------------------------------------------------------------------------------------------"
echo "---------------------------------------------------------------------------------------------------"
echo "build_openCL_icd_v1_which_works \"${1}\""
echo "---------------------------------------------------------------------------------------------------"
echo "---------------------------------------------------------------------------------------------------"
  # build openCL stuff ready for ffmpeg, in a way that the final ffmpeg build is independent of the make/model of video card being used
  # https://streamcomputing.eu/blog/2015-08-14/opencl-basics-multiple-opencl-devices-with-the-icd/
  #
  # "The OpenCL ICD extension (cl_khr_icd) allows multiple implementations of OpenCL to co-exist on the same system. 
  #  The OpenCL ICD Loader Library allows applications to choose a platform from the list of installed platforms
  #  and dispatches OpenCL API calls to the underlying implementation."
  #  
  # courtesy of the work by deadsix27 https://github.com/DeadSix27/python_cross_compile_script
  #
  # remove hangover headers etc which were needed for the interim ffmpeg build
  rm -fv ${mingw_w64_x86_64_prefix}/include/OpenCL
  rm -fv ${mingw_w64_x86_64_prefix}/include/CL
  rm -fv ${mingw_w64_x86_64_prefix}/lib/libOpenCL.a
  # the next 2 aren't copied originally, but try to remove them anyway
  rm -fv ${mingw_w64_x86_64_prefix}/lib/OpenCL.def	
  rm -fv ${mingw_w64_x86_64_prefix}/lib/OpenCL.dll
  # 1. the headers
  do_git_checkout https://github.com/KhronosGroup/OpenCL-Headers.git OpenCL-Headers
  cd OpenCL-Headers || exit 1 
    mkdir -pv "${mingw_w64_x86_64_prefix}/include/CL"  
    mkdir -pv "${mingw_w64_x86_64_prefix}/include/OpenCL"
    cd opencl22/CL  || exit 1 # nvidia is v1.2 as at 2017.05.28 but the ICD loader requires latest
	   cp -fv *.h "$mingw_w64_x86_64_prefix/include/CL/" || exit 1 
	   cp -fv *.h "$mingw_w64_x86_64_prefix/include/OpenCL/" || exit 1 
    cd ../..
  cd ..
  # 2. the icd loader
  do_git_checkout https://github.com/KhronosGroup/OpenCL-ICD-Loader.git OpenCL-ICD-Loader "6849f617e991e8a46eebf746df43032175f263b3" # 6849f617e991e8a46eebf746df43032175f263b3 is last working commit before they broke it for 32bit
  cd OpenCL-ICD-Loader || exit 1 
	 rm -fv "libOpenCL.dll.a"
	 rm -fv "$mingw_w64_x86_64_prefix/lib/libOpenCL.dll.a"
	 rm -fv "OpenCL.a"
	 rm -fv "$mingw_w64_x86_64_prefix/lib/OpenCL.a"
    rm -fv "0001-OpenCL-git-prefix-static.patch"
    apply_patch https://raw.githubusercontent.com/hydra3333/ffmpeg-windows-build-helpers-withOpenCL/master/patches/0001-OpenCL-git-prefix-static.patch "-p0" # "-p1"
    if [ "$1" = "32" ]; then   # this patch applies only to 32bit 
      # this patch apparently fixes the issue 
      #   CMakeFiles/OpenCL.dir/objects.a(icd_windows.c.obj):icd_windows.c:(.text+0x143): 
      #      undefined reference to `InitOnceExecuteOnce'
      #   collect2: error: ld returned 1 exit status
      #   CMakeFiles/OpenCL.dir/build.make:152: recipe for target 'bin/OpenCL.dll' failed
      #   make[2]: *** [bin/OpenCL.dll] Error 1
      apply_patch https://raw.githubusercontent.com/hydra3333/ffmpeg-windows-build-helpers-withOpenCL/master/patches/0001-OpenCL-icd-windows-c.patch #"-p1"
    fi
    cmake –G”Unix Makefiles” . -DENABLE_STATIC_RUNTIME=1 -DCMAKE_SYSTEM_NAME=Windows -DCMAKE_RANLIB=${cross_prefix}ranlib -DCMAKE_C_COMPILER=${cross_prefix}gcc -DCMAKE_CXX_COMPILER=${cross_prefix}g++ -DCMAKE_RC_COMPILER=${cross_prefix}windres -DCMAKE_INSTALL_PREFIX=$mingw_w64_x86_64_prefix -DBUILD_SHARED_LIBS=OFF -DCMAKE_FIND_ROOT_PATH=$mingw_w64_x86_64_prefix || exit 1
    make clean || exit 1 
    make -j $cpu_count || exit 1 
    cp -fv "OpenCL.a" "$mingw_w64_x86_64_prefix/lib/libOpenCL.dll.a" || exit 1       # brute force link success by copying to known .a filenames 
    cp -fv "OpenCL.a" "$mingw_w64_x86_64_prefix/lib/libOpenCL.a" || exit 1           # brute force link success by copying to known .a filenames 
	 #read -p "Finished build_openCL_icd_v1_which_works, press Enter to continue... or control-C if not happy"
  cd ..
}


build_openCL_icd_new_v2() {
echo "---------------------------------------------------------------------------------------------------"
echo "---------------------------------------------------------------------------------------------------"
echo "build_openCL_icd_new_v2 \"${1}\""
echo "---------------------------------------------------------------------------------------------------"
echo "---------------------------------------------------------------------------------------------------"
  # build openCL stuff ready for ffmpeg, in a way that the final ffmpeg build is independent of the make/model of video card being used
  # https://streamcomputing.eu/blog/2015-08-14/opencl-basics-multiple-opencl-devices-with-the-icd/
  #
  # "The OpenCL ICD extension (cl_khr_icd) allows multiple implementations of OpenCL to co-exist on the same system. 
  #  The OpenCL ICD Loader Library allows applications to choose a platform from the list of installed platforms
  #  and dispatches OpenCL API calls to the underlying implementation."
  #  
  # courtesy of the work by deadsix27 https://github.com/DeadSix27/python_cross_compile_script
  #
  # remove hangover headers etc which were needed for the interim ffmpeg build
  rm -fv ${mingw_w64_x86_64_prefix}/include/OpenCL
  rm -fv ${mingw_w64_x86_64_prefix}/include/CL
  rm -fv ${mingw_w64_x86_64_prefix}/lib/libOpenCL.a
  # the next 2 aren't copied originally, but try to remove them anyway
  rm -fv ${mingw_w64_x86_64_prefix}/lib/OpenCL.def	
  rm -fv ${mingw_w64_x86_64_prefix}/lib/OpenCL.dll
  #
  # 1. Download the headers and put them in the right places
  #
  do_git_checkout https://github.com/KhronosGroup/OpenCL-Headers.git OpenCL-Headers
  cd OpenCL-Headers || exit 1 
    mkdir -pv "${mingw_w64_x86_64_prefix}/include/CL"  
    mkdir -pv "${mingw_w64_x86_64_prefix}/include/OpenCL"
    cd opencl22/CL  || exit 1 # nvidia is v1.2 as at 2017.05.28 but the ICD loader requires latest
	   cp -fv *.h "$mingw_w64_x86_64_prefix/include/CL/" || exit 1 
	   cp -fv *.h "$mingw_w64_x86_64_prefix/include/OpenCL/" || exit 1 
    cd ../..
  cd ..
  #
  # 2. the icd loader
  #
  do_git_checkout https://github.com/KhronosGroup/OpenCL-ICD-Loader.git OpenCL-ICD-Loader  # 6849f617e991e8a46eebf746df43032175f263b3 is last working commit before they broke it
  cd OpenCL-ICD-Loader || exit 1 
	 rm -fv "libOpenCL.dll.a"
	 rm -fv "$mingw_w64_x86_64_prefix/lib/libOpenCL.dll.a"
	 rm -fv "OpenCL.a"
	 rm -fv "$mingw_w64_x86_64_prefix/lib/OpenCL.a"
    # 2018.02.02 from DeadSix27:
    apply_patch https://raw.githubusercontent.com/hydra3333/ffmpeg-windows-build-helpers-withOpenCL/master/patches/0001-OpenCL-git-prefix-20180223.patch "-p1"
    apply_patch https://raw.githubusercontent.com/hydra3333/ffmpeg-windows-build-helpers-withOpenCL/master/patches/0002-OpenCL-git-header-20180223.patch "-p1"
    mv icd_windows_hkr.patched.h icd_windows_hkr.h
    #
    # 3. make devpkey includable
    #
    # 2018.02.02 from DeadSix27: apparently devpkey.h is in mingw64, 
    #    https://github.com/DeadSix27/python_cross_compile_script/issues/29#issue-292584111
    #    https://github.com/DeadSix27/python_cross_compile_script/pull/31#issuecomment-362291618
    #    https://github.com/DeadSix27/python_cross_compile_script/issues/29
	 sed -i.bak 's/Devpkey.h/devpkey.h/' icd_windows_hkr.c
    #
    # 4. Build it
    #
    cmake –G”Unix Makefiles” . -DENABLE_STATIC_RUNTIME=1 -DCMAKE_SYSTEM_NAME=Windows -DCMAKE_RANLIB=${cross_prefix}ranlib -DCMAKE_C_COMPILER=${cross_prefix}gcc -DCMAKE_CXX_COMPILER=${cross_prefix}g++ -DCMAKE_RC_COMPILER=${cross_prefix}windres -DCMAKE_INSTALL_PREFIX=$mingw_w64_x86_64_prefix -DBUILD_SHARED_LIBS=OFF -DCMAKE_FIND_ROOT_PATH=$mingw_w64_x86_64_prefix || exit 1
    #
    make clean || exit 1 
    make -j $cpu_count || exit 1 
    #
    # 5. Copy the result to the right places
    #    
    cp -fv "libOpenCL.dll.a" "$mingw_w64_x86_64_prefix/lib/libOpenCL.dll.a"  || exit 1     #ignore errros   # brute force link success by copying to known .a filenames 
    cp -fv "libOpenCL.dll.a" "$mingw_w64_x86_64_prefix/lib/libOpenCL.a"  || exit 1     #ignore errros   # brute force link success by copying to known .a filenames 
    #read -p "Finished build_openCL_icd_new_v2, press Enter to continue... or control-C if not happy"
  cd ..
}
for ffmpeg to include it:

Code: Select all

config_options+=" --enable-opencl "

red5goahead
Posts: 18
Joined: Mon Dec 05, 2016 8:26 pm

Re: NVDEC

Post by red5goahead » Fri Mar 09, 2018 1:39 pm

so now with --enable-nvdec?

there is not a new h264_nvdec codec in the list (help full)
which is the difference with h264_cuvid decoder, now?

DJX
Posts: 54
Joined: Mon Aug 06, 2012 10:37 pm

Re: NVDEC

Post by DJX » Fri Mar 09, 2018 3:45 pm

I'm also confused by this.
The output of "ffmpeg -codecs" does not list a decoder for NVDEC.

Code: Select all

 DEV.LS h264                 H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10 (decoders: h264 h264_qsv h264_cuvid ) (encoders: libx264 libx264rgb h264_amf h264_nvenc h264_qsv nvenc nvenc_h264 )

red5goahead
Posts: 18
Joined: Mon Dec 05, 2016 8:26 pm

Re: NVDEC

Post by red5goahead » Fri Mar 09, 2018 5:17 pm

I'm trying with usual encoding commands

ffmpeg -y -t 00:02:00 -hwaccel cudiv -c:v h264_cuvid -deint adaptive -resize 1920x1080 -i '.\Ferrari Challenge Mugello 458_raw.ts' -c:v hevc_nvenc -c:a aac -preset slow out.mp4

but without success -resize is working . deinterlacing not

hydra3333
Posts: 158
Joined: Sun Apr 28, 2013 1:03 pm
Contact:

Re: NVDEC

Post by hydra3333 » Mon Mar 12, 2018 5:13 am

red5goahead wrote:
Fri Mar 09, 2018 5:17 pm
. deinterlacing not
Last I heard that was a known feature.

https://github.com/rigaya/NVEnc
this apparently has a choice of gpu deinterlacers and resizers and has plenty of comment over at videohelp. I'm looking at testing it for my needs which involve transferring/adding a heap of HDR type metadata (once I understand that better) for 4k/HDR10 clips.

As an alternative example for you, I currently use yadif to pre deinterlace 8-bit input although it's not gpu and its slow, but works. eg alng the lines.

Code: Select all

ffmpeg_x64.exe -hide_banner -v verbose -strict -1 -init_hw_device opencl=ocl:1.0 -filter_hw_device ocl -threads 0 -i "input.mp4" -sws_flags lanczos+accurate_rnd+full_chroma_int+full_chroma_inp -filter_complex "[0:v]yadif=0:0:0" -pix_fmt yuv420p -c:v hevc_nvenc -profile:v main -level 5.1 -preset slow -rc vbr_hq -cq 28 -rc-lookahead 32 -spatial_aq 1 -c:a libfdk_aac -cutoff 18000 -ab 384k -ar 48000 -movflags +faststart -y "output.h265.mp4"
The OpenCL bit on the commandline is only in case i choose to add a gpu OpenCL filter like unsharp etc.

red5goahead
Posts: 18
Joined: Mon Dec 05, 2016 8:26 pm

Re: NVDEC

Post by red5goahead » Mon Mar 12, 2018 9:05 am

thanks but yadif work already fine

ffmpeg -y -t 00:02:00 c:v h264_cuvid -resize 1920x1080 -i '.\Ferrari Challenge Mugello 458_raw.ts' -vf yadif=1 -c:v hevc_nvenc -c:a aac -preset slow out.mp4

the question is about speed. with no hwaccel cuvid the speed is quite enough (about 3x) instead 10-11x and yadif=1 generate a 50 fps footage that is good

hydra3333
Posts: 158
Joined: Sun Apr 28, 2013 1:03 pm
Contact:

Re: NVDEC

Post by hydra3333 » Mon Mar 12, 2018 9:57 am

Is that resizing before deinterlacing ?
I was always told that was "bad" and to always do it after :)

Do consider checking out this https://github.com/rigaya/NVEnc where if needed you could decode the input file with ffmpeg (if nvencc doesn't already read that format directly) and pipe it into and pretty much do all-gpu deinterlacing and resizing into an avc or hevc mp4 using nvencc. Just a thought.

red5goahead
Posts: 18
Joined: Mon Dec 05, 2016 8:26 pm

Re: NVDEC

Post by red5goahead » Mon Mar 12, 2018 10:23 am

I know, usally I'm using this

ffmpeg -y -t 00:02:00 c:v h264_cuvid -i '.\Ferrari Challenge Mugello 458_raw.ts' -vf yadif=1,size=1920:1080 -c:v hevc_nvenc -c:a aac -preset slow out.mp4

I would to decode and encode by hw at all

Post Reply