Asking here because this might have some visibility:
What's the status of hardware acceleration of next-generation video standards (h.265, VP9, something else)?
It's my understanding that after VP8 was pushed and then superseded rapidly, hardware manufacturers are now leery of implementing anything other than h.264.
And as a follow-on: what makes generic GPU hardware and software not sufficient for hardware acceleration of video decoding? What is it that makes my GPU stupendously good at pushing pixels and training neural nets and physics calculations and so many other things, but not better than my CPU at video decoding? Is there a reason h.265 and such aren't implemented in CUDA?
I don't know much about video encoding, not even enough to be dangerous, so to speak.
hevc (h.265) has specialized hardware which encodes or decodes much faster than other GPU solutions. I believe typical graphic cards come with hardware codecs within them but I am not sure.
Regarding why hevc isn't so easy to use with GPU, it comes from the way video compression is done. In video compression, you don't really process each pixel one by one independently. Instead, you consider blocks of pixels and you look for similar ones in other pictures. So it is not much about having many similar small tasks but really one big one. Besides, to minimize the information to send (i.e. to compress better), the compression of each block is based on what the left and top blocks have done. If I compress all my blocks in parallel, I will not be able to make them use each other's information.
Does this help you see the big picture?
VP9 decoding was supported since Feature Set E, it was running a combination of a native ASIC decoder and a set of shaders running on the GPU.
So while it wasn't as power efficient as the dedicated PureVideo hardware block it was still GPU accelerated.
I'm curious, do you have a source for that? I know that Feature Set E cards used a hybrid ASIC/GPU approach for H.265 but I can't find a mention of hybrid VP9.
It's done via DXVA (2.0) if you query DX on a Maxwell card (or even Kepler) you should get something like VP9_VLD_Profile0: DXVA2/D3D11, SD / HD / FHD / 4K (You can either manually do it, or there is a tool called DXVA checker or tester or something along those lines).
LAV Filters (an Open Source implementation of DirectShow/DXVA filters) VP9 support works with Maxwell cards for sure, I have a 780ti somewhere I can check Kepler also, but IIRC it should work.
That said this would work for many video players that use DXVA or have support for external DS filters e.g. VLC/MPC-HC or proprietary players like Splash, but it won't work in a browser.
Chrome for example doesn't allow for external filters iirc so you'll see a pretty big CPU jump.
For some reason it seems fine on VP9 1080p videos on a Maxwell card CPU is 2-3% but at 4K it jumps to 70-80% I have a feeling that since Maxwell has partial support in the PV block for HEVC and VP9 they might be able to decode 1080p with it and then do full CPU decoding for 4K, I wouldn't expect CPU decoding for 1080p to be so resource efficient otherwise.
Just some more info this is the DXVA GPU accelerated decoder http://imgur.com/a/9MPsphttp://imgur.com/a/q8SfN It basically decodes a VP9 stream with a shader and outputs it into an image format inluding NV12 which means you can use PureVideo for direct ouput.
This one can work with either a dedicated media player or with Edge, Chrome does it's own thing and I'm not entirely sure what they do and I have no clue about Safari/Opera/Firefox so if some one wants to fill in about them that would be nifty.
Whilst technically DXVA is a software decoder https://en.wikipedia.org/wiki/DirectX_Video_Acceleration It does offload the heavy lifting to the GPU via shaders, so can still some what classify it as "hardware decoding" since softare decoder/encoders are usually defined as CPU only.
One of the APIs is DXVA/MF on Windows but the underlying hardware is still NVDEC (and you can use the NVDEC API directly if you so choose). It and NVENC are specialised hardware modules.
>What's the status of hardware acceleration of next-generation video standards (h.265, VP9, something else)?
You're only likely to find proper hardware acceleration for the top codecs like h.264, not much else. There's going to be software acceleration for everything else. So from the MPEG[2] group, as you might expect, there's h.265. But then there's a whole new undertaking by the big guys like Netflix, Google, etc. which is called AOMedia[1]. Now they have this new codec called AV1[3] which is still in development. And it's open source and royalty free which is not usually what happens in the MPEG group because they charge royalties and have patents on the codecs/standards they have.
>what makes generic GPU hardware and software not sufficient for hardware acceleration of video decoding?
Now I'm not a video encoding/decoding expert, but just enough to be dangerous actually. What makes GPUs well suited is that they have many, many more cores so each core can handle a specific area of the video output that needs to be decoded.
It's called parallel programming. Whatever can efficiently be implemented on GPUs better than CPUs, people are doing it. This is recently being called GPGPU programming, where tasks that were thought were better done by CPU are being re-programmed for the GPU since the development and performance in GPUs have been going through the roof.
Now when it comes to the why, think of it this way. Your average complex computation is more like lifting a couple really heavy boulders. So you need a strong person to lift it. So that strong person is your CPU core, with complexity and high clock rate. But video encoding is not that complex at the pixel level. So it is more like picking lots (thousands) of tennis balls. GPUs are like having a bunch of kids (current gen GPUs have ~2000 cores even at the mid-range) who are no where as strong as the strong CPU guy, but you can guess who will get the task done faster.
mpv is fantastic. I recently decided to abandon MPlayerX when I found out about their incredibly dishonest malware bundling tactics[1] but was surprised at how few good media players there are for macOS, until I read about mpv.
mpv took a little bit of getting used to since it doesn't offer a lot of useful stuff out of the box (automatically queuing files, subtitle downloads, etc) but since you can write quite powerful scripts for it that didn't remain a problem for long.
If you're on macOS and looking for something with a nice GUI, check out IINA [0]. I saw it on HN [1] last week, and it looks very promising, although it's still an alpha. It's based on mpv.
Iina is very promising. Keep in mind that it is still very early in development, but I really like the mindset of the developer and contributors - creating a macOS experience, not cross-platform that looks bad everywhere.
MPlayerX was once upon a time also that, but its maintenance went vaporware a few years back, and the bundled crap finally killed it. So for a while now, I've been looking for a good player. Seems Iina is that player.
small tempermonkey script replacing all YT embeds with custom URI + custom URI handler calling mpv "youtube_url" (or in my case extracting direct 720p stream https://xxxx.googlevideo.com/videoplayback... link and passing it to SMPlayer)
This lets you watch every single YT clip using player of your choosing. Result is smooth video on 10 year old laptops(1.8GHz Core2) when Flash/browser buildin codecs are barely able to play 480p.
Would you consider publishing the script somewhere? I can't speak for others, but I would personally be inclined to use it as I'm looking for a YouTube client alternative.
Yeah, I don't usually watch that many YouTube videos. In addition, I only follow a handful of content creators, and they don't post that frequently, e.g. Primitive Technology.
Not that it matters to everyone, but YouTube limits users of YouTube-dl to 720p resolution I believe. If you want 1080p or higher you have to do in-browser.
Nope, youtube-dl is able to grab the DASH sources, which includes 4k, 1440p and 1080p. :)
The DASH video and audio sources can then be muxed by mpv on playback.
I always hear about Safari being described as "the new IE", but after switching back to a Macbook I've tried to force myself to stay on Safari as a default browser and it's pretty great from a user perspective.
It saves an incredible amount of battery life over Chrome (which makes me that much more weary of Electron apps eating up battery and memory at idle). Is there any specific architecture decision by Safari that enables those battery savings?
AFAIK the difference is Safari is not as secure as Chrome. Chrome's multi-process security comes at a cost in that pretty much everything a webpage wants/needs to do needs to be shuttled between processes. All network request, all disk io, all graphics happen in other processes. The communication overhead between processes is the difference in CPU usage. It's also why Chrome has 10x less code execution bugs than Safari. Note: Chrome doesn't have 10x less bugs. It has the same number of bugs. It's just by category of bugs it has 10x less code execution bugs.
I'm only guessing that as Firefox goes multi-process for security reasons the same thing will happen. Their CPU usage will go up because of the overhead of cross process communication but their code execution bug percentage will go down
WebKit2 behaves in the same way with regards to the points mentioned above. The sandboxed content process(es) are responsible for network, IO, layout, etc. The UI process (normally, the browser process) serves only as a broker for developer decision making and final drawing of the laid out content.
AFAIK WebKit2 does not use a separate process for graphics. Graphics calls go directly into whatever api is appropriate (CoreGraphics, OpenGL).
Where as Chrome that's not the case, all OS/Driver level graphics happen in the GPU process. That means all data has to be shuttled from the process that wants to display the data to the GPU process. Even video for example gets decoded in a secure process (because there might be exploitable bugs in the codecs) then that data has to be shuttled to the GPU process so it can be composited with the page. The directives of compositing happen in the render process (the webpage) but eventually have to translated to graphics commands that happen in the GPU process.
None of this is true in WebKit2 AFAIK and is one of the many reasons it has so many move code exploit bugs (15x actually for 2016).
Safari has been multi process for ages now (several years). I don't think that's the difference. I think Safari simply errs on the side of better resource consumption where Chrome's highest priority is cutting edge shinies for web front end devs and being as fast as possible at the cost of everything else.
One example of this is where Safari will suspend tabs (as in slow them down or freeze them while keeping them in memory) you haven't used in 15+ minutes that aren't playing music for anything. Effectively, this means that Safari's total power impact is only that of the 2-4 tabs you're currently actively using instead of how ever many you actually have open. It's a bit nicer than something like The Great Suspender though, since it doesn't just trash the loaded pages and force you to reload upon visiting said tabs.
If you're using an adblocker based on WebKit Content Blockers, that can have an impact too. They're much more efficient than even uBlock and add almost no additional CPU or RAM usage to web browsing, meaning you fully reap all the savings of blocking ads and egregious JS.
While technically true for Safari, all these advantages are true if you develop your own application using WKWebView. Each web view will consume its own process up to a point, then some processes will be reused; inactive processes will throttle down activity and suspect. The WebKit2 architecture is a lot more efficient in its memory managed as well. It's a shame so many browsers went with Blink rather than use WebKit2, mostly for political reasons.
I think the bigger problem with WebKit-based browsers on platforms other than macOS is the frequency of updates. For example, I find GNOME Web on Linux/*BSD perfectly suited for my own web browsing, but depending on the distribution you're using updates can be fewer between than even Safari's once-per-year big updates and far more infrequent than Safari Technology Preview's or Chrome's update schedule.
There aren't really any good WebKit-based options on Windows, though. IIRC Midori runs on Windows but is GTK based and feels out of place.
I'm not sure it's Safari per se. Safari is quite a "thin" layer around WebKit2. By "thin", I mean if you remove all the iCloud stuff, bookmarks, etc. Tab management intelligence is performed by WebKit2 backend process. So I think it's more of a testament to WebKit2 architecture. It's a real shame there is no good browser for Windows using WebKit2.
That's one way to frame it. Or you could use a plugin to request the h264 version, and keep the battery life gains of hardware acceleration on non vp9 accelerating devices (vp9 accel. is quite recent.) And still get to watch in 4k.
That's not how it works. If you request h264, there is no 4K option because there's no 4K h264 encoded version* .
Edit: As SG- pointed out, I also cannot read and Youtube is processing 4K h264, but not offering it on the main site. So it's puzzling why they're doing this. Bandwidth is a possibility, but you'd think it'd be consistently applied.
* You can test this using Chrome + h264ify. Which I use because the battery life when watching VP9 is truly horrific, and I don't care how much Google saves on bandwidth by pulling this stunt.
"The necessary support of the VP9 video codec for 4K playback today seems to currently apply to videos being watched directly on YouTube’s site. A quick way to check is to launch an uploaded 4K video on YouTube, and attempt to change the resolution. Recently uploaded videos only show HD options up to 1440p in Safari. Interestingly, if these very videos are embedded on an external page, then 4K video playback is still an option."
So it looks like they're encoding h264 in 4K, just not if you're on the actual Youtube site.
interesting that the file size for vp9/webm is usually smaller for all size options except for 4K where mp4/h264 seems smaller (at least for this specific video) meaning it's not really about trying to save bandwidth.
You are right and I am wrong. 1440p was max I was offered forcing flash player in opera and mp4 in chrome.
Out of curiosity, why was this article safari framed when it affects all browsers that want to use h264 4k including chrome and opera? Is there some way that this is actually a safari-centric change?
Because by default Chrome/Firefox/Edge support VP9 but Safari does not. By extension, if you aren't blocking VP9, and most people aren't, Chrome/Firefox/Edge get VP9 = 4K and Safari gets H264 = no 4K.
It's most likely not a bug, as YouTube and Google are known for forcing VP8/9/10/A1 down people's throats, because it's theirs and they want to "show it to Apple". For video sites, H.264 is the only option if they want to support Apple devices, and it's hardware decoding is widespread enough that it doesn't cause any issues. The only issue is it's patented and it's bad at encoding compared to VP9/H.265. But we can just switch to H.265 anyways, and given that Google deprecates their codecs every 2 years, I don't think that they will ever have good hardware decoding support.
> The only issue is it's patented and it's bad at encoding compared to VP9/H.265
Your argument is that Google is evil for pushing a codec that's patent-free and better as a codec? Also, as others have stated, VP9 has wide hardware acceleration support (and will have wider support in the future)?
Thats the case most of the time I bothered looking. It seems Google is trying to make VP9 versions look better, and despite all the hype around their codec the only way to achieve that is thru bigger bitrates :/
Maybe it has something to do with costs? VP8 and VP9 are free while distributing h.264 and h.265 content will cost millions in fees per year. It makes sense to use free codecs instead of h.26x
Also people maybe should realize that vp8 and vp9 are not entirely based on free patents. Instead, Google pays companies to use their patents and distribute the related technologies freely.
Unless Google ever uses one of your patents, and you have to sue Google over that.
VP8 and VP9 come with a no-litigation clause that basically requires you to share all your patents with Google if you use VP8 and VP9 – which is a shame.
You only lose the right to use WebM parents if you file litigation against any user of WebM (not just Google) over any implementation of WebM. You can sue Google about anything else and not suffer WebM license consequences.
Indeed, turns out, I was wrong on that. Now I wonder, if i own MPEG patents, could I sue Google for misuse of them, while licensing WebM? There’s quite some overlap with them.
And yes, Facebook and Tesla’s patent licenses are completely ridiculous.
You can sue Google about misuse of MPEG patents and keep your protections under the WebM license, so long as you do not sue Google or anybody else about WebM's patents.
It turns out Google removed the "you lose the right to the license if you sue Google for literally anything" part, but Facebook still has it in React, and Tesla has it in their licenses.
That’s the usual wording of these clauses, and the problem.
Good. No one stops Apple from supporting free codecs. Google should have done something of the sort a long time ago. But they didn't, and proliferated H.264 usage.
>> VP9 uses a retarded amount of CPU compared to h264, so I can't blame Apple for not wanting to implement it.
h264 uses a retarded amount of bandwidth compared to VP9, so I can't blame Google for not wanting to implement it.
Google is still providing a choice here, except for 4K where it costs them twice as much. Apple on the other hand has no excuse for not implementing both. There is also a free low-power hardware implementation of VP9 which Apple (who makes their own SoC) could choose to use but hasn't.
> which Apple (who makes their own SoC) could choose to use but hasn't.
They don't make their own SoC for their laptop and desktop machines, which is who this primarily affects. 4K on iPhones doesn't matter since they only have 2560x1440 resolution.
Indeed, Google not doing something sooner is what caused Firefox to accept a proprietary h264 blob, leading to the eventual acceptance of h264 as a required codec in the WebRTC standard.
No big loss for me, no matter how you slice it. I don't watch that much video on YouTube and I prefer to d/l it with youtube-dl if I'm going to be involved with it longer than 5 minutes.
What's the status of hardware acceleration of next-generation video standards (h.265, VP9, something else)?
It's my understanding that after VP8 was pushed and then superseded rapidly, hardware manufacturers are now leery of implementing anything other than h.264.
And as a follow-on: what makes generic GPU hardware and software not sufficient for hardware acceleration of video decoding? What is it that makes my GPU stupendously good at pushing pixels and training neural nets and physics calculations and so many other things, but not better than my CPU at video decoding? Is there a reason h.265 and such aren't implemented in CUDA?
I don't know much about video encoding, not even enough to be dangerous, so to speak.