Asynchronous GPU Readback
Some time ago I mentioned reading back data from the GPU asynchronously. I’ve had done some exploration into this today, and I’d like to get some of that on paper. I’m not going to go in depth through the whole process here as much as some of my other posts, but I think there’s still a lot to learn here.
Something I just want to quickly mention beforehand is that experimental API we we’re going to use is no longer experimental! That’s great news! We can now build the program against a stable Unity API.
Implementation
The asynchronous API works with requests. This basically means that we file a “request” at some point in time (probably after we send data to the GPU), and at another point in the future we’ll get our data back. The idea is that waiting for the data to come back doesn’t make the program wait, and so there’s effectively no framerate impact for getting the data back.
There’s 4 main places that we can make use of this requests API;
Ideally we’d merge all these into a single request, but that’s for another time.
The simplest way I can think of of switching these to an asynchronous process is to something like the following:
We simply make requests, and when they’re ready we feed the data back. This does effectively the same thing we’re doing already without blocking the CPU.
Another way to make use of the requests API is to have the request tell us when it’s ready. This would look something like the following:
Where the second parameter is a C# action that will be called when the request is complete. This was my approach originally, but creating actions in a lambda-like fashion every frame actually incurs a fairly large overhead. A solution here would be to avoid lambdas and create the action manually, but to be honest I hadn’t thought of that until now. Both are good options.
Something else to note here is that it’s extremely important to use CopyTo() rather than ToArray() to move the values into our arrays. In the words of someone on the Unity forums: “ToArray() allocates like a motherf*er”.
Not so bad! The desync you’re seeing here is that the texture is updated each frame, but the mesh is only updated every other frame or so. In theory, we should be able to solve this by only updating the texture each time we push a new request, but despite my best efforts, Unity seems to refuse to chroma key the texture unless we update the texture each frame.
We’ve also successfully decoupled the hand’s bad performance from the rest of the game. In builds, this solution comfortably runs at over 200FPS regardless of the hand parameters. Note: this doesn’t mean the hand portion runs or even looks like it runs at that framerate. It simply means that the rest of the game doesn’t suffer from the hand manager’s performance problems.
Concluding
I agree that a solution like this isn’t ready for production, but nonetheless, significant progress (in my opinion) has been made in getting there. For now I’ve added asynchronous reading as an option that can be toggled in the editor and at runtime, although Unity seems to struggle if you do it too often. We can test with this parameter to at least get a cursory idea of what kind of impact this has on player experience.















