The "No. There is no additional overhead. Components participate in a unified renderloop outside of React. It outperforms Threejs in scale due to Reacts scheduling abilities." made me do a serious double-take.
If you want to go fast in this space then you need to care about data layout and how the system is structured end-to-end. Calling a function per object is going to hit a wall regardless of how you schedule.
Hundreds/Thousands of updates is not small but it's also not massively impressive either. I've done ~2,800 node scene graphs on underpowered ARM chips back in '09 at 60FPS including rendering. You have to use NEON, and be aware of your caches. No scheduling magic is going to change that unless you're just deferring work which sounds like what may be happening there.
FWIW I've also done this in Java via FlatBuffers(which uses ByteBuffer internally) to keep data coherency when driving animations frames so it doesn't require dropping down to C/C++/Rust(although C#'s value types do make it easier).
Their starting setup achieves 16000 textured and complex (relatively) models @ 30fps. Which is already much more than the "optimized" thing is achieving here. And it doesn't "cheat" by pretending to be fast via simply not doing to updates that are expected. And once they apply the various optimizations with memory layout etc... they get to 150000 (!) textured models moving about on screen @ 30 fps. So let's say 75000 @ 60fps, which is more than 35x as many objects, and the objects are much more complex.
Am I missing something? Why is "2000 cubes @ 60fps" extraordinary?
You have to remember that this 700ms overhead is not the renderer. It's just setting fields on the underlying Three.JS Cube objects. The Three.JS renderer runs fine, and I don't even consider Three.JS to be a fast renderer. So yes, the overhead of React-Three-Fiber's tree reconciliation is seemingly massive. No, I do not have any answers for what it is doing. Nor do I have any guesses.
Thanks for clarifying. 700ms overhead is massive. It sounds like the approach is just deeply suboptimal for any sort of high performance and complicated 3d scenes. Perhaps that was not their objective...
there were multiple versions of that test, the fist had nothing to do with the subject matter, the ones that people refer to (spinning cubes) had an artificial delay that was added to simulate cpu stress.
we're running in circles unfortunately and i've explained where the test you're referring to comes from and what it meant. i've posted the real test and if you want, engage in it.
async function test() {
const chars = `!"§$%&/()=?*#<>-_.:,;+0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz`
const font = await new Promise((res) => new THREE.FontLoader().load("https://raw.githubusercontent.com/drcmda/scheduler-test/master/public/Inter%20UI_Bold.json", res))
console.time("test")
for (let i = 0; i < 510; i++) {
new THREE.TextGeometry(chars[Math.floor(Math.random() * chars.length)], {
font,
size: 1,
height: 0.5,
curveSegments: 80,
bevelEnabled: false,
})
}
console.timeEnd("test")
}
test()
// To really drive it home you'd have to repeat it every two seconds ...
// setInterval(test, 2000)
how react 18 concurrency works exactly, i think that's not the right place to churn through it. the react team has published tons of reading material as well as public talks.
It's interesting because in normal React usage, the overhead of generating and throwing away a bunch of objects on each update is dwarfed by the cost of updates to the DOM
But in Three.JS the actual updates to the tree should be cheap, right? There's no reflow, you're just setting values in memory to be used by the next render frame. If so, that changes the calculus.
There's also the fact that in an app, it's rare for actual state updates (and therefore React renders) to happen on every frame; usually it's only on interactions. Maybe the occasional animation (if it can't be handled by native CSS animations). Whereas in graphical contexts like this, it's much more likely you'll have lots of objects in continuous motion (and therefore continuous re-renders).
I can see the productivity gains being worth it for a lot of simpler use-cases, but I'm skeptical about the performance claims when you start to get into complex scenes with lots of entities.
Yeah, I can totally see the argument that it's a programming model that's well understood and you can get a productivity boost from that. However I don't think you can say that doesn't come with a cost, otherwise it would have been pretty widely adopted across the industry.
> otherwise it would have been pretty widely adopted across the industry
I'm not sure that's a fair explanation for why. It's totally possible to come up with new paradigms that are useful even though nobody's thought of them before.
I would think the main issue will be around JavaScript's tendency (cultural, syntactic, etc) to casually create and release objects all over the place, constantly. There's nothing intrinsically wrong with this, but it seems problematic for this use-case.
Example: JavaScript doesn't have named function parameters, because instead you just create and destructure an object:
The syntax encourages this, the React docs encourage this. JSX itself does this for every element you render. And for normal JavaScript usecases it works just fine. But when you're running this logic every frame, I would guess it will limit you at a certain point.
Despite that, I think people are onto something with the broader idea of coding a 3D scene declaratively. I'm just skeptical that React or its norms are the right path to doing it at scale.
Practically, if that function is somewhat hot, I wouldn't be surprised if V8 omitted the allocation altogether when generating optimized bytecode — internally, properties on objects already have an "order," so V8 could push each property into the stack in that order/reverse order (depending on calling convention). And if it doesn't yet, that's not a difficult optimization to make.
That requires substantial escape analysis, which I've noticed v8 does not handle too well. There are similar issues with the new for...of iteration protocol which makes a new object on every iteration, and from my experiments it is about 10-15% slower than a C-style for loop and generates actual GC garbage.
It seems like it would be easy to optimize, but it's one of those things where I know the V8 devs are much smarter than I am so I assume if they haven't figured out how to optimize it, it must be harder than it seems :P
Hierarchical component based designs have been around in game-dev for ages. I first used them in '05 but I remember prior art even before then. It was pretty common to have a declarative way to define components(usually through a scripting language like Lua or sometimes custom DSL).
I agree on the performance aspect, any inner-loop stuff always was down in a native language or heavily JIT'd path, but even then data layout drove it even more which usually required structuring the upstream systems ahead of the core logic. It's the reason why there's no "one-size fits all" game engine. They all make very discrete trade-offs in terms of entity counts, open world vs constrained layout and the like.
The claim that a general purpose web frontend framework adds no overhead to GPU intensive animation is indeed bizarre. The author seems to believe his own marketing that his framework is the end-all-be-all.
If you want to go fast in this space then you need to care about data layout and how the system is structured end-to-end. Calling a function per object is going to hit a wall regardless of how you schedule.
Hundreds/Thousands of updates is not small but it's also not massively impressive either. I've done ~2,800 node scene graphs on underpowered ARM chips back in '09 at 60FPS including rendering. You have to use NEON, and be aware of your caches. No scheduling magic is going to change that unless you're just deferring work which sounds like what may be happening there.
FWIW I've also done this in Java via FlatBuffers(which uses ByteBuffer internally) to keep data coherency when driving animations frames so it doesn't require dropping down to C/C++/Rust(although C#'s value types do make it easier).