Why Spatial Partitioning is Not the Answer to Scaling the Metaverse
By Tobin Ireland, Chief Business Officer @ MetaGravity.
It’s been a known problem in the games development industry for over 10 years that although games engines are fantastic for gameplay development, fidelity and rendering — they can only support multiplayer games up to a limit of 40 to 100 concurrent users (CCUs) at any given instance.
Games such as Minecraft, Roblox, Fortnite, World of Warcraft, Legends of Mir, Final Fantasy and Diablo 4 have hundreds of thousands of online players, yet they are limited to playing in smaller condensed rooms of less than a hundred players.
There’s a good reason why Fortnite’s Battle Royale is capped at 100 players. It’s because any more would either cost too much in cloud computing costs or would crash due to unscalable game engine dynamics. Or both.
Scaling Metaverse Games
As the games industry builds the metaverse, this type of sharded gameplay experience becomes a significant limitation. This is due to the heavy compute requirements for simulating massively multiplayer online role playing games (MMORPGs) in terms of scale and complexity.
So as the amount of online players increases, so does the complexity of being able to render complex interactive objects such as avatars, gameplay collisions and virtual worlds with full fidelity and resolution.
Many companies have tried to solve the problem of simulation scale, including leading games engines, cloud providers, games studios and deep tech start-ups. All have failed so far to deliver a platform that can properly support a large-scale commercial game simulation and today it remains a big unsolved problem.
Limitations of the Current Approach
Most of the attempts to increase simulation scale beyond the limits of a single game instance have used a technique called spatial partitioning.
This approach works on the premise that instead of having multiple 40–100 player rooms, games are hosted on individual servers by breaking up 3-dimensional spaces into a 2-dimensional grid where each of the tiles is simulated by a server node.
As the scale of the simulation expands, additional resources are allocated using space partitioning data structures such as binary space partitioning trees, quadtrees and octrees. This means that cloud servers can be dynamically scaled up and down to support the change in simulation complexity. However, as the simulation scales — each tile needs to be synchronised at every tick of the game’s loop (i.e. 25 times per second) in order to maintain the simulation’s consistent reality.
Companies such as Improbable drove a lot of excitement in the games development industry when they first deployed their SpatialOS platform (now rebranded M²) into the cloud. And they did achieve a 10x increase in scale over what is possible in a full physics game simulation within a games engine like Unreal or Unity.
Hadean (note: my MetaGravity founder Rashid Mansoor was the founder of Hadean between 2015 and 2021) proved it was possible to achieve a 15x improvement in scale over Improbable by building a spatial partitioning engine on top of the highly scalable cloud compute platform Hadean OS.
In 2019 Hadean successfully demonstrated 14,000+ live globally-networked players in a single game environment with CCP Games for Eve Aether Wars, which today remains the world record for live players in a single game instance.
Problem solved then?
Unfortunately not. Although the solution is logical and elegant in principle, it has actually proven to be extremely difficult to execute a spatial partitioning model in practice. And no games have ever been commercially released or supported on this model.
There are 4 core reasons for this:
- Data synchronisation burden: Partitioning a game spatially is naive to understanding what is going on within the game: it’s a blind decision just to split up the game on 2-dimensional spatial lines. To ensure consistency, the game needs to be pessimistically synchronised about 25 times per second across the entire grid of servers. Every tile needs to be synchronised with its neighbours as well as transitively with every other spatial tile across the entire game. This means that as the game scales, the data synchronisation burden grows exponentially. There are also difficult edge cases — such as the sniper problem, the movement of groups of objects over grid intersections, and the presence of physical objects such as walls over spatial grid boundaries. All of this ultimately limits the scale of the simulation by the laws of physics, speed of light and network latency, and also makes the spatial partitioning approach extremely expensive as it limits the scalability of the simulation.
- Scale: Spatial partitioning redefined the limits of scale above what is possible within games engines, but only to thousands of participants or 10 to 150 times the 40–100 user limits of a games engine. This has forced tradeoffs between concurrent users, AI complexity, the level of persistent interactive objects in the world, the real-physics complexity of the gameplay, and the fidelity and resolution of the graphics, as the total compute budget for the game is always limited. The future requirements of MMORPGs and the metaverse far far exceeds this level of simulation scale.
- Cost: Spatial partitioning has turned out to be much more expensive by a factor of up to 10–300x (on an average cost per user per month basis) over existing multiplayer networking solutions. With spatial partitioning platforms, the costs get exponentially higher as scale limits are pushed. Games publishers therefore make less profit as games scale with spatial partitioning — which may explain the games industry’s historic sometimes lack of enthusiasm for creating new games experiences that push simulation scale boundaries.
- Ease-of-use: All existing spatial partitioning platforms require games to be rebuilt by studio teams writing lower level code outside of the major game engine development environments. It turns out that the game logic and the data structures required to execute a game on a 2D-grid spatial partitioning platform is unnatural for the way games are built in games engines, so significant rebuild work is always going to be required. It has been near-impossible to port over an existing game to a spatial partitioning platform or where game development has already started. This has drastically limited the market potential for this technology approach.
A new approach to distributed cloud gaming infrastructure is required
Due to these limitations, spatial partitioning will only ever be a partial solution to game simulation scale. Rendering workarounds are possible to give the impression of scale, but these force tradeoffs between proper physics gameplay, avatar complexity and game resolution. Even a near-perfect engineering deployment of a spatial partitioning model will still have the same technical scale constraints that can never be resolved and cloud operating costs and energy consumption will always prove to be uneconomic.
The games development and metaverse industries urgently need a new solution to simulation scale: one that is based on new mathematics breakthroughs along with a solution that also solves the linked cloud operating cost, electricity consumption, climate impact and ease-of-use limitations of the past.
Once the scale and cost problems are properly solved, games developers will start creating amazing experiences that we have not even dreamed of yet and the ecosystem of new innovation in metaverse gaming experiences will flourish.