Optimization of Draw Calls


I want to talk about what I think is a pretty cool trick for optimizing the number of draw calls in Netstorm 2 an RTS.  Dealing with draw calls in an RTS is more challenging than, say, an FPS because the player can create a large number of objects over the course of the game.  In an FPS, you know how many objects there will be (except maybe for missiles).  In an RTS, the point of the game is to dynamically create units for your army, and each one of these might be a draw call, which means it can get pretty expensive, pretty quickly.

The Bridges of RisingStorm County

There are a large number of bridge units built in a game of Netstorm

Netstorm has an additional challenge: one of the principle “units” a player will place is a bridge.  This is free (although limited to how many you can place every few seconds) and is used to maneuver.  It is in the player’s best interest to place bridges as fast as possible.  In an average game there are hundreds of bridge pieces being built.

Bridges of RisingStorm

The bridges have Tetris-like shapes

Each bridge piece is a Tetris-like shape that is made up of smaller atomic pieces, e.g., a straight, a corner and a tee.  Each of these under normal circumstances will be a separate draw call.  However, thanks to the Mesh Combiner script provided with Unity, these are very easily combined into the Tetris-like shapes.  However, there are still hundreds of Tetris-like shapes to deal with.

Small diversion:  A few years ago, a friend of mine and I worked on a game engine ourselves.  For better or worse, I pushed us into using threads aggressively.  One cool benefit was that since systems acted in independent threads, we could avoid the situation where the may game loop has to run as fast as possible to service any possible need.  Thus the CPU would only be used as much as it was needed, which for many situations was hardly at all.

In order to deal with these many independent systems we set up a publish/subscribe mechanism to decouple dependencies between the systems.  We also had several systems which would only run based on certain events happening, rather than running in a loop.  We created a central dispatch and scheduler which we called the Hub to manage the execution of these systems and dispatch messages to them.

Recently, I saw a way of bringing back some aspects of the Hub into Unity.  Specifically the dispatcher and the scheduler.  The dispatcher uses Unity’s SendMessage interface, but instead of needing to know which game object you want to send to, you just send the message to the dispatcher.  Any gameobject that wants to see that message simply needs to register for it.  One of the ways I use this inside Netstorm is to turn shadows on and off when the FPS are too low or safely high enough.  When the FPS drops below 15, a message is sent to the dispatcher saying “LowFPS” and the shadows disable.  Once the FPS goes back above 60 another message is sent and shadows are turned back on.  Because the FPS checker doesn’t know who is getting the message other effects could also register for the low and high FPS to decide when to turn off.  I could also use this for Level of Detail changes.

But, going back to the bridges, there’s an even cooler trick.  Running the mesh combiner is expensive and Unity’s dynamic batching is limited to 300 polygon object.  It would be nice to have something in between that we can choose to run as needed.  This is where the scheduler comes in.

It works as follows, when a bridge is placed it checks to see if it has been connected to another bridge or if it is independent.  If it’s independent it creates a parent object that contains a scheduled mesh combiner.  This mesh combiner will run 3 seconds after it has started.  However, if another bridge is placed connected it’ll child itself with that existing bridge’s parent and the scheduler will be pushed back by 3 seconds.  Once the player stops building a particule bridge path for 3 seconds the combiner will finally run and combine all the bridges.

Here’s what the code looks like:

public class MeshRecombiner : MonoBehaviour {
    public float recombineDelay = 10.0f;

    private bool dirty;
    void CheckRecombine () {
        if (dirty) {
            Recombine ();
            dirty = false;
        }
    }
    void MarkDirty () {
        foreach (Transform child in transform) {
            if (child.name == "Combined mesh") {
                child.parent = null;
                Destroy (child.gameObject);
            } else {
                child.gameObject.SendMessage ("MarkDirty", null,
                    SendMessageOptions.DontRequireReceiver);
            }
        }
        if (!dirty) {
            Scheduler.GetInstance().AddSchedule (recombineDelay, "CheckRecombine", false, gameObject);
            dirty = true;
        } else {
            Scheduler.GetInstance().AddSchedule (recombineDelay, "CheckRecombine", gameObject);
        }
    }

Notice a few things.  First off, there is a function called “Recombine” which is called by “CheckRecombine”.  “CheckRecombine” is what will be called by the scheduler.  “Recombine” is essentially the “Start” routine from the “MeshCombiner” class that comes default in Unity.  One change is that a child object “Combined mesh” is always created regardless of whether there’s only one material.  This is to simplify the “MarkDirty” code.  The “MarkDirty” function is called when a child is added or removed or modified in some way that requires recombining (e.g., a material has changed on the child).  This “MarkDirty” then disables all the current combinations sends message to all the children who should enable their individual renderers again.  It then sets up the scheduler to call it.

Here’s an example of a simple child which can be combined:

function Start () {
    transform.parent.gameObject.SendMessage ("MarkDirty");
    enabled = false;
}

function MarkDirty () {
    var renderers;
    renderers = GetComponentsInChildren (Renderer);
    for (var r in renderers) {
        r.enabled = true;
    }
}

The final piece to this code is of course the Dispatcher and Scheduler.  However, they are much to large to copy and paste here.  Instead, they exist on github here: https://github.com/tosos/Geewhiz under the project Geewhiz which is the name of the previous game engine I worked on.

Using this method in a situation like the screenshot shown above with all the bridges the draw calls can be taken from something like 700 in that image (with all effects added including shadows) and reduced to on the order of 200, which if it had been enabled at the time the screen shot was taken would have likely taken the frame rate from 3 to something more like 15, at least.  These days it is possibly to get much higher frame rates during multiplayer games of Netstorm.

Hopefully you will be able to get similar speed ups with dynamically created objects using a combination of these techniques.  Drop a line in the comments here if you are able to find some use with these.

  1. #1 by omelchor on January 28, 2014 - 3:35 AM

    Thanks!

(will not be published)
*

Switch to our mobile site