graphics - Cache-Friendly Scene Graph Storage for Hierarchical Transforms

Context

I am optimizing hierarchical transformation updates in a scene graph. My goal is to minimize cache misses during the parent-child transform propagation.

A common object-oriented scene graph suffers from lots of indirect accesses, so I tried a topologically sorted (i.e. the nodes are sorted by non decreasing ParentID) vector representation:


struct Mat4x4 {
    float e00, e01, e02, e03;
    float e10, e11, e12, e13;
    float e20, e21, e22, e23;
    float e30, e31, e32, e33;
};

struct Node {
    Mat4x4 Transform;
    int ParentID;
};

std::vector<Node> nodes;

The topological sort of the nodes vector is necessary to ensure correct order of propagation, i.e. not updating a children transform before its parent; the update process propagates transforms linearly:

for (size_t i = 0; i < nodes.size(); ++i) {
    if (nodes[i].ParentID >= 0) { // the node has a parent
        const Node &parent = nodes[nodes[i].ParentID];
        nodes[i].Transform = ApplyTransform(parent.Transform, nodes[i].Transform);
    }
}

The hot point within this loop seems to be the access to the parent node, and not the actual ApplyTransform computations. This observation comes from running a simple benchmark of the update function on a scene containing ~2000 characters, each represented as a tree of ~50 elements (a total of ~100000 nodes). In that case perf stat gives

stalled-cycles-backend:u         #   48.04% backend cycles idle

and by looking at recorded profiling data, most of the time is spent on movups and movaps operations, indicating frequent cache misses when loading the data for the computations. The benchmark code was compiled using g++ -O3, and is run on a single core.

Question

I was wondering: How can I improve the memory layouts or traversal strategies for better cache efficiency for hierarchical transforms? Are there any specific techniques or patterns used in large scale real-time rendering for solving this problem?

I am not looking for open-ended discussions on scene graphs but rather concrete data structures or patterns that improve memory locality.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

graphics - Cache-Friendly Scene Graph Storage for Hierarchical Transforms - Stack Overflow

Context

Question

与本文相关的文章

评论列表(0)