Improving Your Node Graph
Prune Locations as Early as Possible
The Prune node is used to remove objects from the scene. Given the processing model used by Geolib3-MT, the sooner unneeded scene graph locations/objects are pruned from the scene, the less work required from both downstream Ops and Geolib3-MT itself. In this respect, Prune can be thought of as providing a filtering operation, reducing the size of the scene graph that must be processed by downstream Ops.
In the following example, assets are loaded by the AssetsIn node producing a scene graph of 100 locations. Bounding boxes are calculated by the ProcessBounds Op on all 100 locations and then unneeded scene graph locations for this shot are then pruned from the scene. However, if the Prune node is placed closer to the AssetsIn node, the scene graph processed by ProcessBounds is half the size. This directly translates to both memory and processing cost savings.
Prune scene graph objects/locations as early as possible. Doing so reduces the number of locations downstream Ops must process which corresponds to a reduction in both memory and scene processing time. |
Understand Parallel Scene Processing Dimensions
Geolib3-MT’s scene graph processing system searches for computationally independent tasks which can be evaluated in parallel. Broadly speaking, there are two dimensions of parallelism Geolib3-MT looks to exploit:
- Scene Graph Parallelism - In a deferred/lazily evaluated scene graph such as Katana, a scene graph contains potential work. It is potential because it is only computed once a scene graph location and its children become known through expansion.
Each child of a scene graph location represents a computationally independent task, depending only on its parent location. Once a scene graph location has been computed, all of its children can be computed in parallel. -
Op Tree Parallelism - Sub sections of the Op tree can be processed in parallel. In doing so Geolib3-MT fully expands the scene at the sub section of the Op tree it’s processing. Some ops are naturally parallelizable, such as source Ops (i.e. those Ops with no inputs). Some constructs are also naturally parallelizable, such as Merge Ops, in which each Op tree branch is independently computable.
Exploit parallelism in both dimensions to provide Geolib3-MT with the maximum backlog of known computationally independent tasks to process.
Use Merge Branches for Computationally Independent Scene Graphs
Each branch of a Katana node graph can be thought of as producing an independent scene graph, which are combined through the use of a Merge node.
As each branch of the Merge node represents a computationally independent scene graph, Geolib3-MT exploits this fact to expand each branch in parallel. If this behavior is not desired, it can be disabled in the RenderSettings node. To benefit from this parallel expansion of Op tree branches there are a number of things to consider:
- Consider your scene graph not as one large scene graph but multiple smaller scene graphs, each produced by one or more connected Ops.
- Consider the data dependencies that exist in your scene graphs and the nodes an Ops responsible for producing them. Are there opportunities to refactor your node graph to take advantage of the parallel processing available from multiple independent branches?
- Identify CPU-bound operations that must operate on multiple locations. Whilst Geolib3-MT exploits parallelism available within the scene graph itself (by processing independent locations in parallel), CPU bound operations could also be evaluated in parallel by duplicating CPU bound nodes and ops.
|
In this example, whilst the three scene graphs produced by the Alembic_In nodes are generated in parallel, the parallelism available to the CPUBoundOperation node is limited to the scene graph only. |
Example refactored node graph version in which the CPUBoundOperation is placed within each branch, which means that parallelism can be taken advantage of in both dimensions (scene graph and Op tree). |
Place chains of collapsible Nodes/Ops together
In order to process a scene graph location at a particular Op, Geolib3-MT must perform a number of steps, including but not limited to:
- Ensuring the location’s dependencies (parent location and the location it inherits its attributes from) have also been evaluated.
- Memory has been allocated to store the results of processing the scene graph location.
- Any inherited scene graph attributes have been applied prior to cooking the location.
- Evaluate the actual location using the Op’s cook function.
Whilst this process is efficient, it is not cost-free. Checking dependencies requires checking a central cook result store, memory allocation requires a (potential) system call and the cook call incurs a small overhead to call the function (even if the cook() call doesn’t mutate the scene graph).
With this in mind, anything that can be done to reduce the number of cooks has the potential to improve scene graph processing time and reduce memory usage.
Geolib3-MT’s new Op tree optimization feature performs a preprocessing step to analyze the topology of the Op tree. One such optimization is the collapsing of chains of Ops of the same type. For example, a series of four AttributeSet Ops acting on the same set of locations may be collapsed into a single AttributeSet acting on that set of locations. This reduces the number of cook results in the caching subsystem, thus reducing the memory footprint of scene traversal.