DXSceneMax |
MaxBackgroundThreadsCount gets or sets an integer that specifies the maximum number of background threads that can be used for multi-threaded operations.
Examples: Value 0 means that no background thread will be used and that all rendering will be performed on the main thread. Value 4 means that besides the main thread up to four additional background threads can be used for rendering.
Default value for MaxBackgroundThreadsCount is set as following:
When we have only 1 processor (Environment.ProcessorCount), then MaxBackgroundThreadsCount is set to 0 - no multi-threading.
When we have less or equal to 8 processors, then use all the processors - one for main thread and (ProcessorCount minus 1) for background threads.
When we have more then 8 processors, MaxBackgroundThreadsCount is by default set to 7 (using one main thread and 7 background threads). When using more threads then the gains are very small compared with additional overhead of multiple threads.
Overview of multi-threading in Ab3d.DXEngine
In DXEngine all the SceneNodes and RenderingQueue objects are created on the same thread as the DXScene object is created (usually the UI thread). This ensures that all the memory write operations are correct.
After RenderingQueues are created, the RenderableObjects in the queues are immutable (cannot be changed) and this means that rendering the objects (setting up ConstantBuffers, DirectX states and issuing DirectX draw calls) can be done on multiple threads.
Still, the DXEngine's Effect that is used to render the object needs to support background rendering. In the current version of the DXEngine only the StandardEffect supports background thread rendering. This effect is the most commonly used effect because it is used to render all WPF objects (using WpfMaterial) and all objects with StandardMaterial. But 3D lines, objects that use instancing, per-vertex color material or some other effect are not rendered on the background thread. Those objects are always rendered on the main thread.
This may seem as not optimal. But for most cases this approach provides an advantage where complex objects (instancing, 3D lines) are immediately sent to the GPU from the main thread. While those objects are already processed by the graphics card, the DXEngine can create command lists to render standard objects in the background threads.
Also, all the transparent objects are always rendered on the main thread. This is needed because the order in which the transparent objects are rendered needs to be preserved.
The maximum number of used background threads is defined by the MaxBackgroundThreadsCount property.
The used number of threads is also determined by the number of objects to render (by default at least 100 objects per thread - defined by MinObjectsPerThread). This means that the actually used number of thread can be lower then MaxBackgroundThreadsCount when only smaller number of objects need to be rendered.
For example if there are 320 objects to render and MaxBackgroundThreadsCount is set to 7 and MinObjectsPerThread is set to 100 (by default), then only 2 background threads (and one main thread) will be used to render all the objects.
It is also possible to fine tune the use of multi-treading with setting the RenderObjectsRenderingStep.UseMultiThreading (enable or disable multi-threading in the specified RenderObjectsRenderingStep) or RenderingQueue.UseMultiThreading (enable or disable multi-threading in the specified RenderingQueue) properties.
DXEngine does not use .Net Tasks for multi-threading. The reason is that Task object cannot be reused once it was completed. This means that if Tasks were used in DXEngine, each frame new Task objects would need to be created for each thread. This would create a significant amount of new objects. Therefore DXEngine uses its own thread manager (BackgroundThreadsManager). A BackgroundThreadsManager is set to the DXDevice.BackgroundThreadsManager property and is used by all DXScene objects that use that DXDevice (the same threads are used to render the objects). But if you want that DXScene uses its own threads for background rendering, it is possible to specify a BackgroundThreadsManager to the BackgroundThreadsManager property (null by default - in this case BackgroundThreadsManager from DXDevice is used).
BackgroundThreadsManager creates background threads and executes rendering actions there. It is always possible to get the number of created threads by checking its ThreadsCount property. Because number of used threads is also determined by the number of shown objects and because that number can change, the BackgroundThreadsManager does not immediately abort a thread when it is not required anymore. Instead it waits for one second (by default) and if the thread is still not needed then it aborts it.
Performance gains from using multi-threaded rendering can be awesome - in some cases it is possible to render more the 4 times the number of objects in the same time.
The multi-threading works best when your rendering process is CPU bound. This means that to render a 3D scene CPU needs to work for longer then GPU - most of the time is spent to issue draw command to the graphics card. An example of such 3D scene is rendering many simple meshes (note that when the meshes are the same, you will get even faster performance when object instancing is used).
Another factor that significantly affect the performance gains get by multi-threading is the value of the PresentationType parameter in DXViewportView. When DirectXOverlay is used then DXEngine does not need to wait until the graphics card finishes rendering the scene. This means that if the drawing process (sending DirectX commands to the graphics card - measured by the DrawRenderTimeMs in rendering statistics) is 4 times faster because of multi-threading, the DXEngine will be able to render 4 times that much objects in the same time (when having a graphic card that is fast enough). But when DirectXImage is used (by default), the results are not that awesome. The reason is that in this case the DXEngine needs to wait until the image is fully rendered by the GPU. Because of multi-threading the drawing process can still be significantly lower, but with DirectXImage the waiting for the GPU (CompleteRenderTimeMs) can be much longer (compared to no multi-threading). Anyway, multi-threading can still provides significant performance gains even with DirectXImage.