The main purpose of Microsoft’s new-generation DirectStorage API is to allow games to fully apply the high-speed reading features of NVMe SSDs, to read small game data at a speed of GB per second, and to reduce CPU usage overhead.
Of course, in the past, game development was operated through ReadFile-based IO, but when the amount of read data increases and the size of each data decreases, the CPU usage often soars. Moreover, in order to shorten the download and installation time of the game, game developers will compress the data and texture of the game, so the action of decompression must be performed when the game reads the data and texture.
However, the burden of decompressing the previous read operations is borne by the CPU, which also caused the previous games to take a long time to read and could not take advantage of the high-performance read speed of NVMe SSD. The DirectStorage API has been implemented on the Xbox home console, but the home console uses NVMe hardware queues to manage I/O and hardware-accelerated decompression.
The Windows 11 operating system is very similar to Xbox’s DirectStorage API, and its purpose is to allow game developers to have better development compatibility on these two platforms.
At present, Microsoft has released the code of “BulkLoadDemo” in the microsoft / DirectStorage project on Github. Players can build the project into an executable file through Visual Studio for testing.
Before the test, players can first use Win + G to check whether the current computer supports DirectX 12 Ultimate, and whether the GPU and storage device support DirectStorage through Xbox Program > Settings > Game Features.
BulkLoadDemo’s built-in DirectStorage uses a large number of model loads, and measures data such as load time, CPU usage, and bandwidth. The introduction is based on MiniEngine to develop a model that supports glTF 2.0.
First, execute BulkLoadDemo on the PCIe 4.0 NVMe SSD (system disk), and the measured read time is 0.9s, 0.96% CPU usage, and bandwidth 9.62GB/s; then turn off the -gpu-decompression function under the same settings, that is, replace it with a CPU to solve the problem. compression.
At this time, the reading time of CPU decompression is 1.13s, 96% CPU usage rate, and bandwidth is 7.67GB/s. It can be seen that in addition to improving the reading speed, DirectStorage is more important to GPU decompression and liberate CPU usage rate.
The PCIe 4.0 NVMe SSD of the above system disk is installed in the first M.2 slot of the motherboard to use the CPU channel, and then replaced with the PCH channel for testing, but it has better performance.
Read 0.74s, 0.97% CPU usage, bandwidth 11.79GB/s.
Then replace it with another PCIe 3.0 NVMe SSD and use the PCH channel. The measured reading is 0.83s, 0.96% CPU usage, and bandwidth 10.40GB/s.
Finally, the SATA SSD measured 4.19s read, 1.44% CPU usage, and 2.07GB/s bandwidth.
In terms of reading time, there is not much difference in reading performance between using PCIe 4.0 or 3.0 SSD, but there is still a world of difference compared to SATA SSD, but the biggest advantage of DirectStorage besides high-speed reading is GPU decompression Free up CPU usage for this.
Currently, the PC game that supports DirectStorage is “Forspoken”, but due to the poor optimization, it has mixed reviews. This game will have the opportunity to test and compare the speed and FPS of DirectStorage later.
source: test project download