Automatic GPU Data Compression and Address Swizzling for CPUs via Modified Virtual Address Translation

Symposium on Interactive 3D Graphics and Games (I3D 2020)


Larry SeilerDaqi LinCem Yuksel
Facebook Reality LabsUniversity of UtahUniversity of Utah
[Paper]


Abstract

We describe how to modify hardware page translation to enable CPU software access to compressed and swizzled GPU data arrays as if they were decompressed and stored in row-major order. In a shared memory system, this allows CPU to directly access the GPU data without copying the data or losing the performance and bandwidth benefits of using compression and swizzling on the GPU. Our method is flexible enough to support a wide variety of existing and future swizzling and compression schemes, including block-based lossless compression that requires per-block meta-data. Providing automatic compression can improve performance, even without considering the cost of copying data. In our experiments, we observed up to 33% reduction in CPU/memory energy use and up to 35% reduction in CPU computation time.


cmp-teaser An example of lossless block compression. Each block is 256B and consists of four 64B memory accesses, storing 8 × 8 32-bit pixels. The level of compression is specified by two meta-data bits per block, indicating possible compression ratios 4:0, 4:1, 4:2, and 4:4. Note that 4:0 uses a default clear color for the entire block (e.g. black).


cmp-teaser Lossless compression and decompression occur between the L1 and L2 caches, when page translation (performed by the page walker block) determines that the data is stored compressed in memory. Dotted lines mark the block that is added to the hardware. The datapath and latency for uncompressed data is unchanged.


Acknowledgements

This project was supported in part by a grant from Facebook Reality Labs.

BibTeX

@inproceedings{Seiler2020a,
   author       = {Larry Seiler and Daqi Lin and Cem Yuksel},
   title        = {Automatic {GPU} Data Compression and Address Swizzling for {CPU}s via Modified Virtual Address Translation},
   booktitle    = {Symposium on Interactive 3D Graphics and Games (I3D 2020)},
   year         = {2020},
   numpages     = {10},
   location     = {San Francisco, CA, USA},
   isbn         = {978-1-4503-7589-4/20/05},
   url          = {http://doi.acm.org/10.1145/3384382.3384533},
   doi          = {10.1145/3384382.3384533},
   publisher    = {ACM Press},
   address      = {New York, NY, USA},
}

Compacted CPU/GPU Data Compression via Modified Virtual Address Translation