RleX packbits #68

andrews05 · 2022-09-26T07:24:18Z

This uses a modified packbits implementation with extended repeats. My tests have shown notably smaller file sizes than current RleX, and it currently decodes around 10x faster too (on a >10MB sprite which I haven't attached here).
The surface has to be accessed via pointer since we're setting individual bytes, not entire colours. There's currently no bounds checking here, though I don't think there was before either.
The pack data is also accessed via pointer (rather than data reader) for better performance. Not sure if this is appropriate or not. I have at least added bounds checking here.

The attached file includes a sprite with two copies: The current opcode-based version and the packbits version.

RleX Packbits.zip

andrews05 · 2022-09-28T07:03:47Z

I've found performance can be improved a little by changing block::set<uint8_t> and block::copy_from to use memset/memcpy respectively, rather than simd. Is this okay to change?

tjhancocks · 2022-09-28T08:54:35Z

libGraphite/data/data.cpp

-        byte = value;
-    }
-    simd::set(get<std::uint32_t *>(start), size() - start, v, bytes);
+    memset(get<std::uint8_t *>(start), value, bytes);
 }


I'll need to verify that the use of memset and memcpy here is faster. In theory they should be, but earlier versions of these algorithms from the start of the year were using them, and I updated to the current implementations for performance improvements.

I'm currently adding testing into the library and will be adding some metrics to it as well to time measurements, so I'll want to run this through that as well to be sure the performance is acceptable. Due to the size of the assets being used by Kestrel, slight discrepancies can add up quickly.

Cross platform optimisations are tricky, huh? 😂
All I can say is this is all working much faster for me on macOS/x86_64, but I realise it could be different on other platforms.

tjhancocks · 2022-09-28T08:55:40Z

libGraphite/data/data.cpp

@@ -320,16 +307,12 @@ auto graphite::data::block::increase_size_to(std::size_t new_size) -> void

 auto graphite::data::block::clear() -> void
 {
-    set((uint32_t)0);
+    set((uint8_t)0, size());
 }


This should not be needed, as it will ultimately pass through to set(uint32_t) anyway. If a difference in behaviour has been observed, then it should be the set function that is fixed, rather than this.

This is related to the change to memset. Surface calls block::clear, which calls set(uint8_t), which calls memset. I observed a large reduction in time taken to initialise the surface for rlex.

tjhancocks · 2022-09-28T09:00:51Z

libGraphite/spriteworld/rleX.cpp

+                    }
+                }
+            }
+        }
    }


Again, I'm going to need to verify the performance of this. I moved to a look up table to eliminate the excessive cpu cycles resulting from conditionals. Given that the goal of this PR is to further compress and reduce the size of the rlëX resource, it is invariably going to result in degraded performance, as more work will be required to decompress.

🤔

andrews05 · 2022-09-29T19:46:57Z

Encoder added. Ready for full test/review.

andrews05 · 2022-10-06T21:08:34Z

I've just pushed an update to add bounds checking of the surface frame on decode, to protect against invalid data (or incompatible variants of rleX). Performance is improved a little too. Compared to the current implementation on the refactor branch, I'm seeing 4x to 20x faster decodes here, depending on the sprite.

RleX packbits decode

52f7905

andrews05 force-pushed the rlex_packbits branch from e1a8578 to bccfd5c Compare September 28, 2022 07:02

andrews05 force-pushed the rlex_packbits branch 2 times, most recently from 790b74c to 0690966 Compare September 28, 2022 07:58

tjhancocks reviewed Sep 28, 2022

View reviewed changes

andrews05 force-pushed the rlex_packbits branch from 0690966 to 52f7905 Compare September 29, 2022 05:50

RleX packbits encode

791ac89

andrews05 changed the title ~~[WIP] RleX packbits~~ RleX packbits Sep 29, 2022

Safer rleX decode

d0dfa37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RleX packbits #68

RleX packbits #68

andrews05 commented Sep 26, 2022 •

edited

Loading

andrews05 commented Sep 28, 2022

tjhancocks Sep 28, 2022

andrews05 Sep 28, 2022

tjhancocks Sep 28, 2022

andrews05 Sep 28, 2022

tjhancocks Sep 28, 2022

andrews05 commented Sep 29, 2022

andrews05 commented Oct 6, 2022

RleX packbits #68

Are you sure you want to change the base?

RleX packbits #68

Conversation

andrews05 commented Sep 26, 2022 • edited Loading

andrews05 commented Sep 28, 2022

tjhancocks Sep 28, 2022

Choose a reason for hiding this comment

andrews05 Sep 28, 2022

Choose a reason for hiding this comment

tjhancocks Sep 28, 2022

Choose a reason for hiding this comment

andrews05 Sep 28, 2022

Choose a reason for hiding this comment

tjhancocks Sep 28, 2022

Choose a reason for hiding this comment

andrews05 commented Sep 29, 2022

andrews05 commented Oct 6, 2022

andrews05 commented Sep 26, 2022 •

edited

Loading