mirror of
https://git.proxmox.com/git/rustc
synced 2025-08-17 07:48:55 +00:00
35 lines
1.5 KiB
Markdown
35 lines
1.5 KiB
Markdown
# Optimizations
|
|
This document tracks which optimizations have been done after the initial implementation passed corpus tests and a good amount of fuzzing.
|
|
|
|
## Introducing more unsafe code:
|
|
These optimizations introduced more unsafe code. These should yield significant improvements, or else they are not really worth it.
|
|
|
|
### Optimizing bitreader with byteorder which uses ptr::copy_nonoverlapping
|
|
* Reverse bitreader_reversed::get_bits was identified by linux perf tool using about 36% of the whole time
|
|
* Benchmark: decode enwik9
|
|
|
|
* Before: about 14.7 seconds
|
|
* After: about 12.2 seconds with about 25% of the time used for get_bits()
|
|
|
|
### Optimizing decodebuffer::repeat with ptr::copy_nonoverlapping
|
|
* decodebuffer::repeate was identified by linux perf tool using about 28% of the whole time
|
|
* Benchmark: decode enwik9
|
|
|
|
* Before: about 9.9 seconds
|
|
* After: about 9.4 seconds
|
|
|
|
### Use custom ringbuffer in the decodebuffer
|
|
The decode buffer must be able to do two things efficiently
|
|
* Collect bytes from the front
|
|
* Copy bytes from the contents to the end
|
|
|
|
The stdlibs VecDequeu and Vec can each do one but not the other efficiently. So a custom implementation of a ringbuffer was written.
|
|
|
|
## Introducing NO additional unsafe code
|
|
These are just nice to have
|
|
|
|
### Even better bitreaders
|
|
Studying this material lead to a big improvement in bitreader speed
|
|
* https://fgiesen.wordpress.com/2018/02/19/reading-bits-in-far-too-many-ways-part-1/
|
|
* https://fgiesen.wordpress.com/2018/02/20/reading-bits-in-far-too-many-ways-part-2/
|