-
Notifications
You must be signed in to change notification settings - Fork 318
Speeding up Brilirs #122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speeding up Brilirs #122
Conversation
sampsyo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, that's pretty cool! Those are some nice performance results.
Just a couple of high-level questions:
- Now that the type checking is static rather than dynamic, it seems a little odd that it's coupled with the interpreter. Conceptually, it could be a separate flag or even a completely separate executable (like the current Python type checker and the proposed OCaml one in #107). IMO, a fast interpreter need not necessarily check types… just doing something crazy or crashing on bad programs is fine.
- Nightly Rust does make things a bit more inconvenient to use (especially since the underlying library doesn't require it). If it's really necessary, it's probably OK, especially if the features you want seem like they're coming to stable Rust soon… but if it's just for a little extra performance, then it doesn't quite seem worth it to require it?
| "../benchmarks/sum-bits.json" "../benchmarks/sum-sq-diff.json" | ||
| ) | ||
| args=( "3 6" "128" "50" "7" "645634654" "8" "" "10" "101" "4 20" "8" "50 109658" "96 false" "496" "125" \ | ||
| "-5 8 21" "" "8" "100" "" "42" "100") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a little bit too bad that this script needs to re-list all the benchmarks and all their arguments instead of just using ../benchmarks/*.bril. Maybe Brench could help with this—it could even wrap a call to Hyperfine? Not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't really tried to use Brench yet but I will look into it. I was mostly going off of the blog post so I threw something together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The closest I've gotten is the following one-liner as a single stage pipeline for brench. (export bril=$(bril2json </dev/stdin); export arg='3 6'; hyperfine -s basic --warmup 5 'echo $bril | brili -p $arg') where I'm only trying to get the Ackermann.bril benchmark to run. I get a nice ValueError: I/O operation on closed file. when running Brench which I'm unsure of. It runs as expected on it's own(providing the ackermann file to stdin).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird. Any chance it's a bash/sh difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#!/bin/sh
ps -p $$
(export bril=$(bril2json </dev/stdin); export arg='3 6'; hyperfine -s basic --warmup 5 'echo $bril | brili -p $arg') < ../benchmarks/ackermann.brilThis appears to show the above line using /bin/sh and running without issue.
Instruction::Value {
op: ValueOps::And | ValueOps::Or,
dest,
op_type,
args,
funcs,
labels,
}over Instruction::Value {
op: ValueOps::And,
dest,
op_type,
args,
funcs,
labels,
} |
Instruction::Value {
op: ValueOps::Or,
dest,
op_type,
args,
funcs,
labels,
}
|
|
Cool! Not sure a
|
|
Two more changes have been made. The first being the previously mentioned |
|
Neat! |
|
To bring things full circle, I've reimplemented a version of the original optimization which replaces string variable names with an int identifier. This provides another considerable improvement in performance. I implemented a first run of error messages by implementing the I've updated the setup instructions for Some final numbers on my limited benchmark suite. Seeing the changes in
|
|
This all looks great. I fear I may have done the wrong thing by merging #117 first—looks like there may be a conflict to resolve now? In any case, it seems cool to merge this if you're interested. I'm still dreaming of getting rid of the nightly requirement, but I guess we can revisit that when the next Rust release rolls around. |
f99b2dd to
f4b5be7
Compare
|
Merged |
sampsyo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM; thanks!
This pr does some of the low hanging fruit to bring
brilirsto about 7.5x-11x faster thanbrili. Dependent on #117.All but three benchmarks do not run for long enough to be accurately reported by
hyperfine(hyperfinereports ~5ms for all of them withbrilirs, about ~40-60ms when running withbrili).brilibrilirsThis isn't attempting to be a rigorous benchmarking of
brilivsbrilirsbut to more so show a significant improvement in the performance ofbrilirs(which started at ~1x-1.5x overbrilion these benchmarks ).Two of the most significant optimizations are a drastic reduction in the amount of
clone()/String::from()calls and switching fromHashmaptoFxHashMapwith an initialized capacity.Other optimizations relate to the release profile in
cargo.toml, switching to the MiMalloc allocator and annotating many of the functions with#[inline(always)].Typechecking has been pulled out and is now done first before executing
mainfor a small, but noticeable performance increase. It should now also do a mostly complete validation of the bril code.I made attempts to use async code, the smallvec crate, unsafe argument indexing, and program parallelism but there was no immediate improvement.
brilirsnow uses nightly rust for or patterns in match statements and a slight performance increase.