rust - Is `iter().map().sum()` as fast as `iter().fold()`? -
does compiler generate same code iter().map().sum()
, iter().fold()
? in end achieve same goal, first code iterate 2 times, once map
, once sum
.
here example. version faster in total
?
pub fn square(s: u32) -> u64 { match s { s @ 1...64 => 2u64.pow(s - 1), _ => panic!("square must between 1 , 64") } } pub fn total() -> u64 { // fold (0..64).fold(0u64, |r, s| r + square(s + 1)) // or map (1..64).map(square).sum() }
what tools @ assembly or benchmark this?
for them generate same code, they'd first have do same thing. 2 examples not:
fn total_fold() -> u64 { (0..64).fold(0u64, |r, s| r + square(s + 1)) } fn total_map() -> u64 { (1..64).map(square).sum() } fn main() { println!("{}", total_fold()); println!("{}", total_map()); }
18446744073709551615 9223372036854775807
let's assume meant
fn total_fold() -> u64 { (1..64).fold(0u64, |r, s| r + square(s + 1)) } fn total_map() -> u64 { (1..64).map(|i| square(i + 1)).sum() }
there few avenues check:
- the generated llvm ir
- the generated assembly
- benchmark
the easiest source ir , assembly 1 of playgrounds (official or alternate). these both have buttons view assembly or ir. can pass --emit=llvm-ir
or --emit=asm
compiler generate these files.
make sure generate assembly or ir in release mode. attribute #[inline(never)]
useful keep functions separate find them easier in output.
benchmarking documented in rust programming language, there's no need repeat valuable information.
before rust 1.14, these not produce exact same assembly. i'd wait benchmarking / profiling data see if there's meaningful impact on performance before worried.
as of rust 1.14, do produce same assembly! 1 reason love rust. can write clear , idiomatic code , smart people come along , make equally fast.
but first code iterate 2 times, once
map
, oncesum
.
this incorrect, , i'd love know source told can go correct @ point , prevent future misunderstandings. an iterator operates on pull basis; 1 element processed @ time. core method next
, yields single value, running enough computation produce value.
Comments
Post a Comment