update information on benchmark tests (785eebef) · Commits · Anton Lazarev / enum_dispatch

README.md

+12 −0

Original line number	Diff line number	Diff line
		@@ -55,6 +55,18 @@ Notice the differences:
		3. Add an `#[enum_dispatch(FirstBlockName)]` attribute to the remaining definition. This will "link" it with the previously registered definition.
		4. Update your dynamic types to use the new enum instead. You can use `.into()` from any trait implementor to automatically turn it into an enum variant.

		## performance

		More information on performance can be found in the [docs](https://docs.rs/enum_dispatch/), and benchmarks are available in the `benches` directory.
		The following benchmark results give a taste of what can be achieved using `enum_dispatch`.
		They compare the speed of repeatedly accessing method calls on a `Vec` of 1024 trait objects of randomized concrete types using either `Box`ed trait objects, `&` referenced trait objects, or `enum_dispatch`ed enum types.

		```text
		test benches::boxdyn_homogeneous_vec ... bench: 5,900,191 ns/iter (+/- 95,169)
		test benches::refdyn_homogeneous_vec ... bench: 5,658,461 ns/iter (+/- 137,128)
		test benches::enumdispatch_homogeneous_vec ... bench: 479,630 ns/iter (+/- 3,531)
		```

		## troubleshooting

		### no impls created?

benches/blackbox.rs

+17 −4

Original line number	Diff line number	Diff line
		//! The following benchmark tests create two trait objects, access them through one of the four
		//! tested methods, and use the result in a `test::black_box` call, repeating one million times.
		//!
		//! Unlike the `compiler-optimized` benchmark tests, the return values cannot be assumed unused
		//! because of the `test::black_box` call. This results in virtually no change in performance for
		//! the dynamic dispatched method calls, with `enum_dispatch` starting to show its real access
		//! speed -- still several times faster than the alternatives.
		//!
		//! Real code with dynamic dispatch will likely use multiple trait objects whose types are
		//! determined at runtime. That use-case is tested in the `homogeneous-vec` benchmarks.

		#![feature(test)]
		extern crate test;

		@@ -9,13 +20,15 @@ mod benches {
		use super::*;
		use test::Bencher;

		const ITERATIONS: usize = 1000000;

		#[bench]
		fn enumdispatch_blackbox(b: &mut Bencher) {
		let dis0 = EnumDispatched::from(Zero);
		let dis1 = EnumDispatched::from(One);

		b.iter(\|\| {
		for _ in 0..1000000 {
		for _ in 0..ITERATIONS {
		test::black_box(dis0.return_value());
		test::black_box(dis1.return_value());
		}
		@@ -28,7 +41,7 @@ mod benches {
		let dis1 = DynamicDispatched::from(One);

		b.iter(\|\| {
		for _ in 0..1000000 {
		for _ in 0..ITERATIONS {
		test::black_box(dis0.inner().return_value());
		test::black_box(dis1.inner().return_value());
		}
		@@ -42,7 +55,7 @@ mod benches {
		let dis1: Box<dyn ReturnsValue> = Box::new(One);

		b.iter(\|\| {
		for _ in 0..1000000 {
		for _ in 0..ITERATIONS {
		test::black_box(dis0.return_value());
		test::black_box(dis1.return_value());
		}
		@@ -55,7 +68,7 @@ mod benches {
		let dis1: &dyn ReturnsValue = &One;

		b.iter(\|\| {
		for _ in 0..1000000 {
		for _ in 0..ITERATIONS {
		test::black_box(dis0.return_value());
		test::black_box(dis1.return_value());
		}

benches/compiler-optimized.rs

+18 −4

Original line number	Diff line number	Diff line
		//! The following benchmark tests create two trait objects and access them through one of the four
		//! tested methods, repeating one million times.
		//!
		//! The result for `enum_dispatch` should be instant, since the return value is never used. Even
		//! though this is not very representative of real code, this was done deliberately to demonstrate
		//! the optimization opportunities available when using `enum_dispatch`. When using dynamic
		//! dispatch, the compiler cannot perform optimizations like inlining or code removal -- those
		//! become possible when using `match`-based dispatch.
		//!
		//! The `blackbox` benchmarks provide an example where the compiler is not able to remove code as
		//! an optimization.

		#![feature(test)]
		extern crate test;

		@@ -9,13 +21,15 @@ mod benches {
		use super::*;
		use test::Bencher;

		const ITERATIONS: usize = 1000000;

		#[bench]
		fn enumdispatch_compiler_optimized(b: &mut Bencher) {
		let dis0 = EnumDispatched::from(Zero);
		let dis1 = EnumDispatched::from(One);

		b.iter(\|\| {
		for _ in 0..1000000 {
		for _ in 0..ITERATIONS {
		dis0.return_value();
		dis1.return_value();
		}
		@@ -28,7 +42,7 @@ mod benches {
		let dis1 = DynamicDispatched::from(One);

		b.iter(\|\| {
		for _ in 0..1000000 {
		for _ in 0..ITERATIONS {
		dis0.inner().return_value();
		dis1.inner().return_value();
		}
		@@ -41,7 +55,7 @@ mod benches {
		let dis1: Box<dyn ReturnsValue> = Box::new(One);

		b.iter(\|\| {
		for _ in 0..1000000 {
		for _ in 0..ITERATIONS {
		dis0.return_value();
		dis1.return_value();
		}
		@@ -54,7 +68,7 @@ mod benches {
		let dis1: &dyn ReturnsValue = &One;

		b.iter(\|\| {
		for _ in 0..1000000 {
		for _ in 0..ITERATIONS {
		dis0.return_value();
		dis1.return_value();
		}

benches/homogeneous-vec.rs

+14 −4

Original line number	Diff line number	Diff line
		//! The following benchmark tests create a `Vec` of 1024 trait objects whose concrete types are
		//! determined randomly at runtime, iterate over the `Vec` to access them through one of the four
		//! tested methods, and use the result in a `test::black_box` call, repeating one million times.
		//!
		//! This test is most representative of real code -- it doesn't make sense to use dynamic trait
		//! calls on a single object of known type! The dynamic methods take about twice as long to access,
		//! but the performance for `enum_dispatch` is actually about the same as in the `homogeneous-vec`
		//! benchmarks. This provides a really significant speed-up.

		#![feature(test)]
		extern crate test;

		@@ -11,6 +20,7 @@ mod benches {
		extern crate rand;
		use self::rand::Rng;

		const ITERATIONS: usize = 1000000;
		const VEC_SIZE: usize = 1024;

		#[bench]
		@@ -28,7 +38,7 @@ mod benches {
		}

		b.iter(\|\| {
		for i in 0..1000000 {
		for i in 0..ITERATIONS {
		test::black_box(dispatches[i % VEC_SIZE].return_value());
		}
		})
		@@ -49,7 +59,7 @@ mod benches {
		}

		b.iter(\|\| {
		for i in 0..1000000 {
		for i in 0..ITERATIONS {
		test::black_box(dispatches[i % VEC_SIZE].inner().return_value());
		}
		})
		@@ -70,7 +80,7 @@ mod benches {
		}

		b.iter(\|\| {
		for i in 0..1000000 {
		for i in 0..ITERATIONS {
		test::black_box(dispatches[i % VEC_SIZE].return_value());
		}
		})
		@@ -94,7 +104,7 @@ mod benches {
		}

		b.iter(\|\| {
		for i in 0..1000000 {
		for i in 0..ITERATIONS {
		test::black_box(dispatches[i % VEC_SIZE].return_value());
		}
		})

src/lib.rs

+2 −0

Original line number	Diff line number	Diff line
		@@ -256,6 +256,8 @@
		//!
		//! ## The benchmarks
		//!
		//! The following benchmark results were measured on a Ryzen 7 2700x CPU.
		//!
		//! ### compiler_optimized
		//!
		//! The first set of benchmarks creates trait objects and measures the speed of accessing a method