rust - What's the performance penalty of creating a closure vs struct?

I have done something that looks very smart. Trying to get better with error handling, I've written my DB operations like this

    fn handle_delete(&self, id: &Id) -> Result<(), SqliteError> {
        db_op(
            || DbOperationInfo::new("filerequests", OperationType::Delete),
            || {
                self.conn
                    .execute("DELETE FROM filerequests WHERE id = ?", [id.to_string()])?;
                Ok(())
            },
        )
    }

The idea is that I have my SqliteError enum, which contains a fairly small struct that contains some info about the attempted operation and the rusqlite::error::Error that caused it.

What the db_op function does is accept two closures, the first of which generates that small struct containing operation info and the second one doing the actual operation. If (and only if) an error occurs, the first closure is executed generating the operation info and then the causing error is added to create a SqliteError.

So the operation info creation is lazy.

But do I actually gain something here? Or is the creation of the closure similarly expensive? Or, perhaps the compiler will optimize the closure creation and do that only once? Or maybe it would optimize struct creation just the same (the contents are kinda static)?

Looking for some guidance here: Does this optimization (likely) actually help?

I have done something that looks very smart. Trying to get better with error handling, I've written my DB operations like this

    fn handle_delete(&self, id: &Id) -> Result<(), SqliteError> {
        db_op(
            || DbOperationInfo::new("filerequests", OperationType::Delete),
            || {
                self.conn
                    .execute("DELETE FROM filerequests WHERE id = ?", [id.to_string()])?;
                Ok(())
            },
        )
    }

The idea is that I have my SqliteError enum, which contains a fairly small struct that contains some info about the attempted operation and the rusqlite::error::Error that caused it.

So the operation info creation is lazy.

Looking for some guidance here: Does this optimization (likely) actually help?

Share Improve this question asked 22 hours ago NoBullsh1t 6746 silver badges16 bronze badges

1 Profile it and you'll know. – cafce25 Commented 22 hours ago
1 Your info closure does not appear to capture any variables, so its size will be zero, so it will literally have zero cost to create it. – kmdreko Commented 22 hours ago
@cafce25 How would I actually profile this? Can't remember reading about how to do this anywhere in the docs (yet). Is there some good resource you could point me to? – NoBullsh1t Commented 22 hours ago
There's the perf book that contains information about benchmarking in Rust (among other things). – Finomnis Commented 21 hours ago

Add a comment |

1 Answer 1

Sorted by: Reset to default 2

You can benchmark this situation using divan (or any other benchmarking tool in Rust, like criterion).

You need to minimize the code first to isolate the actual code you want to test. Then, add a bunch of black_box at the right positions to avoid your test code getting compiled out.

This is my attempt to minimize and benchmark your situation. Note that I did not know what exactly is going on in DbOperationInfo::new(), so I just put a String allocation in there, to demonstrate some kind of heap based initialization. If DbOperationInfo does not perform a heap allocation, the result might be a little less obvious.

#![allow(dead_code)]

use std::hint::black_box;

enum OperationType {
    Delete,
    Insert,
}

struct DbOperationInfo {
    s: String,
    t: OperationType,
}

impl DbOperationInfo {
    pub fn new(s: impl Into<String>, t: OperationType) -> Self {
        Self { s: s.into(), t }
    }
}

fn db_op_closure(generate: bool, f: impl FnOnce() -> DbOperationInfo) {
    if black_box(generate) {
        black_box(f());
    }
}

fn db_op_value(generate: bool, val: DbOperationInfo) {
    if black_box(generate) {
        black_box(val);
    }
}

#[divan::bench(args = [true, false], min_time = 0.5)]
fn handle_delete_closure(generate_value: bool) {
    db_op_closure(generate_value, || {
        DbOperationInfo::new("filerequests", OperationType::Delete)
    })
}

#[divan::bench(args = [true, false], min_time = 0.5)]
fn handle_delete_value(generate_value: bool) {
    db_op_value(
        generate_value,
        DbOperationInfo::new("filerequests", OperationType::Delete),
    )
}

fn main() {
    // Run registered benchmarks.
    divan::main();
}

And here is the result:

closure_vs_direct         fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ handle_delete_closure                │               │               │               │         │
│  ├─ false               1.011 ns      │ 5.796 ns      │ 1.023 ns      │ 1.04 ns       │ 22881   │ 187441152
│  ╰─ true                28.83 ns      │ 476.6 ns      │ 29.41 ns      │ 31.12 ns      │ 29588   │ 15149056
╰─ handle_delete_value                  │               │               │               │         │
   ├─ false               28.24 ns      │ 194.2 ns      │ 29.02 ns      │ 30.34 ns      │ 60225   │ 15417600
   ╰─ true                28.63 ns      │ 114.7 ns      │ 29.41 ns      │ 30.54 ns      │ 30153   │ 15438336

So what does that mean?

My version of DbOperationInfo::new() takes about 28 ns.
When the operant is a closure and the closure does not get called, the time is only 1 ns.

That means a single heap allocation is enough to make it way worth wrapping it in a closure.

Just to see the difference, I removed the heap allocation from DbOperationInfo::new():

struct DbOperationInfo {
    s: &'static str,
    t: OperationType,
}

impl DbOperationInfo {
    pub fn new(s: &'static str, t: OperationType) -> Self {
        Self { s, t }
    }
}

And now it does not matter if you wrap it in a closure or not:

closure_vs_direct         fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ handle_delete_closure                │               │               │               │         │
│  ├─ false               1.291 ns      │ 122 ns        │ 1.303 ns      │ 1.387 ns      │ 19944   │ 163381248
│  ╰─ true                1.291 ns      │ 7.199 ns      │ 1.303 ns      │ 1.36 ns       │ 20790   │ 170311680
╰─ handle_delete_value                  │               │               │               │         │
   ├─ false               1.291 ns      │ 6.955 ns      │ 1.303 ns      │ 1.371 ns      │ 20484   │ 167804928
   ╰─ true                1.291 ns      │ 22.99 ns      │ 1.303 ns      │ 1.367 ns      │ 20653   │ 169189376

That tells me that as long as you have a heap allocation or some kind of computation, wrap it in a closure.

If your object is small and can be allocated statically, it does not matter. Theoretically removing the closure could be minimally faster in this situation, but the compiler will likely optimize the closure away to be identical.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

rust - What's the performance penalty of creating a closure vs struct? - Stack Overflow

1 Answer 1

与本文相关的文章

评论列表(0)