最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

unsafe - Rust Uninitialized Vec Causes Double Free Error - Stack Overflow

programmeradmin0浏览0评论

This is a problem of potential undefined behavior of uninitialized memory in unsafe rust.

  • first, allocating vector by usual safe way (type Vec<Vec<usize>>, or must be vector of something on heap instead of stack; in other words, Vec<usize> type will not panic here);
  • second, clone the vector in a new scope;
  • third, allocating a new vector by unsafe uninitialized way (without using MaybeUninit); then a double free occurs.

The code (playground) listed as follows:

#[deny(clippy::uninit_vec)]
unsafe fn uninitialized_vec<T>(size: usize) -> Vec<T> {
    let mut v: Vec<T> = Vec::with_capacity(size);
    unsafe { v.set_len(size) };
    v
}

fn case_1() {
    println!("=== Case 1 ===");
    let vec_a: Vec<Vec<usize>> = vec![vec![0]; 4];
    let _ = vec_a.clone();
    let _: Vec<Vec<usize>> = unsafe { uninitialized_vec(4) };
}

fn case_2() {
    println!("=== Case 2 ===");
    let vec_a: Vec<Vec<usize>> = vec![vec![0]; 4];
    {
        let _ = vec_a.clone();
    }
    let _: Vec<Vec<usize>> = unsafe { uninitialized_vec(4) };
}

fn main() {
    case_1();
    case_2();
}
  • debug mode: signal 6 (SIGABRT): abort program, double free detected in tcache 2; case_2 will stuck here;
  • release mode: signal 4 (SIGILL): illegal instruction; even case_1 will stuck here.

From my common sense, the code itself does not intended to double free any variable. We just declared an uninitialized variable, but not using that variable. If this is not a bug, compiler optimization is probably the only reason that can explain this problem.

What I'm curious is that: whether this code actually triggers undefined behavior in unsafe rust, that compiler optimization may cause double free or other problems. And

  • if it is actually a UB, I feel this code can really confuses rust newcomers (that is saying myself).
  • if it is not a UB, is that a bug of unsafe rust?

Additionally, it is known that by using MaybeUninit properly may avoid double-free runtime error like this:

use std::mem::MaybeUninit;

unsafe fn uninitialized_vec<T>(size: usize) -> Vec<MaybeUninit<T>> {
    let mut v: Vec<MaybeUninit<T>> = Vec::with_capacity(size);
    unsafe { v.set_len(size) };
    v
}

fn main() {
    println!("=== This will not cause error ===");
    let vec_a: Vec<Vec<usize>> = vec![vec![0]; 12];
    {
        let _ = vec_a.clone();
    }
    let _: Vec<MaybeUninit<Vec<usize>>> = unsafe { uninitialized_vec(12) };
}

Still, MaybeUninit can be inconvenient in many circumstances. Subjectively, I prefer using Vec<T> instead of Vec<MaybeUninit<T>>, especially when I can make sure values of this uninitialized vector will be correctly filled later.

This is a problem of potential undefined behavior of uninitialized memory in unsafe rust.

  • first, allocating vector by usual safe way (type Vec<Vec<usize>>, or must be vector of something on heap instead of stack; in other words, Vec<usize> type will not panic here);
  • second, clone the vector in a new scope;
  • third, allocating a new vector by unsafe uninitialized way (without using MaybeUninit); then a double free occurs.

The code (playground) listed as follows:

#[deny(clippy::uninit_vec)]
unsafe fn uninitialized_vec<T>(size: usize) -> Vec<T> {
    let mut v: Vec<T> = Vec::with_capacity(size);
    unsafe { v.set_len(size) };
    v
}

fn case_1() {
    println!("=== Case 1 ===");
    let vec_a: Vec<Vec<usize>> = vec![vec![0]; 4];
    let _ = vec_a.clone();
    let _: Vec<Vec<usize>> = unsafe { uninitialized_vec(4) };
}

fn case_2() {
    println!("=== Case 2 ===");
    let vec_a: Vec<Vec<usize>> = vec![vec![0]; 4];
    {
        let _ = vec_a.clone();
    }
    let _: Vec<Vec<usize>> = unsafe { uninitialized_vec(4) };
}

fn main() {
    case_1();
    case_2();
}
  • debug mode: signal 6 (SIGABRT): abort program, double free detected in tcache 2; case_2 will stuck here;
  • release mode: signal 4 (SIGILL): illegal instruction; even case_1 will stuck here.

From my common sense, the code itself does not intended to double free any variable. We just declared an uninitialized variable, but not using that variable. If this is not a bug, compiler optimization is probably the only reason that can explain this problem.

What I'm curious is that: whether this code actually triggers undefined behavior in unsafe rust, that compiler optimization may cause double free or other problems. And

  • if it is actually a UB, I feel this code can really confuses rust newcomers (that is saying myself).
  • if it is not a UB, is that a bug of unsafe rust?

Additionally, it is known that by using MaybeUninit properly may avoid double-free runtime error like this:

use std::mem::MaybeUninit;

unsafe fn uninitialized_vec<T>(size: usize) -> Vec<MaybeUninit<T>> {
    let mut v: Vec<MaybeUninit<T>> = Vec::with_capacity(size);
    unsafe { v.set_len(size) };
    v
}

fn main() {
    println!("=== This will not cause error ===");
    let vec_a: Vec<Vec<usize>> = vec![vec![0]; 12];
    {
        let _ = vec_a.clone();
    }
    let _: Vec<MaybeUninit<Vec<usize>>> = unsafe { uninitialized_vec(12) };
}

Still, MaybeUninit can be inconvenient in many circumstances. Subjectively, I prefer using Vec<T> instead of Vec<MaybeUninit<T>>, especially when I can make sure values of this uninitialized vector will be correctly filled later.

Share Improve this question edited Mar 7 at 10:48 Joël 2,8411 gold badge21 silver badges38 bronze badges asked Mar 6 at 11:33 ajz34ajz34 1738 bronze badges 2
  • 1 What I'm curious is that: whether this code actually triggers undefined behavior in unsafe rust – There is a tool for that, miri, you can access it from the playground under TOOLS at the top right and it correctly detects this UB. It's not comprehensive though so if it doesn't detect something doesn't mean there is no UB, just that miri didn't find it. – cafce25 Commented Mar 6 at 13:28
  • @cafce25 Ah thanks! I didn't noticed that tool before. It detects UB for this case. Seems this tool can be useful when diving into nomicon. – ajz34 Commented Mar 6 at 14:01
Add a comment  | 

2 Answers 2

Reset to default 8

We just declared an uninitialized variable, but not using that variable

This is incorrect. We are using this variable - namely, we are dropping the outer Vec, therefore dropping all the Vecs in it, and since they are uninitialized, we're trying to drop arbitrary memory.

whether this code actually triggers undefined behavior in unsafe rust

It does, according to documentation for set_len:

Safety
<...>
The elements at old_len..new_len must be initialized.

This requirement is explicitly violated.

undefined behavior in unsafe rust, that compiler optimization may cause double free or other problems

If you trigger undefined behavior, compilation process may do to your program anything at all, by definition. It's not "this code may be misoptimized in this way" - it's "you lied to the compiler, all bets are off".

Disclaimer: Cerberus answered the main question, I'll focus on best practices/implicit questions.

Correct usage of set_len

Subjectively, I prefer using Vec<T> instead of Vec<MaybeUninit<T>>, especially when I can make sure values of this uninitialized vector will be correctly filled later.

As mentioned in @Cerberus answer, set_len should only be used after the elements have been initialized.

That is, the correct way to use set_len is:

fn main() {
    const SIZE: usize = 5;

    let mut vec: Vec<Vec<usize>> = Vec::with_capacity(SIZE);

    {
        let spare = vec.spare_capacity_mut();

        (0..SIZE).for_each(|i| { spare[i].write(Vec::new()); });
    }

    //  Safety:
    //  - All elements in 0..SIZE have been initialized.
    unsafe { vec.set_len(SIZE); }
}

Pre-pooping your pants with Rust

Gankra1, one of the lead designer of unsafe Rust in the early days, wrote an article called Pre-pooping your pants with Rust, which is very a-propos here.

The gist of the article, is that one key difficulty for writing sound unsafe code is anticipating all the ways that something could potentially go wrong: any early return, any panic, etc... which could occur before you "fulfill" a promise.

Hint: it's basically impossible, in the face of ever-changing code.

The solution that Grankra proposed is thus to "pre-poop" your pants. For example, when you call vec.drain(start..end), all the elements in start..end should be removed, and only the elements in 0..start and end.. (catenated together) remain. But who knows what could occur while doing all that?

Thus, the code for drain will:

  1. Set the length of vec to start. It will not leave it at the current length, nor set it at the final length.
  2. Poke a hole in the vec, until all elements in start..end have been moved out.
  3. Move the elements in end.. to start.. (filling the hole, partially).
  4. And ONLY THEN set the length of vec to the final number of elements that actually remain.

If for whatever reason the Drain iterator is leaked in the middle of all this, leaving a hole with uninitialized elements... it's all sound, because all the elements in 0..start are (still) initialized, and that's the only promise that vec is making until (4), and at (4) all is good again.

Or put another way: you never promise that you will do something with unsafe, instead you first do it, and then announce it's done.

1 You may know Gankra from Learn Rust With Entirely Too Many Linked Lists.

MaybeUninit is inconvenient

Yes. Yes it is.

In fact, properly written unsafe code, with all the // Safety annotations, is tedious to write. And verbose. That's what it takes to write verifiably sound unsafe code.

If you can't be bothered, then don't use unsafe!

I would note that there's no benefit to using unsafe in the scenarios you've showcased here:

fn defaulted_vec<T>(size: usize) -> Vec<T>
where
   T: Default,
{
    (0..size).map(T::default).collect()
}

fn case_1() {
    println!("=== Case 1 ===");
    let vec_a: Vec<Vec<usize>> = vec![vec![0]; 4];
    let _ = vec_a.clone();
    let _: Vec<Vec<usize>> = defaulted_vec(4);
}

fn case_2() {
    println!("=== Case 2 ===");
    let vec_a: Vec<Vec<usize>> = vec![vec![0]; 4];
    {
        let _ = vec_a.clone();
    }
    let _: Vec<Vec<usize>> = defaulted_vec(4);
}

fn main() {
    case_1();
    case_2();
}

defaulted_vec will create a Vec of 4 default values, which are cheap -- because Vec::new() doesn't allocate anything -- and you can then use them/overwrite them at leisure.

发布评论

评论列表(0)

  1. 暂无评论