你的位置：首页>programmer>floating point - What are GCC and Clang options to generate Intel DL Boost bfloat16 instructions? - Stack Overflow

floating point - What are GCC and Clang options to generate Intel DL Boost bfloat16 instructions? - Stack Overflow

For this code:

#include <stdfloat>

std::bfloat16_t foo(std::float32_t f)
{
    return f;
}

GCC generates this code:

foo(_Float32):
        sub     rsp, 8
        call    __truncsfbf2
        add     rsp, 8
        ret

Here we see call __truncsfbf2, which is (?) a software implementation (libgcc/soft-fp/truncsfbf2.c).

It is known that:

Hence, when targeting Cascade Lake I expect to see Intel DL Boost bfloat16 instruction VCVT... (instead of call __truncsfbf2).

I've already tried to add -march=cascadelake. However, GCC still generates call __truncsfbf2.

Are there any GCC options to generate Intel DL Boost bfloat16 instructions?

The same question goes for Clang.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始