最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

floating point - What are GCC and Clang options to generate Intel DL Boost bfloat16 instructions? - Stack Overflow

programmeradmin3浏览0评论

For this code:

#include <stdfloat>

std::bfloat16_t foo(std::float32_t f)
{
    return f;
}

GCC generates this code:

foo(_Float32):
        sub     rsp, 8
        call    __truncsfbf2
        add     rsp, 8
        ret

Here we see call __truncsfbf2, which is (?) a software implementation (libgcc/soft-fp/truncsfbf2.c).

It is known that:

  • bfloat16 instructions are part of Intel DL Boost.
  • "DL Boost features were introduced in the Cascade Lake architecture."

Hence, when targeting Cascade Lake I expect to see Intel DL Boost bfloat16 instruction VCVT... (instead of call __truncsfbf2).

I've already tried to add -march=cascadelake. However, GCC still generates call __truncsfbf2.

Are there any GCC options to generate Intel DL Boost bfloat16 instructions?

The same question goes for Clang.

发布评论

评论列表(0)

  1. 暂无评论