c - Is printf's %a formatting for floating-points not unique?

C23 defines the %a conversion specifier in § 7.23.6.1.8 (see here on page 333) as:

A double argument representing a floating-point number is converted in the style [-]0xh.hhhhp±d, where there is one hexadecimal digit (which is nonzero if the argument is a normalized floating-point number and is otherwise unspecified) before the decimal-point character and the number of hexadecimal digits after it is equal to the precision; if the precision is missing and FLT_RADIX is a power of 2, then the precision is sufficient for an exact representation of the value

[…]

The letters abcdef are used for a conversion and the letters ABCDEF for A conversion. […] The exponent always contains at least one digit, and only as many more digits as necessary to represent the decimal exponent of 2. If the value is zero, the exponent is zero.

(emph. mine)

Does this mean that the representation need not to be unique?

For instance, could printf("%a", 1.0) output 0x1p+0, as well as 0x2p-1, 0x4p-2, 0x8p-3? They all have 1 digit in the exponent, so they should be all equivalent as per the requirement above.

C23 defines the %a conversion specifier in § 7.23.6.1.8 (see here on page 333) as:

A double argument representing a floating-point number is converted in the style [-]0xh.hhhhp±d, where there is one hexadecimal digit (which is nonzero if the argument is a normalized floating-point number and is otherwise unspecified) before the decimal-point character and the number of hexadecimal digits after it is equal to the precision; if the precision is missing and FLT_RADIX is a power of 2, then the precision is sufficient for an exact representation of the value

[…]

The letters abcdef are used for a conversion and the letters ABCDEF for A conversion. […] The exponent always contains at least one digit, and only as many more digits as necessary to represent the decimal exponent of 2. If the value is zero, the exponent is zero.

(emph. mine)

Does this mean that the representation need not to be unique?

For instance, could printf("%a", 1.0) output 0x1p+0, as well as 0x2p-1, 0x4p-2, 0x8p-3? They all have 1 digit in the exponent, so they should be all equivalent as per the requirement above.

Share Improve this question edited 2 days ago Ian Abbott 17.4k21 silver badges37 bronze badges asked Feb 17 at 14:07 peppe 22.8k4 gold badges62 silver badges76 bronze badges

1 Seems legal according to the spec. Their example in the footnotes with 123 as 0x1.fp+6 and 124 as 0xf.6p+3 and "implementations can choose" suggests a level of freedom there. – teapot418 Commented Feb 17 at 14:38
1 might be relevant: C printf %a and %La – Turtlefight Commented Feb 17 at 14:44

Add a comment |

2 Answers 2

Sorted by: Reset to default 7

You are correct: the representation need not be unique because the first hex digit, before the ., is unspecified, thus it can represent 1 to 4 bits of the mantissa. This means the number 1.0 can be represented as 0x1p+0, 0x2p-1, 0x4p-2 or 0x8p-3.

The highlighted phrase and only as many more digits as necessary to represent the decimal exponent of 2 means a non zero exponent cannot have extra leading zeroes, excluding representations for 1.0 such as 0x2p-01 or 0x2p-001. The next phrase If the value is zero, the exponent is zero. excludes representations such as 0x1p+00. Note that for this case the specification should have been more explicit and specified +0, excluding 0x1p-0.

Note also that the C Standard does not specify if the '.' must be omitted when the precision is missing in case no hexadecimal digits are required after the . to represent the number. Hence 0x1.p+0, 0x2.p-1 seem as compliant as 0x1p+0 and 0x2p-1. The C Standard does specify if the precision is zero and the # flag is not specified, no decimal-point character appears, which does not cover the case where precision is missing and no digits are necessary. Omitting the . unless # is specified seems consistent and is indeed the observed behavior on various POSIX systems.

For illustration, the default C library for printf("%a", 1.0) produces 0x1p+0 on macOS and FreeBSD, but 0x8p-3 on Debian linux and OpenBSD.

The case of printf("%a", 3.0) is somewhat consistent: 0x1.8p+1 and 0xcp-2 respectively, yet these representations that do not even have the same length.

The rationale for macOS and FreeBSD seems to always have 1 as the initial digit whereas Debian linux default libc (the GNU libC) and that of OpenBSD pack 4 bits of the mantissa into the initial digit, minimizing the total number of digits in 75% of cases and more importantly cramming more precision into the requested number of places should precision be specified in the format, which is valuable and IMHO better.

I interpret the text to mean that a C implementation may choose how many bits (one to four) are in the first digit, after which the exponent is determined. It is not required to choose how the number of bits so as to minimize the exponent length. Thus, if a significand is 1.011…, and the number is positive, the implementation may choose to start the conversion with “0x1.”, “0x2.”, “0x5.”, or “0xd.”

After that choice, the exponent is a function of the value of the number and the relationship between those first few bits and the “.”. That exponent must be formatted to have at least one digit and only as many as necessary.

Thus, if the value to be converted is 1.01100001111₂•2³, then choices for the conversion include:

0x1.61ep+3
0x2.c2cp+2
0x5.878p+1
0xd.0fp+0

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

c - Is printf's %a formatting for floating-points not unique? - Stack Overflow

2 Answers 2

与本文相关的文章

评论列表(0)