In trying to write my sinh(x)
: double my_sinh(double x)
, I came across the corner case.
When
x
is in the 709.782... to 710.475... range, how to calculateexp(x)/2
?
Hyperbolic sine sinh(x)
and cosine cosh(x)
are both well approximated by exp(x)/2
when x >= 20
(or so, depending on the type details of x
).
With common double, the largest finite value DBL_MAX
is about 1.798...x10308.
The most positive x
with a finite sinh(x)
and cosh(x)
is about 710.475....
The most positive x
with a finite exp(x)
is about 709.782....
Since exp(710.0)
overflows, the following /2
can be too late to achieve a finite result.
The code could use a wider precision/range, like long double
, yet long double
is not certainly any better than double
, and I am looking for a double
-only solution.
#include <assert.h>
#include <math.h>
#include <stdio.h>
double my_sinh_f1(double x) {
assert(x >= 709); // Only concerned about large values for now.
return (double) (expl(x)/2); // For now, call a long double function.
}
/*
* Over a range of values:
* Compute the difference between f(x) and higher precision sinhl.
* Scale difference to the unit-in-the-last-place (ULP) of the double result.
* Return the worst case ULP.
*/
double test_sinh(double (f)(double), double x0, double x1, unsigned n) {
double error_max = 0.0;
double dx = (x1 - x0)/n;
for (double x = x0; x <= x1; x += dx) {
long double y0_as_ld = sinhl(x);
double y0 = (double) y0_as_ld;
double ulp = y0 > y0_as_ld ? y0 - nextafter(y0,0) : nextafter(y0,INFINITY) - y0;
double y1 = f(x);
double error = (double) ((y1 - y0_as_ld)/ulp);
error = fabs(error);
if (error > error_max) {
error_max = error;
}
}
return error_max;
}
int main(void) {
double (*f)(double) = my_sinh_f1;
double x0= log(DBL_MAX);
double x1 = asinh(DBL_MAX);
unsigned n = 1000000007; //1000003; // Some prime
printf("x0:%.3f, x1:%.3f n:%10u error:%g(ulp)\n", x0, x1, n, test_sinh(f, x0, x1, n));
}
The above returns 0.5 for the ULP error, which is not surprising, given the extra wide sinhl()
.
In trying to write my sinh(x)
: double my_sinh(double x)
, I came across the corner case.
When
x
is in the 709.782... to 710.475... range, how to calculateexp(x)/2
?
Hyperbolic sine sinh(x)
and cosine cosh(x)
are both well approximated by exp(x)/2
when x >= 20
(or so, depending on the type details of x
).
With common double, the largest finite value DBL_MAX
is about 1.798...x10308.
The most positive x
with a finite sinh(x)
and cosh(x)
is about 710.475....
The most positive x
with a finite exp(x)
is about 709.782....
Since exp(710.0)
overflows, the following /2
can be too late to achieve a finite result.
The code could use a wider precision/range, like long double
, yet long double
is not certainly any better than double
, and I am looking for a double
-only solution.
#include <assert.h>
#include <math.h>
#include <stdio.h>
double my_sinh_f1(double x) {
assert(x >= 709); // Only concerned about large values for now.
return (double) (expl(x)/2); // For now, call a long double function.
}
/*
* Over a range of values:
* Compute the difference between f(x) and higher precision sinhl.
* Scale difference to the unit-in-the-last-place (ULP) of the double result.
* Return the worst case ULP.
*/
double test_sinh(double (f)(double), double x0, double x1, unsigned n) {
double error_max = 0.0;
double dx = (x1 - x0)/n;
for (double x = x0; x <= x1; x += dx) {
long double y0_as_ld = sinhl(x);
double y0 = (double) y0_as_ld;
double ulp = y0 > y0_as_ld ? y0 - nextafter(y0,0) : nextafter(y0,INFINITY) - y0;
double y1 = f(x);
double error = (double) ((y1 - y0_as_ld)/ulp);
error = fabs(error);
if (error > error_max) {
error_max = error;
}
}
return error_max;
}
int main(void) {
double (*f)(double) = my_sinh_f1;
double x0= log(DBL_MAX);
double x1 = asinh(DBL_MAX);
unsigned n = 1000000007; //1000003; // Some prime
printf("x0:%.3f, x1:%.3f n:%10u error:%g(ulp)\n", x0, x1, n, test_sinh(f, x0, x1, n));
}
The above returns 0.5 for the ULP error, which is not surprising, given the extra wide sinhl()
.
5 Answers
Reset to default 48How to calculate exp(x)/2?
To avoid overflow of exp(x)
yet ex/2 is in double
range, use exp(x - v)*exp(v)/2
with a carefully chosen v
(example: 0x1.62e4307c58800p-1
) to minimize error generation.
v
has two special properties:
Subtraction of
x
in this range byv
is exact.Mathematical ev and as a
double
are close (within 0.001 ULP or better).
Using the math identity:
ex/2 = e(x - v)*ev/2
We could select constant v = loge(2), (or v == 0.693...) then
ex/2 = e(x - log(2))*2/2
double my_sinh_f2(double x) {
assert(x >= 709);
static const double v = 0.69314718055994530941; // log(2)
static const double scale = 2.0/2;
return exp(x - v)*scale;
}
x0:709.783, x1:710.476 n:1000000007 error:495.895(ulp)
Then we will have succeeded is eliminating long double
, yet the ULP error jumps to 495.
This is because the error in the subtraction is multiplied by the magnitude of x
for the exp(x)
function.
Instead, let us select a v
that is at least loge(2), yet a multiple of the ULP of x
or next_power_of_2(709)/pow(2,53)
or 1024/9,007,199,254,740,992
.
ulp = 1024.0/pow(2,53);
v = ceil(log(2)/ulp)*ulp; // 0.69314718056000401...
double my_sinh_f3(double x) {
assert(x >= 709);
// Note 10 LSbits are zero -----------vvv
static const double v = 0x1.62e42fefa3c00p-1; // 0.69314718056000401...; // ceil(log(2)*pow(2,43))/pow(2,43)
static const double scale = 2.000000000000117415215/2; // exp(v)/2
return exp(x - v)*scale;
}
Now the ULP error has reduced to 1.792 as the subtraction is exact. The final error is exp()
(hopefully quite small like < 1.0) plus the conversion of ev to a double
(about 0.5) and the multiplication error (another 0.5).
x0:709.783, x1:710.476 n:1000000007 error:1.79199(ulp)
Yet can we do better?
What if the extended precision value of exp(v2)
was of the form 0x1.000000xxxxxxx000...p+1
?
With v1 = 0.69314718056000401...
, ev1 is 2.0000000000001174... or written in hexadecimal with extended precision as 0x1.0000000000108654...p+1
, yet it is only the significant (53 binary digits) or 0x1.0000000000109p+1
that is used as a double
constant and that remaining -0x0.0000000000000346...p+1
is the error in forming the scale
constant.
We could choose an offset v
such that its ev has much less error in its double
representation. Then the value used as a double
would be 0x1.000000xxxxxxxp+1
, which is more than a 1000 times closer to exp(v2)/2
than 0x1.0000000000109p+1
is to exp(v1
)/2.
After testing for values above log(2)
and are a multiple of the ULP of 709 (so the subtraction is exact), a value of v == 0.69314719694034466
leads to:
scale == 0x1.000000465a709000p+1/2
which has 11 mores bits as zero as compared to the double
precision:
scale == 0x1.000000465a709p+1/2
.
double my_sinh_f4(double x) {
assert(x >= 709);
// Note 10 LSbits are zero -----------vvv
static const double v = 0x1.62e4307c58800p-1; // 0.69314719694034466;
static const double scale = 0x1.000000465a709000p+1/2; // exp(v)/2
// ^^^
// Note next 11 bits of e**v, found using long double, are zero.
return exp(x - v)*scale;
}
So with this v
offset, no subtraction error occurs and the scale
used is 1000 times closer to ev than our prior attempt.
Boo-yah!
x0:709.783, x1:710.476 n:1000000007 error:1.00781(ulp).
How does our 1.00781(ulp) compare to the library function sinh()
? Below reports we are nearly a whole 1.0 ULP better.
// sinh
x0:709.783, x1:710.476 n:1000000007 error:1.91992(ulp)
Further testing of library sinh()
indicates for large x
up to 709.783
or so, its ULP is quite good: 0.6 and then has a ULP jump to 1.920 in the range of this question. I suspect my gcc library sinh()
would benefit with the similar constant change.
Notes:
float
: A similar investigation for float
resulted in:
v = 0x1.639900p-1f;
scale = 0x1.005a780p+0f;
and a ULP better than my gcc library sinhf()
in the end-range.
sinhl()
is used as the reference function for a 64-bit extended precision correct answer. A deeper analysis would use a proven reference function.
Compiler notes:
Building file: ../sinh.c
Invoking: Cygwin C Compiler
gcc -std=c11 -O0 -g3 -pedantic -Wall -Wextra -Wconversion -Wlogical-op -Wsign-conversion -Wwrite-strings -c -fmessage-length=0 -v -MMD -MP -MF"sinh.d" -MT"sinh.o" -o "sinh.o" "../sinh.c"
Using built-in specs.
COLLECT_GCC=gcc
Target: x86_64-pc-cygwin
Configured with: /mnt/share/cygpkgs/gcc/gcc.x86_64/src/gcc-12.4.0/configure --srcdir=/mnt/share/cygpkgs/gcc/gcc.x86_64/src/gcc-12.4.0 --prefix=/usr --exec-prefix=/usr --localstatedir=/var --sysconfdir=/etc --docdir=/usr/share/doc/gcc --htmldir=/usr/share/doc/gcc/html -C --build=x86_64-pc-cygwin --host=x86_64-pc-cygwin --target=x86_64-pc-cygwin --without-libiconv-prefix --without-libintl-prefix --libexecdir=/usr/lib --with-gcc-major-version-only --enable-shared --enable-shared-libgcc --enable-static --enable-version-specific-runtime-libs --enable-bootstrap --enable-__cxa_atexit --enable-clocale=newlib --with-dwarf2 --with-tune=generic --enable-languages=ada,c,c++,fortran,lto,objc,obj-c++,jit --enable-graphite --enable-threads=posix --enable-libatomic --enable-libgomp --enable-libquadmath --enable-libquadmath-support --disable-libssp --enable-libada --disable-symvers --disable-multilib --with-gnu-ld --with-gnu-as --with-cloog-include=/usr/include/cloog-isl --without-libiconv-prefix --without-libintl-prefix --with-system-zlib --enable-linker-build-id --with-default-libstdcxx-abi=gcc4-compatible --enable-libstdcxx-filesystem-ts
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.4.0 (GCC)
COLLECT_GCC_OPTIONS='-std=c11' '-O0' '-g3' '-Wpedantic' '-Wall' '-Wextra' '-Wconversion' '-Wlogical-op' '-Wsign-conversion' '-Wwrite-strings' '-c' '-fmessage-length=0' '-v' '-MMD' '-MP' '-MF' 'sinh.d' '-MT' 'sinh.o' '-o' 'sinh.o' '-mtune=generic' '-march=x86-64'
/usr/lib/gcc/x86_64-pc-cygwin/12/cc1.exe -quiet -v -MMD sinh.d -MF sinh.d -MP -MT sinh.o -dD -idirafter /usr/lib/gcc/x86_64-pc-cygwin/12/../../../../lib/../include/w32api -idirafter /usr/lib/gcc/x86_64-pc-cygwin/12/../../../../x86_64-pc-cygwin/lib/../lib/../../include/w32api ../sinh.c -quiet -dumpbase sinh.c -dumpbase-ext .c -mtune=generic -march=x86-64 -g3 -O0 -Wpedantic -Wall -Wextra -Wconversion -Wlogical-op -Wsign-conversion -Wwrite-strings -std=c11 -version -fmessage-length=0 -o /cygdrive/c/Users/TPC/AppData/Local/Temp/ccNNmUyM.s
GNU C11 (GCC) version 12.4.0 (x86_64-pc-cygwin)
compiled by GNU C version 12.4.0, GMP version 6.3.0, MPFR version 4.2.1, MPC version 1.3.1, isl version isl-0.27-GMP
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory "/usr/local/include"
ignoring nonexistent directory "/usr/lib/gcc/x86_64-pc-cygwin/12/include-fixed"
ignoring nonexistent directory "/usr/lib/gcc/x86_64-pc-cygwin/12/../../../../x86_64-pc-cygwin/include"
ignoring duplicate directory "/usr/lib/gcc/x86_64-pc-cygwin/12/../../../../x86_64-pc-cygwin/lib/../lib/../../include/w32api"
#include "..." search starts here:
#include <...> search starts here:
/usr/lib/gcc/x86_64-pc-cygwin/12/include
/usr/include
/usr/lib/gcc/x86_64-pc-cygwin/12/../../../../lib/../include/w32api
End of search list.
GNU C11 (GCC) version 12.4.0 (x86_64-pc-cygwin)
compiled by GNU C version 12.4.0, GMP version 6.3.0, MPFR version 4.2.1, MPC version 1.3.1, isl version isl-0.27-GMP
Further improvements
@njuffa offered v = 0x1.62e433718c000p-1
, which is a ev that is even closer to its double
.
double my_sinh_fn1(double x) {
assert(x >= 709);
static const double v = 0x1.62e433718c000p-1;
static const double scale = 0x1.000001c0f43210000p+1/2; // exp(v)/2
return exp(x - v)*scale;
}
Which, per my testing, resulted in
x0:709.783, x1:710.476 n:1000000007 error:1.00732(ulp)
Deeper testing may reveal additional improvements.
If you just want to avoid overflow and can tolerate error in the last couple of digits, you can calculate exp(x)/2
as exp(x - log(2))
.
As various answers and comments suggested alternatives, below is a test harness.
Please feel open to add your findings.
#include <assert.h>
#include <float.h>
#include <math.h>
#include <stdbool.h>
#include <stdio.h>
#include <time.h>
double akanice_sinh_f1(double x) {
double s = exp(x / 2) / 2;
return s * s * 2;
}
double akanice_sinh_f2(double x) {
static const double v = 0x1.62E4307C5FCp-01;
static const double scale = 2.0000000327673927991156324020994 / 2; // exp(v)/2
return exp(x - v) * scale;
}
double alias_sinh_f1(double x) {
return 0.5 / exp(-x);
}
double dan4_sinh_f1(double x) {
static const double v = 0.69314718055994530941723212145818;
return exp(x - v);
}
double my_sinh_f1(double x) {
return (double) (expl(x) / 2);
}
double my_sinh_f2(double x) {
static const double v = 0.69314718055994530941723212145818; // log(2)
static const double scale = 2.0 / 2; // exp(v)/2
return exp(x - v) * scale;
}
double my_sinh_f3(double x) {
// Note 10 LSbits are zero -----------vvv
static const double v = 0x1.62e42fefa3c00p-1; // 0.69314718056000401...; // ceil(log(2)*pow(2,43))/pow(2,43)
static const double scale = 2.0000000000001174152150690680826 / 2; // exp(v)/2
return exp(x - v) * scale;
}
double my_sinh_f4(double x) {
static const double v = 0.69314719694034466;
static const double scale = 0x1.000000465a709000p+1 / 2; // exp(v)/2
return exp(x - v) * scale;
}
double my_sinh_f5(double x) {
// Variation on akanice_sinh_f1()
double s = exp(x / 2);
return 0.5 * s * s;
}
double njuffa_sinh_f1(double x) {
static const double v = 0x1.62e433718c000p-1;
static const double scale = 0x1.000001c0f43210000p+1 / 2; // exp(v)/2
return exp(x - v) * scale;
}
double wim_sinh_f1(double x) {
static const double v = 0x1.62e42fefa3c00p-1;
// Here scale_minus_1 was computed as exp(v)/2-1 using multiple precision arithmetic.
static const double scale_minus_1 = 5.870760753453405e-14;
double exv = exp(x - v);
return fma(exv, scale_minus_1, exv);
}
double wim_sinh_f2(double x) {
static const double v = 0x1.62e42fefa3c00p-1;
// Here scale_minus_1 was computed as exp(v)/2-1 using multiple precision arithmetic.
static const double scale_minus_1 = 5.870760753453405e-14;
double exv = exp(x - v);
return exv * scale_minus_1 + exv; // No fma()
}
/*
* Over a range of values:
* Compute the difference between f(x) and higher precision sinhl.
* Scale to the unit-in-the-last-place (ULP) of the double result.
* Return the worst case ULP.
*/
double test_sinh(const char *s, double (f)(double), double x0, double x1,
unsigned n) {
double error_max = 0.0;
double error_max_x = 0.0;
double dx = (x1 - x0) / n;
for (double x = x0; x <= x1; x += dx) {
long double y0_as_ld = sinhl(x);
double y0 = (double) y0_as_ld;
double ulp =
y0 > y0_as_ld ? y0 - nextafter(y0, 0) : nextafter(y0, INFINITY) - y0;
double y1 = f(x);
double error = (double) ((y1 - y0_as_ld) / ulp);
error = fabs(error);
if (error > error_max) {
error_max = error;
error_max_x = x;
}
}
clock_t c0 = clock();
for (double x = x0; x <= x1; x += dx) {
f(x);
}
clock_t c1 = clock();
printf(
"%-15s, f(%#.17g) --> error:%#7.3fg(ulp), avg:% .4f usec\n", //
s, error_max_x, error_max,
(double) (c1 - c0) / CLOCKS_PER_SEC / n * 1000000.0);
fflush(stdout);
return error_max;
}
#define TEST_SINH(f) test_sinh(#f, (f), x0, x1, n)
void tests_sinh(unsigned n) {
time_t t0, t1;
time(&t0);
printf("FLT_EVAL_METHOD %d\n", FLT_EVAL_METHOD);
printf("DBL_MAX %g\n", DBL_MAX);
printf("LDBL_MAX %Lg\n", LDBL_MAX);
double x0 = log(DBL_MAX);
x0 = nextafter(x0, INFINITY);
double x1 = asinh(DBL_MAX);
printf("x0: %.17g\n", x0);
printf("x1: %.17g\n", x1);
printf("n: %u\n", n);
printf("\nEmploys special derived constants.\n");
TEST_SINH(akanice_sinh_f2);
TEST_SINH(wim_sinh_f1);
TEST_SINH(wim_sinh_f2);
TEST_SINH(njuffa_sinh_f1);
TEST_SINH(my_sinh_f4);
TEST_SINH(my_sinh_f3);
printf("\nNo special constants.\n");
TEST_SINH(my_sinh_f1);
TEST_SINH(akanice_sinh_f1);
TEST_SINH(my_sinh_f5);
TEST_SINH(sinh);
TEST_SINH(alias_sinh_f1);
TEST_SINH(my_sinh_f2);
TEST_SINH(dan4_sinh_f1);
time(&t1);
printf("\nRun time %g secs\n", difftime(t1, t0));
}
int main(void) {
//tests_sinh(1000003); // Quick test.
tests_sinh(1000000007 /* Some large prime */ );
return 0;
}
Output
FLT_EVAL_METHOD 0
DBL_MAX 1.79769e+308
LDBL_MAX 1.18973e+4932
x0: 709.78271289338409
x1: 710.47586007394398
n: 1000000007
Employs special derived constants.
akanice_sinh_f2, f(710.44541883690090) --> error: 1.005g(ulp), avg: 0.0197 usec
wim_sinh_f1 , f(710.23474935199545) --> error: 1.006g(ulp), avg: 0.0249 usec
wim_sinh_f2 , f(710.23474935199545) --> error: 1.006g(ulp), avg: 0.0197 usec
njuffa_sinh_f1 , f(710.45175735066948) --> error: 1.007g(ulp), avg: 0.0198 usec
my_sinh_f4 , f(710.45192802604777) --> error: 1.007g(ulp), avg: 0.0197 usec
my_sinh_f3 , f(710.47301410218324) --> error: 1.791g(ulp), avg: 0.0196 usec
No special constants.
my_sinh_f1 , f(709.78271491252610) --> error: 0.500g(ulp), avg: 0.0829 usec
akanice_sinh_f1, f(710.46986937340364) --> error: 1.918g(ulp), avg: 0.0189 usec
my_sinh_f5 , f(710.46986937340364) --> error: 1.918g(ulp), avg: 0.0168 usec
sinh , f(710.46986937340364) --> error: 1.918g(ulp), avg: 0.0234 usec
alias_sinh_f1 , f(710.47583691748071) --> error: 8.492g(ulp), avg: 0.2650 usec
my_sinh_f2 , f(710.47585990575567) --> error:495.708g(ulp), avg: 0.0196 usec
dan4_sinh_f1 , f(710.47585990575567) --> error:495.708g(ulp), avg: 0.0201 usec
Run time 3270 secs
Instead of dividing and conquering with a constant, wouldn't this be simple enough:
(exp( x/2 ) / 2) squared * 2
OK, we cumulate error of exponential with that of squaring, so probably more than 1 ulp, but it has a merit of simplicity.
AFTER THOUGHTS
Hum, we pay the error on exp(x/2) twice, so it's more like 1.5 ulp or more, probably not a so good idea...
(Summarizing my comments)
This answer builds further upon chux’s alternative my_sinh_f3()
. There sinh()
is computed as:
v = 0x1.62e42fefa3c00p-1 /* 0.69314718056000401 */
scale = exp(v) / 2 /* scale = 2.0000000000001174152150690680826 / 2 */
y = exp(x - v)
sinh(x) = y * scale
Instead of multiplying by the double precision factor scale
, we write scale
as the sum scale = 1 + scale_minus_1
. Here 1 + scale_minus_1
is a more accurate approximation of exp(v) / 2
than scale
alone as double, if the constant scale_minus_1 = exp(v) / 2 -1 = 5.870760753453405e-14
is computed with sufficiently high (e.g. quadruple) precision.
Now sinh(x)
can be calculated in the numerically favorable form value + small correction
:
sinh(x) = y * scale
sinh(x) = y * (1 + scale_minus_1)
sinh(x) = y + y * scale_minus_1
This leads to:
double wim_sinh_f2(double x) {
static const double v = 0x1.62e42fefa3c00p-1;
static const double scale_minus_1 = 5.870760753453405e-14;
double y = exp(x - v);
return y * scale_minus_1 + y;
// Alternative for platforms with fma() hardware support:
// return fma(y, scale_minus_1, y);
}
chux found a ulp error of 1.006 for both y * scale_minus_1 + y
and the fma()
alternative fma(y, scale_minus_1, y)
.