While upgrading our project from Java 17 to Java 21, we noticed an increase in memory consumption. After dumping the heap and analyzing the differences, I found that there are thousands of empty strings stored in memory.
I succeeded in reproducing the issue with the following code:
import java.lang.management.ManagementFactory;
import java.text.DecimalFormat;
public class DecimalFormating {
static DecimalFormat decimalFormat = new DecimalFormat("#.##");
static DecimalFormat decimalFormat2 = new DecimalFormat();
public static void main(String[] args) {
Runtime.getRuntime().addShutdownHook(new Thread(() -> {
try {
String pid = ManagementFactory.getRuntimeMXBean().getName().split("@")[0];
Process p = Runtime.getRuntime().exec("D:\\JAVA\\jdk-17.0.2\\bin\\jmap.exe -dump:format=b,file=heapdump_string_decimal_17.hprof " + pid);
p.waitFor();
} catch (Exception e) {
e.printStackTrace();
}
}));
}
}
The following code is straightforward as it defines two instances of DecimalFormat, which in turn define multiple empty strings, as seen here and here. It then dumps the heap into a file.
I compiled and ran the code with both Java 17.0.2 and Java 21.0.6, and here is what the memory looks like:
- For Java 17, you can see that all the strings that should be empty point to the same memory address, which is expected behavior due to the String Constant Pool:
- For Java 21, each string has a different memory address, resulting in the empty string being defined six times and consuming six times more memory than in the previous Java:
Is this behavior normal? I can't find any mention of this kind of change in the release notes of Java between versions 18 and 21.
While upgrading our project from Java 17 to Java 21, we noticed an increase in memory consumption. After dumping the heap and analyzing the differences, I found that there are thousands of empty strings stored in memory.
I succeeded in reproducing the issue with the following code:
import java.lang.management.ManagementFactory;
import java.text.DecimalFormat;
public class DecimalFormating {
static DecimalFormat decimalFormat = new DecimalFormat("#.##");
static DecimalFormat decimalFormat2 = new DecimalFormat();
public static void main(String[] args) {
Runtime.getRuntime().addShutdownHook(new Thread(() -> {
try {
String pid = ManagementFactory.getRuntimeMXBean().getName().split("@")[0];
Process p = Runtime.getRuntime().exec("D:\\JAVA\\jdk-17.0.2\\bin\\jmap.exe -dump:format=b,file=heapdump_string_decimal_17.hprof " + pid);
p.waitFor();
} catch (Exception e) {
e.printStackTrace();
}
}));
}
}
The following code is straightforward as it defines two instances of DecimalFormat, which in turn define multiple empty strings, as seen here and here. It then dumps the heap into a file.
I compiled and ran the code with both Java 17.0.2 and Java 21.0.6, and here is what the memory looks like:
- For Java 17, you can see that all the strings that should be empty point to the same memory address, which is expected behavior due to the String Constant Pool:
- For Java 21, each string has a different memory address, resulting in the empty string being defined six times and consuming six times more memory than in the previous Java:
Is this behavior normal? I can't find any mention of this kind of change in the release notes of Java between versions 18 and 21.
Share Improve this question edited 18 hours ago Mark Rotteveel 109k229 gold badges156 silver badges223 bronze badges asked 19 hours ago VoyVoy 1057 bronze badges 4 |2 Answers
Reset to default 8TL;DR: This will be fixed in Java 21.0.7(*), and has been fixed in Java 22.0.2 and Java 23 and later.
The problem is that the initialization with empty string you link is not actually used, as those fields are overwritten with the result of StringBuffer.toString()
(Java 17)/StringBuilder.toString()
(Java 21) calls in the applyPattern
method which is called from the DecimalFormat
constructors. The problem is that the toString()
method of StringBuffer
/StringBuilder
changed significantly with what is returned if the buffer is empty.
In Java 17 (17.0.14) it does:
@Override
@IntrinsicCandidate
public synchronized String toString() {
if (toStringCache == null) {
return toStringCache =
isLatin1() ? StringLatin1.newString(value, 0, count)
: StringUTF16.newString(value, 0, count);
}
return new String(toStringCache);
}
(Though oddly enough, if you'd called toString()
twice without modification, it would return a new instance).
This calls StringLatin1.newString
for an empty buffer, which returns the same empty string each time:
public static String newString(byte[] val, int index, int len) {
if (len == 0) {
return "";
}
return new String(Arrays.copyOfRange(val, index, index + len),
LATIN1);
}
In Java 21 (or at least, after Java 17), the implementation switched to StringBuilder
, and in Java 21 (21.0.6) the toString()
of StringBuilder
does:
@Override
@IntrinsicCandidate
public String toString() {
// Create a copy, don't share the array
return new String(this);
}
Which returns a new instance each and every time (though I didn't check if there is an intrinsic, and if so if it might do something else).
This was addressed in later Java versions. In Java 24 (24.0.0), StringBuilder.toString()
does:
@Override
@IntrinsicCandidate
public String toString() {
if (length() == 0) {
return "";
}
// Create a copy, don't share the array
return new String(this, null);
}
This is bug fix JDK-8325730 for Java 23 and backported to Java 22.0.2, and Java 21.0.7(*) (which hasn't been released yet). This issues has triggered additional discussion, see JDK-8332282 and JDK-8138614, as StringBuilder.toString()
explicitly says:
A new
String
object is allocated and initialized to contain the character sequence currently represented by this object.
And the fix doesn't actually conform as it doesn't return a new instance for an empty string buffer. The documentation will change in Java 25 to no longer require a new instance.
As far as I can tell from a quick look, before Java 15, the behaviour was similar to the Java 21 behaviour (returning new empty instances).
For example, Java 8 (8.0.442) does this in StringBuffer.toString()
:
@Override
public synchronized String toString() {
if (toStringCache == null) {
toStringCache = Arrays.copyOfRange(value, 0, count);
}
return new String(toStringCache, true);
}
And while the Java 11 implementation of StringBuffer.toString()
was the same as Java 17's, in Java 11 (11.0.26), StringLatin1.newString
always returned a new copy:
public static String newString(byte[] val, int index, int len) {
return new String(Arrays.copyOfRange(val, index, index + len),
LATIN1);
}
*: The backport issue JDK-8331299 lists the fix version as 21.0.7-oracle, so I'm not sure if this fix will also land in OpenJDK, or only in the Oracle builds.
I've also encountered this issue post Java 19, after analyzing the issue, found that, starting from Java 19, the DecimalFormat class started using an internal CompactNumberFormat even if you're not using compact formatting directly. Internally, CompactNumberFormat allocates a bunch of arrays for different formatting patterns, and most of these arrays get filled with empty strings as placeholders. These empty strings will stay around as long as your DecimalFormat instance lives, and if you're using static instances like in your example, they will essentially live for the lifetime of your application.
In Java 17 and earlier, this behavior didn't exist, so the extra empty strings simply weren't allocated.
To resolve this, avoid using static DecimalFormat unless absolutely needed.
Instead of using:
static DecimalFormat df = new DecimalFormat("#.##");
change it to
DecimalFormat df = new DecimalFormat("#.##");
or even better:
NumberFormat nf = NumberFormat.getNumberInstance();
nf.setMaximumFractionDigits(2);
DecimalFormat
, theapplyPattern
method is called, and that callstoString()
on aStringBuffer
, overwriting the initial values, and it looks somewhere the encoder used changed from returning the same empty string instance to returning unique empty string instances. – Mark Rotteveel Commented 18 hours agoStringBuffer
/StringBuilder
’stoString()
method when empty with different Java versions. Note that theapplyPattern
method has been changed fromStringBuffer
toStringBuilder
between these versions, but the behavior applies to both of them, so this has no effect. – Holger Commented 18 hours agoStringBuilder.toString()
returns""
instead of a newString
if its length is 0 – user85421 Commented 17 hours ago