A few days ago ##java happened to discuss sets and bit patterns and things like that, I happened to mention EnumSet
and that I find it useful. The rest of the gang wanted to know how it actually measures up, so this is a short evaluation of how EnumSet
stacks up for some operations. We are going to look at a few different things.
EnumSet classes
There are two different versions of EnumSet
:
* RegularEnumSet
when the enum has less than 64 values
* JumboEnumSet
used when the enum has more than 64 values
Looking at the code, it is easy to see that RegularEnumSet
stores the bit pattern in one long and that JumboEnumSet
uses a long[]
. This of course means that JumboEnumSet
s are quite a lot more expensive, both in memory usage and cpu usage (at least one extra level of memory access).
Memory usage
I created a little program to just hold one million Set
s with a few values in each of them.
Note: the enumproject.zip was built by your editor, not your author – any problems with it are the fault of dreamreal and not ernimril. Note that the project is mostly for source reference and not actually running the benchmark.
List<Set<Token>> tokens = new ArrayList<> ();
for (int i = 0; i < 1_000_000; i++) {
Set<Token> s = new HashSet<> ();
s.add (Token.LF);
s.add (Token.CR);
s.add (Token.CRLF);
tokens.add (s);
}
Heap memory usage for this program was about 250 MB according to JVisualVM.
Changing the new HashSet<> ();
into EnumSet.noneOf (Token.class);
we instead get 70 MB of heap memory usage.
Using the SmallEnum
instead causes the HashSet
to still use about 250MB, but drops the EnumSet
usage down to 39 MB. I find it quite nice to save that much memory.
CPU performance
I constructed two simple tests, shown below, that calls a few methods on a Set
that is either EnumSet
or HashSet
, depending on run. The enums have a few Set
s that contain different allocations of the enum and the isX
-methods only do return xSet.contains(this);
@Benchmark
public void testRegular() throws InterruptedException {
SmallEnum s = SmallEnum.A;
boolean isA = s.isA ();
boolean isB = s.isB ();
boolean isC = s.isC ();
boolean res = isA | isB | isC;
}
@Benchmark
public void testJumbo() throws InterruptedException {
Token t = Token.WHITESPACE;
boolean isWhitespace = t.isWhitespace ();
boolean isIdentifier = t.isIdentifier ();
boolean isKeyword = t.isKeyword ();
boolean isLiteral = t.isLiteral ();
boolean isSeparator = t.isSeparator ();
boolean isOperator = t.isOperator ();
boolean isBitOrShiftOperator = t.isBitOrShiftOperator ();
boolean res =
isWhitespace | isIdentifier | isKeyword | isLiteral |
isSeparator | isOperator | isBitOrShiftOperator;
}
I did the benchmarking using jmh in order to find out how fast this is.
Using HashSet:
Benchmark Mode Cnt Score Error Units
EnumSetBenchmark.testJumbo thrpt 20 46787074.985 ± 2373288.078 ops/s
EnumSetBenchmark.testRegular thrpt 20 124474882.016 ± 2165015.166 ops/s
Using EnumSet:
Benchmark Mode Cnt Score Error Units
EnumSetBenchmark.testJumbo thrpt 20 112456096.790 ± 320582.588 ops/s
EnumSetBenchmark.testRegular thrpt 20 563668720.636 ± 594323.541 ops/s
This is of course quite a silly test and one can argue that it does not do very much useful, but it still gives us quite a good indication that performance gains are there. Using EnumSet is 2.4 times faster for jumbo enums, but 4.5 times faster for small (regular) enums for this kind of operation.
I do not claim that your usage will notice the same speedup, but it might be worth checking out.
Final thoughts
Does it really matter if you use EnumSet
or Set
? In most cases: no, the enum will only be one field and not part of memory usage or cpu consumption, but depending on your use case it can be a nice memory saver while also being faster. I recommend that you use it.
One thought on “The case of EnumSet”
Comments are closed.