- The Apache Foundation has announced the release of PDFBox 2.0. Apache PDFBox allows for the “creation of new PDF documents, manipulation, rendering, signing of existing documents and the ability to extract content from documents.”
- TechEmpower‘s Web Framework Benchmarks features Rapidoid as the fastest web framework – and Rapidoid is written in Java. It’s worth noting that the benchmark, being a benchmark, isn’t exactly “real world” – and Rapidoid doesn’t win every category – but it’s still pretty impressive to see Java, with it’s (ancient and outmoded) reputation for lack of speed, featuring so highly here.
- AngularBeans: A Fresh New Take on AngularJS and JavaEE discusses the use of the AngularBeans project to expose functionality from CDI beans.
- Docker Commands and Best Practices Cheat Sheet, from our friends at ZeroTurnaround, is pretty useful.
- Functional Programming: Concepts, Idioms and Philosophy is an attempt to sum up functional programming for people who might not be familiar with the idiom. Not bad, but if you want to really dig in deep, you might check Manning‘s Functional Programming in Scala, which does a fine job exposing you not only to the idea, but its strengths and weaknesses.
- Chronicle Map is, according to the project site, an in-memory key-value store designed for low-latency and/or multi-process applications. Notably trading, financial market applications. Looks interesting – there are plenty of distributed key/value stores around, it might be interesting to see how this one compares to things like Apache Ignite, GigaSpaces’ community edition, Oracle Coherence, Terracotta DSO, and other such candidates.
- Markov Chains explains, well, markov chains. Basically, markov chains are a state transition method that predicts the “next state” using probabilities – you can build conversations using markov chains to predict likely responses. (For example, “vote Trump for President” has likely responses of “Gosh, why” or “heck yeah, let’s build us a wall!”) The reference link is actually a really nice explanation.
Interesting Links, 17 Mar 2016
This list was originally supposed to be published over a week ago, but life’s been busy. Sorry, folks! Happy St. Patrick’s Day!
- A succesful Git branching model considered harmful is a response to another article, A successful Git branching model. Both models can work; which one works better for you depends on a lot of factors that are likely to be unique to your development environment. (I’ve used both: I find the “cactus model” better, personally.)
- The Four Software Engineering Personality Types describes four personalities (surprise) in development environment: Iron Man, Michaelangelo (the sculptor, not the Teenage Mutant Ninja Turtle), Yoda, and Captain America.
- Iron Man is a tinkerer – get 90% of the project done, really quickly.
- Michaelangelo is the detail-oriented, deep-diving programmer – the one who spends years on a given project, working out every detail. Michaelangelos’ projects tend to be unusable until they’re done – then they’re mission-critical and awesome.
- Yoda is a teacher (or, if you like, a puppet with a hand up his… I mean, “a teacher.”) These are the guys who know tons of stuff, and show it to others, growing an organization and providing wisdom – and a great lever when they focus on doing specific tasks.
- Captain America is the workhorse, the one who’ll roll up his sleeves and do the unpleasant work. Like in the comics, Captain America and Iron Man go well together; Iron Man rockets through the stratosphere, flashy and quick, and Captain America cleans everything up and makes it work well.
- The Deep Roots of Javascript Fatigue goes into the rather chaotic waters of JavaScript development. Java’s in a great place: it’s dynamic enough that the community finds new and interesting ways to develop software all the time, but it’s also stable enough that you’re not having to relearn how to do everything every year, which is the situation you find in JavaScript. Excellent writeup, even if JavaScript development isn’t quite as dire as it might sound on the surface.
- And now into our selection of excellent DZone content: Abstraction Considered Harmful..? has a bit to say about abstractions: they’re good, but sometimes they’re leaky (and therefore can be bad). But mostly they’re useful. From the article: “Abstraction, in and of itself, is not harmful. On the contrary, it’s necessary for progress. What’s harmful is relying on impenetrable barriers to protect our precious programmers from hard problems. After all, the 21st-century engineer understands that in order to play in the sand, we all need to be comfortable getting our feet a little wet from time to time.”
- In Anatomy of a Good Java Test, Sam Atkinson (who will show up again in this same collection of interesting links) walks through a simple recipe for good testing. It looks like it’s based around JUnit4 and Hamcrest – hardly awful choices, but also not necessarily the state of the art (or the only way to write good tests). Good baseline, though.
- In In Defense of the Fifth Year Developer, Matthew Casperson argues for some of the abstraction discussed earlier – the point’s not very clear, but complex code laden with abstractions is easier to test and verify, because it breaks problems down into identifiable units.
- And back to Sam Atkinson: In Constructor vs. Getter: A Better Way he discusses the use of no-operation classes to wrap optional behavior (thus:
NoOpNotifier
, with methods that do nothing, instead of anull
that has to be checked). This simplifies the code path (a good thing), and also helps with that pesky abstraction thing. Good article.
Interesting Links, 1 Mar 2016
Happy March 1, it’s April Fool’s Day! Oh, wait…
- From ##java itself:
Anthaas_> 99.7% of people who say C++ is faster are not capable of using the highly-skilled techniques required to make that true.
Now, about how he collected the data to validate that statement… - Gradle.org posted “Gradle vs Maven Feature Comparison“, with a description of “At long last, a comprehensive feature comparison of Maven vs Gradle that shows in detail what Build Automation requires in the Age of Continuous Delivery.” Surprisingly – or not – Gradle comes out well ahead, but most of the features sound more useful than they actually are for most users. (Until, that is, you really need that feature.)
- Maven Testing Module describes using a Maven module solely for holding resources used for testing. It’s a module that’s included in other project modules at
test
scope; it has the testing frameworks and other dependencies in it, so your other modules will no longer be cluttered by test resources or artifacts. Cool idea. (For example, you can put H2 in your test project, along with some stored procedures and a test schema for it, and import them into your application for validation… just kidding, avoid stored procedures unless they’re used for every last bit of your data manipulation. And don’t do that.) - Heinz Kabutz is back, with “Checking HashMaps with MapClashInspector” – which walks through some of the things you should, and could, think about when designing hash codes for your objects. Highly recommended. Precis: “Java 8 HashMap has been optimized to avoid denial of service attacks with many distinct keys containing identical hash codes. Unfortunately performance might degrade if you use your own keys. In this newsletter we show a tool that you can use to inspect your HashMap and view the key distribution within the buckets.”
- Of course the announcement propagates right after the links get published… but Flyway 4.0 has been released. This is a database migration tool – if your schema changes during development (or for any other reason), tools like Flyway are beyond valuable in terms of keeping your schema versioned. Highly recommended. The main alternative to Flyway is Liquibase – that’s not an endorsement of either project, just a plea to save your devops people by using tools designed to help them, instead of making them issue manual SQL to update a schema.
External Program Invocation in Java
Users who wish to shell out a Java program may be tempted to use Runtime.exec()
, which yields a Process
. They probably should use zt-exec instead. However, for those who think that using a separate library for something so “simple” is overkill, please read on.
General Notes
Java does not invoke a shell – Java uses execve()
. This means that variables like ~
, %HOME%
, and “$JAVA_HOME
” will not be expanded, nor will you be able to use shell built-ins such as cd
or while
. However, Java can invoke a bash
or cmd
shell, which can then be fed input and output. For more information on shelling out to an actual shell, please see “Invoking A Shell.”
Invocation with Runtime
Runtime
is the most accessible way to execute an external process. Processes are started with Runtime.exec(String)
, Runtime.exec(String[])
, and variants based off of Runtime.exec(String)
that accept an environment variable array and an optional directory. The only method that should be used from Runtime.exec()
is the String[]
variant. Each part of the String[]
reflects a separate argument, with the 0th being the command to be run. For the functionality of an environment variable array and a directory, please see “Invocation with ProcessBuilder“.
No variant of Runtime.exec(String)
should be used because Runtime.exec(String)
splits the incoming string on spaces. This may not sound bad, as the shell also splits on spaces. Take the following:
sed 's/a b/c d/gi' "My Documents/foo.txt"
A shell would interpret this as ["sed", "s/a b/c d/gi", "My Documents/foo.txt"]
Runtime.exec(String)
would interpret this as: ["sed", "'s/a", "b/c", "d/gi'", "\"My", "Documents/foo.txt\""]
Always use the Runtime.exec(String[])
variant.
After performing the invocation with Runtime
, please see “Using Process“.
Invocation with Process Builder
ProcessBuilder
takes a String...
or a List<string>
as its command argument. Each part of the String[]
reflects a separate argument, with the 0th being the command to be run. In order to manipulate the environment, ProcessBuilder.environment()
returns a mutable Map<String,String>
of environment variables. This should be modified (it is not a read-only view, as pointed out by the summary javadoc). Setting the directory can be done with a .directory()
call.
Other common operations are setting standard output to a file, with redirectOutput(File)
, and merging standard input and standard output, with redirectErrorStream(true)
. Starting the process can be done by calling .start()
.
After performing this invocation, please see “Using Process“.
Invoking a Shell
This must be done by actually starting the shell process, and then the shell will interpret variables as normal. Done with ProcessBuilder, with the following algorithm:
List<String> parameters = new ArrayList<>();
if(System.getProperty("os.name").toLowerCase().contains("windows")) {
parameters.add("cmd");
parameters.add("/C");
} else {
parameters.add("/bin/bash");
parameters.add("-c");
}
This will start the OS-specific shell: cmd.exe
on Windows, bash
on other systems. Other shells may be substituted on linux by changing “bash
” to the appropriate shell in the above. Those familiar with cmd
may want to switch /C
to /X
– this is improper unless streams are redirected, with ProcessBuilder.inheritIO()
.
Note that Windows’
PowerShell
apparently introduces some difficulties. Apparently how it is invoked is different enough that these instructions aren’t enough.
Using Process
After a Process
is obtained with either of the above methods, the process has been started and will run until its natural death. However, there are several common pitfalls.
You must cull the process with Process.waitFor()
, even if you do not want all input and output from the process. Otherwise, the process will exist as a zombie until the parent process (the JVM) exits.
You do not have to read the output from the process, but if you read either stdout
or stderr
, you must read both. On some systems, if there is output in stderr
, stdout
is blocked until this output is consumed. This must be done in a multithreaded fashion if these streams are not joined.
If you want to terminate the process before finishing, you can call Process.destroy()
– this sends the process-equivalent of the KILL
signal to the process, not TERM
, so use cautiously.
Putting it all together, the following is a sample of shelling out to an external process:
import java.io.IOException;
import java.io.InputStream;
import java.util.concurrent.Executors;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Future;
import java.util.concurrent.Callable;
public class Example {
public static void main(String[] args) throws Exception {
ExecutorService service = Executors.newSingleThreadExecutor();
Process p = Runtime.getRuntime().exec(new String[]{"echo", "Hello world"});
new Thread(new ErrorConsumer(p.getErrorStream())).start();
//java is the center of the universe
Future output = service.submit(new OutputConsumer(p.getInputStream()));
p.waitFor();
System.out.println(output.get());
}
private static class ErrorConsumer implements Runnable {
private final InputStream toDiscard;
public ErrorConsumer(final InputStream toDiscard) {
this.toDiscard = toDiscard;
}
@Override
public void run() {
byte[] buf = new byte[1024];
try {
while (toDiscard.read(buf) != -1) ;
} catch (IOException e) {
e.printStackTrace();
}
}
}
private static class OutputConsumer implements Callable</string><string> {
private final InputStream toRead;
public OutputConsumer(final InputStream toRead) {
this.toRead = toRead;
}
@Override
public String call() throws Exception {
StringBuilder sb = new StringBuilder();
byte[] buf = new byte[1024];
int read;
while ((read = toRead.read(buf)) != -1) {
sb.append(new String(buf, 0, read));
}
return sb.toString();
}
}
}
Editor’s note: this code may or may not work as expected. It runs, but may block on some operations – consider yourself warned, prepare to hit
^C
, and use it as a point of emphasis on why you should be using zt-exec, shown immediately below…
Or, with zt-exec:
import org.zeroturnaround.exec.ProcessExecutor;
import java.io.IOException;
import java.util.concurrent.TimeoutException;
public class Example2 {
public static void main(String[] args)
throws InterruptedException, TimeoutException, IOException {
String output = new ProcessExecutor()
.command("echo", "Hello World")
.readOutput(true)
.execute()
.outputUTF8();
System.out.println(output);
}
}
Interesting Links, 24 Feb 2016
It’s been a while, and I’m pretty sure I missed some fun stuff, but here goes with a few things:
- Blogger Sam Atkinson has a few here, some good, some bad. I admire his proclivity.
- “Don’t Rewrite Your Old Application; Refactor!” has some advice for people migrating to new products. It’s got some good thinking in it (rewriting is going to miss stuff, it’s going to take longer than you think) but not a lot of deep reasoning (and misses some possible points, like the resentment from the original architects which has happened to me when I tried to rewrite rather than refactor). Good post.
- Then there’s “Kill Your Dependencies: Java/Maven Edition“, which says not to introduce dependencies until you have no other choice. That’s not terrible advice on the surface, but … it’s terrible advice. Use what you need. If wasting 3MB of disk space gets you one method from a library that saves you time to write or think or test, well, that 3MB of space is cheaper than you are. YMMV, but it’s not good advice – worth reading, though.
- jOOQ‘s “The Mute Design Pattern” shows how you can use Java 8’s lambdas to hide checked exceptions for situations where you Just Don’t Care, leading to code like
mute( () -> { doStuff(); } )
— which is actually pretty neat. Very handy to have in your coding toolbox, much like Binkley‘s “Java 8 AutoCloseable trick“.
By the way, feel free to send in stuff you think belongs here!
Programmatic Reload of Logback Configurations
Logback has the capability to programmatically and explicitly load various configurations. This can be useful when you need to adjust logging levels at runtime, and it’s actually pretty easy to do, as well.
You’d want to use something like this for a long-running application, or one that has an extensive load process: imagine a production environment, where you want to see details that would be hidden by convention.
For example, imagine you track a given method invocation, but your production logs don’t include the tracking, because it’s too verbose. But if a problem occurs, you want to be able to see the invocation. Changing the logging configuration and redeploying (or restarting) is an option, but it’s expensive and embarrassing, when all you really need to do is see more information.
The core operative code looks like this:
LoggerContext context = (LoggerContext) LoggerFactory.getILoggerFactory();
context.reset();
JoranConfigurator configurator = new JoranConfigurator();
configurator.setContext(context);
configurator.doConfigure(this.getClass().getResourceAsStream("/logback.xml"));
Note that doConfigure
can throw a JoranException
if the configuration is invalid somehow.
I built a project (called logback-reloader
) to demonstrate this. The project has a LogThing
interface, which provides a simple doSomething()
method along with an accessor for a Logger
; the doSomething()
method simply issues a series of calls to generate log entries at different levels.
public interface LogThing { Logger getLog(); default void doSomething() { getLog().trace("trace message"); getLog().debug("debug message"); getLog().warn("warn message"); getLog().info("info message"); getLog().error("error message"); } }
I then created two different implementations – ‘FineLogThing’ and ‘CoarseLogThing’ which are identical except that they’re named differently (so that I can easily tune the logging levels).
It would have been easy to use a single implementation and declare two components with Spring, but then I’d be deriving the logger from the Spring name and not the package of the classes. This was just a short path to completion, not necessarily a great design.
Why Spring? Because I’m using Spring at work, and I wanted my test code to be reusable.
Then I created a custom Appender
(InMemoryAppender
) to provide easy access to logged information. I wanted to do this because I wanted to programmatically check that the logging levels were being changed; the idea is that my custom appender actually maintains a list of logged entries internally so I can query it later. The reason the logged entries is a static List
is because Spring doesn’t maintain the Appenders – logback does – so again, this was a short path to completion, not necessarily a “great design.”
So to put it together, I created a TestNG test that had two tests in it. The only difference in the tests is that one uses “logback.xml
” – the default configuration, loaded by default but explicitly included here to remove dependency on order of execution in the tests – and the other uses “logback-other.xml
“. (I could have parameterized the tests as well – again, shortest path, not “great design.”)
Our default logback configuration is pretty simple, albeit slightly longer than I’d like:
<configuration>
<appender name="MEMORY" class="com.autumncode.logger.InMemoryAppender"></appender>
<logger name="com.autumncode.components.fine" level="TRACE"></logger>
<logger name="com.autumncode.components.coarse" level="INFO"></logger>
<root level="WARN">
<appender -ref ref="MEMORY"></appender>
</root>
</configuration>
Note that it’s
appender-ref
, no spaces. The Markdown implementation from this site’s software is inserting the space before the dash.
The “other” logback configuration is almost identical. The only difference is in the level
for the coarse
package:
<configuration>
<appender name="MEMORY" class="com.autumncode.logger.InMemoryAppender"></appender>
<logger name="com.autumncode.components.fine" level="TRACE"></logger>
<logger name="com.autumncode.components.coarse" level="TRACE"></logger>
<root level="WARN">
<appender -ref ref="MEMORY"></appender>
</root>
</configuration>
Here’s the first test:
@Test
public void testBaseConfiguration() throws JoranException {
LoggerContext context = (LoggerContext) LoggerFactory.getILoggerFactory();
context.reset();
JoranConfigurator configurator = new JoranConfigurator();
configurator.setContext(context);
configurator.doConfigure(this.getClass().getResourceAsStream("/logback.xml"));
appender.reset();
fineLogThing.doSomething();
assertEquals(appender.getLogMessages().size(), 5);
appender.reset();
coarseLogThing.doSomething();
assertEquals(appender.getLogMessages().size(), 3);
}
This verifies that the coarse logger doesn’t include as many elements as the fine logger, because the default logback configuration has a more coarse logging granularity set for its package.
The other test is almost identical, as stated: the only differences are in the logback configuration file and the number of messages the coarse logger is expected to have created.
So there you have it: a simple example of reloading logback configuration at runtime.
It’s worth noting that this isn’t “new information.” It’s actually shown pretty well at sites like “Obscured by Clarity,” for example. The only contribution here is the building of a project with running code, as well as loading the configuration from the classpath as opposed to from a filesystem.
Interesting Links, 15 Feb 2016
- A great quote from ##java:
< surial> maven/gradle are to ant, as svn is to cvs.
- JavaCPP is a new project that attempts to bridge a gap between C++ and Java, entering the muddy waters along with JNI and JNA (as well as a few other such projects). It actually looks pretty well done – and targets Android as well as the JVM, which seems like a neat trick.
- First in a couple from DZone: “Reactive Programming by Example: Hands-On with RxJava and Reactor” is a presentation (thus a video) of a use of RxJava. Reactive programming is one way to introduce a scalable processing model to your code, although it’s hardly the only one (and it’s not flawless, either, so if you’re one of the anti-reactive people, cool your jets, it’s okay). If you’ve been wondering what this whole reactive thing is, here’s another chance to learn.
- Speaking of learning: “Monads: It’s OK to Feel Stupid” punts on the idea of describing what a monad is, saying that it’s okay if you don’t understand them – you can use them anyway. (Java’s streams provide a lot of access to functionality through monads, which present “computations represented as sequences of steps.”)
- “The 5 Golden Rules of Giving Awesome Customer Support” goes through some basic things to think about for, of all things, customer support. (Surprise!) The things are topics, not good headings, but one thing they didn’t point out was that people who use your open source software library are customers, too. You’ll want to read the article to get more relevance out of the headings. The points are:
- All users are customers
- Your customer took the time
- Your customer is abrasive and demanding
- Your customer doesn’t understand your product
- Your customer doesn’t care about your workflows
Interesting Links, 9 Feb 2016
- From Parks Computing, a short word of advice in “On Recruiting” for the movers and shakers (and those who want to be movers and shakers): “The quality of your company’s software will never exceed the quality of your company’s software developers.”
- DZone is back with a few interesting posts: “OpenJDK – Is Now the Time?” starts by wondering is OpenJDK is reaching critical mass to the point where it should be considered instead of the standard Oracle JDK. It’s an odd post.
- It points out that if Google had used OpenJDK instead of Oracle’s libraries, the lawsuit might not have happened (Editor’s note: it might have!). This is a good point.
- It says that the deployment options might open up, with standard package management instead of a custom update process specific to Java. This is also a good point.
- It points out that OpenJDK’s performance and scalability is the same as the Oracle JDK. This is… not a good point. The codebases are the same (they’re routinely synchronized: code in one will be in the other eventually.) Oracle’s JDK is effectively OpenJDK with some closed-source libraries, so Oracle’s JVM can write JPEGs natively (and some other features like that.)
- It also points out community improvements to OpenJDK – “As open source developer’s continue to provide insight into the source code, it is likely that OpenJDK could begin to outperform the version released by Oracle.” Um… since the codebases are the same, that’s not likely to happen much at all.
- From ##java, cheeser had a beautiful expression of reference equivalence. Someone was asking about how two references (
A
andB
) pointing to the same object work – cheeser said, “If B is your *name*, A would be a nickname. Both of them mean you so anything said to either name or nickname both go to you.
“ - “Fix PATH environment variable for IntelliJ IDEA on Mac OS X” describes a way for OSX users to provide the OS’s PATH to the popular IDE. It turns out that programs installed via
brew
aren’t necessarily available to IDEA unless you start IDEA from the shell – which few do. It’s easy to fix; this post shows you how. - Another from DZone – they’re on fire! – Per-Ã…ke Minborg posted “Overview of Java’s Escape Analysis“, which discusses what escape analysis is (it’s a way of determining the visibility of an object) and what it means for performance. (If an object isn’t used outside of a method or a block, it can be allocated on the stack rather than on the JVM heap – and as fast as the heap can be in Java, the stack is much faster.)
- Pippo is a new, very small microframework based on services. The example looks … easy enough; take a look, see what you think.
- Yet one more from DZone: Exceptions in Java: You’re (Probably) Doing It Wrong advocates the use of
RuntimeException
to get rid of those peskythrows
clauses and forcedtry
/catch
blocks in your Java code. It’s an argument Spring advocates, and checked exceptions aren’t part of languages like Scala… but I personally find the over-reliance on unchecked exceptions to be terrible. The core argument against check exceptions from the article: “The old argument is that (the use of checked exceptions) “forces†developers to handle exceptions properly. Anyone on a real code base knows that this does not happen, and Exceptions are routinely ignored, or printed out and then ignored. This isn’t some sign of a terrible developer; it is such a common occurrence that it is a sign that checked Exceptions are broken.” Except no, it’s such a common occurrence that it’s a sign that developers are terrible. This article was so terrible that I’ll probably write up a better response as soon as I get some time.
Interesting Links, 5 Feb 2016
- “O Java EE 7 Application Servers, Where Art Thou?” is a humorously-titled summary of the state of Java EE 7 deployment options, covering the full and web profiles for Java EE 7. It’s the sort of thing one wants to know, honestly: great job, Antonio.
- From Stack Overflow, “How to get started with Akka streams?” is a Scala question, not a Java one, but Akka has a Java implementation as well. The first answer (accepted, upvoted) is a fantastic explanation. I may port it to pure Java just for example’s sake…
- From our friends at DZone, Orson Charts 1.5 is Open Source announces that Orson Charts 1.5 has been released, and it’s available under the GPLv3 (a commercial license is available for people who don’t want the restrictions of the GPL). It’s a 3D charting library, not a 2D charting library, and they say if you need 2D charts, you should use JFreeChart – Orson Charts looks great on first impressions, though. (It’s worth noting that apparently both Orson Charts and JFreeChart were from the same author.)
- More from DZone: Application Security for Java Developers is a summary of security concerns. It’s really more of a short “have you thought of this?” post – useful, but not very deep.
The case of EnumSet
A few days ago ##java happened to discuss sets and bit patterns and things like that, I happened to mention EnumSet
and that I find it useful. The rest of the gang wanted to know how it actually measures up, so this is a short evaluation of how EnumSet
stacks up for some operations. We are going to look at a few different things.
EnumSet classes
There are two different versions of EnumSet
:
* RegularEnumSet
when the enum has less than 64 values
* JumboEnumSet
used when the enum has more than 64 values
Looking at the code, it is easy to see that RegularEnumSet
stores the bit pattern in one long and that JumboEnumSet
uses a long[]
. This of course means that JumboEnumSet
s are quite a lot more expensive, both in memory usage and cpu usage (at least one extra level of memory access).
Memory usage
I created a little program to just hold one million Set
s with a few values in each of them.
Note: the enumproject.zip was built by your editor, not your author – any problems with it are the fault of dreamreal and not ernimril. Note that the project is mostly for source reference and not actually running the benchmark.
List<Set<Token>> tokens = new ArrayList<> ();
for (int i = 0; i < 1_000_000; i++) {
Set<Token> s = new HashSet<> ();
s.add (Token.LF);
s.add (Token.CR);
s.add (Token.CRLF);
tokens.add (s);
}
Heap memory usage for this program was about 250 MB according to JVisualVM.
Changing the new HashSet<> ();
into EnumSet.noneOf (Token.class);
we instead get 70 MB of heap memory usage.
Using the SmallEnum
instead causes the HashSet
to still use about 250MB, but drops the EnumSet
usage down to 39 MB. I find it quite nice to save that much memory.
CPU performance
I constructed two simple tests, shown below, that calls a few methods on a Set
that is either EnumSet
or HashSet
, depending on run. The enums have a few Set
s that contain different allocations of the enum and the isX
-methods only do return xSet.contains(this);
@Benchmark
public void testRegular() throws InterruptedException {
SmallEnum s = SmallEnum.A;
boolean isA = s.isA ();
boolean isB = s.isB ();
boolean isC = s.isC ();
boolean res = isA | isB | isC;
}
@Benchmark
public void testJumbo() throws InterruptedException {
Token t = Token.WHITESPACE;
boolean isWhitespace = t.isWhitespace ();
boolean isIdentifier = t.isIdentifier ();
boolean isKeyword = t.isKeyword ();
boolean isLiteral = t.isLiteral ();
boolean isSeparator = t.isSeparator ();
boolean isOperator = t.isOperator ();
boolean isBitOrShiftOperator = t.isBitOrShiftOperator ();
boolean res =
isWhitespace | isIdentifier | isKeyword | isLiteral |
isSeparator | isOperator | isBitOrShiftOperator;
}
I did the benchmarking using jmh in order to find out how fast this is.
Using HashSet:
Benchmark Mode Cnt Score Error Units
EnumSetBenchmark.testJumbo thrpt 20 46787074.985 ± 2373288.078 ops/s
EnumSetBenchmark.testRegular thrpt 20 124474882.016 ± 2165015.166 ops/s
Using EnumSet:
Benchmark Mode Cnt Score Error Units
EnumSetBenchmark.testJumbo thrpt 20 112456096.790 ± 320582.588 ops/s
EnumSetBenchmark.testRegular thrpt 20 563668720.636 ± 594323.541 ops/s
This is of course quite a silly test and one can argue that it does not do very much useful, but it still gives us quite a good indication that performance gains are there. Using EnumSet is 2.4 times faster for jumbo enums, but 4.5 times faster for small (regular) enums for this kind of operation.
I do not claim that your usage will notice the same speedup, but it might be worth checking out.
Final thoughts
Does it really matter if you use EnumSet
or Set
? In most cases: no, the enum will only be one field and not part of memory usage or cpu consumption, but depending on your use case it can be a nice memory saver while also being faster. I recommend that you use it.