Code Complexity

This morning, a user asked a question about determining the equality of three values on ##java. The code he offered as a test was as follows, roughly:

boolean a=false, b=false, c=false;
System.out.println(a == b == c);

Rather than determining if the three values are equivalent, this code checks to see if a is the same as b – with the result of true – and then checks to see if this result is the same as c – so it tries to see if a == b is false. It’s not, so the result of the expression is false. The disassembled code shows it, too:

       0: iconst_0
       1: istore_1
       2: iconst_0
       3: istore_2
       4: iconst_0
       5: istore_3
       6: getstatic     #2    // Field java/lang/System.out:Ljava/io/PrintStream;
       9: iload_1
      10: iload_2
      11: if_icmpne     18
      14: iconst_1
      15: goto          19
      18: iconst_0
      19: iload_3
      20: if_icmpne     27
      23: iconst_1
      24: goto          28
      27: iconst_0
      28: invokevirtual #3    // Method java/io/PrintStream.println:(Z)V
      31: return

This is all well and good, and the original poster was shown code that would work: (a==b && b==c). However… there were other solutions offered. They include:

  • (a == b) == c
  • (a ^ b) == c
  • a ^ b ^ c
  • isSame(a,b,c)

All together now: ugh.
But why? Which ones of those work? Which ones don’t work? (You should probably try to give this some thought before continuing. Be honest with yourself about your answers: nobody else is watching.)
It doesn’t matter.
The reason comes down to code complexity. The simplest solution (a==b && b==c) lacks a certain elegance, I suppose: it’s very straightforward and very, very simple. The other solutions appeal to a certain mentality, the one that says that you have to know something to use this code; you have to think about them some.
You might not have to think much – but only one has the chance of being right, the isSame() method, and that assumes it works properly.
Smart coders will code simply; gauge code by the reward it should give. isSame(), if it accepts multiple types of sequences and has variable arity, might be okay if you can reuse it in multiple scenarios (and it’s needed quite a bit in your code, I guess) — but the others are too complex to really pass a good code review.

Tags on javachannel.org

Articles on javachannel.org are going to start using Odersky’s “Scala Levels” as tags for new content (with old content being rated as they get maintained over time, possibly).
The levels are in two groups: A1, A2, A3, and L1, L2, and L3. The “A” stands for “application;” the “L” stands for “library.” Put simply, the features that tend to be found in applications and libraries are different; an application can prefer concrete types, whereas a library tends not to (if able); the skills and knowledge for writing applications or libraries are different.
An A1 programmer is a beginner at Java; an L3 programmer is expected to really know the language and its features well.
The skills are grouped like: A1, A2/L1, A3/L2, L3. The idea being expressed here is that before you should design a library, you need to be moderately skilled at Java – a beginner shouldn’t bother worrying about expressing ideas in a library.
Odersky actually grouped concepts for Scala in each level (for which he’s gotten some scathing criticism from Tony Morris, for example); eventually, I’d like to have the same kind of groupings for Java, if only to establish a baseline (which is what Odersky was doing, and what Morris apparently missed, in his quest for overreaction. Tony’s a brilliant guy, but like so many other brilliant people in this field, he’s desperate to find points of contention, and then wring them dry. Astute readers might note that Tony’s critique has no way through which to offer commentary…)
So: What you’ll start seeing is tips on javachannel.org being grouped by these levels by tags; the higher the level, the more complex the content is.

Link: The reason to use a buffered stream in Java

The use of buffered streams in Java has come up a few times on ##java recently, so it might be worth reading something that one of the channel residents wrote up about the topic late last year: “The reason to use a buffered stream in Java.”
Aside from being generally useful information, it actually has a test; the conclusion is that using a buffered stream is usually wise because it helps avoid system calls. The test shows this by spending four minutes making system calls, and only one and a half minutes in the actual program code.
Well done.

Detecting Roots in a Graph, and a challenge

How would you detect roots in a simple directed graph? I was recently asked this question, with varying parameters, and there’s an aspect of the problem for which I have no satisfying solution.
A root is a node which points to other nodes, but has no nodes pointing to it; in our simple problem, a root has to point to something, so we don’t have to worry about handling nodes hanging out in space. The relationship will be referred to as a root and a leaf, such that a leaf is a node that is being “pointed to” in a given relationship.
We can visualize the nodes like this:

0,1
0,2
1,2
3,1
4,3

Graphed, it would look like this:
simpledirectedgraph
Our root nodes would be 0 and 4 – the only nodes for which there are no inbound edges.
Without digging too much into rationales, let’s create a simple interface with default behaviors that accomplish our task very simply:

public interface GraphUtility {
  public default Set arrayToSet(int[] data) {
    Set set = new HashSet<>();
    Arrays.stream(data).forEach(set::add);
    return set;
  }
  public default Set findRoots(int[][] graph) {
    Set roots = new HashSet<>();
    Set leaves = new HashSet<>();
    Arrays.stream(graph).forEach(d -> {
      roots.add(d[0]);
      leaves.add(d[1]);
    });
    roots.removeAll(leaves);
    return roots;
  }
}

We’re also going to write a wrapper that keeps track of elapsed time. It’s not great code – multithreading with an instance would be horrible – but it’ll do for what we want.

public class TimingWrapper {
  long elapsedTime=0;
  public Set time(Function> findRoots, int[][] graph) {
    Set set;
    long start=System.nanoTime();
    set=findRoots.apply(graph);
    long end=System.nanoTime();
    elapsedTime=end-start;
    return set;
  }
  @Override
  public String toString() {
    return "TimingWrapper{" +
        "elapsedTime=" + elapsedTime +
        '}';
  }
}

Now let’s build our test. We’re using an anonymous implementation of GraphUtility because the interface has our desired functionality. The wrapper’s time is output on System.out, and isn’t part of the actual test; the numbers would vary based on the actual runtime platform, so we don’t want to rely on the runtime as a metric for success.

GraphUtility util = new GraphUtility() {
};
@Test
public void firstTest() {
  TimingWrapper wrapper = new TimingWrapper();
  int[][] graph = {{0, 1}, {0, 2}, {1, 2}, {3, 1}, {4, 3}};
  Set knownRoots = util.arrayToSet(new int[]{0, 4});
  Set roots = wrapper.time(util::findRoots, graph);
  assertEquals(roots, knownRoots);
  System.out.println(wrapper);
}

Invoke this method enough times, and you’ll see some interesting fluctuations in runtime; on my system, the runtime was around 2500 nanoseconds, which is simply amazing. (There were some outliers, largely based on garbage collection.)
However, that’s a tiny, tiny graph. What happens when we store more nodes? Let’s try it with .. oh, 62,000 nodes:

@Test
public void secondTest() {
  TimingWrapper wrapper = new TimingWrapper();
  int[][] graph = new int[62000][];
  for (int i = 0; i < 62000; i++) {
    graph[i] = new int[]{r.nextInt(7000), r.nextInt(7000)};
  }
  //Set knownRoots = util.arrayToSet(new int[]{0, 4});
  Set roots = wrapper.time(util::findRoots, graph);
  sum += roots.size();
  System.out.println(wrapper);
}

Repeating this test yielded more valuable numbers, and more variant ones, too: we’re now looking at four million nanoseconds for an invocation – so we’re still able to pick out the root nodes in four milliseconds, which isn’t too bad at all.
We can do more with this, too, before we even get to our challenge. Let’s see what happens if we use a database. (If you want to skip ahead to the challenge, feel free to do so; this next section fell out of the sky just because I was interested in seeing how well databases might handle the process.)
We’re going to use a constant graph – built using a Random with a consistent seed – that happens to have ten root nodes in it. First, the code that builds the graph:

public class ConstantGraph {
  int[][] dataSet;
  private ConstantGraph() {
    // DO NOT CHANGE THIS SEED!
    Random random = new Random(1029);
    dataSet = new int[62000][];
    for (int i = 0; i < dataSet.length; i++) {
      int[] datum = new int[2];
      datum[0] = random.nextInt(9000);
      datum[1] = random.nextInt(9000);
      dataSet[i] = datum;
    }
  }
  public int[][] getDataSet() {
    return dataSet;
  }
  public final static ConstantGraph instance = new ConstantGraph();
}

Now we'll build an enum that has all of our SQL isolated for three configurations: embedded Derby, embedded H2, and external Derby. This is really ugly, so we're going to leave it up to the Github repository to hold the actual source and we'll only echo the structure here:

public enum DatabaseInterface {
  DERBYEMBEDDED(...),
  DERBY(...),
  H2(...);
  final String url;
  final String[] ddl;
  final String insert;
  final String dbRootSelection;
  final String dbRowSelection;
}

Now let’s build a test that walks through the enums, testing two different mechanisms of determining the root nodes.
The simple mechanism mirrors what we’ve already seen, simply retrieving the roots and leaves and storing them in two Sets; it then removes the leaves from the set of roots. Therefore, it’s mostly testing how quickly the database can retrieve the data.
The other method is a little more interesting; it does everything in SQL. Here’s the query (which happens to be the same for H2 and Derby):

select distinct root from graph g
  where root not in (select distinct leaf from graph)

Exciting stuff – and chances are that a good DBA can optimize this, honestly (and I’d love to see what a DBA would suggest.) It works pretty well so far, although it’s not quick, as we’ll see.
Here’s our test:

public class ITDatabaseGraphTest {
  ConstantGraph graph = ConstantGraph.instance;
  @Test
  public void testAgainstDatabaseSimple() throws SQLException {
    System.out.println("Simple query test----------------------");
    for (DatabaseInterface database : DatabaseInterface.values()) {
      System.out.println("Testing against "+database.url);
      populateDatabase(database);
      try (Connection conn = DriverManager.getConnection(database.url, "sa", "sa")) {
        try (PreparedStatement ps = conn.prepareStatement(database.dbRowSelection)) {
          Set roots=new HashSet<>();
          Set leaves=new HashSet<>();
          long start = System.nanoTime();
          ResultSet rs=ps.executeQuery();
          while(rs.next()) {
            roots.add(rs.getInt(1));
            leaves.add(rs.getInt(2));
          }
          roots.removeAll(leaves);
          assertEquals(roots, graph.arrayToSet(graph.getRoots()));
          long end = System.nanoTime();
          System.out.println((end-start)+" nanoseconds ("+(end-start)/1000000000.0+" ms)");
        }
      }
    }
  }
  @Test
  public void testAgainstDatabaseComplex() throws SQLException {
    System.out.println("Complex query test---------------------");
    for (DatabaseInterface database : DatabaseInterface.values()) {
      System.out.println("Testing against "+database.url);
      populateDatabase(database);
      try (Connection conn = DriverManager.getConnection(database.url, "sa", "sa")) {
        try (PreparedStatement ps = conn.prepareStatement(database.dbRootSelection)) {
          Set roots=new HashSet<>();
          long start = System.nanoTime();
          ResultSet rs=ps.executeQuery();
          while(rs.next()) {
            roots.add(rs.getInt(1));
          }
          assertEquals(roots, graph.arrayToSet(graph.getRoots()));
          long end = System.nanoTime();
          System.out.println((end-start)+" nanoseconds ("+(end-start)/1000000000.0+" ms)");
        }
      }
    }
  }
  private void populateDatabase(DatabaseInterface database) throws SQLException {
    try (Connection conn = DriverManager.getConnection(database.url, "sa", "sa")) {
      conn.setAutoCommit(false);
      for (String ddl : database.ddl) {
        try (PreparedStatement ps = conn.prepareStatement(ddl)) {
          ps.executeUpdate();
        } catch (SQLException ignored) {
          System.out.println(ignored.getMessage());
        }
      }
      int counter=0;
      try (PreparedStatement ps = conn.prepareStatement(database.insert)) {
        for (int[] data : graph.getDataSet()) {
          ps.setInt(1, data[0]);
          ps.setInt(2, data[1]);
          try {
            ps.executeUpdate();
            counter++;
            if(counter%250==0) {
              conn.commit();
            }
          } catch(SQLException ignored) {
            // we know we have 27 duplicates.
          }
        }
      } catch (SQLException ignored) {
        System.out.println(ignored.getMessage());
      }
      conn.setAutoCommit(true);
    }
  }
}

On my system, I got this output, with the insert process taking roughly 40 seconds:

Complex query test---------------------
Testing against jdbc:derby:foo;create=true
576525951 nanoseconds (0.576525951 ms)
Testing against jdbc:derby://localhost:1527/foo;create=true
527347131 nanoseconds (0.527347131 ms)
Testing against jdbc:h2:./foo
3659439329 nanoseconds (3.659439329 ms)
Simple query test----------------------
Testing against jdbc:derby:foo;create=true
38152214 nanoseconds (0.038152214 ms)
Testing against jdbc:derby://localhost:1527/foo;create=true
67817224 nanoseconds (0.067817224 ms)
Testing against jdbc:h2:./foo
115023709 nanoseconds (0.115023709 ms)

The Challenge

So what we were able to see is that it’s fairly easy to process 62,000 edges with a straight linear process, even factoring in the time a database adds to the mechanism. However, how would you make the root detection multithreaded? What would be a good way to get the detection time down, based on how many cores you could throw at it?
At 620,000 edges, the simple test runs in around 27,000,000 nanoseconds (or 27ms); could you shrink that down? What about 6.2 million edges? What would happen if you tried to add map/reduce? Could you do so?

Watching HotSpot in action, deductively

Today someone asked about the size of java source files, because they wanted to know if they could unroll a loop without making javac throw up, I think. It’s an interesting question, but the main thing that caught the channel’s attention was the claim that an unrolled loop was orders of magnitude faster than the loop itself, such that the unrolled loop took 0.1 millisecond and the normal loop took 0.5 milliseconds – for a loop that simply ran 10000 times.
For one thing, the idea that a loop that runs 10000 times takes 0.1 ms sounds absurd; computers are fast. A Commodore 64 could run that loop in ten seconds. (Actually, after testing it ended up running in roughly twelve seconds. This is the first BASIC code I’ve written in close to thirty years.)
c64timing
Let’s clear one thing out first: java methods – not classes – can be up to 64K. (if you want to know the actual details, see Limitations of the Java Virtual Machine.) I don’t know of a source limitation, although the idea of writing a 64K source file sounds horrifying and rather difficult to manage; I’m sure some enterprising soul can try to determine the largest java source file that can be written such that it still compiles to a valid .class file.
So let’s dig in for a bit, and see what Java does, from the standpoint of observation. Let’s write a loop and time it, with a number of iterations, to see what the JVM actually does to the code as it runs.

This was run on an eight-core Windows 8.1 machine, using the Java 8 virtual machine. Your results may vary based on your clock speed, memory, system load, operating system, and JVM. This should be obvious, but it’s worth remembering.

Here’s our source code, which contains two classes:

package org.javachannel;
import java.util.ArrayList;
import java.util.LinkedList;
import java.util.List;
import java.util.OptionalDouble;
import java.util.function.Consumer;
public class Thing {
  public static void main(String[] args) {
    Thing t = new Thing();
    for (int i = 0; i < 1500; i++) {
      t.measure(i);
    }
    t.display();
  }
  private void display() {
    SlidingAverage sa = new SlidingAverage();
    slide.stream().forEach(sa);
    sa.display();
    System.out.println("sum: " +
        artifact.stream().mapToInt(Integer::intValue).sum());
  }
  List slide = new ArrayList<>(2000);
  List artifact = new ArrayList<>(2000);
  private void measure(int i) {
    int sum = 0;
    long start = System.nanoTime();
    for (int j = 0; j < 10000; j++) {
      sum += j % 3;
    }
    long end = System.nanoTime();
    slide.add((end - start));
    artifact.add(sum);
  }
}
class SlidingAverage implements Consumer {
  List integers = new LinkedList<>();
  List averages = new ArrayList<>();
  @Override
  public void accept(Long value) {
    integers.add(value);
    if (integers.size() > 30) {
      integers.remove(0);
      OptionalDouble od = integers.stream().mapToLong(Long::longValue).average();
      if (od.isPresent()) {
        averages.add((long) od.getAsDouble());
      }
    }
  }
  public void display() {
    int count = 1;
    for (Long average : averages) {
      if (count % 30 == 0) {
        System.out.printf("%-5d %-7d%n", count, average);
      }
      count++;
    }
  }
}

The most important part of all of this is the loop itself:

  private void measure(int i) {
    int sum = 0;
    long start = System.nanoTime();
    for (int j = 0; j < 10000; j++) {
      sum += j % 3;
    }
    long end = System.nanoTime();
    slide.add((end - start));
    artifact.add(sum);
  }

This runs the inner block 10000 times, adding to the result based on the loop counter; this should prevent the JVM from optimizing the loop out of existence. It also adds a good bit to our runtime, as we'll see, but this is less of an impact than you might think.
The rest of the code is really me trying to be fancy, showing a sliding scale of averages; basically, every "average" is based on the prior thirty samples (where each sample is an iteration through the loop.)
So what did this show me when it ran? Here's some output:

30    56051
60    53855
90    54397
120   55338
150   54269
180   55965
210   55495
...
600   56820
630   64889
660   64304
690   50177
720   9536
750   9536
780   10448
810   9522
840   9536
870   10235
...
1440  9508
1470  9508
sum: 14998500

So what this is showing us is that the average of our first thirty times through the loop was ~56000 nanoseconds. A nanosecond is one billionth of a second, so there are 1000000 nanoseconds in a single millisecond; it’s saying that it took 0.056 milliseconds to run the loop the first time through, on average.
It stabilized for a while, until it ran the loop roughly 700 times; then the average dropped down to roughly 10000 nanoseconds.
If you remove the sum += j%3 from the loop, replacing it with sum += 1, the results are dramatic to say the very least: after 180 times through the loop, the average drops down to around 15 nanoseconds.

So what can we conclude?

I was actually surprised by the results; I expected the averages to fall, but not as quickly in the run as they actually did. I expected Hotspot to compile the code to machine language pessimistically, then run through two more levels of optimization before finally settling in on an optimal variation.
Instead, we saw three levels of optimization (as opposed to four). The first is difficult to see – it’s that very first value, which hides a number of calls (roughly ten) that actually took 100000 nanoseconds. We then stabilized around 56000 nanoseconds, and then dropped 80% of our runtime for each loop.
It’s safe to say that Hotspot did a good job on this code – maybe not perfect, but pretty darned close. I didn’t run this on a JVM with debugging enabled, so I don’t know what actual phases of compilation were used.
It’s also safe to say that these numbers are not reliable. Benchmarks are notoriously difficult to write, and even more notorious in Java (because it compiles and optimizes in multiple stages, as we see here), and even more notorious when it’s a microbenchmark like this one. ~10000 ns sounds like an incredibly long time for such a simple loop (“it’s only four or five lines!”) — but it really isn’t. (For reference, a normal blink of an eye takes around 300-400 milliseconds – or 300,000,000 nanoseconds.)

Tomcat and JNDI

Let’s consider the standard baseline Java EE application: it’s a web application, with some kind of framework, using JDBC or something that wraps it.
In canonical Java EE form, it would be a JSF front end and request routing mechanism, with JPA as the data framework, with resources acquired via JNDI. (Note: “canonical Java EE form” doesn’t imply in any way that this is what people actually do, because normally people avoid JSF like the plague.)
With Tomcat, however, you are more likely to see Spring MVC or Struts 2, with Hibernate or MyBatis, talking directly to the database with an embedded database driver and connection pool.

This is anecdata with respect to the typical Tomcat deployment. We don’t pretend otherwise, and neither should you – nor should you be offended that we see this sort of thing so often that the anecdata seems pretty valid, even without formal substantiation.

One of the problems is that while Tomcat has a proper JNDI container, it’s not very easy to use. It even has documentation on setting up a JNDI datasource (which you should be doing) but it’s not complete; I followed the documentation and it did not work.
I’m not even sure why it didn’t work. With that said, I created an example that does work, and it’s on github.
So let’s walk through this sample project. I’m going to include a JNDI resource, and a servlet that uses it; the servlet will show some metadata about the JDBC driver acquired from JNDI, which should be enough of a smoke test that it validates the deployment.
The container I’m using is Tomcat 8.
My Maven configuration is quite simple, and comes directly from a Maven archetype, with the addition of the database driver and the modification of the Java version to 1.8. (Changes are shown in bold.)

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://maven.apache.org/POM/4.0.0"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <artifactId>examples-parent</artifactId>
        <groupId>org.javachannel.examples</groupId>
        <version>1.0-SNAPSHOT</version>
    </parent>
    <groupId>org.javachannel.examples</groupId>
    <artifactId>servlet-jndi-example</artifactId>
    <version>1.0-SNAPSHOT</version>
    <packaging>war</packaging>
    <name>servlet-jndi-example</name>
    <properties>
        <endorsed.dir>${project.build.directory}/endorsed</endorsed.dir>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>
    <dependencies>
        <dependency>
            <groupId>javax</groupId>
            <artifactId>javaee-web-api</artifactId>
            <version>7.0</version>
            <scope>provided</scope>
        </dependency>
        <dependency>
            <groupId>org.hsqldb</groupId>
            <artifactId>hsqldb</artifactId>
            <version>2.3.2</version>
        </dependency>
    </dependencies>
    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.1</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                    <compilerArguments>
                        <endorseddirs>${endorsed.dir}</endorseddirs>
                    </compilerArguments>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-war-plugin</artifactId>
                <version>2.3</version>
                <configuration>
                    <failOnMissingWebXml>false</failOnMissingWebXml>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-dependency-plugin</artifactId>
                <version>2.6</version>
                <executions>
                    <execution>
                        <phase>validate</phase>
                        <goals>
                            <goal>copy</goal>
                        </goals>
                        <configuration>
                            <outputDirectory>${endorsed.dir}</outputDirectory>
                            <silent>true</silent>
                            <artifactItems>
                                <artifactItem>
                                    <groupId>javax</groupId>
                                    <artifactId>javaee-endorsed-api</artifactId>
                                    <version>7.0</version>
                                    <type>jar</type>
                                </artifactItem>
                            </artifactItems>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>
</project>

All that’s well and good; now let’s take a look at our servlet, the thing that serves as our smoke test. We’re not including the import or package statements, for brevity.

@WebServlet(name = "example", urlPatterns = "/exampleservlet")
public class ExampleServlet extends HttpServlet {
    @Resource(name = "java:comp/env/jdbc/hsqldb")
    DataSource dataSource;
    @Override
    protected void doGet(HttpServletRequest req, HttpServletResponse resp)
            throws ServletException, IOException {
        PrintWriter out = resp.getWriter();
        resp.setContentType("text/plain");
        try (Connection conn = dataSource.getConnection()) {
            DatabaseMetaData dmd = conn.getMetaData();
            out.printf("We hit the servlet.%nCan we get a data source? %s%n"
                    + "   (instance is %s)%n",
                    dataSource == null ? "no" : "yes",
                    dataSource == null ? "null" : dataSource.toString());
            out.printf("Database version: %d.%d%n",
                    dmd.getDatabaseMajorVersion(),
                    dmd.getDatabaseMinorVersion());
        } catch (SQLException e) {
            e.printStackTrace(out);
        }
    }
}

Note the @Resource in that servlet – it says that the container is responsible for doing a JNDI lookup (using the name “java:comp/env/jdbc/hsqldb“) with the right type (javax.sql.DataSource). We don’t have any other magic to perform… except that we don’t have that name configured anywhere yet.
The way this is done is via a file called context.xml, which – as you might surmise – defines things for a specific context. You can specify contexts in a lot of ways; one way is to modify the container’s main context.xml (which handles global container changes), and you can also modify contexts on a per-host or per-application basis.
Because this is a developer-targeted web application, we’re going to make it an application-specific context; this is going to be held in a resource named /META-INF/context.xml in our deployable WAR file.
Context files are XML, so we need a root node, <Context />, because nothing’s good enough without mixed-case in XML. (Yay, Tomcat!)
What we need to do is define the actual JDBC resource, which is done with the following XML:

<Resource name="jdbc/hsqldb" auth="Container"
          type="javax.sql.DataSource"
          maxTotal="100" maxIdle="30"
          maxWaitMillis="10000"
          username="sa" password=""
          driverClassName="org.hsqldb.jdbcDriver"
          url="jdbc:hsqldb:mem:testdb"/>

This says to use the org.hsqldb.jdbcDriver class as a DataSource, located at the name “jdbc/hsqldb“. In Tomcat, if this is at the web application’s level, it gets placed in the local context (at “java:comp/env/jdbc/hsqldb“, and on deployment, our servlet will have the resource located at that name injected, and we get the exciting output of:

We hit the servlet.
Can we get a data source? yes
   (instance is org.apache.tomcat.dbcp.dbcp2.BasicDataSource@25dc1971)
Database version: 2.3

Now, this isn’t horribly useful, honestly, because it ties the resource directly into the local namespace.
You can (and probably should) put it at the general context.xml (held in $CATALINA_HOME/conf/context.xml), which should be done rarely (because the container needs to be restarted when this file is changed). Then you’d need a ResourceLink that took the global name (which in this example would be “jdbc/hsqldb“) and link it to the local name (“java:comp/env/jdbc/hsqldb“), which would fit the way most Java EE containers would (and should) do it.

Relevant Links

How to execute a group of tasks

Someone on the channel today asked about how to run a series of tasks, with a terminal condition of sorts. Basically, they wanted to run a number of tasks, and know when they were done.
The question was originally built around having a List of results (List<integer>, maybe?), counting the number of results until the number of results matched the number of tasks.
The channel suggested using an ExecutorService, particularly the shutdown() and awaitTermination() methods, with another suggestion being invokeAll() – which seemed particularly apt because the person asking didn’t want to have to build a new ExecutorService over and over again (which the shutdown() call would necessitate, although the cost of doing this shouldn’t be very high.)
Here’s an example with Java 8, of using the invokeAll() to execute a series of tasks; the task is made-up (it basically rolls three six-sided dice to determine a histogram of results, as a callback to Dungeons and Dragons). It has one dependency, on Apache’s commons-lang3.

package org.javachannel.examples.executors;
import org.apache.commons.lang3.StringUtils;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.locks.LockSupport;
import java.util.stream.IntStream;
class Worker implements Callable {
    int roll() {
        LockSupport.parkNanos((long) (Math.random() * 10000000));
        return (int) (Math.random() * 6) + 1;
    }
    @Override
    public Integer call() throws Exception {
        return roll() + roll() + roll();
    }
}
public class Main {
    public static void main(String[] args) throws InterruptedException {
        List workerList = new ArrayList<>();
        IntStream.range(0, 100000).forEach(i -> workerList.add(new Worker()));
        ExecutorService service = Executors.newCachedThreadPool();
        int[] count = new int[19];
        service.invokeAll(workerList).stream().map(f -> {
            try {
                return f.get();
            } catch (Throwable e) {
                return 0;
            }
        }).forEach(m -> count[m]++);
        IntStream.rangeClosed(3, 18).forEach(r -> {
            System.out.printf("%02d %s%n", r, StringUtils.repeat("*", count[r]/100));
        });
    }
}

This code creates a list of tasks, populates it, and then submits all of them in one call (service.invokeAll(workerList)), then uses the Stream API to populate the array of results. It looks a little more complicated than it should because of the lambda to pull the int out of the Future</integer><integer>, which is what return value for call() becomes when it’s submitted to the Executor.
It’s not a perfect example; surely others can come up with better examples. Feel free to submit them!

Returning void type from a method call

Someone on ##java asked about how to return a void value. This is a terrible question, but let’s look at it anyway, just because.
Here’s some example code they asked about, translated:

void foo(boolean bar) {
  if(bar) {
    return baz();
  }
}
void baz() {
  // do stuff here
}

The intent was to call baz() as a replacement for foo()‘s execution; the one with the question didn’t include code after the if() block, but further questions indicated that the remaining part of foo() was to be avoided.
There’s a lot of monstrosity here. The worst is “return baz();“, which won’t compile, and shouldn’t compile; void‘s whole purpose is to avoid being put on the call stack, so “return void” makes no sense.
One way to rewrite this code is obvious:

void foo(boolean bar) {
  if(bar) {
    baz();
    return;
  }
  // extra code goes here,
  // not to be executed if bar is true
}
void baz() {
  // do stuff here
}

This calls baz() and terminates the method execution immediately after. You could also do something else that’s obvious:

void foo(boolean bar) {
  if(bar) {
    baz();
  } else {
    // extra code goes here
  }
}

This has the advantage of a single termination point for the method. From “Code Complete:”

17.1 return

Minimize the number of returns in each routine. It’s harder to understand a routine if, reading it at the bottom, you’re unaware of the possibility that it returned somewhere above.
Use a return when it enhances readability. In certain routines, once you know the answer, you want to return it to the calling routine immediately. If the routine is defined in such a way that it doesn’t require any cleanup, not returning immediately means that you have to write more code.
(Content copied shamelessly from an excellent answer on StackOverflow.)

Initializing arrays of arrays: avoid fill()

A long, long time ago, on a blog far, far away, this horror was posted by someone who really should have known better (and who has since removed this section from the original post):

BTW: Ricky Clarkson pointed out that I missed a golden opportunity to use Arrays.fill() to fill in the “multiple dimensions.” Even here, you get to loop some, but here’s some grin-worthy code:

int[][][] arr = new int[10][10][10];
Arrays.fill(arr[0][0], 5);
Arrays.fill(arr[0], arr[0][0]);
Arrays.fill(arr, arr[0]);

Hey, it works. It’s retarded, but works. 🙂

This… doesn’t work. It actually copies references, so altering any of the deep references changes all of the deep references. You can see this in action with this code:

import java.util.Arrays;
public class ArrayFillBad {
    public static void main(String[] args) {
        int[][] arr = new int[2][4];
        Arrays.fill(arr[0], 5);
        Arrays.fill(arr, arr[0]);
        display(arr);
        arr[1][2] = 1024;
        display(arr);
    }
    private static void display(int[][] arr) {
        for (int[] a1 : arr) {
            String sep = "";
            for (int a2 : a1) {
                System.out.print(sep + a2);
                sep = ", ";
            }
            System.out.println();
        }
        System.out.println();
    }
}

When run, this offers this output:

5, 5, 5, 5
5, 5, 5, 5
5, 5, 1024, 5
5, 5, 1024, 5

See the two new values? Only one is supposed to have been changed.
So what’s happening here? We’re not actually copying array contents, we’re copying array references – so we have multiple copies of one array, copied across each deep reference.
Chances are very good that this is not what you wanted. We can do better than that (where “better” means “actually works,” as opposed to “causes you to fail code reviews.”)
Here’s working code; it’s “uglier” but ugly trumps broken:

public class ArrayFillGood {
    public static void main(String[] args) {
        int[][] arr = new int[2][4];
        for (int[] anArr : arr) {
            Arrays.fill(anArr, 8);
        }
        display(arr);
        arr[1][2] = 6;
        display(arr);
    }
private static void display(int[][] arr) {
        for (int[] a1 : arr) {
            String sep = "";
            for (int a2 : a1) {
                System.out.print(sep + a2);
                sep = ", ";
            }
            System.out.println();
        }
        System.out.println();
    }
}