How (not) to extend standard collection classes

Today in ##java, someone had a problem: They wanted a bounded-size PriorityQueue. One solution offered was extending the PriorityQueue class to add this behavior1.
This article will talk about why that is not a good solution.
Here is an example implementation of this approach:

public class BoundedPriorityQueue<E> extends PriorityQueue<E> {
    private final int bound;
    public BoundedPriorityQueue(Comparator<? super E> comparator, int bound) {
        super(comparator);
        this.bound = bound;
    }
    @Override
    public boolean add(E e) {
        if (size() >= bound) { return false; }
        return super.add(e);
    }
}

Now… This looks somewhat correct, if you don’t pay attention. But it isn’t! The important thing missed here is that the Queue.offer(E) method is not overridden, and gives the user a way to put this queue into an invalid state (too many items).
In this particular case this is not a huge issue. There are only two methods, offer and add, that increase the size of a PriorityQueue, and add happens to delegate to offer. The important thing to get out of this example is that this bug can happen in the first place, and that the compiler – or runtime – won’t warn you about it.

Extending ArrayList

Let’s get to a more problematic example. We want to have a List that can only be added to, but never removed from.

public class AddOnlyArrayList<E> extends ArrayList<E> {
    @Override public E remove(int index) { throw new UnsupportedOperationException(); }
    @Override public boolean remove(Object o) { throw new UnsupportedOperationException(); }
    @Override public boolean removeAll(Collection<?> c) { throw new UnsupportedOperationException(); }
    @Override public void clear() { throw new UnsupportedOperationException(); }
    @Override public Iterator<E> iterator() { return Iterators.unmodifiableIterator(super.iterator()); }
    @Override public List<E> subList(int fromIndex, int toIndex) { return ...; }
    @Override public ListIterator<E> listIterator() { return ...; }
    @Override public ListIterator<E> listIterator(int index) { return ...; }
}

Hey, this is good, right? All remove methods and clear accounted for, iterators and subList also implemented, should be fine! Well, that’d be right. In Java 7.
With Java 8, Collection.removeIf(Predicate<E>) was added. This is a default method on Collection that normally only delegates to iterator, but ArrayList has its own, optimized version of it.
Suddenly, just with a JDK update, our class breaks and we may not even notice it. The issue is pretty subtle – it may never appear, and when it does happen, the only way we’ll notice is that our “AddOnly”ArrayList suddenly is shrinking because someone elsewhere decided to use that fancy new removeIf method.
The core problem with both these examples is that extension is fragile. A class has to be designed with extension in mind if it’s supposed to be extensible this way. While most standard library classes aren’t final, many of them are not intended to be extended, or only offer hooks for some specific forms of extension.

How to do it properly

You hear this repeated often, but many do not really seem to understand why: Prefer composition over inheritance. Here’s our AddOnlyArrayList with composition:

public class AddOnlyArrayList<E> extends AbstractList<E> {
    private final List<E> elements = new ArrayList<>();
    @Override public E get(int index) { return elements.get(index); }
    @Override public int size() { return elements.size(); }
    @Override public E set(int index, E element) { return elements.set(index, element); }
    @Override public void add(int index, E element) { elements.add(index, element); }
}

remove is unsupported by this class, but importantly, this can never change. The elements list we’re delegating to is fully encapsulated, and we control all access to it. There’s no way for a future change in List, AbstractList or even ArrayList itself to allow removal of elements. This approach is a lot less fragile than extending ArrayList itself.
Here’s our implementation of BoundedPriorityQueue following the same principle:

public class BoundedPriorityQueue<E> extends AbstractQueue<E> {
    private final Queue<E> elements;
    private final int bound;
    public BoundedPriorityQueue(Comparator<? super E> comparator, int bound) {
        this.elements = new PriorityQueue<>(comparator);
        this.bound = bound;
    }
    @Override public Iterator<E> iterator() { return elements.iterator(); }
    @Override public int size() { return elements.size(); }
    @Override public E poll() { return elements.poll(); }
    @Override public E peek() { return elements.peek(); }
    @Override
    public boolean offer(E e) {
        if (elements.size() > bound) return false;
        return elements.offer(e);
    }
}

Disadvantages

Of course, delegation has its disadvantages.

Memory overhead

There’s an additional object for the wrapper. This isn’t really a problem in practice, though.

Performance

You can’t take advantage of all optimizations a class may introduce in the future. ArrayList.removeIf is a great example again – the delegating class will only delegate a subset of methods, falling back to the potentially unoptimial behavior of AbstractList for others. But importantly, while this may be slower, it’s always correct – if it becomes a problem, just delegate that method as well. At least it won’t suddenly lead to broken behavior.

Missing Optional Methods

It’s possible to miss certain optional methods when delegating. In the standard AbstractList example, it’s very easy to forget to delegate the set(int, E) method since it isn’t used very often.
However, this is usually fail-fast – it’ll throw an Exception and you’ll notice it – and much preferrable to the unexpected and subtle results wrong extension may produce.

Additional Code

Especially with large interfaces and interfaces where you don’t have a handy stub class like AbstractList available, there can be quite a lot of methods to delegate. This is probably the biggest problem with delegation and a good reason to keep interfaces small.

Tooling

Your IDE can assist you with generating delegation methods.
If you are a fan of lombok, it offers an experimental @Delegate annotation. But beware: Using this annotation with only excludes defined can lead to the very same issue! Take the example from the lombok website:

class ExcludesDelegateExample {
  long counter = 0L;
  private interface Add {
    boolean add(String x);
    boolean addAll(Collection<? extends String> x);
  }
  @Delegate(excludes=Add.class)
  private final Collection<String> collection = new ArrayList<String>();
  public boolean add(String item) {
    counter++;
    return collection.add(item);
  }
  public boolean addAll(Collection<? extends String> col) {
    counter += col.size();
    return collection.addAll(col);
  }
}

Here we are missing proper handling for the listIterator.add and corresponding subList methods – we can enter invalid states and can have subtle bugs. Be very careful when using this annotation!

Conclusion

Don’t extend classes that weren’t meant to be extended. A class not being final does not give us a free pass – non-final is the default in java, and many programmers do not design their classes with extension in mind (and that’s fine).
Before extending a class with actual logic in it, think about alternatives. Can you use delegation instead? With collections and their convenient Abstract* stub implementations it’s easy, but with small interfaces you may not even need that.


Immutability in Java

Recently in ##java, we had a discussion on what constitutes “immutability” and how to implement it in java properly. Immutability is a core concept for functional programming, and as mainstream languages and ecosystems “catch up” with the safer, less error-prone ideas from functional programming, immutability is wanted in Java projects as well.

What is immutability?

Asking programmers for their definition of immutability, you will usually receive an answer such as this:

An object is immutable if its state never changes.

In this article we will look at different interpretations of that statement and see how to implement it in java.

Unmodifiable vs. Immutable

An unmodifiable object is an object that cannot be mutated. This does not mean it cannot change through other means:

List<String> inner = new ArrayList<>();
inner.add("foo");
List<String> unmodifiable = Collections.unmodifiableList(inner);
System.out.println(unmodifiable); // [foo]
inner.add("bar");
System.out.println(unmodifiable); // [foo, bar]

Unmodifiable objects are generally not considered “truly” immutable. An immutable version of the code above (using Guava‘s ImmutableList) would be:

List<String> inner = new ArrayList<>();
inner.add("foo");
List<String> immutable = ImmutableList.copyOf(inner);
System.out.println(immutable); // [foo]
inner.add("bar");
System.out.println(immutable); // [foo]

Polymorphism

Designing an immutable class to be extended is tricky business. The base class must either relax its immutability or make basically all of its methods final which makes polymorphism basically useless. Relaxing immutability guarantees will often lead to only an unmodifiable class:

public class Vec2 {
    private final int x;
    private final int y;
    public Vec2(int x, int y) {
        this.x = x;
        this.y = y;
    }
    public int getX() { return x; }
    public int getY() { return y; }
    @Override public int hashCode() { return getX() * 31 + getY(); }
    @Override public boolean equals(Object o) { ... }
}

This class appears immutable at first glance, but is it really?

public class MutableVec2 extends Vec2 {
    // redundant fields because super fields are private and final
    private int x;
    private int y;
    public MutableVec2(int x, int y) {
        super(x, y);
        this.x = x;
        this.y = y;
    }
    @Override public int getX() { return x; }
    @Override public int getY() { return y; }
    public void setX(int x) { this.x = x; }
    public void setY(int y) { this.y = y; }
}

We’ve suddenly introduced mutability, and not just that, we also override the two getters, so users of the original “immutable” Vec2 class cannot rely on instances of Vec2 never changing state when using them. This makes Vec2 effectively unmodifiable only, not immutable. Immutable classes should be final.
Limited polymorphism is possible when necessary, for example through a package-private constructor so immutability can be strictly controlled. This is error-prone and should be reserved for exceptional cases.

Defensive copies

Immutability itself does not require immutability on class members (in most definitions). However, in many cases where immutability is used, making sure that members also stay unmodified is desired. This can be done through “defensive copies”:

public class ImmutableClass {
    private final List<String> list;
    public ImmutableClass(List<String> list) {
        this.list = Collections.unmodifiableList(new ArrayList<>(list));
        // when using guava:
        //this.list = ImmutableList.copyOf(list);
    }
    // getter, equals, hashCode
}

Internal state

Internal mutable state is allowed in immutable classes. One use case of this is caching. In OpenJDKs java.lang.String class, which is immutable, the hashCode is cached using this method:

public class String {
    ...
    private int hash; // defaults to 0. not explicitly initialized to avoid race condition (see section below)
    @Override
    public int hashCode() {
        int h = hash;
        if (h == 0) {
            hash = h = computeHashCode();
        }
        return h;
    }
}

(This isn’t the actual implementation but it’s the same principle. Want to see this as implemented in an actual Java runtime library? Here it is.)
To users of this class, the internal state is never visible – they do not see a change in behavior. Because of this, the class still counts as immutable.

Concurrency

Java concurrency is a fickle beast, and immutability is an important tool to avoid the shared mutable state that makes concurrency so difficult. However, immutability has to be implemented carefully to work reliably in concurrent applications.
On top of the basic “state never changes” requirement we must introduce two more requirements: (from Java Concurrency In Practice. Read it, it’s a great book.)

  • All fields must be final.
  • The this reference must never escape the constructor. This is called “proper construction”.

To explore why these requirements are necessary, we will take an example straight from the Java Language Specification:

final class MaybeImmutable {
    private int x;
    public MaybeImmutable(int x) {
        this.x = x;
    }
    public int getX() { return x; }
    // the two threads
    static MaybeImmutable sharedField; // this field isn't volatile
    static void thread1() {
        sharedField = new MaybeImmutable(5);
    }
    static void thread2() {
        MaybeImmutable tmp = sharedField;
        if (tmp != null) {
            System.out.println(tmp.getX()); // could be 5... but could also be 0!
        }
    }
}

At first glance the MaybeImmutable class appears immutable – it does not provide any API to modify its state. The problem with this class is that the field x is not final. Let’s take a look at what both threads do:

// thread1
tmp = allocate MaybeImmutable
tmp.x = 5
write sharedField = tmp
// thread2
read tmp = sharedField
check tmp != null
print tmp.x

In this case, the constructor call is “just” setting the field. If you’re already familiar with the java memory model, you will notice that there is no happens-before ordering between those two threads! This makes the following execution order legal:

// thread1
tmp = allocate MaybeImmutable
write sharedField = tmp
tmp.x = 5
// thread2
// same as above

In that case, thread2 could legally print out “0”.
(Reordering is only one reason why this could happen – caches are another, for example.)
This fact makes our MaybeImmutable class not immutable: From thread2, the state of the same exact instance of MaybeImmutable can change over time. How can we fix this? By making the field x final. Looking at the JLS, this guarantees that when we can see the MyImmutable instance in the sharedField, we can also see the correctly initialized x field. thread2 will always print out 5, as we’d expect.
This visibility guarantee is only given if our class does not leak its this reference before construction is complete, which brings us to our second requirement (“proper construction”):

final class NotImmutable {
    public static NotImmutable instance;
    private final int x;
    public NotImmutable(int x) {
        this.x = x;
        instance = this;
    }
    public int getX() { return x; }
    static void thread1() {
        new NotImmutable(5);
    }
    static void thread2() {
        NotImmutable tmp = instance;
        if (tmp != null) {
            System.out.println(tmp.getX()); // could still be 0!
        }
    }
}

Even though the x field is final in this case, thread2 can see an instance of NotImmutable before its constructor completes, meaning it can still see 0 as the value of x.
Interestingly, setting the x field to volatile in doesn’t help in either example (NotImmutable or MaybeImmutable). The guarantees of final fields are actually different than those of volatile fields here. This is covered in Aleksey Shipilëv’s article “Close Encounters of The Java Memory Model Kind.” This article, together with “Java Memory Model Pragmatics,” is a great resource to learn more about the Java memory model.

Builders

Immutable classes, especially those with many fields, can be difficult to construct – the Java language lacks named parameters. This can be made somewhat easier through the builder pattern, which the following example demonstrates (though only with a single field):

public class ImmutableClass {
    private final String url;
    private ImmutableClass(String url) {
        this.url = url;
    }
    public String getX() {
        return url;
    }
    public static class Builder {
        private String url;
        public Builder url(String url) {
            this.url = url;
            return this;
        }
        public ImmutableClass build() {
            return new ImmutableClass(url);
        }
    }
}
ImmutableClass myInstance = new ImmutableClass.Builder().url("https://yawk.at/").build();

This pattern provides an easy way of constructing very large immutable classes – builders are infinitely more readable than constructor calls such as new ImmutableClass("GET", 6, true, true, false, 5, null). Most IDEs also include utilities to quickly generate such builders.

Lombok

The wonderful project lombok provides a very easy way of writing immutable classes in a few lines of code via the @Value annotation:

@Value
public class ImmutableClass {
    private final String url;
}

This makes the fields private final (if they aren’t already), generates getters and a constructor, and makes the class final. As an added bonus it also generates an equals and hashCode for you. On top of that, you can use the @Builder annotation to also let it generate a builder for you, and the experimental @Wither annotation to generate copy mutators.
To see what kind of code Lombok can generate you can either view the examples on their website or use my javap pastebin to see decompiled lombok-generated code (make sure to select “Procyon” on the bottom-right).

Serialization

Immutable classes do not follow the traditional Java bean layout: They do not have a zero-argument constructor and they do not have setters. This makes serialization using reflection difficult. There are solutions, though – an immutable class constructor can be annotated with serialization-framework-specific annotations to give the framework information on which parameter corresponds to which object property.
We will look at the jackson-databind annotations first:

public final class Vec2 {
    private final int x;
    private final int y;
    @JsonCreator
    public Vec2(@JsonProperty("x") int x, @JsonProperty("y") int y) {
        this.x = x;
        this.y = y;
    }
    public int getX() { return x; }
    public int getY() { return y; }
}

A more framework-agnostic solution exists using the java.beans.ConstructorProperties annotation (not present in android) which is supported by Jackson since version 2.7.0:

public final class Vec2 {
    private final int x;
    private final int y;
    @ConstructorProperties({ "x", "y" })
    public Vec2(int x, int y) {
        this.x = x;
        this.y = y;
    }
    public int getX() { return x; }
    public int getY() { return y; }
}

This annotation is also generated on Lombok constructors by default making serializing Lombok-generated immutable classes hassle-free with jackson 2.7+.
Gson takes another approach to deserialization of objects without a zero-argument constructor: sun.misc.Unsafe provides a method to allocate an instance of a class without calling its constructor. Gson then proceeds to reflectively set fields of the immutable object. However, Unsafe is unsupported API and the OpenJDK project is attempting to phase out the class and may remove it entirely in a future JDK version, so this solution should be avoided.
Immutability in Java is not as simple in Java as it might appear at first but with the right tools and patterns, it can be very beneficial to your code base, making your application easier to read, debug and extend and making parallelism much simpler. Be careful of accidentally adding mutability and make use of tools like Lombok to reduce boilerplate associated with immutable classes in Java.

Interesting Links – 26 Sep 2016

  • Oh, shit, git! is a site that shows a number of common things git users say, as well as how to fix the situations that cause them. Note that the title indicates that the readers are expected to be of age. Some examples:
    • Oh shit, I did something terribly wrong, please tell me git has a magic time machine!?!
    • Oh shit, I committed and immediately realized I need to make one small change!
    • Oh shit, I need to change the message on my last commit!
    • Oh shit, I accidentally committed something to master that should have been on a brand new branch!
    • Oh shit, I accidentally committed to the wrong branch!
    • Oh shit, I tried to run a diff but nothing happened?!
    • F*ck this noise, I give up.
      It’s worth noting that your editor has used at least a variant of all of these statements, most of them even in recent memory.
  • User whaley, in a moment when he wasn’t accompanied by roosters, cows, and goats, pointed out “Suffering-oriented programming“, saying that it convinced him that “make it work, make it pretty, make it fast” is the right approach, as compared to “make it work, make it fast, make it pretty.” Good read.
  • From Reddit: “Recaf is an open-source framework for authoring extensions (dialects) as libraries for Java. You can redefine every major syntactic element of the language, either add new ones or create your own flavor of Java that matches your needs. It can be used to give syntactic support to libraries, to generate and instrument code. Last but not least you can experiment with the design and implementation of Java extensions in plain Java.” Seems like another tool to compete (sort of) with Lombok and Autovalue.