Recently, The Java Programmer published “Difference between ArrayList and LinkedList in Java“, asserting different cases of when to prefer one List
implementation over another. It’s a link full of conventional wisdom, and while it has some good information, it’s also wrong.
Here’s the channel’s factoid on LinkedList
, as of 2017/Apr/25 (prior to it having been changed to point to the page you’re reading):
LinkedList is almost always slower than ArrayList/ArrayDeque. It might be best to use ArrayList/ArrayDeque to start. But if you have an actual performance problem, you can compare the two for your specific case. Otherwise, just use ArrayDeque. (It too has O(1) performance for insert-at-beginning, and unlike LinkedList is actually fast).
Implementation
The “Difference” page says that ArrayList
uses an internal array to represent stored objects, while a LinkedList
uses a doubly-linked list internally. In this, it’s correct.
Searching
The next point of comparison is entitled “Searching.” Here’s what it says about ArrayList
:
An elements can be retrieved or accessed in O(1) time. ArrayList is faster because it uses array data structure and hence index based system is used to access elements.
And about LinkedList
:
An element can be accessed in O(N) time. LinkedList is slower because it uses doubly linked list and for accessing an element it iterates from start or end (whichever is closer).
First off, in the interest of honesty, the last bit of information – that it iterates from either forward or backward, depending on which is closer – was a surprise to me (although it’s perfectly logical.) It’s also accurate.
However, this isn’t searching – it’s element access. Access and searching are different things; it’s get(42)
vs. list.contains(matchingObject)
. This is important; contains()
will have O(N) time for either List
implementation, whereas ArrayList
will have O(1) for get()
, and LinkedList
will have O(N) for get()
.
The thing is: LinkedList
is not only O(N) for get()
but the nature of the internal implementation makes it slower than one might think, because of cache locality. In an ArrayList
, the element references are stored sequentially in memory; a LinkedList
potentially splatters the references all over the heap. (This is not entirely likely, but it’s possible.) Thus, a LinkedList
not only has O(N) performance for indexed access (where the ArrayList
has O(1)), but the O(N) is worse than one might expect.
For accessing elements, there’s no situation apart from accessing the first or last element where a LinkedList
competes with an ArrayList
‘s speed.
Note that programming hates absolutes. It’s fully possible to create situations in which that last claim is not entirely true… but they’re rare and unlikely.
Insertion
About ArrayList
:
Normally the insertion time is O(1). But in case array is full then array is resized by copying elements to new array, which makes insertion time O(N).
And about LinkedList
:
Insertion of an element in LinkedList is faster, it takes O(1) time.
Incomplete data is provided here. For one thing, the author conflates mutation with “insertion.” There are two different kinds of additive list mutations (where data is added to a List
): insertion (meaning that elements are prepended to other elements) and addition (where an element “follows” the last element in the list prior to mutation.)
Where the elements go as part of the additive mutation is really important, and it’s also important to note that the claim of LinkedList
‘s O(1) time is very much a single type of insertion.
For an ArrayList
, a list append (i.e., calling add()
, which adds the provided element at the end of the list) has O(N) performance, but typically has O(1) performance. The O(N) complexity is because the internal array size might be exceeded by the addition of the element, in which case a new internal array is allocated, and the array references are copied over to the new internal array, and then the new element is appended. Thus, in the worst case, it is O(N), but this is misleading because:
- It’s fairly rare (the list resizes with a formula of
currentSize*1.5
(the actual line isint newCapacity = oldCapacity + (oldCapacity >> 1);
, if you’re interested.) - Java’s blits are very, very, very fast; Java uses the blit mechanism constantly, and let’s be real, modern CPUs are really good at this anyway.
So is it actually O(N), then? … given what conventional Big O notation means, yes, it is; it’s just that O(N) is a lot less expensive than one might think in this case. And typically, the add()
will be O(1) in actuality.
But what about LinkedList
?
Insertion at the beginning or the end of the list is, in fact, O(1). This is even the common case (add(Object)
calls linkLast(Object)
by default.) However, if positional notation is used at all (i.e., add(int, Object)
you degrade to O(N), because the LinkedList has to find the position at which the new element has to be inserted – and finding that position is O(N).
So what’s the conclusion? ArrayList
does in fact have O(N) performance for adds, but it’s close enough to O(1) in real world conditions that we can colloquially speak of it being likely as fast as LinkedList
‘s similar case… and if we add any kind of positional insertion, ArrayList
‘s degradation under the worst case is far better than LinkedList
‘s degradation.
As usual, we can create situations under which this is not the case (for example, if we always insert after the first position in a LinkedList
, which will “search” one element and only that element) but these are fairly rare.
Deletion
Let’s see about ArrayList
:
Deletion of an element is slower as all the elements have to be shifted to fill the space created by deleted element.
And LinkedList
:
Deletion of an element is faster in LinkedList because no elements shifting are required.
Well, um, no.
Here, the author is confusing complexity – big O notation – with speed, for ArrayList
. (He or she is wrong about LinkedList
in any event.)
For an ArrayList
, deleting an element by index is indeed O(N) because the JVM does have to blit the elements after the removed element (it has to shift them all in the internal array.)
However, for a LinkedList
, it also has to find the elements around the element to remove – and if you recall, accessing a specific element for a LinkedList
is an O(N) complexity itself. Even though the removal is simple (it’s simply relinking the nodes before and after the removed element), actually finding those nodes is O(N), thus deletion is O(N) for LinkedList – and slower than ArrayList
, to boot.
Memory and Initial Capacity
The author says that an ArrayList
has a default internal capacity of 10. This is incorrect for Java 8. See this code.
For the rest of it, the author is correct; a LinkedList
is “empty” (there are no linked nodes internally); an ArrayList
has capacity
references in the internal list, while a LinkedList
has a chain of objects, which will have higher memory consumption, but that’s not likely to matter unless your lists are huge.
The Comparison’s Conclusion
- As search or get operation is faster in ArrayList so it is used in the scenario where more search or get operations are required.
- As insertion or deletion of an element is faster in LinkedList so it is used in the scenario where more insertion and deletion operations are required.
The thing that the conclusion is missing is that “insertion and deletion” operations in LinkedList
typically involve searches and gets – which are terrible for LinkedList
– and therefore for even the operations that LinkedList
is potentially better, ArrayList
will typically perform better.
The factoid stands. If you need a List
, use ArrayList
first, then ArrayDeque
if your access patterns demand… use LinkedList
only after exhausting other possibilities.
One thought on “LinkedList vs. ArrayList”