One line at a time

Another Java technology blog from a developer far away from home

Collections.sort, the Java8u20 modification

Share Button

At Skiddoo, we were very excited about migrating from Java 7 to Java 8. We have the chance to be able to move forward very quickly and easily with the latest technologies. We’re using the latest play framework version, and the use of lambda feature, combined to the stream from Java 8, felt like the natural next step to us. I was in charge of this migration. And, you might think it’s just about updating the JDK, recompiling, updating the JRE on the server and push, well it’s a little bit more than that.

First, it all started locally on my own computer. Before modifying anything on our Jenkins and staging environment, we wanted to make sure all the projects were, at least, compiling on a local machine with the latest Java 8 version. Because it might not, for multiple reason, let’s just take one as an example.

The interface java.sql.Connection has been modified in Java 7. If one of your classes was implementing it in Java 6, you would get a compilation error while migrating. This is just an example, the JDK evolves, and classes or methods get deprecated, interfaces might change, which forced us to change our code.

This was the easy part of our migration. Most of the time, those are easy fix. Luckily everything was compiling fine in our case. Then, it was the time to run our unit-tests. This is where the first problems appeared. One of our dependencies was not compatible, they had a couple of bugs running under a Java 8 runtime. Again, the fix was easy in that case, we just had to upgrade to a more recent version of our dependency.

Everything looked fine in the end, we upgraded our Jenkins and staging environment, and we pushed to production after more tests. And the real problem arrived…

It just hit us, the kind of exception you don’t want to see in a multi-threaded environment:

ConcurrentModificationException

The first thing that every single engineer is telling to himself is “Why ? How is it possible ?”. And obviously, after the effect of surprise, we looked into the direction of the last thing we changed. It was not a minor change, we just switched from Java 7 to Java 8.

We immediately rolled back our servers to Java 7 and switched back to the most recent Java 7 build.

Here is the beast:

Caused by: java.util.ConcurrentModificationException: null
  at java.util.ArrayList.sort(ArrayList.java:1456) ~[na:1.8.0_25]
  at java.util.Collections.sort(Collections.java:175) ~[na:1.8.0_25]

It was clear, a collection was getting sorted while it was getting modified somewhere else. We found the problem very quickly: one of our immutable object had a getter returning a collection. Instead of returning a copy of the collection (to maintain the immutability of the object), it was returning the collection itself.

It was great, we found a problem we had for ages that we were not even aware of. But that was not enough for me: why didn’t we run into this problem earlier? We’ve been processing hundred of thousands of requests every day, for many years, and we’ve never, ever had this problem before. It couldn’t be just a bad timing for a race condition, it had to be something else.

Of course, it was. After a little bit of Googling, we found that they changed the Collections.sort implementation starting Java8u20:

Area: core-libs/java.util.collections
Synopsis: Collection.sort defers now defers to List.sort

Previously Collection.sort copied the elements of the list to sort into an array, sorted that array, then updated list, in place, with those elements in the array, and the default method List.sort deferred to Collection.sort. This was a non-optimal arrangement.

From 8u20 release onwards Collection.sort defers to List.sort. This means, for example, existing code that calls Collection.sort with an instance of ArrayList will now use the optimal sort implemented by ArrayList.

See 8032636.

Before Java8u20, the Collections.sort method was getting a copy of the collection in an array (with the toArray() method) and was sorting this array with Arrays.sort. The list itself wasn’t modified yet. Once the array was sorted, the algorithm iterating over the list and resetting each element:

public static <T extends Comparable<? super T>> void sort(List<T> list) {
  Object[] a = list.toArray();
  Arrays.sort(a);
  ListIterator<T> i = list.listIterator();
  for (int j=0; j<a.length; j++) {
    i.next();
    i.set((T)a[j]);
  }
}

This implementation would never throw a ConcurrentModificationException if you don’t add or remove an element while it’s getting sorted. It doesn’t matter if the collection is getting sorted at the same time by multiple threads. Because the set(E e) method from the Iterator (line 7 above) doesn’t modify the expectedModCount of the iterator, neither the modCount of the collection. That means the structure itself of the collection (number of elements for example) doesn’t change.

Those internal fields are the one used to determine if a modification has occurred, somewhere else, while iterating. It’s just a comparaison between the expectedModCount of the iterator and the real modCount of the collection. If they are different, that means the collection has been modified somewhere else, and then, the iterator would raise a ConcurrentModificationException.

When a remove or add method is called on the iterator, it would check if modCount and expectedModCount are the same, apply the operation and synchronize its expectedModCount with the new modCount from the collection.

This ConcurrentModificationException situation couldn’t happen to us before, because the collection was never modified, only sorted multiple times by different threads.

So why now in Java 8? Because obviously it’s not using this algorithm anymore. The Collections.sort method now relies immediately on the sort(Comparator<E> c) method of the java.util.List object and more precisely in our case, on the ArrayList implementation of sort:

public void sort(Comparator<? super E> c) {
  final int expectedModCount = modCount;
  Arrays.sort((E[]) elementData, 0, size, c);
  if (modCount != expectedModCount) {
    throw new ConcurrentModificationException();
  }
  modCount++;
}

Contrary to the old Collections.sort, this implementation modifies the modCount of the collection (line 7 above) once the list has been sorted, even if the structure itself didn’t really change (still the same number of elements). This single line, makes a big difference. While one thread was expecting a certain value of modCount, another one had modified modCount. They did this because instead of working on a copy of the list, they are working directly on the internal array of ArrayList. The member elementData, in the code above, is the internal Object[] array of the collection.

Today, everything is running smoothly under Java 8, and we are (very, very) happy to be able to use lambdas and streams. Our “vertical problem” is getting way better thanks to these amazing features.

Share Button

4 Comments

  1. The topic seems to stop at a point why the issue happens ? But why such a change internally in the core.

    Thanks,
    Vimal

    • ekaiser

      July 5, 2016 at 11:51 pm

      Hi Vimal,
      I think they just wanted to clean the code and only use one “sorting” algorithm implementation.
      Cheers,

  2. How was the issue resolved?What should we do to resolve this concurrent modification exception in java 8?

    • ekaiser

      September 30, 2016 at 1:13 am

      Hi Vanaja, just make sure the collection you are trying to sort is not used by another thread at the same time.

Leave a Reply to ekaiser Cancel reply

Your email address will not be published.

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

© 2017 One line at a time

Theme by Anders NorenUp ↑