Serializing Tasks

Background

You have written a nice distributed Java application that allows you to submit tasks to a remote server for execution. It relies on a method execute that may look something like this:1

  public <T> T execute(Callable<T> task) throws ExecutionException { ... }

This method works by serializing the task to send it to a remote server. The task is then executed on the remote site and the value it produces is serialized and sent back to the caller site. The whole scheme can be implemented in terms of ObjectOutputStream and ObjectInputStream (to send and receive objects in serialized form) or can use a middleware like Java’s Remote Method Invocation (RMI).

For this to work, tasks (and the results they produce) need to be serializable. In the simplest scenario, regular, full-fledged classes are used to implement tasks:

class MyTask implements Callable<Integer>, Serializable {
  public Integer call() {
    return 42;
  }
}

class Main {
  void runTask() throws ExecutionException {
    RemoteExecutor exec = new RemoteExecutor();
    exec.execute(new MyTask());  // works fine
  }
}

In this note, we are looking at alternative scenarios, especially the use of Java 8’s lambdas to represent tasks.

Nested classes

Tasks implemented as static, serializable member classes pose no problem:

class Main {

  static class MyTask implements Callable<Integer>, Serializable {
    public Integer call() {
      return 42;
    }
  }

  void runTask() throws ExecutionException {
    RemoteExecutor exec = new RemoteExecutor();
    exec.execute(new MyTask());  // works fine
  }
}

A possible mistake would be to forget to implement the Serializable interface. This would result in a runtime error:2 ObjectOutputStream.writeObject fails when trying to serialize an instance of MyClass. Another mistake would be to make the class non-static:

class Main {

  class MyTask implements Callable<Integer>, Serializable {
    public Integer call() {
      return 42;
    }
  }

  void runTask() throws ExecutionException {
    RemoteExecutor exec = new RemoteExecutor();
    exec.execute(new MyTask());  // fails at runtime
  }
}

This also fails at runtime, but with a slightly different error. The instance of MyTask is of type Serializable but contains a (synthetic) field of type Main (referred to as Main.this when used in code) that points back to the enclosing Main object, which is not serializable. Serialization fails at the level of ObjectOutputStream.defaultWriteFields when trying to serialize this field.

The same thing happens when trying to use an anonymous class, which is a form of non-static nested class:

class MyTask implements Callable<Integer>, Serializable {
  public Integer call() {
    return 42;
  }
}

public class Main {

  void runTask() throws ExecutionException {
    RemoteExecutor exec = new RemoteExecutor();
    exec.execute(new MyTask() {
      @Override
      public Integer call() {
        return super.call() + 1;
      }
    });
  }
}

This fails at runtime with the same error as when a non-static member class is used. Note that this would work if method runTask were static.

Lambdas

Java 8 introduced a form of function literals (lambdas). As a later add-on, Java’s lambdas are not defined as cleanly as they are in other languages like ML or Scala. In particular, Java’s lambdas do not have a type associated with them. Instead, they can be used as target values for various functional types, which basically are interfaces with a single method (like Callable). Still, Java’s lambdas can be very convenient and are often used to implement values of type Callable or Runnable:

  Runnable task = () -> System.out.println(42);
  Callable<Integer> task = () -> 42;

Therefore, it is natural to want to use lambdas to create tasks submitted to execute, but this raises the question of their serialization. The call:

  exec.execute(() -> 42);  // fails at runtime

fails miserably because the lambda is not serializable.

Targeting a user-defined type

One possibility is to defined a functional interface3 that represents serializable tasks and use lambdas as targets for this interface. This can be achieved by changing the signature of method execute:

  @FunctionalInterface
  public interface SerializableCallable<T> extends Serializable, Callable<T> {}
  
  public <T> T execute(SerializableCallable<T> task) throws ExecutionException { /* exact same code as before */ }

The call exec.execute(() -> 42) now works because the lambda is converted into an object with type SerializableCallable instead of simply Callable and serialization succeeds. Note that the pre-lambda equivalent:

  // fails at runtime
  exec.execute(new SerializableCallable<Integer>() {
    public Integer call() {
      return 42;
    }
  });

would fail at runtime. This is because this code uses an anonymous class, which includes an implicit reference to the (non-serializable) outer class. It is interesting that the lambda does not include such a reference.

But what if the lambda captures data in a closure? Is it still serializable? It depends on what is being captured:

class Main {

  String aField = "a field";

  static String aStaticField = "a static field";

  String aMethod() {
    return "a method";
  }

  static String aStaticMethod() {
    return "a static method";
  }

  void runTask() throws ExecutionException {
    RemoteExecutor exec = new RemoteExecutor();
    String aLocalVariable = "a local variable";

    // the following work fine:
    exec.execute(() -> aLocalVariable);
    exec.execute(() -> aStaticField);
    exec.execute(() -> aStaticMethod());
    
    // the following don't
    exec.execute(() -> aField);
    exec.execute(() -> aMethod());
  }

Not too surprisingly, as soon as the lambda closes over Main.this, it stops being serializable.

Avoiding SerializableCallable

A drawback of the previous approach is that the signature of method execute uses a user-defined interface SerializableCallable. In particular, the method cannot be called with instances of other serializable callables like MyTask. This problem can easily be resolved by overloading the execute method:

  public <T> T execute(Callable<T> task) throws ExecutionException {
    // same code as before
  }

  public <T> T execute(SerializableCallable<T> task) throws ExecutionException {
    return execute((Callable<T>) task);
  }

With these methods available, the following calls both succeed:

  exec.execute(new MyTask());  // uses the first execute method
  exec.execute(() -> 42);      // uses the second execute method

This works because the lambda is given the most specific type SerializableCallable and not the broader type Callable.

Basically, the role of the second execute method is to force the lambda into the SerializableCallable type before branching into the first method. This can also be achieved by typecasting, thus sidestepping the second method entirely:

  exec.execute((SerializableCallable) () -> 42);

This works even if there is only one execute method, with a Callable signature.

Can SerializableCallable can be avoided altogether? Since Java 8, typecasts can use _intersection types.4 The previous call can be written without referring to SerializableCallable:

  exec.execute((Serializable & Callable<Integer>) () -> 42);

The lambda is given a synthetic type—similar to class MyTask—that is both a subtype of Serializable and Callable<Integer>. A (Serializable & Callable) typecast also works but not (Serializable & Callable<?>).

Unfortunately, this typecast must be on the caller site; it cannot be moved inside method execute. Without the caller site typecast, the lambda would be given a type that extends Callable but not Serializable and the typecast inside execute would fail.

This approach also works with method (or constructor) references:

public class Main {

  static int compute() {
    return 42;
  }

  void runTask() throws ExecutionException {
    RemoteExecutor exec = new RemoteExecutor();
    exec.execute((Serializable & Callable<Integer>) Main::compute);
  }
}

Is there a way to avoid the typecast entirely? The intersection type cannot be used directly as the signature of method execute:

  // not allowed in Java
  public <T> T execute(Serializable & Callable<T> task) throws ExecutionException {

In order to write a method that requires its argument to be both Callable and Serializable without introducing a SerializableCallable type, one case rely on generics:

  public <R, T extends Callable<R> & Serializable> R execute(T task) throws ExecutionException { ... }

Here, type T represents the serializable task and type R the value returned by the task. Therefore T must be a subtype of both Serializable and Callable<R>, as specified by the extends clause.

Unfortunately, this does not help dispense with the typecast: execute(() -> 42) is rejected at compile time. For it to work, the Java compiler would have to infer a synthetic type, which it refuses to do. So, in the end, the calling site typecast has to stay.

Test code

To help write these notes, I used Java code to simulate a remote execution server:

@SuppressWarnings("unchecked")
public class RemoteExecutor {

  public <T> T execute(Callable<T> task) throws ExecutionException {
    try {
      File tmp = File.createTempFile("serialized-data-", ".tmp");
      tmp.deleteOnExit();

      // sending the task
      ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream(tmp));
      out.writeObject(task);
      out.close();

      // receiving the task
      ObjectInputStream in = new ObjectInputStream(new FileInputStream(tmp));
      task = (Callable<T>) in.readObject();
      in.close();

      // executing the task
      T output = task.call();

      // sending the output
      out = new ObjectOutputStream(new FileOutputStream(tmp));
      out.writeObject(output);
      out.close();

      // receiving the output
      in = new ObjectInputStream(new FileInputStream(tmp));
      output = (T) in.readObject();
      in.close();

      return output;

    } catch (Exception e) {
      throw new ExecutionException(e);
    }
  }
}

This code serializes the task and stores it into a file to simulate sending it to a remote server. It then deserialize the task from the file to simulate the server receiving it. The task is executed and its return value is processed in the same way (serialized into a file, then deserialized).


  1. This is a form of synchronous execution. A more complex alternative would allow asynchronous execution and produce a Future<T> instead of a T.

  2. It could also be caught at compile-time by rewriting execute to require a serializable argument. See below.

  3. The @FunctionalInterface annotation is optional.

  4. Intersection type typecasts were added to Java 8 specifically for this purpose.