Link

Type Erasure

Table of contents

  1. What is type erasure?
  2. How are type parameters replaced?
    1. Are all type parameters changed from type parameter, T, to Object?
  3. What are bridge methods and why are these needed?
    1. Which version of the method is invoked?
  4. How does type erasure effect return types?
  5. How does type erasure effect return types?
    1. Is this only related to generics?
    2. Can we manually overload a method by changing its return type?

What is type erasure?

Generics came late to the language (in Java 1.5) and were implemented in a way that these are still compatible to code that does not use generics (raw types).

Type Erasure is the process that happens at compile time and making sure that our code works as expected and is also compatible with raw types while also providing us with compile time safety. Type erasure performs three things:

  • Replace all type parameters in generic types with their upper bounds. The generic type T is replaced by the upper bound if one is defined or Object.
  • Insert type casts if necessary, to preserve type safety.
  • Generate bridge methods to preserve polymorphism in extended generic types.

Each aspect of type erasure is explored in more details in the following sections.

How are type parameters replaced?

Consider the following example.

package demo;

public class Box<T> {

  private T item;

  public void put( final T item ) {
    this.item = item;
  }
}

The Box class is a generic type that has one type parameter, T. It has one method that takes a parameter of type T and stores this in the property named item. T is not an actual type, such as String or Object, that the Java runtime can work with, thus the compiler will have to replace this with an Object.

The generated bytecode shows how the class will look after it is compiled (View > Show Bytecode)

// class version 58.65535 (-65478)
// access flags 0x21
// signature <T:Ljava/lang/Object;>Ljava/lang/Object;
// declaration: demo/Box<T>
public class demo/Box {

  // compiled from: Box.java

  // access flags 0x2
  // signature TT;
  // declaration: item extends T
  private Ljava/lang/Object; item

  // access flags 0x1
  public <init>()V
   L0
    LINENUMBER 3 L0
    ALOAD 0
    INVOKESPECIAL java/lang/Object.<init> ()V
    RETURN
   L1
    LOCALVARIABLE this Ldemo/Box; L0 L1 0
    // signature Ldemo/Box<TT;>;
    // declaration: this extends demo.Box<T>
    MAXSTACK = 1
    MAXLOCALS = 1

  // access flags 0x1
  // signature (TT;)V
  // declaration: void put(T)
  public put(Ljava/lang/Object;)V
   L0
    LINENUMBER 8 L0
    ALOAD 0
    ALOAD 1
    PUTFIELD demo/Box.item : Ljava/lang/Object;
   L1
    LINENUMBER 9 L1
    RETURN
   L2
    LOCALVARIABLE this Ldemo/Box; L0 L2 0
    // signature Ldemo/Box<TT;>;
    // declaration: this extends demo.Box<T>
    LOCALVARIABLE item Ljava/lang/Object; L0 L2 1
    // signature TT;
    // declaration: item extends T
    MAXSTACK = 2
    MAXLOCALS = 2
}

Let’s break the bytecode, shown above, into smaller pieces.

  1. Our Box class has one property and is of type Object

      private T item;
    

    The bytecode for the above property.

      // access flags 0x2
      // signature TT;
      // declaration: item extends T
      private Ljava/lang/Object; item
    

    The Java compiler has one Box class, which can contain any type. We can create a Box of type String or of type Double or any other type. Yet we have only one class defined. To accommodate all possible types, Java has to store our value in a property of type Object.

    ⓘ NoteThis may be a hint to you why generics do not support primitives. The compiler replaces the type parameter from T to an Object.
  2. The generic put() method takes one parameter of type T

      public void put( final T item ) {
        this.item = item;
      }
    

    The bytecode for the above method.

      // access flags 0x1
      // signature (TT;)V
      // declaration: void put(T)
      public put(Ljava/lang/Object;)V
       L0
        LINENUMBER 8 L0
        ALOAD 0
        ALOAD 1
        PUTFIELD demo/Box.item : Ljava/lang/Object;
       L1
        LINENUMBER 9 L1
        RETURN
       L2
        LOCALVARIABLE this Ldemo/Box; L0 L2 0
        // signature Ldemo/Box<TT;>;
        // declaration: this extends demo.Box<T>
        LOCALVARIABLE item Ljava/lang/Object; L0 L2 1
        // signature TT;
        // declaration: item extends T
        MAXSTACK = 2
        MAXLOCALS = 2
    

    The method parameter is changed from T to Object as shown in the above bytecode. When called, this method sets the property to the given value.

Are all type parameters changed from type parameter, T, to Object?

No.

Type erasure replaces the type parameter to its upper bound, depending on its definition. Consider the Box example that we discussed before.

package demo;

public class Box<T> {

  private T item;

  public void put( final T item ) {
    this.item = item;
  }
}

The type parameter T can be any object as we are not providing an upper bound. We can write the above example as shown next.

package demo;

public class Box<T extends Object> { /* ... */ }

Both generic definitions are equivalent. We can restrict the type parameter to be of a type we need. Consider the following interface.

package demo;

public interface BoxItem {
}

We can use the above interface (or any type that can be extended) to limit the types that we can use together with the Box generic type, thus specifying an upper bound, as shown next.

package demo;

public class Box<T extends BoxItem> { /* ... */ }

If we build these two classes and view the Box’s class bytecode (View > Show Bytecode) we will see that the compiler has now replaced our type parameter T by the BoxItem type, as shown next.

// class version 58.65535 (-65478)
// access flags 0x21
// signature <T::Ldemo/BoxItem;>Ljava/lang/Object;
// declaration: demo/Box<T extends demo.BoxItem>
public class demo/Box {

  // compiled from: Box.java

  // access flags 0x2
  // signature TT;
  // declaration: item extends T
  private Ldemo/BoxItem; item

  // access flags 0x1
  public <init>()V
   L0
    LINENUMBER 3 L0
    ALOAD 0
    INVOKESPECIAL java/lang/Object.<init> ()V
    RETURN
   L1
    LOCALVARIABLE this Ldemo/Box; L0 L1 0
    // signature Ldemo/Box<TT;>;
    // declaration: this extends demo.Box<T>
    MAXSTACK = 1
    MAXLOCALS = 1

  // access flags 0x1
  // signature (TT;)V
  // declaration: void put(T)
  public put(Ldemo/BoxItem;)V
   L0
    LINENUMBER 8 L0
    ALOAD 0
    ALOAD 1
    PUTFIELD demo/Box.item : Ldemo/BoxItem;
   L1
    LINENUMBER 9 L1
    RETURN
   L2
    LOCALVARIABLE this Ldemo/Box; L0 L2 0
    // signature Ldemo/Box<TT;>;
    // declaration: this extends demo.Box<T>
    LOCALVARIABLE item Ldemo/BoxItem; L0 L2 1
    // signature TT;
    // declaration: item extends T
    MAXSTACK = 2
    MAXLOCALS = 2
}

Following are the two parts that we effected by our change.

  1. The property is now of type BoxItem

      // access flags 0x2
      // signature TT;
      // declaration: item extends T
      private Ldemo/BoxItem; item
    
  2. The put() method’s parameter is now of type BoxItem

      // access flags 0x1
      // signature (TT;)V
      // declaration: void put(T)
      public put(Ldemo/BoxItem;)V
       L0
        LINENUMBER 8 L0
        ALOAD 0
        ALOAD 1
        PUTFIELD demo/Box.item : Ldemo/BoxItem;
       L1
        LINENUMBER 9 L1
        RETURN
       L2
        LOCALVARIABLE this Ldemo/Box; L0 L2 0
        // signature Ldemo/Box<TT;>;
        // declaration: this extends demo.Box<T>
        LOCALVARIABLE item Ldemo/BoxItem; L0 L2 1
        // signature TT;
        // declaration: item extends T
        MAXSTACK = 2
        MAXLOCALS = 2
    

What are bridge methods and why are these needed?

Consider our generic Box class, shown below.

package demo;

public class Box<T> {

  private T item;

  public void put( final T item ) {
    this.item = item;
  }
}

When our generic class is compiled, the type parameter T is replaced by the Object type, as we saw before. Our code will look like the following.

ⓘ NoteThe following is an example of how the code will look like after type erasure.
package demo;

public class Box {

  private Object item;

  public void put( final Object item ) {
    this.item = item;
  }
}

Now consider the following StringBox class, that extends our generic type Box.

package demo;

public class StringBox extends Box<String> {
  @Override
  public void put( final String item ) {
    System.out.printf( "Adding string: %s%n", item );
    super.put( item );
  }
}

After type erasure, the generic Box class will contain a method with the following signature.

  public void put( final Object item )

The subtype, StringBox, does not have such method.

How will polymorphism works when generics are used?

If we list the methods that the StringBox, using the javap command, we will see that the Java compiler introduces a new (bridge) method.

$ javap -p build/classes/java/main/demo/StringBox.class
Compiled from "StringBox.java"
public class demo.StringBox extends demo.Box<java.lang.String> {
  public demo.StringBox();
  public void put(java.lang.String);
  public void put(java.lang.Object);
}

The put() method is overloaded. The new method is called a bridge method as was introduced by the type erasure.

If we analyse the bytecode of the overloaded method (the method generated by the type erasure), will find that it simply casts the input to String and calls our method, as shown in the following bytecode fragment.

  // access flags 0x1041
  public synthetic bridge put(Ljava/lang/Object;)V
   L0
    LINENUMBER 3 L0
    ALOAD 0
    ALOAD 1
    CHECKCAST java/lang/String
    INVOKEVIRTUAL demo/StringBox.put (Ljava/lang/String;)V
    RETURN
   L1
    LOCALVARIABLE this Ldemo/StringBox; L0 L1 0
    MAXSTACK = 2
    MAXLOCALS = 2

The put() (bridge) method, generated by the type erasure, has the following flags.

Flag NameDescription
ACC_BRIDGEA bridge method, generated by the compiler.
ACC_SYNTHETICDeclared synthetic; not present in the source code.

The above table was copied from: Table 4.6-A. Method access and property flags in Chapter 4. The class File Format in The Java® Virtual Machine Specification.

The generated put() (bridge) method is invoking our put() as indicated by the INVOKEVIRTUAL bytecode, shown next.

    INVOKEVIRTUAL demo/StringBox.put (Ljava/lang/String;)V

Type erasure is creating a new (bridge) method, that simply calls our method while ensuring polymorphism, as shown next.

ⓘ NoteThe following is an example of how the code will look like after type erasure.
package demo;

public class StringBox extends Box<String> {

  public void put( final Object item ) {
    put( (String) item );
  }

  public void put( final String item ) {
    System.out.printf( "Adding string: %s%n", item );
    super.put( item );
  }
}

Adding a new feature to a programming language can be a massive undertaking. Bridge methods were required in order to support both generics and polymorphism.

Which version of the method is invoked?

After type erasure, the StringBox class will have two methods.

  • put(Object) method that takes an Object
  • put(String) method that takes a String

With that said, only the put(String) method is available for us to work with when working with the StringBox type. The bridge method put(Object) is added by type erasure during compilation and is only there to retrofit generics.

When we have a variable of type StringBox, we can only invoke the put(String) method. The Java compiler will prevent us from invoking the bridge put() method. There is no way for us to invoke the StringBox’s put(Object) method.

We can go through the StringBox’s put(Object) method through the supertype, Box, using polymorphism. Consider the following example.

⚠ The following example will compile but will throw ClassCastException!!
package demo;

public class App {

  public static void main( final String[] args ) {
    final Box box = new StringBox();
    box.put( Integer.valueOf( 42 ) );
  }
}

The above example makes use of raw types on purpose. Please avoid raw types!! This means that the compiler will not provide us with type checks, and we can try to put anything we want into our box. The above example fails at runtime with a ClassCastException, as an Integer is not a String.

Exception in thread "main" java.lang.ClassCastException: class java.lang.Integer cannot be cast to class java.lang.String (java.lang.Integer and java.lang.String are in module java.base of loader 'bootstrap')
	at demo.StringBox.put(StringBox.java:3)
	at demo.App.main(App.java:7)

The Java runtime environment tried to type cast our value to a String and failed as this was an Integer.

Now consider the following example, that too makes use of raw types.

package demo;

public class App {

  public static void main( final String[] args ) {
    final Object item = "Ball";
    final Box box = new StringBox();
    box.put( item );
  }
}

In this example, we are calling the put() method that takes an Object (as the variable item is of type Object, despite the fact that it is pointing to a String). This will type cast the given reference to a String and then calls the put() method that takes a String, as shown next.

Raw Types

When using raw types we cannot access the put() method that takes a String directly. That’s because the variable box is of type Box (and not StringBox) which only has one put() method that takes an Object, even when we pass a String, as shown next.

package demo;

public class App {

  public static void main( final String[] args ) {
    final Box box = new StringBox();
    box.put( "Ball" );
  }
}

Inheritance adds complexity to generics. Consider the following example.

package demo;

public class Box<T> {

  private T item;

  public void put( final T item ) { /* ... */ }

  public void clear() {
    put( null );
  }
}

A new method, named clear(), is added to the Box class which simply invokes the put() method and passes null. Now consider the following example.

package demo;

public class App {

  public static void main( final String[] args ) {
    final Box<String> box = new StringBox();
    box.clear();
  }
}

The clear() method invokes the put() method that takes an Object. The subtype, StringBox, has its own version of the put() method that overrides the put() method in the Box class. This method was generated by the type erasure (bridge method). The generated method calls our put() method, which calls the parent’s method as shown next.

Raw Types

How does type erasure effect return types?

So far, we only placed items into our Box. Let’s add a new method to the Box class that returns the item in the box, as shown next.

package demo;

public class Box<T> {

  private T item;

  public void put( final T item ) { /* ... */ }

  public T get() {
    return item;
  }

  public void clear() { /* ... */ }
}

The get() method returns the type parameter T, which is replaced by the Object type, as shown in the following bytecode fragment.

  // access flags 0x1
  // signature ()TT;
  // declaration: T get()
  public get()Ljava/lang/Object;
   L0
    LINENUMBER 12 L0
    ALOAD 0
    GETFIELD demo/Box.item : Ljava/lang/Object;
    ARETURN
   L1
    LOCALVARIABLE this Ldemo/Box; L0 L1 0
    // signature Ldemo/Box<TT;>;
    // declaration: this extends demo.Box<T>
    MAXSTACK = 1
    MAXLOCALS = 1

With that said, we can still retrieve the type parameter without having to cast it, as shown in the following example.

package demo;

public class App {

  public static void main( final String[] args ) {
    final Box<String> box = new StringBox();
    box.put( "Bicycle" );

    final String item = box.get();
    System.out.printf( "The box contains a: %s%n", item );
  }
}

The Java compiler will introduce a type cast at the consumer side for us, as shown in the following bytecode fragment.

   L2
    LINENUMBER 9 L2
    ALOAD 1
    INVOKEVIRTUAL demo/Box.get ()Ljava/lang/Object;
    CHECKCAST java/lang/String
    ASTORE 2

The Java compilers verifies that the right type is passed, and we can safely assume that the return type can be safely casted back to the required type.

How does type erasure effect return types?

Consider the following example of the StringBox class.

package demo;

public class StringBox extends Box<String> {
  @Override
  public void put( final String item ) {
    System.out.printf( "Adding string: %s%n", item );
    super.put( item );
  }

  @Override
  public String get() {
    System.out.println( "Retuning string" );
    return super.get();
  }
}

In the above example, the StringBox overrides the generic get() method defined in the Box class. If we list all methods defined by the StringBox class, we will see the following.

$ javap -p build/classes/java/main/demo/StringBox.class
Compiled from "StringBox.java"
public class demo.StringBox extends demo.Box<java.lang.String> {
  public demo.StringBox();
  public void put(java.lang.String);
  public void put(java.lang.Object);
  public java.lang.String get();
  public java.lang.Object get();
}

The StringBox class now has two get() methods.

Before Java 1.5, when a method overrides another method, it needed to match the method signature (parameters) and the return type. If the method defined in the supertype returns an Object, the overriding method in the subtype must return an Object. It cannot return a String, for example.

Generics break this rule, as the subtype StringBox returns a String, when the overridden method in the Box class returns an Object.

As of Java 1.5, this rule was relaxed and covariant return types (JLS-8.4.5) were introduced.

Return types may vary among methods that override each other if the return types are reference types. The notion of return-type-substitutability supports covariant returns, that is, the specialization of the return type to a subtype.
(Reference)

Internally, Java needs to match a method with the same return type. This was made possible by simply overloading the get() method by creating a new bridge method in the StringBox, as shown in the following bytecode fragment.

  // access flags 0x1041
  public synthetic bridge get()Ljava/lang/Object;
   L0
    LINENUMBER 3 L0
    ALOAD 0
    INVOKEVIRTUAL demo/StringBox.get ()Ljava/lang/String;
    ARETURN
   L1
    LOCALVARIABLE this Ldemo/StringBox; L0 L1 0
    MAXSTACK = 1
    MAXLOCALS = 1

Similar to other bridge methods, the method shown above simply calls the original method. This technique kept Java method linking happy while supporting generics.

Without bridge methods Java will be looking for a method with the same signature and same return type, and a NoSuchMethodError is thrown if one is not found.

NO.

Consider the following two classes.

  1. A supertype class that defines one method

    package demo;
    
    public class Supertype {
    
      public Object getIt() {
        return "The supertype";
      }
    }
    
  2. A subtype, extending Supertype, and overriding the getIt() method

    package demo;
    
    public class Subtype extends Supertype {
    
      @Override
      public String getIt() {
        return "It's me, the subtype";
      }
    }
    

These two classes do not use generics and the Subtype class method makes use of covariant return types. Listing the methods in the Subtype class will show both methods.

$ javap -p build/classes/java/main/demo/Subtype.class
Compiled from "Subtype.java"
public class demo.Subtype extends demo.Supertype {
  public demo.Subtype();
  public java.lang.String getIt();
  public java.lang.Object getIt();
}

The bridge method is a method that has the same signature, but a different return type. While this is permitted here, we cannot manually create such method.

Can we manually overload a method by changing its return type?

NO.

Bridge methods are an exception to this rule. We cannot programmatically have two or more methods within the same class that have the same name and parameters but different return types.