Type Erasure
Table of contents
- What is type erasure?
- How are type parameters replaced?
- What are bridge methods and why are these needed?
- How does type erasure effect return types?
- How does type erasure effect return types?
What is type erasure?
Generics came late to the language (in Java 1.5) and were implemented in a way that these are still compatible to code that does not use generics (raw types).
Type Erasure is the process that happens at compile time and making sure that our code works as expected and is also compatible with raw types while also providing us with compile time safety. Type erasure performs three things:
- Replace all type parameters in generic types with their upper bounds. The generic type
T
is replaced by the upper bound if one is defined orObject
. - Insert type casts if necessary, to preserve type safety.
- Generate bridge methods to preserve polymorphism in extended generic types.
Each aspect of type erasure is explored in more details in the following sections.
How are type parameters replaced?
Consider the following example.
package demo;
public class Box<T> {
private T item;
public void put( final T item ) {
this.item = item;
}
}
The Box
class is a generic type that has one type parameter, T
. It has one method that takes a parameter of type T
and stores this in the property named item
. T
is not an actual type, such as String
or Object
, that the Java runtime can work with, thus the compiler will have to replace this with an Object
.
The generated bytecode shows how the class will look after it is compiled (View > Show Bytecode)
// class version 58.65535 (-65478)
// access flags 0x21
// signature <T:Ljava/lang/Object;>Ljava/lang/Object;
// declaration: demo/Box<T>
public class demo/Box {
// compiled from: Box.java
// access flags 0x2
// signature TT;
// declaration: item extends T
private Ljava/lang/Object; item
// access flags 0x1
public <init>()V
L0
LINENUMBER 3 L0
ALOAD 0
INVOKESPECIAL java/lang/Object.<init> ()V
RETURN
L1
LOCALVARIABLE this Ldemo/Box; L0 L1 0
// signature Ldemo/Box<TT;>;
// declaration: this extends demo.Box<T>
MAXSTACK = 1
MAXLOCALS = 1
// access flags 0x1
// signature (TT;)V
// declaration: void put(T)
public put(Ljava/lang/Object;)V
L0
LINENUMBER 8 L0
ALOAD 0
ALOAD 1
PUTFIELD demo/Box.item : Ljava/lang/Object;
L1
LINENUMBER 9 L1
RETURN
L2
LOCALVARIABLE this Ldemo/Box; L0 L2 0
// signature Ldemo/Box<TT;>;
// declaration: this extends demo.Box<T>
LOCALVARIABLE item Ljava/lang/Object; L0 L2 1
// signature TT;
// declaration: item extends T
MAXSTACK = 2
MAXLOCALS = 2
}
Let’s break the bytecode, shown above, into smaller pieces.
Our
Box
class has one property and is of typeObject
private T item;
The bytecode for the above property.
// access flags 0x2 // signature TT; // declaration: item extends T private Ljava/lang/Object; item
The Java compiler has one
Box
class, which can contain any type. We can create aBox
of typeString
or of typeDouble
or any other type. Yet we have only one class defined. To accommodate all possible types, Java has to store our value in a property of typeObject
.ⓘ NoteThis may be a hint to you why generics do not support primitives. The compiler replaces the type parameter fromT
to anObject
.The generic
put()
method takes one parameter of typeT
public void put( final T item ) { this.item = item; }
The bytecode for the above method.
// access flags 0x1 // signature (TT;)V // declaration: void put(T) public put(Ljava/lang/Object;)V L0 LINENUMBER 8 L0 ALOAD 0 ALOAD 1 PUTFIELD demo/Box.item : Ljava/lang/Object; L1 LINENUMBER 9 L1 RETURN L2 LOCALVARIABLE this Ldemo/Box; L0 L2 0 // signature Ldemo/Box<TT;>; // declaration: this extends demo.Box<T> LOCALVARIABLE item Ljava/lang/Object; L0 L2 1 // signature TT; // declaration: item extends T MAXSTACK = 2 MAXLOCALS = 2
The method parameter is changed from
T
toObject
as shown in the above bytecode. When called, this method sets the property to the given value.
Are all type parameters changed from type parameter, T
, to Object
?
No.
Type erasure replaces the type parameter to its upper bound, depending on its definition. Consider the Box
example that we discussed before.
package demo;
public class Box<T> {
private T item;
public void put( final T item ) {
this.item = item;
}
}
The type parameter T
can be any object as we are not providing an upper bound. We can write the above example as shown next.
package demo;
public class Box<T extends Object> { /* ... */ }
Both generic definitions are equivalent. We can restrict the type parameter to be of a type we need. Consider the following interface.
package demo;
public interface BoxItem {
}
We can use the above interface (or any type that can be extended) to limit the types that we can use together with the Box
generic type, thus specifying an upper bound, as shown next.
package demo;
public class Box<T extends BoxItem> { /* ... */ }
If we build these two classes and view the Box
’s class bytecode (View > Show Bytecode) we will see that the compiler has now replaced our type parameter T
by the BoxItem
type, as shown next.
// class version 58.65535 (-65478)
// access flags 0x21
// signature <T::Ldemo/BoxItem;>Ljava/lang/Object;
// declaration: demo/Box<T extends demo.BoxItem>
public class demo/Box {
// compiled from: Box.java
// access flags 0x2
// signature TT;
// declaration: item extends T
private Ldemo/BoxItem; item
// access flags 0x1
public <init>()V
L0
LINENUMBER 3 L0
ALOAD 0
INVOKESPECIAL java/lang/Object.<init> ()V
RETURN
L1
LOCALVARIABLE this Ldemo/Box; L0 L1 0
// signature Ldemo/Box<TT;>;
// declaration: this extends demo.Box<T>
MAXSTACK = 1
MAXLOCALS = 1
// access flags 0x1
// signature (TT;)V
// declaration: void put(T)
public put(Ldemo/BoxItem;)V
L0
LINENUMBER 8 L0
ALOAD 0
ALOAD 1
PUTFIELD demo/Box.item : Ldemo/BoxItem;
L1
LINENUMBER 9 L1
RETURN
L2
LOCALVARIABLE this Ldemo/Box; L0 L2 0
// signature Ldemo/Box<TT;>;
// declaration: this extends demo.Box<T>
LOCALVARIABLE item Ldemo/BoxItem; L0 L2 1
// signature TT;
// declaration: item extends T
MAXSTACK = 2
MAXLOCALS = 2
}
Following are the two parts that we effected by our change.
The property is now of type
BoxItem
// access flags 0x2 // signature TT; // declaration: item extends T private Ldemo/BoxItem; item
The
put()
method’s parameter is now of typeBoxItem
// access flags 0x1 // signature (TT;)V // declaration: void put(T) public put(Ldemo/BoxItem;)V L0 LINENUMBER 8 L0 ALOAD 0 ALOAD 1 PUTFIELD demo/Box.item : Ldemo/BoxItem; L1 LINENUMBER 9 L1 RETURN L2 LOCALVARIABLE this Ldemo/Box; L0 L2 0 // signature Ldemo/Box<TT;>; // declaration: this extends demo.Box<T> LOCALVARIABLE item Ldemo/BoxItem; L0 L2 1 // signature TT; // declaration: item extends T MAXSTACK = 2 MAXLOCALS = 2
What are bridge methods and why are these needed?
Consider our generic Box
class, shown below.
package demo;
public class Box<T> {
private T item;
public void put( final T item ) {
this.item = item;
}
}
When our generic class is compiled, the type parameter T
is replaced by the Object
type, as we saw before. Our code will look like the following.
package demo;
public class Box {
private Object item;
public void put( final Object item ) {
this.item = item;
}
}
Now consider the following StringBox
class, that extends our generic type Box
.
package demo;
public class StringBox extends Box<String> {
@Override
public void put( final String item ) {
System.out.printf( "Adding string: %s%n", item );
super.put( item );
}
}
After type erasure, the generic Box
class will contain a method with the following signature.
public void put( final Object item )
The subtype, StringBox
, does not have such method.
How will polymorphism works when generics are used?
If we list the methods that the StringBox
, using the javap
command, we will see that the Java compiler introduces a new (bridge) method.
$ javap -p build/classes/java/main/demo/StringBox.class
Compiled from "StringBox.java"
public class demo.StringBox extends demo.Box<java.lang.String> {
public demo.StringBox();
public void put(java.lang.String);
public void put(java.lang.Object);
}
The put()
method is overloaded. The new method is called a bridge method as was introduced by the type erasure.
If we analyse the bytecode of the overloaded method (the method generated by the type erasure), will find that it simply casts the input to String
and calls our method, as shown in the following bytecode fragment.
// access flags 0x1041
public synthetic bridge put(Ljava/lang/Object;)V
L0
LINENUMBER 3 L0
ALOAD 0
ALOAD 1
CHECKCAST java/lang/String
INVOKEVIRTUAL demo/StringBox.put (Ljava/lang/String;)V
RETURN
L1
LOCALVARIABLE this Ldemo/StringBox; L0 L1 0
MAXSTACK = 2
MAXLOCALS = 2
The put()
(bridge) method, generated by the type erasure, has the following flags.
Flag Name | Description |
---|---|
ACC_BRIDGE | A bridge method, generated by the compiler. |
ACC_SYNTHETIC | Declared synthetic; not present in the source code. |
The above table was copied from: Table 4.6-A. Method access and property flags in Chapter 4. The class File Format in The Java® Virtual Machine Specification.
The generated put()
(bridge) method is invoking our put()
as indicated by the INVOKEVIRTUAL
bytecode, shown next.
INVOKEVIRTUAL demo/StringBox.put (Ljava/lang/String;)V
Type erasure is creating a new (bridge) method, that simply calls our method while ensuring polymorphism, as shown next.
package demo;
public class StringBox extends Box<String> {
public void put( final Object item ) {
put( (String) item );
}
public void put( final String item ) {
System.out.printf( "Adding string: %s%n", item );
super.put( item );
}
}
Adding a new feature to a programming language can be a massive undertaking. Bridge methods were required in order to support both generics and polymorphism.
Which version of the method is invoked?
After type erasure, the StringBox
class will have two methods.
put(Object)
method that takes anObject
put(String)
method that takes aString
With that said, only the put(String)
method is available for us to work with when working with the StringBox
type. The bridge method put(Object)
is added by type erasure during compilation and is only there to retrofit generics.
When we have a variable of type StringBox
, we can only invoke the put(String)
method. The Java compiler will prevent us from invoking the bridge put()
method. There is no way for us to invoke the StringBox
’s put(Object)
method.
We can go through the StringBox
’s put(Object)
method through the supertype, Box
, using polymorphism. Consider the following example.
ClassCastException
!! package demo;
public class App {
public static void main( final String[] args ) {
final Box box = new StringBox();
box.put( Integer.valueOf( 42 ) );
}
}
The above example makes use of raw types on purpose. Please avoid raw types!! This means that the compiler will not provide us with type checks, and we can try to put anything we want into our box. The above example fails at runtime with a ClassCastException
, as an Integer
is not a String
.
Exception in thread "main" java.lang.ClassCastException: class java.lang.Integer cannot be cast to class java.lang.String (java.lang.Integer and java.lang.String are in module java.base of loader 'bootstrap')
at demo.StringBox.put(StringBox.java:3)
at demo.App.main(App.java:7)
The Java runtime environment tried to type cast our value to a String
and failed as this was an Integer
.
Now consider the following example, that too makes use of raw types.
package demo;
public class App {
public static void main( final String[] args ) {
final Object item = "Ball";
final Box box = new StringBox();
box.put( item );
}
}
In this example, we are calling the put()
method that takes an Object
(as the variable item
is of type Object
, despite the fact that it is pointing to a String
). This will type cast the given reference to a String
and then calls the put()
method that takes a String
, as shown next.
When using raw types we cannot access the put()
method that takes a String
directly. That’s because the variable box
is of type Box
(and not StringBox
) which only has one put()
method that takes an Object
, even when we pass a String
, as shown next.
package demo;
public class App {
public static void main( final String[] args ) {
final Box box = new StringBox();
box.put( "Ball" );
}
}
Inheritance adds complexity to generics. Consider the following example.
package demo;
public class Box<T> {
private T item;
public void put( final T item ) { /* ... */ }
public void clear() {
put( null );
}
}
A new method, named clear()
, is added to the Box
class which simply invokes the put()
method and passes null
. Now consider the following example.
package demo;
public class App {
public static void main( final String[] args ) {
final Box<String> box = new StringBox();
box.clear();
}
}
The clear()
method invokes the put()
method that takes an Object
. The subtype, StringBox
, has its own version of the put()
method that overrides the put()
method in the Box
class. This method was generated by the type erasure (bridge method). The generated method calls our put()
method, which calls the parent’s method as shown next.
How does type erasure effect return types?
So far, we only placed items into our Box
. Let’s add a new method to the Box
class that returns the item in the box, as shown next.
package demo;
public class Box<T> {
private T item;
public void put( final T item ) { /* ... */ }
public T get() {
return item;
}
public void clear() { /* ... */ }
}
The get()
method returns the type parameter T
, which is replaced by the Object
type, as shown in the following bytecode fragment.
// access flags 0x1
// signature ()TT;
// declaration: T get()
public get()Ljava/lang/Object;
L0
LINENUMBER 12 L0
ALOAD 0
GETFIELD demo/Box.item : Ljava/lang/Object;
ARETURN
L1
LOCALVARIABLE this Ldemo/Box; L0 L1 0
// signature Ldemo/Box<TT;>;
// declaration: this extends demo.Box<T>
MAXSTACK = 1
MAXLOCALS = 1
With that said, we can still retrieve the type parameter without having to cast it, as shown in the following example.
package demo;
public class App {
public static void main( final String[] args ) {
final Box<String> box = new StringBox();
box.put( "Bicycle" );
final String item = box.get();
System.out.printf( "The box contains a: %s%n", item );
}
}
The Java compiler will introduce a type cast at the consumer side for us, as shown in the following bytecode fragment.
L2
LINENUMBER 9 L2
ALOAD 1
INVOKEVIRTUAL demo/Box.get ()Ljava/lang/Object;
CHECKCAST java/lang/String
ASTORE 2
The Java compilers verifies that the right type is passed, and we can safely assume that the return type can be safely casted back to the required type.
How does type erasure effect return types?
Consider the following example of the StringBox
class.
package demo;
public class StringBox extends Box<String> {
@Override
public void put( final String item ) {
System.out.printf( "Adding string: %s%n", item );
super.put( item );
}
@Override
public String get() {
System.out.println( "Retuning string" );
return super.get();
}
}
In the above example, the StringBox
overrides the generic get()
method defined in the Box
class. If we list all methods defined by the StringBox
class, we will see the following.
$ javap -p build/classes/java/main/demo/StringBox.class
Compiled from "StringBox.java"
public class demo.StringBox extends demo.Box<java.lang.String> {
public demo.StringBox();
public void put(java.lang.String);
public void put(java.lang.Object);
public java.lang.String get();
public java.lang.Object get();
}
The StringBox
class now has two get()
methods.
Before Java 1.5, when a method overrides another method, it needed to match the method signature (parameters) and the return type. If the method defined in the supertype returns an Object
, the overriding method in the subtype must return an Object
. It cannot return a String
, for example.
Generics break this rule, as the subtype StringBox
returns a String
, when the overridden method in the Box
class returns an Object
.
As of Java 1.5, this rule was relaxed and covariant return types (JLS-8.4.5) were introduced.
“Return types may vary among methods that override each other if the return types are reference types. The notion of return-type-substitutability supports covariant returns, that is, the specialization of the return type to a subtype.“
(Reference)
Internally, Java needs to match a method with the same return type. This was made possible by simply overloading the get()
method by creating a new bridge method in the StringBox
, as shown in the following bytecode fragment.
// access flags 0x1041
public synthetic bridge get()Ljava/lang/Object;
L0
LINENUMBER 3 L0
ALOAD 0
INVOKEVIRTUAL demo/StringBox.get ()Ljava/lang/String;
ARETURN
L1
LOCALVARIABLE this Ldemo/StringBox; L0 L1 0
MAXSTACK = 1
MAXLOCALS = 1
Similar to other bridge methods, the method shown above simply calls the original method. This technique kept Java method linking happy while supporting generics.
Without bridge methods Java will be looking for a method with the same signature and same return type, and a NoSuchMethodError
is thrown if one is not found.
Is this only related to generics?
NO.
Consider the following two classes.
A supertype class that defines one method
package demo; public class Supertype { public Object getIt() { return "The supertype"; } }
A subtype, extending
Supertype
, and overriding thegetIt()
methodpackage demo; public class Subtype extends Supertype { @Override public String getIt() { return "It's me, the subtype"; } }
These two classes do not use generics and the Subtype
class method makes use of covariant return types. Listing the methods in the Subtype
class will show both methods.
$ javap -p build/classes/java/main/demo/Subtype.class
Compiled from "Subtype.java"
public class demo.Subtype extends demo.Supertype {
public demo.Subtype();
public java.lang.String getIt();
public java.lang.Object getIt();
}
The bridge method is a method that has the same signature, but a different return type. While this is permitted here, we cannot manually create such method.
Can we manually overload a method by changing its return type?
NO.
Bridge methods are an exception to this rule. We cannot programmatically have two or more methods within the same class that have the same name and parameters but different return types.