No More Type Erasure: Scala's Ultimate JVM Hack

Recently, while maintaining one of my old Java libraries, I noticed something odd about how the Stream API constructs arrays of generic types. Here's a simple example to illustrate :

import java.util.*;
import java.util.stream.*;

public class PetDemo {
    public static void main(String[] args) {
        List<String> names = Arrays.asList("Louis", "Buddy", "Milo", "Luna", "Bella");
        List<Integer> weights = Arrays.asList(10, 8, 12, 9);

        Pet[] pets = IntStream.range(0, names.size())
                .mapToObj(i -> new Pet(names.get(i), weights.get(i)))
                .toArray(Pet[]::new);

        System.out.println(Arrays.toString(pets));
    }
}

The call .toArray(Pet[]::new) caught my attention. Why do we need to specify this? It's already clear from the variable definition Pet[] pets that we want an array of Pet. This makes .toArray(Pet[]::new) feel like boilerplate. Digging into the implementation of .toArray, I realized that the method <A> A[] toArray(IntFunction<A[]> generator) in Java Streams can't determine what A actually is at runtime due to Java's type erasure—a limitation in Java's generics. In other words, at runtime, the JVM only sees an array of Object[], not Pet[].


Can We remove the Pet[]::new?

Because of type erasure, we need a way to pass type information so it's available at runtime. In the example above, we could write a method that lets us pass the type explicitly :

public static <T> T[] createArray(List<String> list1, List<Integer> list2, Class<T> clazz)

Then we can create a Pet array like this:

Pet[] pets = createArray(names, weights, Pet.class);

The implementation would use Java reflection. Problem solved? Not really. Now, instead of passing Pet[]::new, we have to pass Pet.class everywhere. We've just swapped one kind of boilerplate for another. It appears that we can't make a better api, however this shows that one way to deal with type erasure is by adding type information as an extra argument.


How Does Scala Handle This?

Let's see how the same thing looks in Scala:

case class Pet(name: String, weight: Int)

val names = List("Louis", "Buddy", "Milo", "Luna", "Bella")
val weights = List(10, 8, 12, 9)

val pets: Array[Pet] = names.zip(weights).map(Pet.apply).toArray

println(pets.mkString("[", ", ", "]"))

Wow. No boilerplate! But how?

If you look at the signature of toArray, you'll see: def toArray[B >: A: ClassTag]: Array[B]. This is syntactic sugar for def toArray[B >: A](implicit ct: ClassTag[B]): Array[B]. So, Scala also needs type information, in the form of a ClassTag, which is passed implicitly at compile time. That's the magic: Scala removes the boilerplate by passing the type info for you. At this point, you might say: "Scala doesn't really solve type erasure, as the title of this article claims." And you'd be right! Scala is compiled to JVM bytecode, so it can't fix type erasure at the root. In the end, generic type information still need to be passed to methods or functions if you want to access it at runtime. However, implicit parameters are a great way to mitigate the issues caused by type erasure.


Bonus: Mirror in Scala 3 – Robust Generic Programming at Compile Time

Implicit parameters are just one example of how Scala makes generic programming easier. Here's a bonus: in Scala 3, you can do generic type manipulation at compile time using the Mirror.

Problem: Write a method to check at compile time whether a type T contains a field named ssn. This kind of compile-time check is impossible in Java, because Java’s type system and reflection only allow inspection of fields at runtime, not at compile time.

file FieldCheckDemo.scala :

// file FieldCheckDemo.scala
package example

import scala.deriving.Mirror
import scala.quoted.*

object FieldCheckDemo {

  inline def requireFieldSSN[T](using Mirror.ProductOf[T]): Unit =
    ${ requireFieldSSNImpl[T] }

  def requireFieldSSNImpl[T: Type](using Quotes): Expr[Unit] = {
    import quotes.reflect.*
    val fields = TypeRepr.of[T].typeSymbol.caseFields.map(_.name)
    if (!fields.contains("ssn"))
      report.errorAndAbort("Hey bro, your class must have a field named 'ssn' 😅")
    '{ () }
  }
}

file Main.scala :

// file Main.scala
package example
import example.FieldCheckDemo.requireFieldSSN

object Main {
  // Example 1: This will compile
  case class Person(name: String, ssn: String)
  requireFieldSSN[Person] // OK

  // Example 2: This will NOT compile
  case class Pet(name: String, weight: Int)
  requireFieldSSN[Pet] // compile error
}

Compile error showing in IDE :

cee