Separate type-information derivation into auto and semiauto

Before I start explaining: I'm willing to work on the PR if you're interested, but I thought it better to discuss it with you first :-)

So, we're using flink-scala-api for type-information (I work with @arnaud-daroussin). One thing we've noted is that if we used it "as intended" (by just importing `org.apache.flinkx.api.serializers._` everywhere), it leads to very high compilation times. With the old Flink API, the full clean-compile took around 160 seconds, and with flink-scala-api it moved up to 200 seconds. However, we managed to cut quite a lot of it by using semi-auto derivation instead of full-auto derivation: we've reduced the time down to 140 seconds, even less than before the migration.

 I'm not sure how familiar you are with semi-auto vs full-auto derivation? The idea is that instead of importing the macro everywhere, we declare implicit `TypeInformation` vals in the companion objects of all classes, and they're automatically found (hence semi-auto: they're declared manually, but found automatically). In addition to faster compile times, semi-auto also had the advantage of letting us create custom TypeInformations for certain class where the macro would have worked, but wouldn't have been as optimized for runtime performance. => You trade convenience for control.

So for example, instead of:
```scala
import org.apache.flink.api.common.typeinfo.TypeInformation
import org.apache.flinkx.api.serializers._

final case class Alert(message: String)

final case class Notification(alerts: List[Alert])

object Job {
  val info = implicitly[TypeInformation[Notification]]
}
```

We have:
```scala
import org.apache.flink.api.common.typeinfo.TypeInformation
// Don't import deriveTypeInformation
import org.apache.flinkx.api.serializers.{deriveTypeInformation => _, _}

final case class Alert(message: String)

object Alert {
  implicit val alertInfo: TypeInformation[Alert] = org.apache.flinkx.api.serializers.deriveTypeInformation
}

final case class Notification(alerts: List[Alert])

object Notification {
  implicit val notificationInfo: TypeInformation[Notification] = // some custom stuff
}

object Job {
  val info = implicitly[TypeInformation[Notification]]
}
```

**The issue is that flink-scala-api doesn't really support semi-auto derivation natively.**

So, we had to jump through some hoops. As you can see, we have to be careful to never import `deriveTypeInformation`, because it would have a higher priority as an implicit (being already in the scope) than the one on the entity's companion object. That's very error-prone: it's easy to miss (we did it a few times), because if you do everything seems to work "mostly" fine. So instead, we just created our own class that copied everything from `org.apache.flinkx.api.serializers` except `deriveTypeInformation`.

Another issue is that it doesn't notice when a type-information is missing, because `deriveTypeInformation` ends up calling itself if necessary. So for example, this shouldn't compile in semi-auto, but it does:
```scala
import org.apache.flink.api.common.typeinfo.TypeInformation
// Don't import deriveTypeInformation
import org.apache.flinkx.api.serializers.{deriveTypeInformation => _, _}

final case class Alert(message: String)

object Alert {
  // No TypeInformation declared
}

final case class Notification(alerts: List[Alert])

object Notification {
  // note that deriveTypeInformation is not in the implicit context, we call it by its full name
  // so it shouldn't find a way to get a TypeInformation[Alert]
  implicit val notificationInfo: TypeInformation[Notification] = org.apache.flinkx.api.serializers.deriveTypeInformation
}

object Job {
  val info = implicitly[TypeInformation[Notification]]
}
```

OK, that was a wall of text, sorry 😅 

So: **what do you think about supporting both auto and semi-auto derivation?**

That's something projects like Circe are already doing. The idea would be to have two separate packages for the derivation of serializers and type-informations, called `auto` and `semiauto`. The generic type-informations (for stuff like `Option`, `List`, etc.) would be in a parent trait, inherited both by auto and semi-auto, and the macro would be the only thing being different between the two. Note that on the semi-auto derivation, the cache is not necessary, because the declared type-information vals are doing the job.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Separate type-information derivation into auto and semiauto #282

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Separate type-information derivation into auto and semiauto #282

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions