Development

Development

To get info about new technologies, perspective products and useful services

BigData

BigData

To know more about big data, data analysis techniques, tools and projects

Refactoring

Refactoring

To improve your code quality, speed up development process

Tag: orm

Scala as backend language. Tips, tricks and pain

Scala as backend language. Tips, tricks and pain

I’ve got a legacy service, written in Scala. Stack was: Play2, Scala, Slick, Postgres.

Here is described why such technology stack is not the best option, what should be done to make it work better with less efforts and how to avoid underwater rocks.

For impatient:
If you have choice – don’t use Slick.
If you have more freedom – don’t use Play.
And finally – try to avoid Scala on the back-end. It might be good for Spark applications, but not for the backends.

Data layer

Every backend with persistence data needs to have data layer.

From my experience the best way of code organizing is repository pattern. You have your entity (dao) and repository, which you access when you need to do some manipulations with data. Nowadays modern ORMs are your friends here. They do a lot of things for you.

Slick – back in 2010

It was my first thought, when I started using it. In Java you can use Spring-data, which generates a repository implementation for you. All you need is to annotate your entity with JPA and write repository interface.

Slick is another thing. It can work in two ways.

Manual definition

You define your entity as a case class, mentioning all needed fields and their types:

case class User(
    id: Option[Long],
    firstName: String,
    lastName: String
)

And then you manually repeat all the fields and their types defining the schema:

class UserTable(tag: Tag) extends Table[User](tag, "user") {
    def id = column[Long]("id", O.PrimaryKey, O.AutoInc)
    def firstName = column[String]("first_name")
    def lastName = column[String]("last_name")

    def * = (id.?, firstName, lastName) <> (User.tupled, User.unapply)
}

Nice. Like in ancient times. Forget about @Column automapping. In case you have DTO and you need to add a field you should always remember to add it to DTO, DAO and schema. 3 places.

And have you seen insert method implementation?

def create(name: String, age: Int): Future[Person] = db.run {
    (people.map(p => (p.name, p.age))
returning people.map(_.id)
into ((nameAge, id) => Person(id, nameAge._1, nameAge._2))
    ) += (name, age)
}

I used to have save method defined somewhere in abstract repository only once and have it in one line, something like myFavouriteOrm.insert(new User(name, age)).

Full example is here: https://github.com/playframework/play-scala-slick-example

I don’t understand why Play’s authors say ORM’s “will quickly become counter-productive“. Writing manual mapping on real projects would become a pain much faster then abstract “ORM counter-productivity“.

Code generation

The second approach is code generation. It scans your DB and generates the code, based on it. Like reversed migration. I didn’t like this approach completely (it was in the legacy code I’ve got).

First, to make it working you need to have db access at compile time, which is not always possible

Second, if backend owns the data – it should be responsible for the schema. It means there should be schema from code or code changes + migration with schema changes in the same repository.

Third, have you seen the generated code? Lots of unnecessary classes, no format (400-600 characters in a line), no ability to modify this classes, by adding some logic or extending an interface. I had to create my own data layer, around this generated data layer 🙁

Ebean and some efforts to make it work

So, after fighting with Slick I’ve decided to remove it together with data layer completely and to use another technology. I’ve selected Ebean, as it is official ORM for Play2 + Java. Looks like Play developers don’t like Hibernate for some reason.

Important thing to notice – it is Java ORM and Scala is not supported officially (its support was dropped a few years ago). So you need to apply some efforts to make it work.

First of all – add jaxb libraries to your dependencies. They were removed in Java 9. So on 9+ Java your app will crash at runtime without them.

libraryDependencies ++= Seq(
  
"javax.xml.bind" % "jaxb-api" % "2.2.11",
  
"com.sun.xml.bind" % "jaxb-core" % "2.2.11",
  
"com.sun.xml.bind" % "jaxb-impl" % "2.2.11",
  
"javax.activation" % "activation" % "1.1.1"
)

Next – do not forget to add jdbc library and driver library for your database.

After it you are ready to set up your data layer.

Entity

Write your entities as normal java entities:

@Table(name = "master")
@Entity
class Master {
  @Id
  @GeneratedValue(strategy = GenerationType.AUTO)
  @Column(name = "master_id")
  var masterId: Int = _

  @Column(name = "master_name")
  var masterName: String = _

  @OneToMany(cascade = Array(CascadeType.MERGE))
  var pets: util.List[Pet] = new util.ArrayList[Pet]()
}

Basic Scala types are supported, but with several limitations:

  • You have to use java.util.list in case of one/many-to-many relationship. Scala’s ListBuffer is not supported as Ebean doesn’t know how to de/serialize it. Scala’s List also, as it is immutable and Ebean can’t populate it.
  • Primitives like Int or Double should not be nullable in the database. If you have it nullable – use java.lang.Double (/ Int) or you will get exception as soon as you will try to load such object from the database, because Scala’s Double is compiled to double primitive, which can’t be null.
    Scala’s Option[Double] won’t work, as ORM will return null instead of Option[null].
  • Relations are supported, including bridge table, which is also created automatically. But, because of the bug, @JoinColumn can’t be specified.
  • Ebean uses java lists, so you need to use scala.collection.JavaConverters every time you are planning to use lists in query (like where.in) and every time you return a list (like findList).
Repository

It is (the only) nice thing in Scala, which can be useful here: trait can extend abstract class. It means you can create your abstract CRUD repository and use it in business repositories. Like you have out of the box in Spring-Data 🙂

1. Create your abstract repository:

class AbstractRepository[T: ClassTag] {
  var ebeanServer: EbeanServer = _

  @Inject()
  def setEbeanServer(ebeanConfig: EbeanConfig): Unit = {
    ebeanServer = Ebean.getServer(ebeanConfig.defaultServer())
  }

  def insert(item: T): T = {
    ebeanServer.insert(item)
    item
  }

  def update(item: T): T = {
    ebeanServer.update(item)
    item
  }

  def saveAll(items: List[T]): Unit = {
    ebeanServer.insertAll(items.asJavaCollection)
  }

  def listAll(): List[T] = {
    ebeanServer.find(classTag[T].runtimeClass.asInstanceOf[Class[T]])
      .where().findList().asScala.toList
  }

  def find(id: Any): Option[T] = {
    Option(ebeanServer.find(classTag[T].runtimeClass.asInstanceOf[Class[T]], id))
  }
}

You need to use classTag here to determine the class of the entity.

2. Create your business repository trait, extending this abstract repository:

@ImplementedBy(classOf[MasterRepositoryImpl])
trait MasterRepository extends AbstractRepository[Master] {
}

Here you can also set up some special methods, which will be used only in this repository.

In the implementation you need to define only methods from MasterRepository. In case of none – just leave it empty. Methods from the AbstractRepository will be accessible anyway.

@Singleton
class MasterRepositoryImpl extends MasterRepository {
}

After data layer refactoring ~70% of code was removed. The main point here – functional staff (FRM and other “modern” things) can be useful only in case you don’t have business objects. F.e. you are creating telecom back-end, which main intent is to parse network packages, do something with it’s data and fire them to the next point of your data pipeline. In all other cases, when your business logic touches real world – you need to use object oriented design.  

DTO layer

Another part of the system which made me disappointed. This layer’s intent is to receive messages from outside (usually REST) and run some actions, based on message type. Usually it means that you get message, parse it (usually from JSON) and pass to service layer. Then take service layer’s return and send outside as an encoded answer. Encoding and decoding messages (DTOs) is the main thing here.

For some reason working with json is unfriendly in Scala. And super unfriendly in Play2.

Json serialization – is not automated anymore

In normal frameworks specifying the type of an object to be parsed is all you need to do. You specify root object, request body will be parsed and serialized to this object, including all sub-objects. F.e. build(@RequestBody RepositoryDTO body) taken from one of my opensource projects.

In Play you need to set up implicit reader for every sub-object, used in your DTO. In case your MasterDTO contains PetDTO, which contains RoleDTO you have to set up reader for all of them:

def createMaster: Action[AnyContent] = Action.async { request =>
    implicit val formatRole: OFormat[RoleDTO] = Json.format[RoleDTO]
    implicit val formatPet: OFormat[PetDTO] = Json.format[PetDTO]
    implicit val format: OFormat[MasterDTO] = Json.format[MasterDTO]
    val parsed = Json.fromJson(request.body.asJson.get)(format)
    val body: MasterDTO = parsed.getOrElse(null)
    // …
}

Maybe there is some automated way, but I haven’t found it. All approaches end up with getting request’s body as json and parsing it manually.

Json validation – more boilerplate for the god of boilerplate!

Play has it’s own modern functional way of data validation. In three steps only:

  1. Forget about javax.validation
  2. Define your DTO as case-class. Here you write your field names and their types.
  3. Manually write Form mapping, mentioning all dto’s field names and writing their types once again.

After Slick’s manual schema definition, I’ve expected something shitty. But it overcame my expectations.

The example:

case class SomeDTO(id: Int, text: String, option: Option[Double]).
def validationForm: Form[SomeDTO] = { 
  import play.api.data.Forms._
  Form(
       mapping(
              "id" -> number,
              "text" -> nonEmptyText,
              "option" -> optional(of(doubleFormat))
       )(SomeDTO.apply)(SomeDTO.unapply)
  )
}

It is used like this:

    def failure(badForm: Form[_]) = {
      BadRequest(badForm.errorsAsJson(messagesProvider))
    }

    def success(input: SomeDTO) = {
      // your business logic here 
    }

    validationForm.bindFromRequest()(request).fold(failure, success)

Json deserialization – forget about heterogeneity

It was the main problem with Play’s json implementation and the main reason I’ve decided to get rid of it. Unfortunately I didn’t find a quick solution to remove it completely (looks like it is hardcoded). So I use Play json for validation & dto decoding and json4s for dto encoding.

Why?

I have all my DTOs implement my JsonSerializable trait and I have few services, which work with generic objects. Imagine DogDTO and CatDTO: they are different business entities but some actions are common. To avoid code duplication I just send them via Pet trait to those services (like FeedPetService). They do their job and return just a List of JsonSerializable objects (can be either Cat or Dog DTOs, based on input type).

It turned out that Play can’t serialize trait if it is not sealed. It requires an implicit writer to be set up explicitly. So after googling a bit I’ve switch to json4s.

Now I have 2 lines of implementation for any DTO:

def toJson(elements: List[JsonSerializable]): String = {
    implicit val formats: AnyRef with Formats = Serialization.formats(NoTypeHints)
    Serialization.write(elements)
  }

It is defined in trait. Every companion object, which extends this trait has json serialization of class-objects out of the box.

Summing up

  • Slick’s creators call Slick “Functional Relational Mapper” (FRM) and claim it to have minimum configuration advantages. As far as I see it is yet another not successful attempt to create something with “Functional” buzzword. From 10 years of my experience I spend around 4 years in functional programming (Erlang) and saw a lot of dead projects, which started like “New Innovative Functional Approach”
  • Scala’s implicit is something magical which breaks KISS principle and makes the code messy. Here is a very good thread about Scala implicits + Slick
  • Working with json in Play2 is pain.