Migration to Scala 2.11 to run on Spark 2


#1

Hi,

In Simply Business we are currently using Snowplow integrated with Spark 1.6 and deployed on EMR-4.6.

Wa are in the process of migrating to Spark 2, EMR 5.2 as it will help us in many areas. However this version of Spark in EMR is built on top of Scala 2.11. Talked with AWS support and they don’t have a similar Spark 2 on Scala 2.10 EMR build they can offer.

So we are migrating our projects to Scala 2.11 in order to be able to run in Spark 2 in EMR. We obviously have a dependency on Snowplow libraries which are all built against Scala 2.10. (We mainly depend on common-enrich)

Before I embark on updating libraries and trial error what goes wrong with every update I do, I was wondering what kind of progress you have done on this area, and if a joint effort (at least for the projects we depend on) would be beneficial. I am aware of this ticket: https://github.com/snowplow/snowplow/issues/2824 but it doesn’t give much detail on roadmap or progress.

Thanks a lot,


#2

Hello @calo81,

There’s some work going on bumps branch. There’s no ETA yet and this is more like an experiment, but what we can see here is that:

  1. This branch compiles by Scala 2.11
  2. We need to publish two more our projects with 2.11 support (scala-forex and scala-util)
  3. The only feature I found not working as expected is API Request Enrichment (tests just hanging)

So, depending on how urgent it is for you - you may want to publish locally (sbt publishLocal) scala-forex-M1 and scala-util-M1 and comile Common Enrich using them.

Any feedback on this would be highly appreciated.


#3

Hi Anton,
thanks a lot for the reply and the pointer to the branch.

I will pick up on that branch as you suggest (will discard the one I was changing myself), try it out and let you know if I encounter any problems or any kind of feedback I may have.

Cheers,


#4

Hi @anton sorry but where is the M1 branch of Scala-util ?. I can’t find it, I think is not in the repo… Also I found this issue here: https://github.com/snowplow/scala-util/issues/20 where it is mentioned that the plan is to kill this project. How should I proceed?.

Thanks,


#5

Hi @calo81,

Sorry for not mentioning it earlier. I think better way would be to move the single very simple class from scala-util to somewhere like com.snowplowanalytics.snowplow.enrich.common.utils and remove dependency entirely and replace all import com.snowplowanalytics.util.Tap._ with com.snowplowanalytics.snowplow.enrich.common.utils.Tap._.


#6

Hi,

When do you expect the scala 2.11 branch to be merged in?