To get info about new technologies, perspective products and useful services



To know more about big data, data analysis techniques, tools and projects



To improve your code quality, speed up development process

Author: Val Tikhonov

Senior backend developer.
End-to-end from front-end to back-end with Catcher

End-to-end from front-end to back-end with Catcher

Today Catcher’s external modules 5.1.0 were finally released. It’s great news as it enables Selenium step for Front-end testing!

How should proper e2e test look like?

Imagine you have a user service with nice UI, which allows you to get information about users, registered in your system. Deeply in the back-end you also have an audit log, which saves all actions.

Before 5.1.0 you could use HTTP calls to mimic front-end behavior to trigger some actions on the back-end side.

Your test probably looked like:
– call http endpoint to search for a user
– check search event was saved to the database
– compare found user with search event, saved in the database

This test checks 100% of back-end functionality. But most likely front-end is the part of your system also! So proper end-to-end test should start with front-end application and end up in a back-end.

Without touching front-end you could have false-positive results in e2e tests. F.e.: a user has some special symbols in his name. All back-end tests passes and you deploy your application in production. After the deploy your users start to complain that front-end part of the application crashes. The reason is – front-end can’t handle back-end’s response when rendering user details with special symbols in his name.

With the new Catcher’s version you can include Front-end in your test. So – instead of calling http you can use selenium step.

The test

Let’s write a test, which will search for a user and will check that our search attempt was logged.

Every test starts with variables. To cover false-positive results we need to save multiple users and then check that only the correct one is returned. Let’s compose our users. Every user will have a random email and random name thanks to random built-in function.

        - name: '{{ random("name") }}'
          email:  '{{ random("email") }}'
        - name: '{{ random("name") }}'
          email:  '{{ random("email") }}'
        - name: '{{ random("name") }}'
          email:  '{{ random("email") }}'

Now we are ready to write our steps.

Populate the data

The first step we need to do is to populate the data with prepare step.

Let’s prepare a users.sql which will create all back-end tables (in case of clean run we don’t have them).

 CREATE TABLE if not exists users_table( 
                     email varchar(36) primary key,
                     name varchar(36) NOT NULL 

Next – we need to fill our table with test data. users.csv will use our users variable to prepare data for our step.

{%- for user in users -%}
{{ }},{{ }}
{%- endfor -%}

The step itself will take users.sql and create database tables if needed. Then it will populate it using users.csv based on users variable.

  - prepare:
              conf: '{{ postgres }}'
              schema: users_table.sql
                  users: users.csv
      name: Populate postgres with {{ users|length }} users

Select a user to search for

The next (small) step is to select a user for our search. Echo step will randomly select user from users variable and register it’s email as a new variable.

- echo: 
    from: '{{ random_choice(users).email }}'
    register: {search_for: '{{ OUTPUT }}'}
    name: 'Select {{ search_for }} for search'

Search front-end for our user

With the Selenium step we can use our front-end to search for the user. Selenium step runs the script in JS/Java/Jar/Python from resources directory.

It passes Catcher’s variables as environment variables to the script so you can access it within Selenium. It also greps the script’s output, so you can access everything in Catcher’s next steps.

- selenium:
            file: register_user.js
            driver: '/usr/lib/geckodriver'
        register: {title: '{{ OUTPUT.title  }}'}

The script will run register_user which searches for our selected user and will register page’s title.

Check the search log

After we did the search we need to check if it was logged. Imagine our back-end uses MongoDB. So we’ll use mongo step.

- mongo:
            conf: '{{ mongo }}'
            collection: 'search_log'
            find: {'text': '{{ search_for }}'}
      register: {search_log: '{{ OUTPUT }}'}

This step searches MongoDB search_log collection for any search attempts with our user in text.

Compare results

Final steps are connected with results comparison. First – we’ll use echo again to transform our users so that we can search in users by email.

- echo:
        from: '{{ users|groupby("email")|asdict }}'
        register: {users_kv: '{{ OUTPUT }}'}

Second – we will compare front-end page title got from selenium with MongoDB search log and user’s name.

 - check:
            - equals: {the: '{{ users_kv[search_for][0].name }}', is: '{{ title }}'}
            - equals: {the: '{{ title }}', is: '{{ }}'}

The selenium resource

Let’s add a Selenium test resource. It will go to your site and will searches for your user. If everything is OK page title will be the result of this step.


Selenium step supports Java, JS, Python and Jar archives. In this article I’ll show you all of them (except Jar, it is the same as Java, but without compilation). Let’s start with JavaScript.

const {Builder, By, Key, until} = require('selenium-webdriver');
async function basicExample(){
    let driver = await new Builder().forBrowser('firefox').build();
        await driver.get(process.env.site_url);
        await driver.findElement('q')).sendKeys(process.env.search_for, Key.RETURN);
        await driver.wait(until.titleContains(process.env.search_for), 1000);
        await driver.getTitle().then(function(title) {
                    console.log('{\"title\":\"' + title + '\"}')
    catch(err) {
        process.exitCode = 1;

Catcher passes all it’s variables as environment variables, so you can access them from JS/Java/Python. process.env.site_url in this example takes site_url from Catcher’s variables and process.env.search_for takes user email to search for it.

Everything you write to STDOUT is caught by Catcher. In case of JSON it will be returned as dictionary. F.e. with console.log('{\"title\":\"' + title + '\"}') statement OUTPUT.title will be available on Catcher’s side. If Catcher can’t parse JSON – it will return a text as OUTPUT.


Here is the Python implementation of the same resource. It should be also placed in resources directory. To use it instead of Java implementation you need to change file parameter in Selenium step.

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import os
from selenium.webdriver.firefox.options import Options

options = Options()
options.headless = True
driver = webdriver.Firefox(options=options)
    assert "Python" in driver.title
    elem = driver.find_element_by_name("q")
    assert "No results found." not in driver.page_source


Java is a bit more complex, as (if you are not using already compiled Jar) Catcher should compile Java source before running it. For this you need to have Java and Selenium libraries installed in your system.

Luckily Catcher comes with Docker image where libraries (JS, Java, Python), Selenium drivers (Firefox, Chrome, Opera) and tools (NodeJS, JDK, Python) installed.

package selenium;

import org.openqa.selenium.By; 
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;
import org.openqa.selenium.firefox.FirefoxBinary;
import org.openqa.selenium.firefox.FirefoxOptions;

public class MySeleniumTest {

    public static void main(String[] args) {
        FirefoxBinary firefoxBinary = new FirefoxBinary();
        FirefoxOptions options = new FirefoxOptions();
        WebDriver driver = new FirefoxDriver(options);
        try {
            WebElement element = driver.findElement("q"));
            System.out.println("{\"title\":\""+driver.getTitle() + "\"}");
        } finally {


Catcher’s update 5.1.0 unites front and back-end testing, allowing them both to exist in one testcase. It improves the coverage and make the test really end-to-end.

Catcher e2e tests tool for beginners

Catcher e2e tests tool for beginners

What is Catcher?

It is an end-to-end test tool, specially designed to test systems containing many components. Initially developed as end-to-end microservices test tool it perfectly fits needs of data pipeline testing. Here we describe basic Catcher concepts.

Catcher is a modular system. It has core modules which are shipped with Catcher itself and additional modules which can be installed separately on demand.


In Catcher terminology module and step is the same.
Every Catcher test can have next types of steps:

  • built-in core Catcher steps. They include basic steps like checks, sh, loops, http, wait and others;
  • extended Catcher steps. They have more complex implementation and sometimes require dependencies or drivers to be installed in the system. For example: kafka, elastic, docker, mongo, s3 and others.
  • your custom steps, written in Python, Java, sh or any other language you prefer.

Simple example of how test can look like:

  - http:  # load answers via post body
        url: ''
        body_from_file: "data/answers.json"
  - elastic:  # check in the logs that answers were loaded
        url: ''
        index: test
          match: {payload : "answers loaded successfully"}


Steps with hard-coded values are not that useful. Variables is one of Catcher’s key features. Jinja2 templates are fully supported. Try to use as much variables as you can to avoid code duplication and make things flexible.

Test-local variables

Every step can use existing variables by defining them in the variables section. They can be either static (as token) or computed (as user_email).

  user_email: '{{ random("email") }}'
  token: 'my_secret_token'
  - http:
        headers: {Content-Type: 'application/json', Authorization: '{{ token }}'}
        url: '{{ user_email }}'
        body: {'foo': bar}

Every step can also register it’s output (or part of it) as a new variable:

  user_email: '{{ random("email") }}'
  token: 'my_secret_token'
  - mongo:  # search MongoDb for user 
            database: test
            username: test
            password: test
            host: localhost
            port: 27017
        collection: 'users'
        find_one: {'user_id': '{{ user_email }}'}
      register: {found_user: '{{ OUTPUT }}'}
  - check:  # check token was saved
      equals: {the: '{{ found_user["token"] }}', is: "{{ token }}"}

Let’s take closer look at this line: register: {found_user: '{{ OUTPUT }}'}. Here OUTPUT system variable stores mongo step’s output. We register it as a new variable found_user.

OUTPUT is the system variable which is being used to store every step’s output.

System variables

Catcher can also access your system environment variables. F.e. run export MY_ENV_HOST=localhost && catcher my_test.yml and your environment variable will be picked up. You can disable this behavior by running Catcher with -s flag: catcher -s false my_test.yml.

  - http:  # load answers via post body
        url: 'http://{{ MY_ENV_HOST }}/load'

To override variables use -e param while running tests:

catcher -e var=value -e other_var=other_value tests/

Override priority is:

  1. command line variables override everything below
  2. newly registered variables override everything below
  3. test-local variables override everything below
  4. inventory variables override everything below
  5. environment variables override nothing


As you may notice hard-coding services configuration either in the test or in test’s variables is not that flexible, as they are different for every environment. Catcher uses inventory files for environment-specific configuration. Inventories are stored in the inventory folder. They also support templates.

You can create inventory file: local.yml and set variables there:

airflow_web: ''
    key_id: minio
    secret_key: minio123

And create develop_aws.yml with variables:

airflow_web: 'http://my_airflow:8080'
    key_id: my_real_aws_key_id
    secret_key: my_real_aws_secrey

When you run Catcher you can specify inventory via -i param:

catcher -i inventory/local.yml test


Sometimes, when you write your test, you need to see what is going on in your system after each step. At the moment Catcher supports only json output. Run it with -p json option:

catcher -p json my_test.yml

And check reports directory for a json step-by-step report. It will contain all in and out variables for every step.



For a quick start you can use Catcher Docker image. It includes catcher-core, all extended modules, drivers and dependencies. It comes ready to use – all you need to do is to mount your local directories with your tests, inventories, etc.
The full command to run it is:

docker run -v $(pwd)/inventory:/opt/catcher/inventory
           -v $(pwd)/resources:/opt/catcher/resources
           -v $(pwd)/test:/opt/catcher/tests
           -v $(pwd)/steps:/opt/catcher/steps
           -v $(pwd)/reports:/opt/catcher/reports
            catcher -i inventory/my_inventory.yml tests

Catcher in Docker is usually used in CI while developers prefer Catcher local installation for writing tests.


For a local installation ensure you have Python 3.5+ version. You can use miniconda if you don’t have it.

Run pip install catcher to install catcher-core. It will install the Catcher and basic steps. If you need any extended steps you need to install them separately. F.e. for kafka and postgres you have to run pip install catcher_modules[kafka,postgres]

Some extended steps also require drivers and system requirements to be installed first. F.e. libclntsh.dylib library for the Oracle database.

Rollback for microservices with Ansible and Jenkins

Rollback for microservices with Ansible and Jenkins

Imagine your project consists of 4 microservices (3 backends, 1 frontend). Yesterday you introduced several new features and made a release. Unfortunately, your users have just reported a bug. Some of the old important features are not working. You need to do a rollback for all services. Ah, if it could be done with one button.

Actually it can be. In this article I’ll show you how.

Tech stack:

  • Jenkins for rollback automation
  • Ansible + Python for rollback script
  • Docker registry for storing release images
  • DC/OS for running apps


We will have a python script, called via Ansible from Jenkins, as described in this article. The only difference is – we should have two different tags to run. The first one gathers all available images, the second runs the rollback.

The get algorithm:

  1. Request all images from the docker registry. Filter them by environment, sort by date and take 10 last one for every repository.
  2. Form json with repositories, images and dates and write to file system

The run algorithm:

  1. Read json from get second step and create a Jenkins input
  2. Take all available images for the selected date and do a rollback

The rollback itself:

  1. Modify the docker image section in marathon json config
  2. Start a deploy with modified config

Special case

Imagine a service, which doesn’t change in this release. It means there won’t be any rollback image available for it. But you still need to roll it back, because of the compatibility issues. Please find the example of the situation on the picture below.

If you select Today-1 only Repo1 and Repo3 will be rolled back, as there are no images for Repo2. Perhaps it wasn’t changed.

The problem here is that Repo1 or Repo3 N-1 versions could be incompatible with Repo2 latest version. So you need to find the next version of Repo2 before the rollback date. It is Today-2 version.

Get rollbacks

We will have two actions for a rollback:

  • We gather all rollback dates and images available for the current environment.
  • User selects the data and we perform a rollback.

Ansible side

Ansible changes are minor. Just add two tags for common steps (like requirements installation):

- name: "Copy requirements.txt"


    src: "requirements.txt"

    dest: "/tmp/{{ role_name }}/"


    - get

    - run

Don’t forget to add tag to the always step, or your clean-up will be ignored. Using run tag only is preferred.

It would be useful to register rollbacks in get output and debug them. In this case you can use Ansible even without Jenkins.

- name: "Get rollbacks"

  shell: "source activate /tmp/{{ role_name }}/{{ conda_env }} ; {{ item }}"


    - pip install -r /tmp/{{ role_name }}/requirements.txt

    - "python /tmp/{{ role_name }}/ get

      --repo={{ repo }}

      --dump={{ dump_path }}

      --env={{ env }}"


    executable: /bin/bash


    - get

  register: rollbacks

- debug:

    var: rollbacks.results[1].stdout


    - get

Python side

With docopt you can use a single entry point with two options, one for `get` and one for run.

Usage: get --repo=<r> --env=<e> [--dump=<dr>] run --date=<d> --env=<e> --slack=<s> --user=<u> --pwd=<p> [--dump=<dr>]

The fork itself:

if arguments['get']:

    return get(repo, env, dump)

if arguments['run']:

    return run(date, env, slack, user, pwd, dump)

To get rollbacks you need to call you Docker registry’s API first.
I assume that you use this image naming schema:

You need to get all tags for current repo, filter them by environment, then sort by date and return last 10.

def get_rollbacks(repo: str, env: str):

    r = requests.get(f'{DOCKER_REGISTRY}/v2/{repo}/tags/list', verify=False)

    if r.status_code != 200:

        raise Exception(f"Failed to fetch tags {r.status_code}")

    releases = list(filter(lambda x: x.endswith(env), r.json()['tags']))

    all_rollbacks = [(get_manifest(repo, r), {r: repo}) for r in releases[-10:]]

    return dict(all_rollbacks)

Where repo is your `service-name` and env is the current branch.

Sorting by date is a bit complex. Date is not included in tags information. The only way to get it is to fetch the mainfest and to check history.

def get_manifest(repo, tag):

    r = requests.get(f'{DOCKER_REGISTRY}/v2/{repo}/manifests/{tag}', verify=False)

    if r.status_code != 200:

        raise Exception(f"Failed to fetch manifest {r.raw}")
    history = r.json()['history']

    sort = sorted([json.loads(h['v1Compatibility'])['created'] for h in history])

    return sort[-1][:10]

The full get function:

def get(repo: str, env: str, dump: str):

    rollbacks = {}

    repos = repo.split(',')

    for r in repos:

        for date, rb in get_rollbacks(r, env).items():

            if date in rollbacks:

                rollbacks[date] += [rb]


                rollbacks[date] = [rb]


    if dump is not None:

        with open(path.join(dump, "rollback.json"), mode='w') as rb:
            json.dump({'all': repos, 'rollbacks': rollbacks}, rb)

    return rollbacks.keys()

Where repo is a comma separated list of your service-names. F.e. repo1,repo2,repo3. You also need to print rollbacks for Ansible debug.

Jenkins side

Let’s start Jenkins pipeline with environment input.

parameters {

  choice(choices: 'dev\nstage\nprod', description: 'Which environment should I rollback?', name: 'environment')


if you use master environment instead of prod you don’t need to do anything. Otherwise you need to create static variable rollback_env outside of the pipeline and fill it during the first step.

script {

    // need this as env names don't match each other. develop/master/stage in docker vs dev/stage/prod in marathon

    if (params.environment == 'prod') {

        rollback_env = "master"

    } else if(params.environment == 'stage') {

        rollback_env = "stage"

    } else {

        rollback_env = "develop"



Then just download your git repo with ansible playbook and run it.

git branch: 'master',

    credentialsId: <your git user credentials id>',

    url: "<your ansible repo>"


        playbook: "${env.PLAYBOOK_ROOT}/rollback_service.yaml",
        inventory: "inventories/dev/hosts.ini",

        credentialsId: <your git user credentials id>',

        extras: '-e "repo=' + "${env.REPOS}" + ' env=' + "${docker_env}" + ' slack=' + "${env.SLACK_CALLBACK}" + ' dump_path=' + "/tmp" + '" -v',

        tags: "get")

Please pay attention to the dump_path. It tells python script to create json directly in the /tmp, so that we can read it from Jenkins. Lets do it.

import groovy.json.JsonSlurper

def gather_rollback_dates() {

    def inputFile = readFile("/tmp/rollback.json")

    def InputJSON = new JsonSlurper().parseText(inputFile)

    return InputJSON['rollbacks'].keySet().join("\n")


This function will find your rollback, get all dates and form a string with \n separator. It is required to generate an input with dropdown.

stage('Select rollback date') {

 steps {

    script {

          def userInput = false

          try {

            timeout(time: 120, unit: 'SECONDS') {

                userInput = input(id: 'userInput',

                                  message: 'Select a date to rollback',
                                  parameters: [

                                    choice(name: 'rollback_date',
                                           choices: gather_rollback_dates(),

                                           description: 'One or more services have rollback at this date')])


          } catch(err) {


          if (userInput) {

            print('Performing rollback')

            env.DATE = userInput

          } else {
            print('Skip rollback')




It looks like this:

Perform a rollback

We have 5 actions for a rollback:

  • Read json from previous step
  • Find missing images for the selected date
  • Get marathon service ids from docker ids
  • Change marathon app’s config
  • Update app in marathon

Ansible side

Nothing special here. Just call python.

- name: "Perform rollbacks"

  shell: "source activate /tmp/{{ role_name }}/{{ conda_env }} ; {{ item }}"


    - pip install -r /tmp/{{ role_name }}/requirements.txt

    - "python /tmp/{{ role_name }}/ run

      --date={{ date }}

      --env={{ env }}

      --slack={{ slack }}

      --user={{ dcos_user }}

      --dump={{ dump_path }}

      --pwd={{ dcos_password }}"


    - run

Python side

Let’s start with run method

Read json and select all available images for a selected date.

def run(date, env, slack, user, pwd, dump):

    json_data = read_rollbacks(dump)

    all_rollbacks = OrderedDict(sorted(json_data['rollbacks'].items(), key=lambda x: x[0]))

    repos = json_data['all']

    images = all_rollbacks[date]

If images for some repos are missing – we need to find their older versions. Add this to your run method:

if len(repos) > 1 and len(repos) > len(images):

    get_missing_images(date, repos, all_rollbacks)

Where get_missing_images just goes through all_rollbacks and selects image with nearest date for each missing image.

def get_missing_images(date, repos, all_rollbacks):

    images = all_rollbacks[date]  # select available images

    found_services = [list(rb.values())[0] for rb in images]  # get services from images

    missing = list(set(repos) - set(found_services))  # substract to get missing

    for service in missing:  # populate images with rollback for every missing

        rollback = get_nearest_date(service, date, all_rollbacks)

        if rollback is None:

            print(f"Previous rollback for {service} not found")


            images += [rollback]

def get_nearest_date(repo, date, all_rollbacks):

    for d, images in reversed(all_rollbacks.items()):

        if d < date:

            for rollback, image in images[0].items():

                if image == repo:

                    return {rollback: image}

    return None

After we have our images populated we need to get marathon service ids. Our marathon ids uses standard /<department>/<environment>/<project>/<service-name>. At this step we have only service-name, so we should create a binding to Maration id.

We can do it by listing all applications running in Maration and filtering them by the environment and service name (I haven’t found better solution).

def get_service_ids(env: str, images: list, user: str, pwd: str) -> dict:

    ids_only = get_marathon_ids_for_env(env, user, pwd)  # all running services for env

    services = {}

    for rollback in images:

        tag = list(rollback.keys())[0]

        id_part = rollback[tag]

        real_id = list(filter(lambda x: x.endswith(id_part), ids_only))  # filter by service-name

        if not real_id:

            raise Exception(f"Id {id_part} not found")

        services[real_id[0]] = tag

    return services

def get_marathon_ids_for_env(env: str, user: str, pwd: str):

    res = call_with_output(f'dcos auth login --username={user} --password={pwd}')

    if res.decode().strip() != 'Login successful!':

        raise Exception("Can't login to dcos cli")

    all_services = call_with_output('dcos marathon app list')

    matched = list(filter(lambda x: x.startswith(f"/ds/{env}"),


    return [m.split(' ')[0] for m in matched]

After we have service ids we can iterate through them and do a rollback for each. Add this to your run method:

services = get_service_ids(env, images, user, pwd)

for service_id, service_tag in services.items():

    if slack is not None:

        notify_slack(slack, f"Rollback { service_id }: { service_tag }")

    print(do_deploy(service_id, service_tag))

Well, that’s all. Don’t forget to add slack notifications for the rollback.

Jenkins side

Python part was the most complex. On Jenkins side you just need to call Ansible with run tag and selected date.

stage('Rollback') {

 when {

    expression {

         return env.DATE != null


 steps {


            playbook: "${env.PLAYBOOK_ROOT}/rollback_service.yaml",
            inventory: "inventories/dev/hosts.ini",

            credentialsId: <your git user credentials id>',

            extras: '-e "date=' + "${env.DATE}" + ' env=' + "${params.environment}" + ' slack=' + "${env.SLACK_CALLBACK}" + ' dump_path=' + "/tmp" + '" -v',

            tags: "run")



Summing up

Current solution is quite complex, but it allows you to run rollbacks both from Ansible via cli and from Jenkins. The second one is preferred, as you can see the user who approved the rollback.

Have a nice firefighting and hope you’ll never have a need in rollbacks!

Microservices and continuous delivery

Microservices and continuous delivery

Imagine a typical situation – yesterday your devops engineer was eaten by a tiger. You a very sad because he didn’t finish the release system for your project. It contains 4 repositories: 2 back-end, 1 front-end, 1 data pipeline.

And now it is you who should set up a deploy pipeline for your project tomorrow.

In this article you’ll get to know how to set up Jenkins, Ansible and Catcher to build multi-environment production ready CI/CD with E2E tests and minimum effort.

Individual pipeline

First step to do – is to set up an individual pipeline for every service. I assume that you are a good developer and you have a separate git repository for each service.

All you need to do here – is to write a Jenkins pipeline and fed it to Jenkins via organization plugin, manually or automatically. The pipeline will be triggered on every commit. It will run tests for every branch. In case of environment branch (develop, stage or master) it will also build docker image and will deploy it to the right environment.

Set up an agent

Agent is the starting point of every Jenkins pipeline. The most common is agent any, unless you don’t need any special stuff.

Set up triggers

Your pipeline should be triggered on every commit. If your Jenkins is not accessible from external network – use pollSCM.

Set up environment variables

They make your life much easier, as they allow you to copy-paste your Jenkinsfile with minimum changes.
Environment should include the docker image names.

environment {
    IMAGE_NAME = "<your_docker_registry_url:port>/<your_project>:${env.BUILD_NUMBER}-${env.BRANCH_NAME}"
    LATEST_IMAGE_NAME = "<your_docker_registry_url:port>/<your_project>:latest-${env.BRANCH_NAME}"


Set up common steps

Common steps are steps, that should be called on every branch. Even if it is a feature branch.

steps {

        sh "make test"


Remember, that keeping to a standard is a wise decision (or you will be eaten by a tiger too). So, ensure you have a Makefile in your repository. It is your friend here, as it allows you to build language agnostic pipeline. Even if your new devops don’t know your programming language or build system, they will understand, that calling `make test` will test your project.

It is also the right place for notifications. Use slackSend to send a notification to your project’s Slack channel.

slackSend color: "warning", message: "Started: ${env.JOB_NAME} - ${env.BUILD_NUMBER} (<${env.BUILD_URL}|Open>)"

Set up special build steps

Special steps are the steps, that should be run only when changes are made to a special branch. Jenkins allows you to use a when condition:

stage('Build') {

   when {

     expression {

        return env.BRANCH_NAME == 'master' || env.BRANCH_NAME == 'develop' || env.BRANCH_NAME == 'stage'


   steps {
      sh "docker build -t ${env.IMAGE_NAME} ."
      sh "docker push ${env.IMAGE_NAME}"

      sh "docker tag ${env.IMAGE_NAME}

      sh "docker push ${env.LATEST_IMAGE_NAME}"

Set up environment-specific deploy

Besides the when condition, you should also select the proper image or configuration to deploy the right environment. I use Marathon and my dev/stage/prod use different CPU limitations, secrets and other configurations. They are stored in marathon/marathon_<env>.json. So before the deploy you should select the proper configuration file. Use script for this:


  when {

    expression {

       return env.BRANCH_NAME == 'master' || env.BRANCH_NAME == 'develop' || env.BRANCH_NAME == 'stage'



  steps {

    script {

        if (env.BRANCH_NAME == 'master') {

            env.MARATHON = "marathon/marathon_prod.json"

        } else if (env.BRANCH_NAME == 'stage') {

            env.MARATHON = "marathon/marathon_stage.json"

        } else {

            env.MARATHON = "marathon/marathon_dev.json"




      url: 'http://leader.mesos:8080',

      docker: "${env.IMAGE_NAME}",

      filename: "${env.MARATHON}"




Ansible promote role

The easiest way to set up a promotion from one environment to another is to trigger the individual pipeline, configured previously.

In the previous article I showed you, that it is much better to use Jenkins together with Ansible. There are no exceptions here (just imagine, that tiger also ate your Jenkins-machine).

We will use a python script wrapped in the Ansible role. For those who haven’t read my previous article – groovy jenknis shared library can be used instead, but it is not recommended as:

  • it is difficult to develop and debug such libraries, because of different versions of Jenkins, Jenkins groovy plugin and groovy installed locally.
  • it makes your release highly depend on your Jenkins, which is OK until you decide to move to another CI, or your Jenkins is down and you need to do a release.

Python script

To trigger the promotion from develop to stage you should merge develop into the stage and push it. That’s all. After the push it’s internal pipeline will be triggered.

The python script itself:

  1. Clone the repository
  2. Checkout to the branch you are going to promote
  3. Merge the previous environment’s branch
  4. Push it!

That looks very easy, although here are some tips.

Prefer your system’s git to the python’s library. In this case you can use your own keys while running locally.

def call_with_output(cmd: str, directory='.'):

    output = subprocess.Popen(cmd.split(' '),




    stdout, stderr = output.communicate()

    if stderr is None:

        return stdout

    raise Exception(stderr)

If your repository is not public, you should clone it by token. Notice, that git_user, git_token and company are ansible variables. They don’t change too often, so I store them in role’s default variables.

call_with_output(f'git clone https://{{ git_user }}:{{ git_token }}{{ company }}/{ repo }.git')

It is good not to call push if there are no changes. But not all git versions have the same output. up-to-date differs from up to date. It took me a while to notice this.

changes = call_with_output(f"git merge { from_branch }", repo).decode("utf-8").strip()
if changes != "Already up to date." and changes != "Already up-to-date.":

    call_with_output(f"git push origin HEAD:{ to_branch }", repo)

Sending a slack notification directly to your project’s channel is also a good idea. You can do it via slack webhook.

def notify_slack(callback, message):

    response =, data=json.dumps({'text': message}),

                             headers={'Content-Type': 'application/json'}


    if response.status_code != 200:

        raise ValueError('Request to slack returned an error %s, the response is:\n%s'

                         % (response.status_code, response.text)


Jenkins shared pipeline

Now you have your Ansible promote role. It’s time to create a Jenkins pipeline for the whole project, which will call Ansible for you. This pipeline can be triggered manually by you or automatically by any of the project’s services.

Start with adding a parameter:

parameters {
    choice(choices: 'develop\nstage\nmaster', description: 'Which environment should I check?', name: 'environment')

The deploy step:

stage('Promote dev to stage') {
    when {
        expression {
            return params.environment == 'develop'
    steps {
        deploy_all('develop', 'stage')


Where deploy_all downloads your ansible repository with the role you’ve created and calls it for every service of project being deployed.

def deploy_all(from, to) {
    git branch: 'master',
        credentialsId: “${env.GIT_USER_ID}”,
        url: "<your_company>/<your_ansible_repo>"
    deploy('repo_1', from, to)
    deploy('repo_2', from, to)
    deploy('repo_3', from, to)

def deploy(repo, from, to) {
        playbook: "${env.PLAYBOOK_ROOT}/deploy_service.yaml",
        inventory: "inventories/dev/hosts.ini",
        credentialsId: ${env.SSH_USER_ID},
        extras: '-e "to=' + "${to}" + ' from=' +"${from}" + ' repo=' + "${repo}" + ' slack=' + "${env.SLACK_CALLBACK}" + '" -vvv')

Now you have the deploy pipeline for all services and can call it manually. It is 3x faster, than calling manually the pipeline of each of 3 projects. But it is not our goal yet.

We need this pipeline to be triggered by any of our internal pipelines.

Add this step to all 3 Jenkinsfiles of your services:

stage('Trigger promotion pipeline) {

 when {

    expression {

         return env.BRANCH_NAME == 'master' || env.BRANCH_NAME == 'develop' || env.BRANCH_NAME == 'stage'



 steps {

   build job: "../<jenkins_promote_project_pipeline_name>/master",

         wait: false,

         parameters: [

           string(name: 'environment', value: String.valueOf(env.BRANCH_NAME))




Automation part is done now. After you’ve merged your feature branch local service’s tests are run and service is deployed to develop environment. After it the pipeline immediately triggers promotion pipeline for the whole project. All services which were changed will be deployed to the next environment.

Add end-to-end test

Automatic promotions is good, but what is the point of it? It just moves your changes from environment to environment without any high-level acceptance tests?

In Catcher’s article I’ve already mentioned, that green service’s tests don’t give you dead certainty that your services can interact with each other normally. To ensure, that the whole system is working you need to add end-to-end tests in your promotion pipeline.

To add Catcher end to end tests just create inventory and tests in your Jenkins shared pipeline’s repository project (I assume that you have separate git repository, where you store the pipeline, readme with deployment description, etc).

In the inventory you should mention all project’s services, for every environment. F.e. for develop:

backend1: ""
frontend: ""
backend2: ""
database: ""

In tests you should put your end-to-end tests. The simpliest thing will be checking their healthchecks. It will show you that they are at least working.

  - http:
      name: 'Check frontend is up'
        url: '{{ backend1 }}'
  - http:
      name: 'Check backend1 is up'
        url: '{{ backend1 }}/graphql'
        body: '''
           __schema {
              types {
          Content-Type: "application/graphql"
  - http:
      name: 'Check backend2 is up'
        url: '{{ backend2 }}/healthcheck'
  - postgres:
      conf: '{{ database }}'
      query: 'select 1'

Add test step to your jenkins pipeline just before the deploy.
Do not forget to create a Makefile.

stage('Prepare') {
     steps {
       sh "make conda"
       sh "make requirements"

Make sure you’ve selected the proper environment. You should always test the same environment, which is specified in patameter.environment.

stage('Test') {
     steps {
        script {
            if (params.environment == 'develop') {
                env.INVENTORY = "dev.yml"
            } else {
                env.INVENTORY = "stage.yml"
        sh "make test INVENTORY=${env.INVENTORY}"

Piece of the Makefile:

CONDA_ENV_NAME ?= my_e2e_env
ACTIVATE_ENV = source activate ./$(CONDA_ENV_NAME)

.PHONY: conda
conda: $(CONDA_ENV_NAME)
	conda create -p $(CONDA_ENV_NAME) --copy -y python=$(PY_VERSION)
	$(ACTIVATE_ENV) && python -s -m pip install -r requirements.txt

.PHONY: requirements
	$(ACTIVATE_ENV) && python -s -m pip install -r requirements.txt

.PHONY: test
	$(ACTIVATE_ENV) && catcher script/tests -i inventory/${INVENTORY}

Disable automatic prod promotion

End-to-end test is good, but not perfect. You shouldn’t let every change deploy on prod realtime. Unless you like to work at night.

Add an input for promote stage to master pipeline’s step. If nobody will press this input – it will be ignored.

stage('Promote stage to prod') {
     when {
        expression {
             return params.environment == 'stage'
     steps {
        script {
          def userInput = false
          try {
            timeout(time: 60, unit: 'SECONDS') {
                userInput = input(id: 'userInput',
                                  message: 'Promote current stage to prod?',
                                  parameters: [
                                      [$class: 'BooleanParameterDefinition', defaultValue: false, description: '', name: 'Promote']
          } catch(err) {

          if (userInput) {
            print('Deploying prod')
            deploy_all('stage', 'master')
          } else {
            print('Skip deploy')

In this case prod will be deployed only after stage’s e2e test is successfull and user decides changes are ready to be promoted.


Such pipeline allows you to deploy a bunch of microservices at once with minimal changes to an existing infrastructure, as we re-use each service’s internal deploy pipeline, which you probably already have.

It is not perfect, as it doesn’t take into a consideration broken build or red service-level tests. But it allows you to save your time during the deploy and remove human error factor by setting all dependent services at one place.

In my next article I’ll show you the example of a rollback pipeline for a set of microservices.

End-to-end microservices testing with Catcher

End-to-end microservices testing with Catcher

I would like to introduce a new tool for end-to-end testing – Catcher.

What is an e2e test?

End-to-end test usually answers the questions like: “Was this user really created, or service just returned 200 without any action?”.

In comparison with project level tests (unit/functional/integration) e2e runs against the whole system. They can call your backend’s http endpoints, check values written to the database, message queue, ask another services about changes and even emulate external service behaviour.

E2E tests are the tests with the highest level. They are usually intended to verify that a system meets the requirements and all components can interact with each other.

Why do we need e2e tests?

Why do we need to write these tests? Even M.Fowler recommends to avoid these tests in a favor of more simple ones.

However, on more higher abstract layer tests are written – the less rewrites will be done. In case of refactoring, unit tests are usually rewritten completely. You should also spend most of your time on functional tests during code changes. But end-to-end tests should check your business logic, which is unlikely to change very often.

Besides that, even the full coverage of all microservices doesn’t guarantee their correct in-between interaction. Developers may incorrectly implement the protocol (naming or data type errors). Or develop new features relying on the data schema from the documentation. Anyway you can get a surprise at the prod environment, since schema mismatches: a mess in the data or someone forgot to update the schema.

And each service’s tests would be green.

Why do we need to automate tests?

Indeed. In my previous company was decided not to spend efforts on setting up automated tests, because it takes time. Our system wasn’t big at that time (10-15 microservices with common Kafka). CTO said that “tests are not important, the main thing is – system should work”. So we were doing manual tests on multiple environments.

How it looked like:

  1. Discuss with owners of other microservices what should be deployed to test a new feature.
  2. Deploy all services.
  3. Connect to remote kafka (double ssh via gateway).
  4. Connect to k8s logs.
  5. Manually form and send kafka message (thanks god it was plain json).
  6. Check the logs in attempt to understand whether it worked or not.

And now let’s add a fly in this ointment: majority of tests requires fresh users to be created, because it was difficult to reuse existing one.

How user sign up looked like:

  1. Insert various data (name, email, etc).
  2. Insert personal data (address, phone, various tax data).
  3. Insert bank data.
  4. Answer 20-40 questions.
  5. Pass IdNow (there was mock up for dev, but stage took 5+ minutes, because their sandbox was sometimes overloaded).
  6. This step requires opening bank account which you can’t do via front-end. You have to go to kafka via ssh and act as a mock-service (send a message, that account was opened).
  7. Go to moderator’s account on another frontend and approve the user you’ve just created.

Super, the user has just been created! Now lets add another fly: some tests require more than one user. When tests fail you have to start again with registering users.

How new features pass business team’s checks? The same actions need to be done in the next environment.

After some time you start feeling yourself like a monkey, clicking these numerous buttons, registering users and performing manual steps. Also, some developers had problems with kafka connection or didn’t know about tmux and faced this bug with default terminal and 80 char limit.


  • No need to do a set up. Just test on existing environment.
  • Don’t need high qualification. Can be done by cheap specialists


  • Takes much time (the further – the more).
  • Usually only new features are tested (without ensuring, that all features, tested previously are ok).
  • Usually manual testing is performed by qualified developers (expensive developers are utilized on cheap job).

How to automate?

If you’ve read till this point and are still sure, that manual testing is ok and everything was done right in this company, then the other part of my article won’t be interesting to you.

Developers can have two ways to automate repeating actions. They depend on the type of the programmer, who had enough time:

  • Standalone back-end service, which lives in your environment.  Tests are hardcoded inside and are triggered via endpoints. May be partly automated with CI.
  • Script with hardcoded test. It differs only in way of run. You need to connect somewhere (probably via ssh) and call this script. Can be put into a Docker image. May be also automated with CI.

Sounds good. Any problems?

Yes. Such tests are usually created using technologies that the author knows. Usually it is a scripting language such as python or ruby, which allows you to write a test quickly and easily.

However, sometimes you can stumble upon a bunch of bash scripts, C or something more exotic. Once I spent a week rewriting the bike on bash scripts to a python, because these scripts were no longer extensible and no one really knew how do they work or what do they test . The example of self-made end-to-end test is here.


  • They are automated!


  • Has additional requirements to developer’s qualification (F.e. main language is Java, but tests were written in Python)
  • You write a code to test a code (who will test the tests?)

Is there anything out of the box?

Of course. Just look on BDD. There is Cucumber or Gauge.

In short – the developer describes the business scenario in a special language and writes the implementation later. This language is usually human readable. It is assumed that it will be read/written not only by developers, but also by project managers.

Together with implementation scenario is stored in the standalone project and is run by third party services (Cucumber, Gauge…).

The scenario:

Customer sign-up

* Go to sign up page

Customer sign-up
tags: sign-up, customer

* Sign up a new customer with name "John" email "" and "password"
* Check if the sign up was successful

The implementation:

@Step("Sign up as <customer> with email <> and <password>")
    public void signUp(String customer, String email, String password) {
        WebDriver webDriver = Driver.webDriver;
        WebElement form = webDriver.findElement("new_user"));

    @Step("Check if the sign up was successful")
    public void checkSignUpSuccessful() {
        WebDriver webDriver = Driver.webDriver;
        WebElement message = webDriver.findElements(By.className("message"));
        assertThat(message.getText(), is("You have been signed up successfully!"));

The full project can be found here.


  • Business logic is described in human readable language and is stored in one place (can be used as documentation).
  • Existing solutions are used. Developers only need to know how to use them.


  • Managers won’t read/write these specs.
  • You have to maintain both specifications and implementations.

Why do we need Catcher?

Of course, to simplify the process.

The developer just writes a test scenarios in json or yaml, catcher executes them. The scenario is just a set of consecutive steps, f.e.:

    - http:
          url: ''
          body: {key: '1', data: 'foo'}
    - postgres:
          conf: 'dbname=test user=test host=localhost password=test'
          query: 'select * from test where id=1'

Catcher supports Jinja2 templates, so you can use variables instead of hardcoded values. You can also store global variables in inventory files (as in ansible), fetch them from environment or register new one.

  bonus: 5000
  initial_value: 1000
- http:
          url: '{{ user_service }}/sign_up'
          body: {username: 'test_user_{{ RANDOM_INT }}', data: 'stub'}
        register: {user_id: '{{ OUTPUT.uuid }}'
- kafka:
            server: '{{ kafka }}'
            topic: '{{ new_users_topic }}'
                equals: {the: '{{ MESSAGE.uuid }}', is: '{{ user_id }}'}
        register: {balance: '{{ OUTPUT.initial_balance }}'}

Additionally, you can run verification steps:

- check: # check user’s initial balance
    equals: {the: '{{ balance }}', is: '{{ initial_value + bonus }}'}

You can also run one tests from another, which allows you to reuse the code and keep it separated logically.

    file: register_user.yaml
    as: sign_up
    # .... some steps
    - run:
        include: sign_up
    # .... some steps

Catcher also has a tag system – you can run only some special steps from included test.

Besides built-in steps and additional repository it is possible to write your own modules on python (simply by inheriting ExternalStep) or in any other language:

one=$(echo ${1} | jq -r '.add.the')
two=$(echo ${1} | jq -r '')
echo $((${one} + ${two}))

And executing it:

  one: 1
  two: 2
    - math:
        add: {the: '{{ one }}', to: '{{ two }}'}
        register: {sum: '{{ OUTPUT }}'}

It is recommended to place tests in the docker and run them via CI.

Docker image can also be used in Marathon / K8s to test an existing environment. At the moment I am working on a backend (analogue of AnsibleTower) to make the testing process even easier and more convenient.

The example of e2e test for a group of microservices is here.
Working example of e2e test with Travis integration is here.


  • No need to write any code (only in case of custom modules).
  • Switching environments via inventory files (like in ansible).
  • Easy extendable with custom modules (in any language).
  • Ready to use modules.


  • The developer have to know not very human readable DSL (in comparison with other BDD tools).

Instead of conclusion

You can use standard technologies or write something on your own. But I am talking about microservices here. They are characterized by a wide variety of technologies and a big number of teams. If for JVM team junit + testcontainers will be an excellent choice, Erlang team will select common test. After your department will grow, all e2e tests will be given to a dedicated team – infrastructure or qa. Imagine how happy they will be because of this zoo?

When I was writing this tool, I just wanted to reduce the time I usually spend on tests. In every new company I usually have to write (or rewrite) such test system.

However, this tool turned out to be more flexible than I’ve expected. F.e. Catcher can also be used for organizing centralized migrations and updating microservice systems, or data pipelines integration testing.

Ansible and Jenkins – automate your scritps

Ansible and Jenkins – automate your scritps

The topic I’d like to reveal in this article may seem obvious, but I was surprised how many companies don’t follow this best practice.

For impatient:

  • Automate every action you’ve done more than once.
  • Don’t use Jenkins static groovy library.
  • Use Jenkins + Ansible + Python for automation.

The problem

Any developer in his work always faces a situation when some action needs to be repeated. Sometimes these actions are urgent and need to be done very quickly. F.e. your prod is down and you need to rebuild indexes on your database, or repopulate images on your dashboard, or re-elect new leader in your distributed back-end.

It is good to remember these 3 golden rules, which can make your life easier:

  • If you repeat an action more, than twice – it should be automated.
  • If there are several steps to be done – they should be put in one script.
  • When there is some complex set up before running these actions – everything should be documented.

Following these rules will decrease the time you usually spend on firefighting. It may seem unnecessary to spend time on such automation from business prospect, but in real life you free your time for development new features, as well as reduce the time needed to fix a problem.

Another problem is a bus factor. When you have manual actions – there will always be a person, who knows critical and unique information. If this person (dies) leaves your company – you won’t be able to fix the problem quickly, as knowledge would be lost. Documented scripts with actions are your friends here.

Custom scripts

At some point all developers come to the rules, mentioned above. They start to automate their actions by creating scripts. It is good, but here hides the danger – such scripts are usually written in different programming languages and are stored in many repositories.

It is hard to maintain such a zoo. And sometimes even hard to find a script for a particular problem. Maybe some scripts will be even re-implemented several times. Be ready for it.

Another problem is the environment. Such scripts are friendly to it’s creator’s environment. And now imagine you’ve found an old script, written in some language you don’t have installed in your system. What should you do to quickly run it and fix the problem?

Jenkins shared libraries

One solution here is to make Jenkins solve your problem. You have groovy shared libraries with scripts, which do fixes you need. And Jenkins jobs, each one for the problem you need to fix. Everything in one repository.

The approach is good, but not the implementation.

It is really hard to develop such scripts. I’ve faced a lot of problems with it, because there is no guarantee, that a code, you’ve tested locally will work in Jenkins. The main reason lies in different Groovy version.

Python scripts

To solve the versioning problem one can use Python + Conda/venv. Python itself is very good for scripting and quite widespread. There is a higher chance somebody in your team knows Python, than Groovy.

With the help of Conda you can use the same Python version everywhere.

I also highly recommend docopt for Python. Do you remember about the third rule of automation? It is much better when your documentation comes together with the code, because it reduces the maintenance difficulty.

Comments in script are not always able to explain you why and how this script should be run and what are the arguments value. The docopt will handle parameters and default values for you as well as printing the help message on every wrong argument provided or just by demand.

#!/usr/bin/env python

Very important script. It should be run when our prod freezes for some seconds. 
It will get all missed transactions, ask for confirmation and process results.

Usage: --issuer=<i> --bank=<b> [--slack=<s>] -h | --help

  -h --help                     show this help message and exit
  --issuer=<i>                  Which issuer to use for transaction confirmation.  [default: primary]
  --bank=<b>                    Which bank’s backend to use.
  --slack=<s>                   slack callback to notify

Ansible + Python

After previous stage you have self-documented version-independent script. A developer’s dream. What can be improved?

First of all they are still a bit coupled with python dependencies. If you are going to use these python scripts as a company standard – you have to force everybody to install conda in order to be able to run these scripts.

Second – you still need a central storage for such scripts. The unique source of truth, where fix for ideally any problem can be found.

To solve both issues you need to use Ansible and have single repository for it’s scripts (in huge companies you should prefer per-department repository).

Every problem, which can be solved with scripts turns into the role. Each role has it’s Readme, where problem and solution are described. Root’s readme points to each role’s readme with a small comment of a problem it solves.

## Problem
Sometimes we have a pikes of a high load. During this load our slaves can loose master and some transactions won’t be processed.
## Solution
Such missed transactions are saved by slaves in a special queue. This role gets these transactions, asks bank’s confirmation for each and processes the results.
## Run
ansible-playbook resolve_transactions.yaml –i inventory/prod/hosts.ini –extra-vars “-i=’primary’ –b=’my_bank’”

It doesn’t replace your Python scripts, as plain ansible scripts are harder to debug and develop. Instead of it all python scripts go into files or templates inside the role and are called as a part of the play.

The minimal scenario usually contains conda creation and deps installation, as well as running script itself (for simplicity this role assumes conda is installed).

- block:

    - name: "Copy requirements.txt"


        src: "requirements.txt"

        dest: "/tmp/{{ role_name }}/"

    - name: "Copy python executable"


        src: ""

        dest: "/tmp/{{ role_name }}/"

    - name: "Create Conda Env for {{ python_version }}"

      shell: "conda create -p /tmp/{{ role_name }}/{{ conda_env }} --copy -y python={{ python_version }}"

    - name: "Run my script"

      shell: "source activate /tmp/{{ role_name }}/{{ conda_env }} && {{ item }}"


        - pip install -r /tmp/{{ role_name }}/requirements.txt

        - "python /tmp/{{ role_name }}/

          --issuer={{ issuer }}

          --bank={{ bank }}

          --slack={{ slack }}"


        executable: /bin/bash


    - name: "Clean up"


        state: absent

        path: "/tmp/{{ role_name }}/"

Here you can benefit from ansible variable system:

Group variables are stored per environment, as well as global variables, which are symlinks to all.

Each role can also has it’s specific `default` variables, which are overridden by ansible input to the script.

Now you can transfer the first line support to another department, just pointing them to a single repository with full documentation and scripts. All they need to know is how to run ansible.

Jenkins + Ansible + Python

The problem with first line support is they are usually cheaper and less qualified than usual developers. They also may run Windows and have no idea about what Ansible is. The ideal solution for them is to provide a document with rules like “If you suspect this problem – push that button”. And you can do it with the help of Jenkins.
First of all ensure you have ansible plugin installed.

Second – create credentials for ssh usage.

Third – write a pipeline for every role you wish to create a button for. You can place it in the role’s root directory in front of Readme and make repository root’s Jenkins pipeline scan for all pipelines in roles/ and create child Jenkins pipelines if necessary.

Your typical pipeline would have input params:

parameters {

    choice(choices: ['dev','prod', 'stage'], description: 'On which Environment should I run this script?', name: 'environment')


As well as a first step should be cloning your repo with ansible:

stage(Clone Git repository') {

   steps {

        git branch: 'master',

            credentialsId: <some-uuid>',

            url: "${env.PROJECT_REPO}"



And calling the ansible playbook itself:

stage('Run ansible script') {

   steps {

       script {

         if (params.environment == 'prod') {

            env.INVENTORY = "inventories/prod/hosts.ini"

            env.ISSUER = "primary"

         } else if(params.environment == 'dev'){

            env.INVENTORY = "inventories/dev/hosts.ini"

            env.ISSUER = "secondary"

         } else if(params.environment == 'stage'){

            env.INVENTORY = "inventories/stage/hosts.ini"

            env.ISSUER = "secondary"

         } else {

            throw new Exception("Unknown environment: ${params.environment}")




            playbook: "${env.PLAYBOOK_ROOT}/deploy_service.yaml",

            inventory: "${env.PLAYBOOK_ROOT}/${env.INVENTORY}",

            credentialsId: '<your-credentials-id>',

            extras: '-e "i=' + "${ env.ISSUER }" + ' b='my_bank"+ '" -v')



After creating Jenkins jobs all you need is link them to each role’s readme, as well as any connected project’s readme.

Summing up

  • Automated scripts allow you to fix problems much faster, but they also require some effort to make them easy to use and platform independent.
  • Self-documented scripts allow you to reduce bus factor and onboarding time for newcomers.
  • Centralized repository with standardized tools allows you to do a quick responsibility handover to another team in future.
  • Ansible + Jenkins allows you to fix problem by pressing a single Jenkins button (even when you are at vacation and have only your mobile phone with you) or running an ansible script, when your Jenkins is down.
  • Jenkins Buttons allows you to reduce the qualification requirements and the price of the first line support.

Have a happy firefighting! 🙂

Scala as backend language. Tips, tricks and pain

Scala as backend language. Tips, tricks and pain

I’ve got a legacy service, written in Scala. Stack was: Play2, Scala, Slick, Postgres.

Here is described why such technology stack is not the best option, what should be done to make it work better with less efforts and how to avoid underwater rocks.

For impatient:
If you have choice – don’t use Slick.
If you have more freedom – don’t use Play.
And finally – try to avoid Scala on the back-end. It might be good for Spark applications, but not for the backends.

Data layer

Every backend with persistence data needs to have data layer.

From my experience the best way of code organizing is repository pattern. You have your entity (dao) and repository, which you access when you need to do some manipulations with data. Nowadays modern ORMs are your friends here. They do a lot of things for you.

Slick – back in 2010

It was my first thought, when I started using it. In Java you can use Spring-data, which generates a repository implementation for you. All you need is to annotate your entity with JPA and write repository interface.

Slick is another thing. It can work in two ways.

Manual definition

You define your entity as a case class, mentioning all needed fields and their types:

case class User(
    id: Option[Long],
    firstName: String,
    lastName: String

And then you manually repeat all the fields and their types defining the schema:

class UserTable(tag: Tag) extends Table[User](tag, "user") {
    def id = column[Long]("id", O.PrimaryKey, O.AutoInc)
    def firstName = column[String]("first_name")
    def lastName = column[String]("last_name")

    def * = (id.?, firstName, lastName) <> (User.tupled, User.unapply)

Nice. Like in ancient times. Forget about @Column automapping. In case you have DTO and you need to add a field you should always remember to add it to DTO, DAO and schema. 3 places.

And have you seen insert method implementation?

def create(name: String, age: Int): Future[Person] = {
    ( => (, p.age))
into ((nameAge, id) => Person(id, nameAge._1, nameAge._2))
    ) += (name, age)

I used to have save method defined somewhere in abstract repository only once and have it in one line, something like myFavouriteOrm.insert(new User(name, age)).

Full example is here:

I don’t understand why Play’s authors say ORM’s “will quickly become counter-productive“. Writing manual mapping on real projects would become a pain much faster then abstract “ORM counter-productivity“.

Code generation

The second approach is code generation. It scans your DB and generates the code, based on it. Like reversed migration. I didn’t like this approach completely (it was in the legacy code I’ve got).

First, to make it working you need to have db access at compile time, which is not always possible

Second, if backend owns the data – it should be responsible for the schema. It means there should be schema from code or code changes + migration with schema changes in the same repository.

Third, have you seen the generated code? Lots of unnecessary classes, no format (400-600 characters in a line), no ability to modify this classes, by adding some logic or extending an interface. I had to create my own data layer, around this generated data layer 🙁

Ebean and some efforts to make it work

So, after fighting with Slick I’ve decided to remove it together with data layer completely and to use another technology. I’ve selected Ebean, as it is official ORM for Play2 + Java. Looks like Play developers don’t like Hibernate for some reason.

Important thing to notice – it is Java ORM and Scala is not supported officially (its support was dropped a few years ago). So you need to apply some efforts to make it work.

First of all – add jaxb libraries to your dependencies. They were removed in Java 9. So on 9+ Java your app will crash at runtime without them.

libraryDependencies ++= Seq(
"javax.xml.bind" % "jaxb-api" % "2.2.11",
"com.sun.xml.bind" % "jaxb-core" % "2.2.11",
"com.sun.xml.bind" % "jaxb-impl" % "2.2.11",
"javax.activation" % "activation" % "1.1.1"

Next – do not forget to add jdbc library and driver library for your database.

After it you are ready to set up your data layer.


Write your entities as normal java entities:

@Table(name = "master")
class Master {
  @GeneratedValue(strategy = GenerationType.AUTO)
  @Column(name = "master_id")
  var masterId: Int = _

  @Column(name = "master_name")
  var masterName: String = _

  @OneToMany(cascade = Array(CascadeType.MERGE))
  var pets: util.List[Pet] = new util.ArrayList[Pet]()

Basic Scala types are supported, but with several limitations:

  • You have to use java.util.list in case of one/many-to-many relationship. Scala’s ListBuffer is not supported as Ebean doesn’t know how to de/serialize it. Scala’s List also, as it is immutable and Ebean can’t populate it.
  • Primitives like Int or Double should not be nullable in the database. If you have it nullable – use java.lang.Double (/ Int) or you will get exception as soon as you will try to load such object from the database, because Scala’s Double is compiled to double primitive, which can’t be null.
    Scala’s Option[Double] won’t work, as ORM will return null instead of Option[null].
  • Relations are supported, including bridge table, which is also created automatically. But, because of the bug, @JoinColumn can’t be specified.
  • Ebean uses java lists, so you need to use scala.collection.JavaConverters every time you are planning to use lists in query (like and every time you return a list (like findList).

It is (the only) nice thing in Scala, which can be useful here: trait can extend abstract class. It means you can create your abstract CRUD repository and use it in business repositories. Like you have out of the box in Spring-Data 🙂

1. Create your abstract repository:

class AbstractRepository[T: ClassTag] {
  var ebeanServer: EbeanServer = _

  def setEbeanServer(ebeanConfig: EbeanConfig): Unit = {
    ebeanServer = Ebean.getServer(ebeanConfig.defaultServer())

  def insert(item: T): T = {

  def update(item: T): T = {

  def saveAll(items: List[T]): Unit = {

  def listAll(): List[T] = {

  def find(id: Any): Option[T] = {
    Option(ebeanServer.find(classTag[T].runtimeClass.asInstanceOf[Class[T]], id))

You need to use classTag here to determine the class of the entity.

2. Create your business repository trait, extending this abstract repository:

trait MasterRepository extends AbstractRepository[Master] {

Here you can also set up some special methods, which will be used only in this repository.

In the implementation you need to define only methods from MasterRepository. In case of none – just leave it empty. Methods from the AbstractRepository will be accessible anyway.

class MasterRepositoryImpl extends MasterRepository {

After data layer refactoring ~70% of code was removed. The main point here – functional staff (FRM and other “modern” things) can be useful only in case you don’t have business objects. F.e. you are creating telecom back-end, which main intent is to parse network packages, do something with it’s data and fire them to the next point of your data pipeline. In all other cases, when your business logic touches real world – you need to use object oriented design.  

Bugs and workarounds

I’ve recently faced a bug, which I would like to mention.

Sometimes application fails to start because it can’t find Ebean class. It is connected with logback.xml but I am not sure, how. My breaking change was adding Sentry‘s logback.

There are two solutions:

  • some people fix it by just playing with logback.xml by removing or changing appenders. That doesn’t look so stable.
  • another workaround is to inject EbeanDynamicEvolutions into your repository (AbstractRepository is the best place). You don’t need to use it. I think it is connected with Play’s attempts to run evolutions on start. The connection to logback is still unclear.

DTO layer

Another part of the system which made me disappointed. This layer’s intent is to receive messages from outside (usually REST) and run some actions, based on message type. Usually it means that you get message, parse it (usually from JSON) and pass to service layer. Then take service layer’s return and send outside as an encoded answer. Encoding and decoding messages (DTOs) is the main thing here.

For some reason working with json is unfriendly in Scala. And super unfriendly in Play2.

Json deserialization – is not automated anymore

In normal frameworks specifying the type of an object to be parsed is all you need to do. You specify root object, request body will be parsed and serialized to this object, including all sub-objects. F.e. build(@RequestBody RepositoryDTO body) taken from one of my opensource projects.

In Play you need to set up implicit reader for every sub-object, used in your DTO. In case your MasterDTO contains PetDTO, which contains RoleDTO you have to set up reader for all of them:

def createMaster: Action[AnyContent] = Action.async { request =>
    implicit val formatRole: OFormat[RoleDTO] = Json.format[RoleDTO]
    implicit val formatPet: OFormat[PetDTO] = Json.format[PetDTO]
    implicit val format: OFormat[MasterDTO] = Json.format[MasterDTO]
    val parsed = Json.fromJson(request.body.asJson.get)(format)
    val body: MasterDTO = parsed.getOrElse(null)
    // …

Maybe there is some automated way, but I haven’t found it. All approaches end up with getting request’s body as json and parsing it manually.

Finally I’ve ended with json4s and parsing objects like this:


What I still don’t like here is you have to get body as json, convert it to string and parse one more time. I am lucky, this project is not realtime, but if your is – think twice before doing so.

Json validation – more boilerplate for the god of boilerplate!

Play has it’s own modern functional way of data validation. In three steps only:

  1. Forget about javax.validation
  2. Define your DTO as case-class. Here you write your field names and their types.
  3. Manually write Form mapping, mentioning all dto’s field names and writing their types once again.

After Slick’s manual schema definition, I’ve expected something shitty. But it overcame my expectations.

The example:

case class SomeDTO(id: Int, text: String, option: Option[Double]).
def validationForm: Form[SomeDTO] = { 
              "id" -> number,
              "text" -> nonEmptyText,
              "option" -> optional(of(doubleFormat))

It is used like this:

    def failure(badForm: Form[_]) = {

    def success(input: SomeDTO) = {
      // your business logic here 

    validationForm.bindFromRequest()(request).fold(failure, success)

Json serialization – forget about heterogeneity

It was the main problem with Play’s json implementation and the main reason I’ve decided to get rid of it. Unfortunately I haven’t found a quick solution to remove it completely (looks like it is hardcoded) and replace with json4s.

I have all my DTOs implement my JsonSerializable trait and I have few services, which work with generic objects. Imagine DogDTO and CatDTO: they are different business entities but some actions are common. To avoid code duplication I just send them via Pet trait to those services (like FeedPetService). They do their job and return just a List of JsonSerializable objects (can be either Cat or Dog DTOs, based on input type).

It turned out that Play can’t serialize trait if it is not sealed. It requires an implicit writer to be set up explicitly. So after googling a bit I’ve switch to json4s.

Now I have 2 lines of implementation for any DTO:

def toJson(elements: List[JsonSerializable]): String = {
    implicit val formats: AnyRef with Formats = Serialization.formats(NoTypeHints)

It is defined in trait. Every companion object, which extends this trait has json serialization of class-objects out of the box.

Summing up

  • Slick’s creators call Slick “Functional Relational Mapper” (FRM) and claim it to have minimum configuration advantages. As far as I see it is yet another not successful attempt to create something with “Functional” buzzword. From 10 years of my experience I spend around 4 years in functional programming (Erlang) and saw a lot of dead projects, which started like “New Innovative Functional Approach”
  • Scala’s implicit is something magical which breaks KISS principle and makes the code messy. Here is a very good thread about Scala implicits + Slick
  • Working with json in Play2 is pain.
Python & Graphql. Tips, tricks and performance improvements.

Python & Graphql. Tips, tricks and performance improvements.

Recently I’ve finished another back-end with GraphQL, but now on Python. In this article I would like to tell you about all difficulties I’ve faced and narrow places which can affect the performance.

Technology stack: graphene + flask and sqlalchemy integration. Here is a piece of requirements.txt:


This allow me to map my database entities directly to GraphQL.

It looks like this:

The model:

class Color(db.Model):
  """color table"""
  __tablename__ = 'colors'

  color_id = Column(BigInteger().with_variant(sqlite.INTEGER(), 'sqlite'), primary_key=True)
  color_name = Column(String(50), nullable=False)
  color_r = Column(SmallInteger)
  color_g = Column(SmallInteger)
  color_b = Column(SmallInteger)

The node:

class ColorNode(SQLAlchemyObjectType):
  class Meta:
    model = colours.Color
    interfaces = (relay.Node,)

  color_id = graphene.Field(BigInt)

Everything is simple and nice.

But what are the problems?

Flask context.

At the time of writing this article I was unable to send my context to the GraphQL.

                 context_value={'session': db.session})

This thing didn’t work for me, as view in flask-graphql integration was replaced by flask request.

Maybe this is fixed now, but I have to subclass GrqphQLView to save the context:

class ContexedView(GraphQLView):
  context_value = None

  def get_context(self):
    context = super().get_context()
    if self.context_value:
      for k, v in self.context_value.items():
        setattr(context, k, v)
    return context

CORS support

It is always a thing I forget to add 🙂

For Python Flask just add flask-cors in your requirements and set it up in your create_app method via CORS(app). That’s all.

Bigint type

I had to create my own bigint type, as I use it in the database as primary key in some columns. And there were graphene errors when I try to send int type.

class BigInt(Scalar):
  def serialize(num):
    return num

  def parse_literal(node):
    if isinstance(node, ast.StringValue) or isinstance(node, ast.IntValue):
      return int(node.value)

  def parse_value(value):
    return int(value)

Compound primary key

Also, graphene_sqlalchemy doesn’t support compound primary key out of the box. I had one table with (Int, Int, Date) primary key. To make it resolve by id via Relay’s Node interface I had to override get_node method:

def get_node(cls, info, id):
  import datetime
  return super().get_node(info, eval(id))

datetime import and eval are very important here, as without them date field will be just a string and nothing will work during querying the database.

Mutations with authorization

It was really easy to make authorization for queries, all I needed is to add Viewer object and write get_token and get_by_token methods, as I did many times in java before.

But mutations are called bypassing Viewer and its naturally for GraphQL.

I didn’t want to add authorization code in every mutation’s header, as it leads to code duplication and it’s a little bit dangerous, as I may create a backdoor by simply forgetting to add this code.

So I’ve subclass mutation and reimplement it’s mutate_and_get_payload like this:

class AuthorizedMutation(relay.ClientIDMutation):
  class Meta:
    abstract = True

  def mutate_authorized(cls, root, info, **kwargs):

  def mutate_and_get_payload(cls, root, info, **kwargs):
    # authorize user using info.context.headers.get('Authorization')
    return cls.mutate_authorized(root, info, **kwargs)

All my mutations subclass AuthorizedMutation and just implement their business logic in mutate_authorized. It is called only if user was authorized.

Sortable and Filterable connections

To have my data automatically sorted via query in connection (with sorted options added to the schema) I had to subclass relay’s connection and implement get_query method (it is called in graphene_sqlalchemy).

class SortedRelayConnection(relay.Connection):
  class Meta:
    abstract = True

  def get_query(cls, info, **kwargs):
    return SQLAlchemyConnectionField.get_query(cls._meta.node._meta.model, info, **kwargs)

Then I decided to add dynamic filtering over every field. Also with extending schema.

Out of the box graphene can’t do it, so I had to add a PR and subclass connection once again:

class FilteredRelayConnection(relay.Connection):
  class Meta:
    abstract = True

  def get_query(cls, info, **kwargs):
    return FilterableConnectionField.get_query(cls._meta.node._meta.model, info, **kwargs)

Where FilterableConnectionField was introduced in the PR.

Sentry middleware

We use sentry as error notification system and it was hard to make it work with graphene. Sentry has good flask integration, but problem with graphene is – it swallows exceptions returning them as errors in response.

I had to use my own middleware:

class SentryMiddleware(object):

  def __init__(self, sentry) -> None:
    self.sentry = sentry

  def resolve(self, next, root, info, **args):
    promise = next(root, info, **args)
    if promise.is_rejected:
    return promise

  def log_and_return(self, e):
      raise e
    except Exception:
      if self.sentry.is_configured:
      if not issubclass(type(e), NotImportantUserError):
    return e

It is registered on GraphQL route creation:

                 context_value={'session': db.session},

Low performance with relations

Everything was well, tests were green and I was happy till my application went to dev environment with real amounts of data. Everything was super slow.

The problem was in sqlalchemy’s relations. They are lazy by default.

It means – if you have graph with 3 relations: Master -> Pet -> Food and query them all, first query will receive all masters (select * from masters`). F.e. you’ve received 20. Then for each master there will be query (select * from pets where master_id = ?). 20 queries. And finally – N food queries, based on pet return.

My advice here – if you have complex relations and lots of data (I was writing back-end for big data world) you have to make all relations eager. The query itself will be harder, but it will be only one, reducing response time dramatically.

Performance improvement with custom queries

After I made my critical relations eager (not all relations, I had to study front-end app to understand what and how they query) everything worked faster, but not enough. I looked at generated queries and was a bit frightened – they were monstrous! I had to write my own, optimized queries for some nodes.

F.e. if I have a PlanMonthly entity with several OrderColorDistributions, each of it having one Order.

I can use subqueries to limit the data (remember, I am writing back-end for big data) and populate relations with existing data (I anyway had this data in the query, so there was no need to use eager joins, generated by ORM). It will facilitates the request.


  1. Mark subqueries with_labels=True
  2. Use root’s (for this request) entity as return one:
    Order.query \
      .filter(<low level filtering here>) \
      .join(<join another table, which you can use later>) \
      .join(ocr_query, Order.order_id == ocr_query.c.order_color_distribution_order_id) \
            and_(ocr_query.c.order_color_distribution_color_id == date_limit_query.c.plans_monthly_color_id,
                 ocr_query.c.order_color_distribution_date == date_limit_query.c.plans_monthly_date,
                 <another table joined previously> == date_limit_query.c.plans_monthly_group_id))
  3. Use contains_eager on all first level relations.
    query = query.options(contains_eager(Order.color_distributions, alias=ocr_query))
  4. If you have second layer of relations (Order -> OrderColorDistribution -> PlanMonthly) chain contains_eager:
    query = query.options(contains_eager(Order.color_distributions, alias=ocr_query)
                 .contains_eager(OrderColorDistribution.plan, alias=date_limit_query))

Reducing number of calls to the database

Besides data rendering level I have my service layer, which knows nothing about GraphQL. And I am not going to introduce it there, as I don’t like high coupling.

But each service needs fetched months data. To use all the data only once and have it in all services, I use injector with @request scope. Remember this scope, it is your friend in GraphQL.

It works like a singleton, but only within one request to /graphql. In my connection I just populate it with plans, found via GraphQL query (including all custom filters and ranges from front-end):


Then in all services, which need to access this data I just use this cache:

def __init__(self,
             prediction_service: PredictionService,
             price_calculator: PriceCalculator,
             future_month_cache: FutureMonthCache) -> None:
  self._prediction_service = prediction_service
  self._price_calculator = price_calculator

Another nice thing is – all my services, which manipulate data and form the request have also @request scope, so I don’t need to calculate predictions for every month. I take them all from cache, do one query and store the results. Moreover, one service can rely on other service’s calculated data. Request scope helps a lot here, as it allows me to calculate all data only once.

On the Node side I call my request scope services via resolver:

def resolve_predicted_pieces(self, _info):
  return app.injector.get(PredictionCalculator).get_original_future_value(self)

It allows me to run heavy calculations only if predicted_pieces were specified in the GraphQL query.

Summing up

That’s all difficulties I’ve faced. I haven’t tried websocket subscriptions, but from what I’ve learned I can say that Python’s GraphQL is more flexible, than Java’s one. Because of Python’s flexibility itself. But if I am going to work on high-load back-end, I would prefer not to use GraphQL, as it is harder to optimize.

GraphQL with Spring: Query and Pagination

GraphQL with Spring: Query and Pagination

In this article I’ll describe you how to use GraphQL with Spring with this library. Full example is available here.

Why annotations?

From my point of view schema should not be written manually, just because it is easy to make a mistake. Schema should be generated from code instead. And your IDE can help you here, checking types and typos in names.

Nearly always GraphQL schema has the same structure as back-end data models. This is because back-end is closer to data. So it would much be easily to annotate your data models and keep the existing schema rather than to write schema manually (maybe on front-end side) and then create bridges between this schema and existing data models.

Add library and create core beans

First thing to do is to add library to your spring boot project. I assume you’ve already added web, so just add graphql to your build.gradle:


Graphql object is a start execution for GraphQL queries. To build it we need to provide a schema and strategy to it.
Let’s create a schema bean in your Configuration class:

public GraphQLSchema schema() {
    GraphQLAnnotations.register(new ZonedDateTimeTypeFunction());
    return newSchema()

Here we register custom ZoneDateTime type function to convert ZonedDateTime from java to string with format yyyy-MM-dd’T’HH:mm:ss.SSSZ and back.

Then we use a builder to create new schema with query, mutation and subscription. This tutorial covers query only.

Building a schema is not so cheap, so it should be done only once. GrahpQlAnnotations will scan your source tree starting from QueryDto and going through it’s properties and methods building a schema for you.

After schema is ready you can create a GraphQL bean:

public GraphQL graphQL(GraphQLSchema schema) {
    return GraphQL.newGraphQL(schema)
            .queryExecutionStrategy(new EnhancedExecutionStrategy())

According to the documentation building GraphQL object is cheap and can be done per request, if required. It is not needed for me, but you can add prototype scope on it.

I’ve used EnhancedExecutionStrategy to have ClientMutationId be inserted automatically to support Relay mutations.

Create controller with CORS support

You will receive your graphql request as ordinary POST request on /graphql:

@RequestMapping(path = "/graphql", method = RequestMethod.POST)
public CompletableFuture<ResponseEntity<?>> getTransaction(@RequestBody String query) {
    CompletableFuture<?> respond = graphqlService.executeQuery(query);
    return respond.thenApply(r -> new ResponseEntity<>(r, HttpStatus.OK));

It should always return ExecutionResult and Http.OK, even if there is an error!
Also it is very important to support OPTIONS request. Some front-end GraphQL frameworks send it before sending POST with data.
In Spring all you need is just add @CrossOrigin annotation.

Execute graphql with spring application context

You can get your query in two formats: json with variables:

{"query":"query SomeQuery($pagination: InputPagination) { viewer { someMethod(pagination: $pagination) { data { inner data } } } }","variables":{"pagination":{"pageSize":50,"currentPage":1}}}

or plain GraphQL query:

query SomeQuery {
 viewer {
   someMethod(pagination: {pageSize:50, currentPage:1}) {
     data { inner data }

The best way to convert both formats to one is to use this inner class:

private class InputQuery {
    String query;
    Map<String, Object> variables = new HashMap<>();
    InputQuery(String query) {
        ObjectMapper mapper = new ObjectMapper();
        try {
            Map<String, Object> jsonBody = mapper.readValue(query, new TypeReference<Map<String, Object>>() {
            this.query = (String) jsonBody.get("query");
            this.variables = (Map<String, Object>) jsonBody.get("variables");
        } catch (IOException ignored) {
            this.query = query;

Here we parse JSON first. If parsed – we provide query with variables. If not – we just assume input string to be a plain query.
To execute your query you should construct GraphQL execution input and pass it to the execute method of your GraphQL object.

public CompletableFuture<ExecutionResult> executeQuery(String query) {
    InputQuery queryObject = new InputQuery(query);
    ExecutionInput executionInput = ExecutionInput.newExecutionInput()
    return CompletableFuture.completedFuture(graphQL.execute(executionInput));


private ApplicationContext appContext;
private GraphQL graphQL;
private MutationDto mutationDto;

appContext is spring application context. It is used as execution input context in order to access spring beans in GraphQL objects.
GraphQL is your bean, created earlier.
MutationDto is your mutation. I’ll cover it in another tutorial.

The query

Query is a start point for your GraphQL request.

public class QueryDto

I used Dto suffix for all GraphQL objects to separate them from data objects. However this suffix is redundant for schema, so @GraphQLName annotation is used.

public static TableDto getFreeTable(DataFetchingEnvironment environment) {
    ApplicationContext context = environment.getContext();
    DeskRepositoryService repositoryService = context.getBean(DeskRepositoryService.class);
    return repositoryService.getRandomFreeDesk().map(TableDto::new).orElse(null);

Every public static method in QueryDto annotated with @GrahpQLField will be available for query:

query {
 getFreeTable {
   tableId name

GraphQL Objects

Your query returns TableDto which is your GraphQL object.

The difference between QueryDto and normal TableDto is that first one is always static, while objects are created. In listing above is is created from Desk.

To make fields and methods of created object visible for the query you should make them public and annotate with @GraphQLField.

In case of properties you can leave them private. GraphQL library will access them anyway:

private Long tableId;
private String name;

public String getWaiterName(DataFetchingEnvironment environment) {
     //TODO use context to retrieve waiter.
    return "default";

DataFetchingEnvironment will be automatically filled in by GrahpQl Annotations library if added to function’s arguments. You can skip it if not needed:

public String getWaiterName() {
    return "default";

You can also use any other arguments including objects:

public String getWaiterName(DataFetchingEnvironment environment, String string, MealDto meal) {
     //TODO use context to retreive waiter.
    return "default";

You can use @GraphQLNonNull to make any argument required.

Relay compatibility

Every object should implement Node interface, which has non null id:

public interface Node {
    String id();

ClassTypeResolver allows GraphQL to include interface to your schema.
I usually use Class name + class Id for Node Id. Here is AbstractId every object extends.

Then in TableDto constructor I will use: super(TableDto.class, desk.getDeskId().toString());

For the ability to get Table by it’s Node id let’s use this:

public static TableDto getById(DataFetchingEnvironment environment, String id) {
    ApplicationContext context = environment.getContext();
    DeskRepositoryService repositoryService = context.getBean(DeskRepositoryService.class);
    return repositoryService.findById(Long.valueOf(id)).map(TableDto::new).orElse(null);

It is be called from QueryDto:

public static Node node(DataFetchingEnvironment environment, @GraphQLNonNull String id) {
    String[] decoded = decodeId(id);
    if (decoded[0].equals(TableDto.class.getName()))
        return TableDto.getById(environment, decoded[1]);
    if (decoded[0].equals(ReservationDto.class.getName()))
        return ReservationDto.getById(environment, decoded[1]);
    log.error("Don't know how to get {}", decoded[0]);
    throw new RuntimeException("Don't know how to get " + decoded[0]);

by this query: query {node(id: "unique_graphql_id") {... on Table { reservations {edges {node {guest from to}} }}}}

The pagination

To support pagination your method should return PaginatedData<YourClass> and have additional annotation @GraphQLConnection:

public static PaginatedData<TableDto> getAllTables(DataFetchingEnvironment environment) {
    ApplicationContext context = environment.getContext();
    DeskRepositoryService repositoryService = context.getBean(DeskRepositoryService.class);
    Page page = new Page(environment);
    List<Desk> allDesks;
    if(page.applyPagination()) {
        allDesks = repositoryService.findAll(); // TODO apply pagination!
    } else {
        allDesks = repositoryService.findAll();
    List<TableDto> tables =;
    return new AbstractPaginatedData<TableDto>(false, false, tables) {
        public String getCursor(TableDto entity) {

Here I create Page using default GraphQL pagination variables. But you can use any other input for pagination you like.

To implement pagination you have two options:

  • in memory pagination. You retrieve all objects from your repository and then paginate, filter and sort them. For this solution it is much better to create your implementation of PaginatedData and pass environment, as well as pagination/sorting/filtering input there.
  • Actions in repository. It is much better, as you won’t load lot’s objects to memory, but it requires you to generate complex queries in the repository.

All GraphQL objects can have methods with pagination. In TableDto you can retrieve a paginated list of all reservations for current table with:

public PaginatedData<ReservationDto> reservations(DataFetchingEnvironment environment) {

Next steps

In the next article I will cover mutations and subscriptions.