Implementation, Testing And Support – Our Ongoing Quest To Deliver The Best Value Possible, Always.

Dave Sampson — 19 June, 2020

Welcome back to our series on the challenges and changes we’ve made delivering a middleware product to the healthcare industry on behalf of our federal government customer. In the first article we introduced some of the changes we’ve applied to our relationship and working practices, and built on that in our second article to describe corresponding changes to how we develop, build and deploy. It’s now time to conclude our journey, starting with our testing practices.

How We Test

Our testing practices have historically been a mixture of sporadically implemented and maintained unit and integration tests and mostly manual system and regression testing. In the last couple of years we’ve invested (and will continue to invest) in more consistent use of test automation.

Our legacy unit tests had been written using the MSTest framework across many years by many developers. As tends to happen, different developers had different approaches, experience and appetites for writing coded tests, resulting in tests that were inconsistent and in many cases assumed dependencies such as databases or data in databases that were long, long gone! Due to this, the tests weren’t able to be executed reliably, resulting in them becoming poorly maintained, and the cycle continued.

We’re now partway through migrating our existing unit and integration tests from the MSTest framework to xUnit instead (this was the subject of a recent team swarm day), and all new tests are written in xUnit. For us, xUnit provides much better support for the explicit handling of dependencies through its fixtures, and we use them extensively to perform actions for each test run such as:

  • Creating a clean application database (using our database upgrade mechanism) and populating it with seed data in SQL Server LocalDB for each test run
  • Ensuring that client certificates required for interacting with external dependencies are installed appropriately (more on that to come)
  • Ensuring our logging configuration is available during unit tests

Using just these fixtures we’ve eliminated many of the static “test helper” classes (shudder) we previously had throughout our codebase and supplied a framework that can consistently and reliably get tests into a known state prior to running and subsequently clean up after themselves. This allows our developers to focus on what they actually need to test instead of worrying about dependencies as a barrier to even getting started! By making our test dependencies explicit and easing how they’re fulfilled, we now focus on writing repeatable unit and integration tests that will run anywhere – including any data or configuration that’s specific to the test.

While this approach is working just fine for our unit tests, we’re actually reasonably constrained in how widely we can implement unit tests before we reach the land of integration tests instead. Being middleware, we have dependencies on a number of external services, and a vast majority of our middleware logic revolves around interactions with these services. In a sound architecture these dependencies would have been abstracted by interfaces and we’d use dependency injection to inject either the concrete dependency or a mock that we could test against. Alas, tying in with my previous observation of doing the right things from the start (from now on), in the legacy codebase we have no interfaces nor dependency injection. While that is certainly something we will progressively work towards, in the meantime we’re left exploring alternatives for satisfying dependencies in unit and integration tests:

  • We have access to an externally hosted “system vendor test” environment for each external service dependency, and historically this has been the environment we’ve targeted with our integration tests.

    • The Good: The environment exactly replicates the service logic implemented in the corresponding production environment, meaning if our tests fail in this environment, it’s significant.

    • The Bad: The environment intermittently and inconsistently returns error responses to service calls. And the very next time the same test case is executed it’s likely to succeed, but a different test case may fail, or not. So we waste time investigating test failures that turn out to be a transient failure in our dependency not in our system under test, consequently again losing confidence in our integration tests.

    • The Bad: We have limited control over the data in this environment, it’s shared with a number of other vendors. So we can very well (and do!) execute a test using test data that we thought was in one state, only to find that someone else has modified it in the meantime!

  • We have access to a “simulator” for each of our external service dependencies that can be run alongside our system under test.

    • The Good: We’re in complete control of the environment and the data in it.

    • The Bad: The simulator doesn’t implement all service operations supplied by the external service dependencies, so we can’t execute some test cases.

    • The Bad: The simulator is based on a different codebase to the main codebase for the external services, so may not always reflect the current release.

  • We’ve developed a set of “mocks” for each external service dependency and the service operations we invoke.

    • The Good: We’re in complete control again.

    • The Bad: The simulator doesn’t implement all service operations supplied by the external service dependencies, so we can’t execute some test cases.

    • The Bad: We implement minimal logic in the mocks, so they’re not really suitable for testing interaction logic. But they are great for performance testing – more on this later!

I’m not sure there’s a great answer to this, everywhere we turn there are constraints and deficiencies. In the interim we’re working with the supplier of the “system vendor test” environments to try to understand why we encounter intermittent failures and introducing increased resiliency into our code that interacts with these dependencies. This is an area we continue to invest in, because it’s important to get right.

As our products are becoming more widely adopted we’re seeing a greater emphasis from customers on performance, both in the core and user interface products. Performance testing in general is not something we’ve historically been great at, we’ve tended to do just enough to meet the requirements of a specific project, without really considering repeatability or frameworks. To address the increasing requirements from our customers we’re now investing heavily in automated performance testing. We’ve composed a reasonably robust framework that uses JMeter running inside Docker containers as our test clients, which enables us to scale our clients to simulate any load we desire. To remove our dependency on the external services and provide the responsiveness we need during performance testing we’ve used SoapUI to mock each service operation we depend on, with the mocks deployed to Tomcat. We use Azure DevOps Pipelines to deploy a target release to our performance test environment and kick-off the performance test run, capturing performance metrics via Telegraf in InfluxDB, and reporting on them using Grafana. Using this approach we’ve finally been able to benchmark performance of our core and UI products, with a view to identifying changes that introduce performance regressions in the future. We’ve also successfully used our framework to identify and eliminate some long-standing performance bottlenecks, in particular improving the responsiveness of a high-use screen in our UI by at least 5x.

We’re actively working on maturing our testing capability in other areas too, including:

  • Continuing to automate our regression test suite using tools such as Selenium and Katalon

  • Continuing to evolve our performance testing framework, including consideration of alternative mocks such as Mountebank, and scheduling our performance test suite to run on a regular basis so we can compare the results with previous runs

  • Automating and scheduling vulnerability scans

  • Simplifying and standardising our test reporting requirements, particularly in the areas of test strategy and test summaries