Tuesday, 18 June 2019

Local Microservices: First-Class Procedures

This is the third article in a three part series on local microservices.  The first two articles looked at:
These articles identified the method coupling of Object Orientation creates a monolithic jigsaw of different shaped objects.  Microservices are breaking these into more manageable, smaller jigsaws that appear similar in shape.

This article continues the breaking down to consider local (pass by reference) microservices.

Part Three: Local Microservices via First-Class Procedures

The first two articles in this series identified:
  1. Object references are a nice graph of nodes (objects) and lines (fields)
  2. Object methods have a significant coupling problem creating a jigsaw of behaviour
  3. Microservices break the method couple to return behaviour to a graph of nodes (microservices) and lines (HTTP requests, / Queue messages)
There is an underlying pattern to representing this decoupled behaviour.   It is HTTP URL / Queue name and payload / message type.   This decoupled client calling pattern can be represented with the following general interface:

interface ClientCall<T> {
  void invokeService(T singleObject);
}

This client calling interface is then implemented by the appropriate HTTP request service(...) method or Queue onMessage(...) method.  These methods are usually found on the following objects:

public void SomeHttpServicerImpl {
  @Inject SomeRepository someRepository;
  @Inject AnotherRepository anotherRepository;
  @Inject ClientCall<SomeArgument> anotherMicroservice;
  // other dependencies

  public void service(SomeObject httpRequestEntity) {
    // service HTTP request with injected dependencies
  }
}
public void SomeQueueConsumerImpl {
  @Inject SomeRepository someRepository;
  @Inject AnotherRepository anotherRepository;
  @Inject ClientCall<SomeArgument> anotherMicroservice;
  // other dependencies

  public void onMessage(SomeQueueMessage message) {
    // service Queue message with injected dependencies
  }
}

Furthermore, what is not shown clearly is the threading model.   As the HTTP servicer or Queue consumer are in their own process, they are run with their own threads.

The result is the following pattern for implementing the microservice:
  • Single object provided by client
  • Remaining objects are dependency injected
  • Thread used is based on service/consumer implementation
  • Interaction with other microservices is via single parameter ClientCall
The issue with this pattern is that all calls to other microservices require the microservice to be executed by another thread.  As the mciroservice resides behind HTTP requests / Queues, there is process boundaries preventing the calling thread from executing the microservice.

The process boundary separation provides a bounded context, so that the microservices are isolated from each other.  However, this separation puts a lot of communication overheads and network error handling into microservice solutions.  Plus it disallows microservices from being executed by the same thread.

So can we have the microservice called and executed by the same thread, and still continue to provide the microservice advantages of bounded contexts?  (in other words, the smaller jigsaws)

Local Bounded Context

To see how local (same thread calling/executing) microservices can be achieved, we need to transform the above implementations slightly.

Rather than field/setter injection, let's look at using constructor injection.  We could turn the above implementation into the following:

public void SomeMicroserviceImpl {
  private final SomeRepository someRepository;
  private final AnotherRepository anotherRepository;
  private final ClientCall<SomeArgument> anotherMicroservice;

  @Inject
  public SomeMicroserviceImpl(
            SomeRepository someRepository,
            AnotherRepository anotherRepository,
            ClientCall<SomeArgument> anotherMicroservice) {
    this.someRepository = someRepository;
    this.anotherRepository = anotherRepository;
    this.anotherMicroservice = anotherMicroservice;
  }

  public void service(SomeObject httpRequestEntity) {
    // service HTTP request with injected dependencies
  }
}

However, that's a lot of code!

Rather, why not just inject the dependencies directly into the method:

  public static void service(
            SomeObject httpRequestEntity,
            SomeRepository someRepository,
            AnotherRepository anotherRepository,
            ClientCall<SomeArgument> anotherMicroservice) {
    // service HTTP request with injected dependencies
  }

The method has effectively become a procedure.   The object and all it's fields are no longer necessary.  The above procedure links the required objects together by being parameters.

This execution is now:
  1. ClientCall used to invoke a procedure
  2. Procedure pulls in appropriate dependencies
  3. Procedure then invokes other procedures via the ClientCall interface
The execution is no longer methods navigating the Object references, locking you into monolithic jigsaw.  It is now procedures invoking each other, pulling in only the required dependencies for the procedure.

As the procedure pulls in only its required objects, it provides a bounded context.  One procedure may pull in a particular set of objects, while another procedure may pull in a totally different set of objects. As the procedure joins the objects, we no longer have to create a big graph of all objects referencing each other.   We can separate the objects into smaller graphs.  This break down allows separation of objects into bounded contexts.

Now the question comes of how can we implement this so the procedures run within the same process space?

First-Class Procedure

Well this procedure is remarkably similar to the First-Class Procedure.  See:
What the First-Class Procedures allows is containerising small snippets of logic within a procedure.  These procedures communicate via loosely coupled continuations that require only a single object (payload message).  The remaining objects are dependency injected.  Furthermore, threading models can be specific to each procedure.

The two bounded context approaches have similar characteristics:
  • HTTP/Queue communication can be considered a single argument Continuation
  • Threading models can be different within each first-class procedure / microservice process
  • Dependency Injection of both allows access to only the required object graph allowing smaller object jigsaw puzzles (no monoliths).  In other words, bounded contexts.
The difference is that the same thread can call and execute the First-Class Procedure.  In other words, First-Class Procedures run locally with each other.

Remote vs Local

But don't microservices want to be process separated to allow different release cycles and scalability?
Yes, that is absolutely true once up and running in production with heavy load of users.  However, what about getting started with microservices?

For me this falls into the problem of being opinionated too early.  To get the right mix of microservices takes a significant amount of requirements gathering and architecture.  Why? Because refactoring microservice architectures can be expensive.  Microservices involve a lot of overhead in typically different code repositories, build pipelines, network failure handling, etc.  Finding you got the microservice mix wrong involves a lot of effort to change.

By starting out with first-class procedures, you get to try a local microservice mix.  If the mixture is wrong, it is very quick to change them.  First-class procedures are weaved together graphically.  Therefore, to change the mixture is simply rewriting the procedures and then drawing the new connections between them. Yep, that's it.  No code moving between repositories.  No changing build pipelines.  No extra error handling because of network failures.  You can get on with trying out various mixes of local microservices (first-class procedures) all on your local development machine.

Once you find a mix you are happy with, deploy all of them in the one container.  Why?  Because unless you have a large user based, you can run your first-class procedures in just one node (possibly two for redundancy).  Having less deployed nodes, means less cloud instances.  Less cloud instances, is well less dollars.

Then as your load increases, you can split out the first-class procedures into separate nodes.  Just change the continuation link between them to either a HTTP call or Queue.  Furthermore, this split can then be for various reasons you may discover along the way:
  • differing functionality change cycles
  • differing team responsibilities (e.g. Conway's Law)
  • data governance may mean geographic restrictions
  • security may require some to run on premise
  • on premise capacity limits may mean pushing some to public clouds
The above is not an exhaustive list.

Having to requirements gather and architect the microservice mix given all of the above could get quite exhausting.  Especially, as some aspects are quite fluid (e.g. teams change, companies buying other companies, capacity limits on in house data centres, etc).  There are significant factors making it difficult to find the right mix of microservices up front.

Plus this also works in reverse.  As things change and some aspects don't experience higher loads or significant functional changes, they can be combined back into single instances.  This reduces the number of cloud instances required, and again reduces dollars.

Summary

For me, local microservices (i.e. pass by reference mircoservices) is going to eventuate.  This is similar to session EJBs being introduced because the EJB 1.0 specification of only remote calls was just too heavy.  Yes, we have better infrastructure and networks than 20 years ago.  However, the financial overhead costs of only remote microservices may soon be considered heavy weight and expensive given that local (pass by reference) First-Class Procedures are available.

So if you are finding the "remote" microservice architect a heavy weight and expensive consideration, try out First-Class Procedures as a local microservice solution.   This gets you many of the benefits of microservices without the expense.  Then, as your load increases and your revenues increase, scale into a remote microservice architecture.  But this is only where you see real need, so you can keep your infrastructure costs down.

Thursday, 6 June 2019

Local Microservices: Breaking up the Jigsaw

This is part two in a three part series looking at local microservices running in the same process.  Part one identified the coupling problems in Object Orientation behaviour (found here).  This article will look at how microservices are helping reduce the coupling problem.

Part Two: Breaking up the Jigsaw

Part one identified that object behaviour coupling is similar to a jigsaw puzzle. This is a highly coupled jigsaw of varying shaped objects. These different shaped objects makes their re-use and refactoring difficult within monolithic applications.

So, you may be asking how are microservices helping this problem?

For me, I see the evolution of microservices to be an evolution to break down the rigid behavioural coupling imposed by object methods.

We had applications grow to become monoliths:


This became unmanageable to enhance or re-use, as everything was tightly coupled together in a rigid jigsaw.

Our first attempts was to try re-using parts of the jigsaw with Service-Orientated Architectures.   This looked like the following:



The service-oriented architecture was, in my opinion, a doomed to failure attempt to the expose method connectors outside the system for re-use.  Yay, we can now call into our monoliths to re-use aspects of them.  Oh, wait there was too much coupling to that method that it was just too hard to separate it from the rest of the monolith.

Ok, we could put in governance and some great coding practices to avoid this.  However:
  • deadlines
  • shortcuts
  • new team members
  • occasional bad design decisions
  • etc 
allows the coupling to increase over time.  As much as we wanted to believe in the ideals, the behaviour coupling of methods just took over to create the monolith jigsaw.

We needed to break things down and keep them isolated.

So we split the jigsaw up into smaller puzzles.  The result is the following smaller puzzles joined by HTTP requests / Queues:


And for me, microservices were born.

At first glance, this looks very similar to the original EJB 1.0 specification of only remote calls.  Now one might argue that microservices are not single objects like EJBs typically were.  This, however, is not why I see microservices an improvement over remote EJBs.

EJBs use remote procedure calls that allow multiple parameters and varying exceptions to be thrown. This does nothing to decouple you from the varying shape of the method call. EJBs only enable methods to be called remotely.  These remote method calls continue to have all the jigsaw coupling issues of method calls.  Except, now they are less reliable due to network failures.  Hence, EJBs suffer the method coupling problems that leads you to a distributed monolithic jigsaw.

Microservices on the other hand, standardised calls to other microservices via HTTP requests / queue messages. Looking at the five coupling aspects of the method call (Inversion of Coupling Control), HTTP requests / queue messages remove most of the method call coupling:

Method Coupling AspectHTTP RequestQueue
Method NameURL

Though, can be decoupled indirection via lookup in service directory/registry
Outbound queue

Provides decoupled indirection to target consumer (via message routing)
Return TypeMay provide return entity.

However, typically this is only for front-ends with users waiting.

Note: be careful of synchronous returns between microservices, as they can create synchronous coupling that can lead to distributed monoliths
N/A, as decoupled from any synchronous return values
ParametersOnly single payload

Typically JSON/XML serialised object.
Only single payload

Typically JSON/XML serialised object.
ExceptionsStandardised REST status codes.

Note: be careful of relying on error response payloads, as they may be indications of cohesiveness problems in the microservice
N/A, as decoupled from any synchronous processing
ThreadingDecoupled as separate process

Any threading model enabling synchronous response
Decoupled as separate process

No restriction on threading model


Therefore, using HTTP requests / queues, the microservice calling shape is effectively standardised. Given the payload is typically a serialised object (e.g. JSON/XML), this removes the ability to pass multiple parameters.  Standardising the HTTP status codes disallows varying exceptions.  Plus, my preferred microservice communication of queues is decoupled by its very nature.

So by making the communication HTTP requests / queue messages, microservices are removing the method coupling problems.  It makes it easier to call different microservices, as the shape is only a name (HTTP URL / Queue name) and a single parameter (payload).

This, actually, is similar to the object reference shape of type and reference discussed in part one.  However, now type is parameter type (payload serialised object) and reference is name (URL / Queue name).  This allows microservice behaviour to somewhat be represented like the idealised object graph at the start of part one - lines to standard shaped microservice call connectors.

Stay tuned for part three where we look at further isolating the microservice container to the servicing method to effectively enable local microservices.

Tuesday, 28 May 2019

Local Microservices: Object Orientation Behaviour Coupling Problem

This is the first in a three part series on looking at creating "local" microservices.

What I mean by local microservices is pass by reference microservices running in the same process.  Right now I see microservices similar to the original EJB 1.0 specification.  All calls between the EJBs were remote, just like microservices are now.  Due to remote call overheads, the EJB 2.0 specification brought in session (local, pass by reference) EJBs running in the same process.

Now, before getting to local microservices, we need to look at identifying the Object Orientation behaviour coupling problem.  This to me, identifies a big problem that microservices resolve.   It is also the reason enabling local microservices.

Part One: Object Orientation Behaviour Coupling Problem

Structurally Object Orientation joins objects together in a very nice graph. These typically look something like the following:


These graphs represent the object relationships in your application.  Each line is a field reference to another object.  This, for example, allows modelling your personal details to have addresses, phone number and various other objects containing information.

However, I can't call an object reference.  I use the object reference to access a method on the object.  Behaviour of your application actually follows the method calls.  In other words, program control follows method calls on the thread stack.  Yes, the object references make methods accessible to be called, however the methods calling each other is how behaviour of your application is achieved.

So, we can just draw the above graph with the methods calls, right?  No, the method coupling is a lot more complicated than just an object reference that varies at most by type (see Inversion of Coupling Control).   Methods have:
  • a particular return type
  • a name
  • varying number of parameters with differing types
  • varying number of exceptions with differing types
  • thread to execute them
Drawing a simple line between objects to represent behavioural relationships (program control flow) does not respect the method coupling complexity.  In the object structural graph, changing a line is simply changing the reference (memory address) to another object.  Yes, restrictions apply based on the type of object referenced, but it is a single address value that couples the two objects into a structural relationship.  This is not the case with methods.

Each method call is a different shape.  In terms of object structural connectors the shape of the object reference is object type and memory address.   This is consistent across all object references.  However, for methods the call (connectors) are different shapes due to the above 5 aspects.  In other words, one method may have 1 parameter, while another has 3 and throws 2 exceptions.  The shape of each method call connector varies significantly.

Drawing the behaviour diagram of object method calls is more akin to the following:


Each call (connector) between the methods is a different shape, just like a jigsaw.

We piece objects together into a jigsaw to achieve the behaviour of the application.  This is why refactoring your application is so difficult.  It's like trying to move pieces of a jigsaw around to make a new picture.   We end up having to change the pieces significantly just so they even attempt to fit into the new picture.

It is also the reason, I believe, why object orientation never provided that ideal class re-use.   Re-using objects is like trying to re-use jigsaw pieces of one puzzle to complete another puzzle.   Now if the puzzles are somewhat alike, you can kinda jam them together.   However, more often than not you have to change the piece (object) to fit into the other jigsaw (application) losing re-use.

Hence, the method coupling problem creates a significant behavioural coupling problem in monolithic Object Oriented systems.  Refactoring is an expensive exercise, as its not redrawing lines between nice round objects.  Refactoring is reshaping pieces to fit into a new jigsaw picture.

Now I can continue the analogy further with much of API development attempting to standardise jigsaw pieces and finding generic pictures that represent majority of systems.  However, I'm hoping you can join me in seeing that much of Object Orientation behaviour coupling problem is not with the objects but rather the use of methods.

And yes, for Object Orientation purists, Object Orientation is more about loosely coupled message passing.  In other words, languages such as Erlang may be more closer to what the intention of Object Orientation really was supposed to be.  However, my focus here is the "mainstream" understanding of Object Orientation being objects and methods.

Stay tuned for my next part on how microservices are helping with relieving the Object Orientation method coupling problem.

Sunday, 28 April 2019

Function IoC for First Class Procedure

This is the second article of two to introduce the term I'm suggesting of "first class procedure". The first article provided a working example of the first class procedure to see them in action. This article delves into the detail and some theory on how the first class procedure has evolved.

The evolution of the "first class procedure" starts with looking at the function. The function takes in parameters to then produce the result:

  type Function = Array[Any] => Any

  // Example of first class function
  def exampleFunction(paramOne: Int, paramTwo: Double): String
  val firstClassFunction: Function = (parameters) => exampleFunction(parameters(0).asInstanceOf[Int], parameters(1).asInstanceOf[Double])

However, before being able to obtain any results, the function also requires a thread to be run. This is what I refer to as the "implicit thread". The function does not define details regarding the thread to use, so defaults to the invoking thread of the function (the implicit thread).

Ideally, we should enhance the function signature to indicate the thread more explicitly so we can allow more control in how the composed functions are executed:

  type Executor = (Function, Array[Any]) => Any

  def invoke(executor: Executor, function: Function, parameters: Array[Any]) =
    executor(function, parameters)

Now the thread to execute the function is made explicit. Note that the executor is means to invoke with the implicit thread or a separate explicit thread via a thread pool.

An argument might be that threading is hard and should be left to compilers/frameworks/etc. However, I'm believing that I still want some control over the execution profile of the application. A reason for this would be the following example.

I'm running a multi-user application on a large server with multiple processors that does various blocking I/O and expensive CPU computations to service a request. I really want to optimise the CPU computations to have them have affinity to a core to avoid cache misses and thread context switching overheads. Plus I want to isolate the blocking I/O to a separate thread pool so the CPU computations do not block causing idle CPUs. Given the load of I/O I might want to isolate one core to do all I/O leaving the remaining cores free for the CPU computations. Conversely, I might find the I/O is minimal and can time slice with the CPU computations so I can get an extra core for the CPU computations to gain increased throughput. Ideally, I would like means to tweak this for the application.

Then requirements change and I now want to use the same application to service a single user that is running the application on a single core (e.g. a cheap portable embedded system). In this circumstance, I'm happy for the CPU computations to block while the I/O is underway. Basically, I just want one implicit thread to run all the functions.

In both the above, it is the same application logic but just different execution profiles based on the environment the application is running within.

So taking our above explicit function signature we now require to have the higher order functions provide appropriate threading information to invoke each composed function:

  def higher(executorIO: Executor, executorCpuIntensive: Executor, parameters: Array[Any]) =
    executorIO(functionIO, Array(executorCpuIntensive(functionCpuIntensive, parameters)) ++ parameters)

This now requires the higher order function to determine the thread for each contained function. This can blow out the higher order function's signature. Therefore, let's for now create a function that can determine the executor for a function:

  type ExecutorLocator = Function => Executor

So this now changes the higher order function to be:

  def higher(executorLocator: ExecutorLocator, parameters: Array[Any]) =
    executorLocator(functionIO)(functionIO, Array(executorLocator(functionCpuIntensive)(functionCpuIntensive, parameters)) ++ parameters)

So we've stopped the blow out of the higher order function signature, but it starts to highlight a problem of passing results between the composed functions. Well not so much in passing results, but in which parts of the logic to execute with which thread.

To get the result from functionA to pass to functionB, we have two approaches:
  • As functionA completes, the system returns the result to the higher order function's thread that then passes it onto functionB
  • As functionA completes, it carries onto execute functionB
The difference is very subtle, however significant for thread context switching overheads.

Note: in my naive understanding of functional programming, I believe the first approach can be considered turning each function into an Actor, while the second approach is a Continuation (continuation passing style).

Anyway, before misunderstanding too much of the functional programming literature, the problem with the first approach is excessive thread context switching. If the higher order function is a different executor to it's composed functions, it creates two thread context switches for every composed function. The execution would be:
  1. Higher order function executes
  2. Invokes composed function on another thread
  3. Result returned to higher order function that thread context switches back to itself
  4. Invokes next composed function on another thread (possibly the same thread required for step 2)
The thread context switch that happens in step 3 is not really required. Furthermore, the cost of switching the thread would be more than the few operations to pass the result of the first function to the second function.

Note: I refer to thread context switching overheads assuming threads need to be scheduled in to handle each part. However, even if threads were running continuously on separate cores, there is overheads in having to get the messages between the threads.

There is also a second problem of exceptions. Yes, I understand exceptions are not very functional, however I'll come to this later in regards to composition.

So if we take approach 2 of continuations, then the execution would be as follows:
  1. Higher order function executes
  2. Invokes composed functionA on another thread (passing in the next composed functionB)
  3. On result, the composed functionA continues with the next composed functionB
We have eliminated the extra context switch in the first approach by letting the composed functions continue on with each other.

Now this is not overly special. This is just Continuation Passing Style with the ability to execute the continuation with the implicit thread or delegating to a different thread. In doing so, we try to use the implicit thread as much as possible to reduce threading overheads. However when functions have different execution characteristics (such as blocking I/O, expensive CPU computations) we swap threads to enable appropriate execution of the functions to keep the overall application performant.

Another way of thinking about this threading is to consider the threads running on separate cores. The first approach is very much synchronous communication. Composed function is invoked and the higher order function effectively waits until the result is available. Continuations, on the other hand, are more akin to asynchronous communication. Higher order function triggers the composed function and then is free to continue other functions. The composed function will itself continue with the next composed function.

But continuations do not come easily.

Continuation passing has implications for the function signature, as we must pass the next continuation into all functions. This now has our functions looking like this:

  type Continuation = (Any) => Unit

  def function(executorLocator: ExecutorLocator, parameters: Array[Any], continuation: Continuation)

Now might I dare say there is more than one outcome to a function - well to me anyway. Yes, we could return a data type defining both success and error. Case statements then handle these various outcomes. However, for each new error type we're having to add new case statement handling. This reminds me of reflection problems of having to interrogate for each error type. Personally, I like to handle these outcomes separately.

Rather than combining the result and error into the one next continuation, we can provide a continuation for each. In my understanding, this is not too different to try/catch blocks (except that we can now execute the catch block with the implicit thread or different thread). In other words, we provide multiple continuations:

  def function(executorLocator: ExecutorLocator, parameters: Array[Any],
      successfulContinuation: Continuation,
      errorOneContinuation: Continuation,
      errorTwoContinuation: Continuation)

But why stop there. We can also have different paths through the function for say if statements. If condition is true follow the first continuation, else follow the second continuation.

  def function(executorLocator: ExecutorLocator, parameters: Array[Any],
      trueContinuation: Continuation,
      falseContinuation: Continuation,
      errorOneContinuation: Continuation,
      errorTwoContinuation: Continuation)

This is starting to blow out the function signature again, especially when composing into higher order functions that have to handle many exceptions. This is also the reason I tend to find many frameworks are moving away from checked exceptions. However, with first class procedures, we very much like checked exceptions (though this is probably not as you typically know them). But we'll come to this soon.

So to avoid the signature blow out, let's do what we did with the choice of executor and wrap the continuation decision into a function. For now, let's assume we can have some key to aid the function to determine the appropriate continuation. This function would look as follows:

  type ContinuationLocator = (Function, Any) => Continuation

This then turns all our functions into the following:

  def function(
      executorLocator: ExecutorLocator,
      parameters: Array[Any],
      continuationLocator: ContinuationLocator)

So now we have this very flexible execution model than minimises the thread context switching.

However, how do we implement the executorLocator and continuationLocator functions?

Well the naming of the functions is deliberate, as they follow the ServiceLocator pattern. Given a key provide back the dependency. However, in this case, it is not an object but rather an executor for thread choice and continuation for invoking the next function.

Yay, we can now go create key/value configuration for every function in the system. Well, maybe not.

This granularity of configuration ends up being a nightmare. I can certainly say my first versions of implementing first class procedures showed this very much to be the case. Plus given we have an assumed key to identify the continuation, how do we know we have every continuation configured for a complete system? In other words, how can we make this compile safe?

To solve this problem, we do what we typically always do in software, add more indirection.

Huh, more dynamic indirection to create a type safe compiled solution? Well, yes.

To explain how indirection has helped, let's start with the Continuation Injection.

Firstly, we are going to give the function state (or possibly better described as meta-data). Arguably this is potentially turning the function into an object, but I'm going to avoid trying to relate things to the literature right now and focus on how the first class procedure has evolved.

So we now have a wrapping object:

  class ManagedFunction(
    val logic: Function,
    val continuations: Map[Any, Continuation])

So we've associated the continuations to the function, but this really is only a dynamic map based on key. Therefore, to be compile safe, we need to give the key some meaning the compiler/framework can understand.

Well to do this, let's go back to the exploded function signature:

  def function(parameters: Array[Any],
      trueContinuation: Continuation,
      falseContinuation: Continuation)

Given the parameters are always ordered, we can use the index of the continuation as the key. This has the ManagedFunction look as follows:

  class ManagedFunction(
    val logic: Function,
    val continuations: Map[Any, Continuation]) {

    def cont(key: Any): Continuation = continuations.get(key) match { case Some(cont) => cont }

    def run(parameters: Array[Any]) = logic(parameters ++ Array(cont(1), cont(2)))
  }

Now, I mentioned first class procedures actually like using checked exceptions. The reason is that the checked exception is stated on the signature. This would look like:

  function(Object[] paramaeters) throws ErrorOne, ErrorTwo;

Checked exceptions are not ordered, but their types are unique. Therefore, we can use the checked exception's type as the key. This now turns the ManagedFunction into the following:

  class ManagedFunction(
    val logic: Function,
    val continuations: Map[Any, Continuation]) {

    def cont(key: Any): Continuation = continuations.get(key) match { case Some(cont) => cont }

    def run(parameters: Array[Any]) = {
      try {
        logic(parameters ++ Array(cont(1), cont(2)))
      } catch {
        case ex: Throwable => cont(ex.getClass())(ex)
      }
    }
  }

Now, to make compile/framework safe, we provide a function that produces the required list of keys for the logic function:

  def extractContinuations(logic: Function): Array[Any] = {
    var parameterIndex = 0
    extractParameterTypes(logic)
      .filter((parameterType) => classOf[Continuation].isAssignableFrom(parameterType)).map((paramContinuation) => { parameterIndex += 1; parameterIndex }) ++
      extractExceptionTypes(logic)
  }

The compiler/framework will then confirm that configuration mappings are provided for each key. This allows validating that all continuations are configured for the function to operate. Furthermore, doing this across all ManagedFunction instances within the application, we can confirm a complete configured application. We now have compile/framework startup safe validation that all continuations are configured.

However, now we are having problems of passing state between the functions contained within the ManagedFunction. As only one argument can be passed with the continuation, how can a function have more than one parameter?

Ideally, we want to have each ManagedFunction have the following run method:

  abstract class ManagedFunction {
    def run(parameter: Any) // not, parameters: Array[Any]
  }

So, how can we provide the additional parameters for the function?

Before answering this, we need to consider how the first continuation is triggered to start the chain of ManagedFunction executions. As the application is now being realised as a mapping of continuation to ManagedFunctions, we need means to trigger a continuation from outside a ManagedFunction.

Well, why can't we give objects continuations?

We can create a ManagedObject that contains continuations:

  class ManagedObject(
    @Inject val continuationOne: Continuation,
    @Inject val continuationTwo: Continuation) {
    // ... object methods
  }

This now allows objects to trigger logic. Why is this useful? Well, for example, we can have a HTTP socket listener object that receives a request and services the request by invoking a continuation. Further ManagedFunction instances will then use details of the request, to route it via continuations to appropriate handling ManagedFunction instances to then service the request.

The HTTP example actually points us to a design pattern already solving the issue of multiple parameters for a function. A typical thread-per-request web server has a request, session and application context. Now let's ignore session and application contexts as they are not concurrency safe. It is the request context pattern that helps us.

A request context allows passing objects between controllers and view rendering components. What are the controllers and view renders? They are snippets of logic that take in a request scope to access/mutate the request scope to capture enough state to provide a response (with possible side effects of saving state in databases, logging details, etc).

These snippets of logic fit well into the ManagedFunction, with request scopes created for each continuation tree invoked from a ManagedObject. ManagedObjects are created in the application that are hooked into the network of continuations to ManagedFunctions. When the ManagedObject receives an event (HTTP request, queue message, etc), it does two things:
  1. Starts a new request scope 
  2. Triggers the first continuation with the scope that carries through for all further continuations triggered
  3. ManagedFunctions can now grab their required parameters from the scope
This can be taken further to include dependency injection. Rather than the ManagedFunction being responsible for managing the request scope, the request scope objects are provided through dependency injection. This is the following dependency context for the ManagedFunction:

  type ServiceLocator = String => Any

  class DependencyContext(val serviceLocator: ServiceLocator) {
    val objects = scala.collection.mutable.Map[String, Any]()

    def getObject(name: String) = {
      objects.get(name) match {
        case Some(obj) => obj
        case None => {
          val obj = serviceLocator(name)
          objects(name) = obj
          obj
        }
      }
    }
  }

Side benefit of providing Dependency Context is that we can re-use existing Dependency Injection frameworks for managing objects. For example, the ServiceLocator can be a Spring BeanFactory. Furthermore, we can also dependency inject ManagedObject implementations to allow objects to maintain state, but also trigger continuations in the background (e.g. background polling for JWT key changes in providing JWT authentication state).

The ManagedFunction now becomes:

  type ContinuationFactory = DependencyContext => Continuation

  class ManagedFunction(
    val logic: Function,
    val parameterScopeNames: List[String],
    val continuations: Map[Any, ContinuationFactory]) {

    def obj(index: Int, context: DependencyContext): Any = context.getObject(parameterScopeNames(index))
    def cont(key: Any, context: DependencyContext): Continuation = continuations.get(key) match { case Some(factory) => factory(context) }

    def run(parameterFromContinuation: Any, context: DependencyContext) = {
      try {
        logic(Array(parameterFromContinuation, obj(1, context), obj(2, context), cont(1, context), cont(2, context)))
      } catch {
        case ex: Throwable => cont(ex.getClass(), context)(ex)
      }
    }
  }

To populate the scope names, we can again use reflection on the logic signature. However, rather than having to provide explicit configuration, we can use auto-wire configuration based on parameter type and possible qualifier. This then becomes normal dependency injection for constructors, except that we are injecting into the logic function.

We can now have database connections, HTTP clients, etc provided to the logic, however it does not answer the problem of passing state across the continuation boundaries.

To solve passing state, we just create a state object. This object acts much like a variable. It's value can be set and retrieved. However, this introduces mutability and timing concerns regarding the flow of continuations. It is unclear on whether a ManagedFunction is only safely accessing the variable's value, or is unsafely mutating the variable. Therefore, for variables we provide additional support within the ManagedFunction to identify the use of the variable.

For variable state objects, we allow the ManagedFunction to use various interfaces to identify the nature of using the variable. This allows the following interfaces for a variable state:

  trait Out[T] { def set(value: T) }
  trait In[T] { def get(): T }
  trait Var[T] extends Out[T] with In[T]

ManagedFunctions can then use the appropriate interface to identify their intention on the state of the variable.

Note that it is now possible to traverse the graph from ManagedObject continuations to confirm variable state outputs of ManagedFunctions are always upstream of respective inputs. This creates an ability for compile safe state generation. Furthermore, if all objects loaded to scope variables are immutable it allows reasoning for identifying the ManagedFunction producing incorrect state (just look for the ManagedFunctions requiring the Out of the variable).

What this now also provides is multiple inputs and multiple outputs. Composition is no longer derived by output of one function being passed as input to the next function. State is maintained in scope with ManagedFunctions pulling/pushing state as appropriate to the scope. Continuations are now separated from having to be concerned with all the state needed to invoke a ManagedFunction.

Now, the above implementation assumes some order of parameters followed by continuations in invoking the logic. As this information is reflectively retrieved from the logic function, the order is not necessary. We can then have a ManagedFunction look as follows:

  class ManagedFunction(
    val logic: Function,
    val parameterScopeNames: List[String],
    val continuations: Map[Any, ContinuationFactory]) {

    def obj(index: Int, context: DependencyContext): Any = context.getObject(parameterScopeNames(index))
    def cont(key: Any, context: DependencyContext): Continuation = continuations.get(key) match { case Some(factory) => factory(context) }

    def run(parameterFromContinuation: Any, context: DependencyContext) = {
      var continuationIndex = 0
      var objectIndex = 0
      val arguments = extractParameterTypes(logic).map(_ match {
        case p if p.isAnnotationPresent(classOf[Parameter]) => parameterFromContinuation
        case c if classOf[Continuation].isAssignableFrom(c) => cont({ continuationIndex += 1; continuationIndex }, context)
        case _ => obj({ objectIndex += 1; objectIndex }, context)
      })
      try {
        logic(arguments)
      } catch {
        case ex: Throwable => cont(ex.getClass(), context)(ex)
      }
    }
  }

Notice that the return value from the function (logic) is no longer necessary. Hence, why we're considering this "first class procedures".

This can then be represented as follows:

  class ManagedFunction(
    val procedure: Array[Any] => Unit,
    val parameterScopeNames: List[String],
    val continuations: Map[Any, ContinuationFactory]) {

    def obj(index: Int, context: DependencyContext): Any = context.getObject(parameterScopeNames(index))
    def cont(key: Any, context: DependencyContext): Continuation = continuations.get(key) match { case Some(factory) => factory(context) }

    def run(parameterFromContinuation: Any, context: DependencyContext): Unit = {
      var continuationIndex = 0
      var objectIndex = 0
      val arguments = extractParameterTypes(procedure).map(_ match {
        case p if p.isAnnotationPresent(classOf[Parameter]) => parameterFromContinuation
        case c if classOf[Continuation].isAssignableFrom(c) => cont({ continuationIndex += 1; continuationIndex }, context)
        case _ => obj({ objectIndex += 1; objectIndex }, context)
      })
      try {
        procedure(arguments)
      } catch {
        case ex: Throwable => cont(ex.getClass(), context)(ex)
      }
    }
  }

So we've provided composition of logic with state management, but we've not solved the original implicit thread problem that sparked this.

To solve specifying explicit threads, we need to implement the ExecutorLocator. This is achieved by looking at the parameter types of the function. As all state (objects) are now injected from the DependencyContext, we can determine the execution characteristics from the parameters. In other words, if the logic depends on a Database connection, it is likely to be making blocking calls. Therefore, we can use the parameter types to implement the ExecutorLocator:

  class ManagedFunction(
    val procedure: Array[Any] => Unit,
    val parameterScopeNames: List[String],
    val continuations: Map[Any, ContinuationFactory],
    val executorConfiguration: Map[Class[_], Executor]) {

    def obj(index: Int, context: DependencyContext): Any = context.getObject(parameterScopeNames(index))
    def cont(key: Any, context: DependencyContext): Continuation = continuations.get(key) match { case Some(factory) => factory(context) }
    def executorLocator(): Executor = {
      var executor: Executor = (logic, arguments) => logic(arguments) // default executor is synchronous (implicit thread)
      extractParameterTypes(procedure).map((parameterType) => executorConfiguration.get(parameterType) match {
        case Some(e) => { executor = e; e } // matched so override
        case None => executor
      })
      executor
    }

    def run(parameterFromContinuation: Any, context: DependencyContext): Unit = {
      var continuationIndex = 0
      var objectIndex = 0
      val arguments = extractParameterTypes(procedure).map(_ match {
        case p if p.isAnnotationPresent(classOf[Parameter]) => parameterFromContinuation
        case c if classOf[Continuation].isAssignableFrom(c) => cont({ continuationIndex += 1; continuationIndex }, context)
        case _ => obj({ objectIndex += 1; objectIndex }, context)
      })
      executorLocator()((arguments) => {
        try {
          procedure(arguments)
        } catch {
          case ex: Throwable => cont(ex.getClass(), context)(ex)
        }
      }, arguments)
    }
  }

This enables choice of Executor to be managed within configuration. This separates it from concerns of composition and state management.

And now you are up to speed with the general concepts behind the first class procedure.

Though, do note that the actual implementation uses a lot more memoization, as function signatures are static allowing the reflection to be done at compile/startup time.

Furthermore, the overall effect is that higher order functions do not need to provide all arguments to call the function (procedure). The control is inverted so configuration and the procedure itself defines what is injected into it. Higher order composition need only use the continuations to invoke the first class procedures.

Plus, I find it lifts a constraint in functional programming of having to fit all results through the small key hole of the function's return type. The return type of the function needs to provide success and error details, and is coupled to have to pass this through the chain of composed functions.  First class procedures, via variables, decouples this so any upstream procedure can output the value for any downstream procedure to consume.  Furthermore, checked exceptions continue error flows to remove this from function return types (output variable types).

There are also other concepts built on first class procedures, such as:
  • process, thread, function scoped dependency contexts for concurrency/parallel processing
  • higher order compositions (Sections)
  • thread affinity and other thread management (via Executive)
  • providing context for state, such as transactions (Governance)
  • additional ManagedFunctions inserted into flows similar to aspects (Administration)
However, this article is focused on the first class procedure and is already long enough.

So, in conclusion, the first class procedure is applying inversion of control to the function to inject in state, continuations and thread (via Executor). This means the first class procedure no longer requires composition via return values of the function. This makes it significantly easier to weave impure/pure functionality together. Furthermore, it allows execution strategies for the application to be configured in at deployment time.

And to see all this in action, please see the first article.

For more information, see this paper OfficeFloor: using office patterns to improve software design (free download here).

Saturday, 27 April 2019

OO Functional Imperative Reactive weaved together

This is first of a two part article to discuss how different paradigms in programming can be weaved together seamlessly via the "First Class Procedure", a term I'm using to best describe the concept. 

The working code in this article demonstrates how you can seamlessly weave together the following to service a request:
  1. Validate a request (on socket event loop thread).
  2. Start a transaction and register the request in the database.  This will be on another thread to avoid halting the socket event loop thread.
  3. Make reactive calls to pull in data from other services.
  4. Run some functional code to work out the standard deviation on service times.
  5. Undertake alternate flows to handle special cases (including handling exceptions).  Then if no exceptions causing rollback, store results in the database.  This again is on a different thread to not tie up the reactive event loop thread.
  6. Send the response after committing the transaction
This allows you to use the programming paradigm best suited to the various problems at hand.  Note the request servicing in the demonstration is arbitrary.  The focus is on showing how the various programming paradigms can be weaved together.

Now to write a complete description of how this works with the first class procedure is beyond a single article.  There are many patterns used together to enable the composition through first class procedures.  Therefore, I'm going to provide an introduction to first class procedures in two parts:
  • This article to demonstrate with working code how flexible and easy composition is with first class procedures
  • Next article to provide an explanation more closely aligned to the theory on how the first class procedure has evolved to its current understanding
We'll start with some simple examples and then get to the more interesting above case of weaving multiple programming paradigms together.

First class procedure

Simple event loop

The following first class procedure services a REST request.  This will be run on the HTTP socket event loop thread.

public void service(ObjectResponse<ServicedThreadResponse> response) {
    response.send(new ServicedThreadResponse(Thread.currentThread().getName(), "Event", System.currentTimeMillis()));
}

Simple thread-per-request

The following first class procedure services a REST request by pulling a value from the database and sending it in the response.  This will be run by a separate thread pool.

public void service(ServicedThreadRequest request, ThreadPerRequestRepository repository, ObjectResponse<ServicedThreadResponse> response) {
    int identifier = request.getIdentifier() % 10;
    ThreadPerRequest entity = repository.findById(identifier).get();
    response.send(new ServicedThreadResponse(Thread.currentThread().getName(), entity.getName(), System.currentTimeMillis()));
}


The distinction of thread to use will be discussed later.  However, for now notice that a Spring Repository is used by only the thread-per-request first class procedure.

First Class Procedures weaved together

Ok, the above is little boring.  We've seen this in web application servers before.  Show us something interesting!

To show something more interesting we are going to weave first class procedures together to achieve the example detailed at the start of this article.

Each step in the request servicing is implemented as a first class procedure.  We'll address each first class procedure in the order specified.

Validate request (on socket event loop)

This is simple validation that the request is correct.  As it is straight forward logic, we use the thread of the socket event loop.  This way we don't have to pay overheads of a thread context switch and threading overheads to reject invalid requests.  The code is as follows:

const HttpException = Java.type("net.officefloor.server.http.HttpException");
const Integer = Java.type("java.lang.Integer")

function validate(identifier, requestIdentifier) {
    if (Number(identifier) <= 0) {
        throw new HttpException(422, "Invalid identifier");
    }
    requestIdentifier.set(Integer.valueOf(identifier))
}
validate.officefloor = [ 
    { httpPathParameter: "identifier" },
    { out: Integer },
    { next : "valid" }
];

Note that the validation is written in JavaScript.  This is so that the client side JavaScript validation rules can be re-used to validate requests to ensure consistency between client and server.

The officefloor attribute added to the function provides meta-data.  This is necessary, as JavaScript does not provide the strongly typed information required of first class procedures.

Imperative to register request in database

After validation, the request identifier is registered in the database.  This also creates a unique number for the request based on an IDENTITY column in the database.

@Next("registered")
public static void registerRequest(@Val int requestIdentifier, WeavedRequestRepository repository, Out<WeavedRequest> weavedRequest) {
    WeavedRequest entity = new WeavedRequest(requestIdentifier);
    repository.save(entity);
    weavedRequest.set(entity);
}

Reactive

The next is some Reactive code to concurrently call the two REST end points detailed at the start of this article (simple event loop and simple thread-per-request).   Because we are using Reactive we can call them concurrently to improve performance.

Note that while waiting on the responses, the flow is effectively idle with threads servicing other functionality.  This is asynchronous handling so that threads are not tied up waiting.   Once both sets of results come back, they notify the respective asynchronous flow to continue processing.

private final static String URL = "http://localhost:7878/{path}";

@Next("useData")
public static void retrieveData(WebClient client,
        AsynchronousFlow eventLoopFlow, @EventLoopResponse Out<ServicedThreadResponse> eventLoopResponse,
        @Val WeavedRequest request, AsynchronousFlow threadPerRequestFlow, @ThreadPerRequestResponse Out<ServicedThreadResponse> threadPerRequestResponse) {

    Flux.range(1, 10)
        .map((index) -> client.get().uri(URL, "event-loop").retrieve().bodyToMono(ServicedThreadResponse.class))
        .flatMap((response) -> response).collectList().subscribe((responses) -> eventLoopFlow.complete(
            () -> eventLoopResponse.set(responses.stream().toArray(ServicedThreadResponse[]::new))));

    Flux.range(1, 10)
        .map((index) -> client.post().uri(URL, "thread-per-request").contentType(MediaType.APPLICATION_JSON)
            .syncBody(new ServicedThreadRequest(request.getId())).retrieve()
            .bodyToMono(ServicedThreadResponse.class))
        .flatMap((response) -> response).collectList().subscribe((responses) -> threadPerRequestFlow.complete(
            () -> threadPerRequestResponse.set(responses.stream().toArray(ServicedThreadResponse[]::new))));
}

By now you may be noticing the Out/@Val combinations.  This is how values can be passed from one first class procedure to another first class procedure.  Note that if type for different values is the same, a qualifier can be used to distinguish them.  The rest of the arguments are provided from dependency injection (in this case Spring).

Functional

Next the reactive responses are provided to Scala functional code to determine the standard deviation of service times.

def mean(timestamps: Iterable[Long]): Double = timestamps.sum.toDouble / timestamps.size

def variance(timestamps: Iterable[Long]): Double = {
    val avg = mean(timestamps)
    timestamps.map(timestamp => math.pow(timestamp.toDouble - avg, 2)).sum / timestamps.size
}

def stdDev(timestamps: Iterable[Long]): Double = math.sqrt(variance(timestamps))

@Next("use")
def standardDeviation(@EventLoopResponse @Val eventLoopResponses: Array[ServicedThreadResponse], @ThreadPerRequestResponse @Val threadPerRequestResponses: Array[ServicedThreadResponse]): Double =
    stdDev((eventLoopResponses ++ threadPerRequestResponses).map(response => response.getTimestamp))

Note that a library could be used to reduce this code.  However, we've done this to demonstrate how functional code can be integrated into first class procedures.

Flow control

The next first class procedure triggers a flow to handle special cases.  Should there be no issues with the special cases, then it stores the standard deviation in the database.

@FlowInterface
public static interface Flows {
    void handleSpecialCases(FlowSuccessful callback);
    void stored();
}

public static void store(@Parameter double standardDeviation, Flows flows, @Val WeavedRequest request, WeavedRequestRepository repository, Out<RequestStandardDeviation> stDevOut) {
    flows.handleSpecialCases(() -> {
        request.setRequestStandardDeviation(new RequestStandardDeviation(standardDeviation, request));
        repository.save(request);
        stDevOut.set(request.getRequestStandardDeviation());
        flows.stored();
    });
}

The handling of the special cases is by the following first class procedure.

public static void handleSpecialCase(@Val WeavedRequest request) throws WeavedRollbackException, WeavedCommitException {
    switch (request.getRequestIdentifier()) {
        case 3:
            throw new WeavedRollbackException(request);
        case 4:
            throw new WeavedCommitException(request);
    }
}

Touch of exception handling

The two exception handling first class procedures are as follows.

public static void handle(@Parameter WeavedRollbackException exception, ObjectResponse<WeavedErrorResponse> response) {
    WeavedRequest request = exception.getWeavedRequest();
    response.send(new WeavedErrorResponse(request.getRequestIdentifier(), request.getId()));
}

public static void handle(@Parameter WeavedCommitException exception, WeavedRequestRepository repository, ObjectResponse<WeavedErrorResponse> response) {
    WeavedRequest request = exception.getWeavedRequest();
    request.setWeavedError(new WeavedError("Request Identifier (" + request.getRequestIdentifier() + ") is special case", request));
    repository.save(request);
    response.send(new WeavedErrorResponse(request.getRequestIdentifier(), request.getId()));
}

The second handler works within the transaction, so includes further data stored in the database.

Note that due to first class procedure composition not requiring the caller to catch exceptions, checked exceptions are embraced.  We consider checked exceptions very useful information in flow composition.  However, the distinction is that it should not be the caller's concern but rather the flow's concern.  To me this is a big difference and stops the catch and log exception handling problem.   Exception handling is now a separate concern that can be coded in afterwards.

Successful response

On successful storage of the request details in the database, the following first class procedure sends the response.

public void send(@Val WeavedRequest request, @Val RequestStandardDeviation standardDeviation, @EventLoopResponse @Val ServicedThreadResponse[] eventLoopResponse,
        @ThreadPerRequestResponse @Val ServicedThreadResponse[] threadPerRequestResponse, ObjectResponse<WeavedResponse> response) {
    response.send(new WeavedResponse(request.getRequestIdentifier(), request.getId(), eventLoopResponse, threadPerRequestResponse, standardDeviation.getStandardDeviation()));
}

Kotlin for some OO

Oh, and just for a little bit more polyglot fun, the OO objects used to represent the JSON request/responses are the following.

@HttpObject
data class ServicedThreadRequest(val identifier: Int)

data class ServicedThreadResponse(val threadName: String, val lookupName: String, val timestamp: Long)

data class WeavedErrorResponse(val requestIdentifier: Int, val requestNumber: Int)

data class WeavedResponse(val requestIdentifier: Int
        , val requestNumber: Int
        , val eventLoopResponses: Array
        , val threadPerRequestResponses: Array
        , val standardDeviation: Double)

Proving it works

The following is a test to confirm the flow of first class procedures services the request.

public static final SpringRule spring = new SpringRule();

public static final OfficeFloorRule officeFloor = new OfficeFloorRule();

@ClassRule
public static final RuleChain ordered = RuleChain.outerRule(spring).around(officeFloor);

@Rule
public final HttpClientRule client = new HttpClientRule();

private static final ObjectMapper mapper = new ObjectMapper();
static {
    mapper.registerModule(new KotlinModule());
}

@Test
public void confirmWeavedTogether() throws Exception {
    HttpResponse response = this.client.execute(new HttpPost(this.client.url("/weave/1")));
    assertEquals("Should be successful", 200, response.getStatusLine().getStatusCode());
    WeavedResponse body = mapper.readValue(EntityUtils.toString(response.getEntity()), WeavedResponse.class);
    WeavedRequest entity = spring.getBean(WeavedRequestRepository.class).findById(body.getRequestNumber()).get();
    assertNotNull("Should have standard deviation stored", entity.getRequestStandardDeviation());
}

Weaving together

The following diagram is the configuration to weave the above first class procedures together.


This is the only configuration/code necessary to compose the first class procedures together.  Notice the names represent the first class procedure names and their respective meta-data.

What this means, is check the port on the all the calls and tests.  Yes, everything you see above is running off the one port.  Yep, you don't have to choose between a framework that provides only thread-per-request or single threaded event loops.  This is because of the execution strategy provided by Thread Injection of first class procedures.

Thread Injection

The threading configuration is actually the following:

<teams>
    <team source="net.officefloor.frame.impl.spi.team.ExecutorCachedTeamSource" type="org.springframework.data.repository.CrudRepository" />
</teams>

Here we flag all procedures requiring a Spring Repository to be executed by a thread pool.  Remember I said keep note of use of Spring Repository.  Well the above configuration has any first class procedure requiring a Spring Repository executed by the configured thread pool.  Note that thread pools are named teams, due to the modeling origins of first class procedures coming from Offices.

Therefore, looking at the flow again, the thread execution is as follows:
  1. Validate uses the thread of the socket listener event loop
  2. Register request uses a Spring Repository, so execution is swapped to a thread from the configured thread pool
  3. This thread carries onto trigger the asynchronous reactive calls
  4. The reactive event loop thread then invokes the callbacks.  As the Scala code is quick to execute, the reactive event loop thread carries on to execute the Scala pure function.  Here it is deemed that a thread context switch is too much overhead, and it is more efficient to just invoke the highly optimised Scala pure function.  However, if we want to separate the Scala function to different thread pool, we can configure in a different thread pool (typically via marker dependency on the first class procedure).
  5. The remaining imperative code has a switch back to a thread from the configured thread pool, as depends on Spring repository.  Furthermore, the thread locals between the threads are propagated to each used thread, so the Spring Repository transaction is not lost (i.e. transaction is active for all first class procedures within the transaction bounds).
  6. Response is then sent.
Now all the above is configurable via Thread Injection.  If we have, for example, more than one synchronous data store, we can create a thread pool to interact with each data store to avoid one slow data store tying up all threads of the application.

This also means you can configure different threading for different environments without having to change any code.

Disclaimer

In a real world applications, I would try to avoid so many of the above programming languages together.  I'd try to streamline them to just a couple to avoid too many skill sets involved driving up maintenance costs of your application (plus reduces problems for mixed compiling).  This is only a demonstration of how OO, Functional, Imperative and Reactive code can all be weaved together with first class procedures.  Furthermore, it demonstrates how you can write concrete solutions before abstracting.

Also, as you can see we've had to cover a lot of breadth in each programming paradigm.  If the code is not a good representation of the paradigm, we're very happy to take feedback on improvements from those more acquainted with a particular paradigm.

And if we've missed an important paradigm, please let me know so we can consider including it.  When it comes to coding we appreciate diversity to give developers choice.  We're trying to tear down fences between the paradigms to have one big happy coding family.

Summary

We've demonstrated how the first class procedure can weave together polyglot code written in different paradigms to service a request.  The code outlined above in the article is all the code required for the application.  There is no further weaving code required.

Furthermore, to avoid the problems of it only works on my machine (in this article), the code for the above is available here.  See the readme on how to run it.

For more understanding of what's going on, see the tutorials, my other articles and in particular my next article.

Monday, 8 April 2019

IT "Floor Plan"

OfficeFloor's graphical configuration is inspired from the construction industry's floor plan.

Now, we did consider auto-wiring flows together based on parameter types between the first class procedures.  This is still possible if there is interest.  However, we went graphical for another reason that has less to do with the technical problems of the system but more in helping build and maintain the correct system.

This inspiration came from having been around house construction with my father and later when building my own house. Building a house is definitely not simply putting a slab down and walls up with a roof, possibly with a door to get in and out.   There is a lot of engineering going into houses to ensure the foundations are correct, the house is not going to fall over and even on the more advanced side of ensuring it fits with the home owner's style of living.

Now, I certainly claim no expertise in the house construction industry but they a have tool that made understanding and reasoning about the construction easier:

Floor Plan

The floor plan to me was a powerful communication tool within the home construction industry.   Now I had some background in construction, but engineering concerns such as house footings, load bearing walls, etc are beyond me.  However, around the floor plan we could discuss how the house was to be built.

Reflecting on my experience of building a house, I could see this discussion working even for those that had not a background around construction.  One does not need to be too technical to follow the room layouts and be able to visualise how the house is going to turn out.  Then when asking for something difficult (such as moving a load bearing wall), it is reasonably easy for the construction company to explain using the floor plan why that would incur extra costs.  Plus this same conversation can be had when renovating the house - something in IT we continually do by enhancing systems.

Now in this relationship, I saw myself as the business is to information technology.   I knew what I wanted out of the house (IT system) and the floor plan helped the communication with the (IT) construction company.  I saw the home construction industry having similar problems of non-technical customers wanting something technical, and solved this with the floor plan.

So it had me ask the question:

What do we produce in the information technology space to be the "floor plan" of IT systems?

I believe we've tried many approaches, but I find they get lost in the complex, intangible details of IT systems. We've tried model driven designs specific to certain domains with some success, however these become difficult to maintain and enhance.  We've used many modelling techniques such as UML and lots of others in the business analysis space, but these are divorced of the code and quickly stale (especially with the Agile approach of building a room at a time).  We can even incorporate functional programming as means to demonstrate how code is composed together.   However, I as a customer would not be expected to understand engineering mathematics of construction, so why should I be subjected to computing maths.

The best I've seen is story boards, but this does not convey the complexity of the internals of systems.  All too often I hear, it's only a button, why's it cost so much?   Now if we could bring out the "floor plan" of the internals of the IT system we could have a conversation around what aspects of the system is needing changes.  For example, that's load bearing impacting these rooms (components) of the house (system), so cost to change is going to be expensive.

Now when I refer to the internals of the system, I'm also not meaning front-end applications that only use the server to store data in databases.  For many of these applications, most things are quite tangible in screens to the user.  What I'm referring to is more complex server side components servicing the requests that are quite intangible to the user.  In other words, the details necessary to explain why adding this new button is 10 times more costly than the button added last week.

Furthermore, I tend to find most approaches are focused on one-way communication of capturing requirements to convey to technical individuals to build.  There is little in going the other way.  In other words, in providing understandable information back to the users on why some changes are harder than others.

This is even evident in Business Process Model and Notation (BPMN) in it becoming an executable specification.   This has been great to model business processes for technical people to implement systems.  In doing so, I'm finding it starting to incorporate the technical details of UML.  This starts becoming very technical and makes it difficult for non-technical individuals to pick up.

So I reflect on how I convey this information to users, and well the whiteboard with boxes and lines is something I keep drawing.  I tend to find drawing boxes representing things with lines indicating flow provides enough visualisation to have a discussion in 5 minutes about why something is easier or harder to change.

So is there a way to extract the boxes and lines out of the system for this conversation with the user?

Now to get technical for a moment, OfficeFloor allows composition through continuations.  For those not familiar with continuations you can can skip the rest of this paragraph and just know that continuations allow connecting code flow by drawing lines between procedures.  For those wanting a little more detail, the state passed by the continuation in OfficeFloor is decoupled from the continuation (state is Dependency Injected into the procedure from Dependency Contexts).   This allows the first class procedures of OfficeFloor to be composed together separate of state (Dependency Injection) and execution strategies (Thread Injection).  Therefore, OfficeFloor composition is merely configuring the output continuations of a procedure to the appropriate handling procedure.  This can be represented as a graph of nodes (procedures) with connectors (continuations) and directed edges (mapping configuration of continuation to procedure) - hence prime for graphical configuration.

So armed with first class procedures and means to draw lines between them to control program flow, we have the opportunity to represent the application flow composition as boxes and lines.

Therefore, when asked why graphical configuration for OfficeFloor.  It just made sense to use graphical configuration for ease of communicating the system's "floor plan".   This makes it easy for users to understand the system.  It also makes it very easy for new developers to pick up the system.  This is especially useful when having to work between many micro-services.

Therefore, OfficeFloor uses it's Continuation Injection graphical configuration to be the "floor plan" to how the IT system is constructed to make communication between user and construction company easier.

Monday, 11 March 2019

Perils of opinionated frameworks, like Spring Boot. Inverting for opinionated code.

We developers like abstraction.  Without it, we could not build applications.  Our programming disciplines even require that we code to abstractions and avoid coupling our code to detailed implementations.

However, what is the right abstractions for your application?

Sadly, the choice of abstractions really comes from our choice of framework.  Frameworks are basically abstract solutions that we extend to solve our problem.

Unfortunately frameworks, like Spring Boot, come opinionated about the  threading models you use, interfaces you need to extend, possibly the data repositories applicable and various other assumptions about your problem space.  That's a lot of restrictions before I've even written my first line of code.

What we really want to do is explore the problem space first.  This is what test driven design is all about.   We write tests to define what is successful code.  Then we implement code to pass those tests.  As we go along writing tests to cover off requirements, we subsequently churn out working code for the application.  In time we get enough working code to release as the application.

So this leads me to ask, when do we test the choice of framework?

Opinionated frameworks force abstractions too early in the development process

Well, I guess we pay very experienced senior people to make this choice.  So this choice must be correct.  It would not be for reasons like:
  • I (or our company) only know this framework, so we are using it
  • New shiny with lots of buzz words, we must use it
  • My CVs a little old, let's try something new
  • This one is cheaper
  • Architecture believed what it says on the tin
Regardless of the reason, the only way to test the framework choice is to build the application with it.  And just for those of you who like opinionated frameworks (like Spring Boot), please tell me you write the most risky aspects first.  This is so you can quickly discover if the framework's opinions match with your problem.

Sadly, even if you test with the most risky aspects, finding out the framework decision is wrong can lead to a lot of wasted code.  This arguably wastes is a lot of money for the business and can lead to failing projects.

For example, say we choose Spring Reactive.  Yay, we can make concurrent asynchronous calls out to various micro-services.  We can also use the latest in NoSQL data stores.  This was all a great decision.   However, over time we realise we have a small amount of data where integrity of the data is very important.  We find we want to use a relational database to solve this, and then incorporate JPA on this database for easier interaction.  However, our choice of Spring Reactive has disallowed this because it requires all I/O to be asynchronous (JPA is synchronous database calls).  Ok, yes, we can use Schedulers, but I seem to be continually doing work arounds for lack of transactions.  The data consistency issues are starting to mount up and we're missing deadlines.  I'm now in a position of do I throw out all the Reactive code, or do I keep making work arounds hoping it might all hang together.  I definitely need to swap jobs before this hits production and we start supporting it.  In my next job, I've learnt to use Spring Servlets for this type of problem.

The flip side of this could also be easily the case.  We start out wanting Spring Servlet for JPA interaction with a database.   However, over time we realise the database interaction is mostly read-only.  What we really wanted was asynchronous I/O from Spring Reactive to collect data from multiple micro-services and data stores concurrently.  Unfortunately, with our up front Spring Servlet choice, the data collection is just too slow.  Our work around is to use async Servlets and spawn threads to make concurrent requests.  This worked initially, but over time the load increased.  This significantly increased thread counts, resulting in thread scheduling starvation, which resulted in timeouts.  I've really got no way to fix this without significant rewrites of the application.  In my next job, I've learnt to use Spring Reactive for this type of problem.

So can look to test the framework without having to throw out all our code?

Inverting framework control

Dependency Injection went a long way in inverting control.   When I write my Servlet handling method, I no longer need to pass in all my dependent objects.  I would define dependencies, via @Inject, to have the framework make them available.  The framework, subsequently, no longer dictates what objects my implementation can depend on.

However, there is a lot more to a framework than just the objects.  Frameworks will impose some threading model and require me to extend certain methods.  While dependency injection provides references to objects, the framework still has to call the methods on the objects to do anything useful.  For example, Spring goes along way to make the methods flexible, but still couples you to Reactive or Servlet coding by the required return type from the method.

As I need the Spring framework to undertake Dependency Injection for my tests, I'm coupled to the particular Spring Servlet/Reactive abstractions before I even write my first line of code.  An upfront choice that could be quite costly to change if I get wrong!

What I really want to do is:
  1. Write tests for my implementations (as we are always test driven, of course)
  2. Write my implementations
  3. Wire up my implementations together to become the application
Well the first two is very simple:
  1. Write tests calling a method passing in mock objects
  2. Write implementation of the method to pass the test
The last becomes very hard.  The reason the last becomes very hard is there is no consistent way to call every method.  Methods have different names, different parameters, different exceptions, possibly different threading requirements and different return types.  What we need is some facade over the methods to make them appear the same.

The Inversion of (Coupling) Control (IoC) provides this facade over the method via the ManagedFunction.  The ManagedFunction interface does not indicate what thread to use, what parameters/return types are required, nor what exceptions may be thrown.   This is all specified by the contained method implementation.  The coupling is inverted so the implementation specifies what it requires.

This inversion of coupling allows framework decisions to be deferred.  As I can have all my methods invoked in a consistent way, I can go ahead and start writing implementations.   These implementations may require Reactive coding to undertake asynchronous calls out to different micro-services.  Some of these implementations may require using JPA to write to relational databases.  I really should not care at the start of building the system.  I'm tackling the concrete problems to gain a better understanding of the real problem space.  I know my methods can be invoked by the framework via wrapping them in a ManagedFunction.  We can deal with determining the right framework later on, once we know more.

Actually, this is allowing the implementations to choose the appropriate abstractions to be provided by the framework. My implementations define what objects they require, what other methods they require calling and what thread models they will require. The implementations are, effectively, defining what abstractions are required from the framework.

Therefore, it is no longer the framework being opinionated.  It is your developer code that is allowed to be opinionated.

This then allows your implementations to be opinionated about the most appropriate framework to use.  No longer do you have to guess the framework based on vague understanding of the problem space.   You can see what abstractions your implementations require and make a more informed choice of framework.

In effect, IoC has deferred choice of the framework to much later in the development process.  This is so you can can make the decision much more confidently.  And isn't this what Agile says, defer the commitment until the last responsible moment.

Summary

In summary, why be forced to make too many up front decisions about your application?  In choosing the framework, you are making some significant choices is solving your problem space.  As frameworks are opinionated, they impose a lot of coupling on your solution.

Rather, why can't I just start writing solutions to concrete problems and worry about how they fit together later on?  This allows me to make choices regarding the appropriate abstractions (and subsequently framework) when I know a lot more about the problem space.

Inversion of (Coupling) Control gives this ability to defer abstraction and framework choices to much later in the development process, when you are more informed to make the decision correctly.