Monday, 8 April 2019

IT "Floor Plan"

OfficeFloor's graphical configuration is inspired from the construction industry's floor plan.

Now, we did consider auto-wiring flows together based on parameter types between the first class procedures.  This is still possible if there is interest.  However, we went graphical for another reason that has less to do with the technical problems of the system but more in helping build and maintain the correct system.

This inspiration came from having been around house construction with my father and later when building my own house. Building a house is definitely not simply putting a slab down and walls up with a roof, possibly with a door to get in and out.   There is a lot of engineering going into houses to ensure the foundations are correct, the house is not going to fall over and even on the more advanced side of ensuring it fits with the home owner's style of living.

Now, I certainly claim no expertise in the house construction industry but they a have tool that made understanding and reasoning about the construction easier:

Floor Plan

The floor plan to me was a powerful communication tool within the home construction industry.   Now I had some background in construction, but engineering concerns such as house footings, load bearing walls, etc are beyond me.  However, around the floor plan we could discuss how the house was to be built.

Reflecting on my experience of building a house, I could see this discussion working even for those that had not a background around construction.  One does not need to be too technical to follow the room layouts and be able to visualise how the house is going to turn out.  Then when asking for something difficult (such as moving a load bearing wall), it is reasonably easy for the construction company to explain using the floor plan why that would incur extra costs.  Plus this same conversation can be had when renovating the house - something in IT we continually do by enhancing systems.

Now in this relationship, I saw myself as the business is to information technology.   I knew what I wanted out of the house (IT system) and the floor plan helped the communication with the (IT) construction company.  I saw the home construction industry having similar problems of non-technical customers wanting something technical, and solved this with the floor plan.

So it had me ask the question:

What do we produce in the information technology space to be the "floor plan" of IT systems?

I believe we've tried many approaches, but I find they get lost in the complex, intangible details of IT systems. We've tried model driven designs specific to certain domains with some success, however these become difficult to maintain and enhance.  We've used many modelling techniques such as UML and lots of others in the business analysis space, but these are divorced of the code and quickly stale (especially with the Agile approach of building a room at a time).  We can even incorporate functional programming as means to demonstrate how code is composed together.   However, I as a customer would not be expected to understand engineering mathematics of construction, so why should I be subjected to computing maths.

The best I've seen is story boards, but this does not convey the complexity of the internals of systems.  All too often I hear, it's only a button, why's it cost so much?   Now if we could bring out the "floor plan" of the internals of the IT system we could have a conversation around what aspects of the system is needing changes.  For example, that's load bearing impacting these rooms (components) of the house (system), so cost to change is going to be expensive.

Now when I refer to the internals of the system, I'm also not meaning front-end applications that only use the server to store data in databases.  For many of these applications, most things are quite tangible in screens to the user.  What I'm referring to is more complex server side components servicing the requests that are quite intangible to the user.  In other words, the details necessary to explain why adding this new button is 10 times more costly than the button added last week.

Furthermore, I tend to find most approaches are focused on one-way communication of capturing requirements to convey to technical individuals to build.  There is little in going the other way.  In other words, in providing understandable information back to the users on why some changes are harder than others.

This is even evident in Business Process Model and Notation (BPMN) in it becoming an executable specification.   This has been great to model business processes for technical people to implement systems.  In doing so, I'm finding it starting to incorporate the technical details of UML.  This starts becoming very technical and makes it difficult for non-technical individuals to pick up.

So I reflect on how I convey this information to users, and well the whiteboard with boxes and lines is something I keep drawing.  I tend to find drawing boxes representing things with lines indicating flow provides enough visualisation to have a discussion in 5 minutes about why something is easier or harder to change.

So is there a way to extract the boxes and lines out of the system for this conversation with the user?

Now to get technical for a moment, OfficeFloor allows composition through continuations.  For those not familiar with continuations you can can skip the rest of this paragraph and just know that continuations allow connecting code flow by drawing lines between procedures.  For those wanting a little more detail, the state passed by the continuation in OfficeFloor is decoupled from the continuation (state is Dependency Injected into the procedure from Dependency Contexts).   This allows the first class procedures of OfficeFloor to be composed together separate of state (Dependency Injection) and execution strategies (Thread Injection).  Therefore, OfficeFloor composition is merely configuring the output continuations of a procedure to the appropriate handling procedure.  This can be represented as a graph of nodes (procedures) with connectors (continuations) and directed edges (mapping configuration of continuation to procedure) - hence prime for graphical configuration.

So armed with first class procedures and means to draw lines between them to control program flow, we have the opportunity to represent the application flow composition as boxes and lines.

Therefore, when asked why graphical configuration for OfficeFloor.  It just made sense to use graphical configuration for ease of communicating the system's "floor plan".   This makes it easy for users to understand the system.  It also makes it very easy for new developers to pick up the system.  This is especially useful when having to work between many micro-services.

Therefore, OfficeFloor uses it's Continuation Injection graphical configuration to be the "floor plan" to how the IT system is constructed to make communication between user and construction company easier.

Monday, 11 March 2019

Perils of opinionated frameworks, like Spring Boot. Inverting for opinionated code.

We developers like abstraction.  Without it, we could not build applications.  Our programming disciplines even require that we code to abstractions and avoid coupling our code to detailed implementations.

However, what is the right abstractions for your application?

Sadly, the choice of abstractions really comes from our choice of framework.  Frameworks are basically abstract solutions that we extend to solve our problem.

Unfortunately frameworks, like Spring Boot, come opinionated about the  threading models you use, interfaces you need to extend, possibly the data repositories applicable and various other assumptions about your problem space.  That's a lot of restrictions before I've even written my first line of code.

What we really want to do is explore the problem space first.  This is what test driven design is all about.   We write tests to define what is successful code.  Then we implement code to pass those tests.  As we go along writing tests to cover off requirements, we subsequently churn out working code for the application.  In time we get enough working code to release as the application.

So this leads me to ask, when do we test the choice of framework?

Opinionated frameworks force abstractions too early in the development process

Well, I guess we pay very experienced senior people to make this choice.  So this choice must be correct.  It would not be for reasons like:
  • I (or our company) only know this framework, so we are using it
  • New shiny with lots of buzz words, we must use it
  • My CVs a little old, let's try something new
  • This one is cheaper
  • Architecture believed what it says on the tin
Regardless of the reason, the only way to test the framework choice is to build the application with it.  And just for those of you who like opinionated frameworks (like Spring Boot), please tell me you write the most risky aspects first.  This is so you can quickly discover if the framework's opinions match with your problem.

Sadly, even if you test with the most risky aspects, finding out the framework decision is wrong can lead to a lot of wasted code.  This arguably wastes is a lot of money for the business and can lead to failing projects.

For example, say we choose Spring Reactive.  Yay, we can make concurrent asynchronous calls out to various micro-services.  We can also use the latest in NoSQL data stores.  This was all a great decision.   However, over time we realise we have a small amount of data where integrity of the data is very important.  We find we want to use a relational database to solve this, and then incorporate JPA on this database for easier interaction.  However, our choice of Spring Reactive has disallowed this because it requires all I/O to be asynchronous (JPA is synchronous database calls).  Ok, yes, we can use Schedulers, but I seem to be continually doing work arounds for lack of transactions.  The data consistency issues are starting to mount up and we're missing deadlines.  I'm now in a position of do I throw out all the Reactive code, or do I keep making work arounds hoping it might all hang together.  I definitely need to swap jobs before this hits production and we start supporting it.  In my next job, I've learnt to use Spring Servlets for this type of problem.

The flip side of this could also be easily the case.  We start out wanting Spring Servlet for JPA interaction with a database.   However, over time we realise the database interaction is mostly read-only.  What we really wanted was asynchronous I/O from Spring Reactive to collect data from multiple micro-services and data stores concurrently.  Unfortunately, with our up front Spring Servlet choice, the data collection is just too slow.  Our work around is to use async Servlets and spawn threads to make concurrent requests.  This worked initially, but over time the load increased.  This significantly increased thread counts, resulting in thread scheduling starvation, which resulted in timeouts.  I've really got no way to fix this without significant rewrites of the application.  In my next job, I've learnt to use Spring Reactive for this type of problem.

So can look to test the framework without having to throw out all our code?

Inverting framework control

Dependency Injection went a long way in inverting control.   When I write my Servlet handling method, I no longer need to pass in all my dependent objects.  I would define dependencies, via @Inject, to have the framework make them available.  The framework, subsequently, no longer dictates what objects my implementation can depend on.

However, there is a lot more to a framework than just the objects.  Frameworks will impose some threading model and require me to extend certain methods.  While dependency injection provides references to objects, the framework still has to call the methods on the objects to do anything useful.  For example, Spring goes along way to make the methods flexible, but still couples you to Reactive or Servlet coding by the required return type from the method.

As I need the Spring framework to undertake Dependency Injection for my tests, I'm coupled to the particular Spring Servlet/Reactive abstractions before I even write my first line of code.  An upfront choice that could be quite costly to change if I get wrong!

What I really want to do is:
  1. Write tests for my implementations (as we are always test driven, of course)
  2. Write my implementations
  3. Wire up my implementations together to become the application
Well the first two is very simple:
  1. Write tests calling a method passing in mock objects
  2. Write implementation of the method to pass the test
The last becomes very hard.  The reason the last becomes very hard is there is no consistent way to call every method.  Methods have different names, different parameters, different exceptions, possibly different threading requirements and different return types.  What we need is some facade over the methods to make them appear the same.

The Inversion of (Coupling) Control (IoC) provides this facade over the method via the ManagedFunction.  The ManagedFunction interface does not indicate what thread to use, what parameters/return types are required, nor what exceptions may be thrown.   This is all specified by the contained method implementation.  The coupling is inverted so the implementation specifies what it requires.

This inversion of coupling allows framework decisions to be deferred.  As I can have all my methods invoked in a consistent way, I can go ahead and start writing implementations.   These implementations may require Reactive coding to undertake asynchronous calls out to different micro-services.  Some of these implementations may require using JPA to write to relational databases.  I really should not care at the start of building the system.  I'm tackling the concrete problems to gain a better understanding of the real problem space.  I know my methods can be invoked by the framework via wrapping them in a ManagedFunction.  We can deal with determining the right framework later on, once we know more.

Actually, this is allowing the implementations to choose the appropriate abstractions to be provided by the framework. My implementations define what objects they require, what other methods they require calling and what thread models they will require. The implementations are, effectively, defining what abstractions are required from the framework.

Therefore, it is no longer the framework being opinionated.  It is your developer code that is allowed to be opinionated.

This then allows your implementations to be opinionated about the most appropriate framework to use.  No longer do you have to guess the framework based on vague understanding of the problem space.   You can see what abstractions your implementations require and make a more informed choice of framework.

In effect, IoC has deferred choice of the framework to much later in the development process.  This is so you can can make the decision much more confidently.  And isn't this what Agile says, defer the commitment until the last responsible moment.

Summary

In summary, why be forced to make too many up front decisions about your application?  In choosing the framework, you are making some significant choices is solving your problem space.  As frameworks are opinionated, they impose a lot of coupling on your solution.

Rather, why can't I just start writing solutions to concrete problems and worry about how they fit together later on?  This allows me to make choices regarding the appropriate abstractions (and subsequently framework) when I know a lot more about the problem space.

Inversion of (Coupling) Control gives this ability to defer abstraction and framework choices to much later in the development process, when you are more informed to make the decision correctly.

Wednesday, 6 March 2019

Inversion of Control (explained non-technically)

A design pattern for the Inversion of Control principle was presented in a paper published in 2015 (a free download is available here).  The premise of the paper was "can we learn something from how businesses organise themselves and translate this into software design improvements?" Basically, businesses have been around a lot longer than software systems.  So, how did businesses deal with problems, such as scale, before computers existed?

So running with this analogy, I looked at how businesses evolve. Businesses don't set out day one to be a Fortune 500. Typically, they start with you in your garage (maybe with a friend). Overtime your business grows and you hire people, assign clearer functional responsibilities and start to scale up your business. Businesses have to do this, while also changing quickly to stay competitive.

Within software, we have moved from Waterfall to Agile. Waterfall can be considered your set out to build a Fortune 500 company day one approach. Agile, on the other hand, says build only things of value and evolve your system over time.  Also, Agile focuses on being quicker to react to change. Therefore, Agile is a lot closer to how businesses grow, evolve and stay competitive.

Unfortunately, our software architectures have still stayed a "waterfall" top down approach. Architects will produce technology stack diagrams that indicate how the architectural layers of the system work (those technical pictures full of rectangles on top of each other).  The nature of these layers is always a bureaucratic control from the top layer to the bottom layer.  This is very similar to large companies with many layers of management.  So our software architectures force us to design the Fortune 500 company before developers even get to write the first line of code.

Inversion of Control is like empowering employees in a business.  Rather than the manager dictating exactly how the employees will work, the manager trusts the employees to undertake the goals of the business. In other words, the employees are in control of how the work gets done.  They will decide what help to get (Continuation Injection).  They will decide what business resources they require (Dependency Injection).  They may decide to escalate issues with the work (Continuation Injection).  They may even decide another employee may be better suited to do the work (Thread Injection).

By empowering the employee, we have inverted the control.  The employee is now in control of how the work gets done.  This is similar to the Inversion of Control in software.  The developer is now in control of how they write code.  They are not restricted by bureaucratic top down architecture controls from their managers.  This allows the developer to evolve the business's software quickly so it may grow and stay competitive.

Monday, 4 March 2019

Reactive frameworks "rob Peter to pay Paul"! Avoiding the problem with Dependency Contexts

Reactive frameworks, typically, align to message driven architectures.

The problem with building message driven systems, is the message provides the context.  The nature of message driven architectures is that the handler of the message is disconnected from the producer of the message.  The only information connecting them is the message.  Hence, the message needs to provide all the necessary context and data for the handler to process the message.

This becomes complicated when downstream message handlers require more context (such as transactions).  How do you pass a transaction via a message?  Yes, we can use XA transactions to suspend and resume transactions but this creates a lot of overhead to the performance and scale these Reactive frameworks are chasing.  Furthermore, Spring Data even considers transactions not a good fit for reactive.

So how can we create a Reactive framework that enables context for such things as transactions?

Dependency Contexts

Well the first problem is co-ordination.  Firing off lots of events serviced by different functions in different threads, creates a lot of multi-threading co-ordination.  How does one know when to commit the transaction?  Or possibly decide to rollback the transaction?  Or what if an asynchronous I/O takes too long causing the held transaction to time out.  We would be required to monitor all messages and co-ordinate their results to then make a decision on commit / rollback (along with handling further non-application events, such as transaction timeout events).   This puts a lot of coding decisions on the developer outside the normal application logic.  Gone are the days the developer just threw an exception and the transaction is rolled back.

So let's bring back order to the event handling chaos.  Load the events to a queue and process events in the order they are added to the queue.  As requests on the system should be independent, we create a separate queue for each request to keep requests isolated from each other.

The resulting event loop would look as follows:


  public void startRequest(Function function) {
    this.functions.add(function);
  }

  BlockingQueue<Function> functions = new LinkedBlockingQueue<>();

  public void eventLoop() throws InterruptedException {
    for (;;) {

      // Obtain next function to execute
      Function function = this.functions.take();

      // Determine if new request (no context)
      FunctionContext context = (function instanceof ContextualFunction)
          ? ((ContextualFunction) function).context
          : new FunctionContext();

      // Allow for multi-threading in executing
      synchronized (context) {
        function.doFunction(context);
      }

      // Register the next function to execute
      Function nextFunction = context.functions.pollFirst();
      if (nextFunction != null) {
        this.functions.add(nextFunction);
      }
    }
  }

  public interface Function {
    void doFunction(FunctionContext context);
  }

  public class FunctionContext {

    private Deque<Function> functions = new LinkedList<>();

    public void triggerFunction(Function function) {
      this.functions.add(new ContextualFunction(function, this));
    }
  }

  public class ContextualFunction implements Function {

    private final Function delegate;
    private final FunctionContext context;

    public ContextualFunction(Function delegate, FunctionContext context) {
      this.delegate = delegate;
      this.context = context;
    }

    @Override
    public void doFunction(FunctionContext context) {
      this.delegate.doFunction(context);
    }
  }

Ok, a lot of code but it is now doing two things:
  • ordering all functions to be executed sequentially
  • providing a context that travels with the execution chain of functions
But how do we make use of the FunctionContext to begin and manage the transaction?

We incorporate the ManagedFunction of Invervsion of Control and use a ServiceLocator to enable storing the dependencies within the context:

  public interface ServiceLocator {
    // Uses context for transitive dependencies
    Object locateService(String serviceName, FunctionContext context);
  }

  ServiceLocator serviceLocator = ...; // can, for example, wrap Spring BeanFactory

  public class FunctionContext {

    // ... triggerFunction logic omitted for brevity

    private Map<String, Object> contextDependencies = new HashMap<>();

    public Object getDependency(String dependencyName) {

      // Pull dependency from context cache
      Object dependency = this.contextDependencies.get(dependencyName);
      if (dependency == null) {

        // Not cached, so create new and cache in context
        dependency = serviceLocator.locateService(dependencyName, this);
        this.contextDependencies.put(dependencyName, dependency);
      }

      // Return dependency for Function to use
      return dependency;
    }
  }

The Functions are now able to use the getDependency("name") method to retrieve the same objects for a request.  As the dependency objects are cached within the context, the various Functions involved in servicing the request are able to retrieve the same object.

Therefore, a transaction can be managed across Functions.   The first Function retrieves the Connection and starts the transaction.  Further Functions execute, pulling in the same Connection (with transaction established).  The final Function in the chain then commits the transaction.

Should there be a failure, the injected Exception handlers of the ManagedFunction can rollback the transaction.  Within thread-per-request architectures, exceptions are thrown by the developer's code typically rolling back the transaction.  By having the ManagedFunction injected handlers also rollback the transaction, this reproduces the ease of thread-per-request transaction management for exceptions.

Furthermore, the exception handlers would clear the FunctionContext queue of Functions.  As the transaction has been rolled back, there is no point further executing the remaining Functions.  In typical thread-per-request handling, the remaining part of the try block would be skipped.  By clearing the FunctionContext queue this mimics skipping the remaining logic and goes straight to the catch block.  In the case of ManagedFunction exception handler, this would be triggering a new function to handle the failure.

But we've just reduced Reactive framework to a single sequence of functions, losing the concurrency it can provide!

Well beyond making it easier to code now that there is some order in function execution, we can introduce concurrency by spawning another sequence of functions.  As the FunctionContext ties the sequence of functions together, we just create a new FunctionContext to enable concurrency.  The following code shows this:

  public class FunctionContext {

    private Deque<Function> functions = new LinkedList<>();

    public void triggerFunction(Function function) {
      this.functions.add(new ContextualFunction(function, this));
    }

    public void spawnNewSequence(Function function) {
      this.functions.add(function); // event loop will create new FunctionContext
    }

    // ... getDependency(...) removed for brevity
  }

In actual fact, we have just created Threads running on an event loop.  The sequence of functions are executed in order just like imperative statements are executed in order by a thread.  So we now have Threads without the overheads of a thread-per-request.  The dependencies are bound to the context and subsequently the Thread of execution - making them effectively ThreadLocal.  As ThreadLocals are thread safe, we now have safe multi-threading functional code.

As dependencies are effectively ThreadLocal to the sequence of Functions, they can be mutated for the next Function to pick up the change.  Yes, immutability better removes developer errors, however this should not be reason to restrict the developer from doing it.  This is especially the case if you want to mutate objects via an ORM (e.g. JPA or Spring Repositories) to do updates in the database.

OfficeFloor (http://officefloor.net) implements Dependency Contexts in it's threading models.  OfficeFloor, actually, makes this easier to develop by introducing the concept of Governance.  Governance does the above transaction management by configuration declaration (much like a @Transaction annotation on a method).  However, as OfficeFloor uses graphical configuration, this is done graphically.  Rather than annotating the code directly, OfficeFloor graphically marks functions for transaction governance for better code re-use and flexibility in configuring applications.  An example of this configuration is as follows:


Please try out OfficeFloor (tutorials) and we value all feedback.

Summary

So when Reactive frameworks are tooting their horn, they are actually doing this while restricting you further as a developer.  Because the reactive frameworks can't handle the context problem, they push this problem onto you the developer in avoiding context.

Subsequently, reactive frameworks rob developers (Peter) of context to pay for the message passing problem in their frameworks (Paul).

By incorporating dependency contexts into event loops, you can incorporate context that we have grown to love in thread-per-request for managing things such as transactions.

Monday, 25 February 2019

Is Spring Reactive already obsolete? Inversion of Thread Coupling

Beyond Spring's Dependency Injection only solving 1/5 of the Inversion of Control problem, Spring Reactive bases itself on the event loop. While there are other popular event loop driven solutions (NodeJS, Nginx), the single threaded event loop is a pendulum swing in the other direction from thread-per-request (thread pools). With event loops competing against thread-per-request is there not some pattern that underlies both of them? Well, actually yes!

But before getting to this, let's look at the issues regarding event loops and thread-per-request. If you are more interested in the solution, you can skip the next two sections.

Thread coupling problems

Event loop

First of all, "thread coupling"? Why is this a concern? Well for event loops the single threaded nature requires all I/O to be undertaken asynchronously. Should a database or HTTP call need to block, it will block the single event loop thread and hold up the system. This restriction is in itself a big coupling problem, as to go Reactive all your I/O is coupled to now go asynchronous. This means no more ORMs like JPA to make access to databases easier (as JPA requires blocking database calls). Yep, something that used to remove 40-60% of boiler plate code in applications is now not unusable (enjoy writing this all over again!)

Beyond the restrictive I/O in your decision to use Reactive patterns, the ability to use multiple processors is restricted as there is only one thread. Ok, instances of the Reactive engine are duplicated to each CPU, however they can not share state. The multi-threaded implications of sharing state between two event loops is difficult. Reactive programming is hard enough, let alone adding multi-threading into it. Yes, communication between event loops can be via events. However, using this to keep duplicated copies of shared state in sync across event loops creates problems that are just avoided. Basically you are told to design your Reactive systems to avoid this with immutability.

Therefore, you are stuck coupled to the one thread. So what? Well if you have computationally expensive operations, such as security cryptography (JWT), it creates scheduling problems. By being on a single thread, this operation must be completed before anything else can be undertaken. With multiple threads, other threads can be time sliced in by the operating system to progress other less CPU intensive requests. However, you only have the one thread so all that lovely operating system thread scheduling is now lost. You're stuck waiting for the expensive CPU intensive operations to complete before servicing anything else.

Oh please, just ignore these problems! We developers like performance. Reactive is all in the aim of greater performance and improved scalability. Lesser threads allows reduced overheads to allow improved throughput. Ok, yes, I'll have better performing production systems potentially reducing hardware costs. However, it's going to be a lot slower to build and enhance that production system due to coupling restrictions that comes from single threaded event loops. Not to mention, having to rewrite algorithms to not hog the CPU. Given the scarcity of developers compared to the over abundant supply of cloud hardware, arguing about costs of scale may only be for those rare significantly large systems.

We do loose a lot going Reactive. This is possibly to the point that we have not thought it through enough. Hence, possibly why Reactive frameworks warn against changing to it whole sale. They usually indicate Reactive patterns only work for smaller less complicated systems.

Thread-per-request (thread pools)

On the flip side, thread-per-request patterns (such as Servlet 2.x) use thread pools to handle scale. They assign a thread to service the request and scale out by having multiple (typically pooled) threads.

We can probably read many articles touting Reactive over the scale limitations of thread-per-request, but the main issue with thread-per-request is not actually in performance nor scale. The issue with thread-per-request is lot more permissive to your application and can actually pollute your whole architecture.

To see this problem, just look at invoking a method:

Response result = object.method(identifier);

Should the implementation of the method be as follows:

@Inject Connection connection;
@Inject HttpClient client;

public Result method(Long identifier) {

  // Retrieve synchronous database result
  ResultSet resultSet = connection.createStatement()
    .executeQuery("<some SQL> where id = " + identifier);
  resultSet.next();
  String databaseValue = resultSet.getString("value");

  // Retrieve synchronous HTTP result
  HttpResponse response = client.send("<some URL>/" + databaseValue);

  // Return result requiring synchronous results to complete
  return new Result(response.getEntity());
}

This creates a coupling problem to the thread of the request, that can pollute out to your whole architecture. Yes, you've just placed a coupling on the request thread out to your other systems.

While the database call is synchronous, the HTTP call is also forcing the downstream system to respond synchronously. We can't change the HTTP call to be asynchronous, because the request thread wants to continue with a result to return from the method. This synchronous coupling to the request thread not only limits the call, but also limits the downstream system to have to provide a synchronous response. Hence, the thread-per-request thread coupling can pollute out to your other systems and possibly across your entire architecture. No wonder the REST micro-service pattern of synchronous HTTP calls are so popular! It is a pattern that forces itself top down on your system. Sounds like thread-per-request and Reactive share this same opinion on forcing everything top down to support themselves.

Threading to support I/O

In summary, the problems are as follows.

Single threaded event loops:
  • couple you to asynchronous communication only (simple JPA code is no longer available)
  • just avoids multi-threading, as two threads executing events from the event queue would create considerable synchronisation problems (likely slowing solution and causing concurrency bugs that are hard to code against for the best of developers)
  • loose the advantage of the thread scheduling that operating systems have spent considerable effort optimising
While thread-per-request solutions:
  • couples you to synchronous communication only (as the result is expected immediately; and not some time later via callback)
  • have higher overheads (to single thread event loops) due to managing more threads and therefore less scalable
The pendulum swing between thread pools and Reactive single threaded can actually be considered going from synchronous communication (thread-per-request) to asynchronous communication (single threaded event loops). The remaining problems are actually implementation constraints of a threading model built specifically to support each type of communication. Plus given the coupling on downstream systems that synchronous communication poses, this pendulum swing to asynchronous communication is not all a bad thing.

So the question is, why are we forced to choose only one communication style? Why can't we use synchronous and asynchronous communication styles together?

Well, we can't put asynchronous calls inside synchronous method calls. There is no opportunity for callbacks. Yes, we can block waiting on the callback but Reactive will consider itself superior in scale due to additional threading overheads involved in this. Therefore, we need asynchronous code to allow synchronous calls.

However, we can't put synchronous calls inside event loops, as it halts the event loop thread. Hence, we need extra threads to undertake the synchronous calls to allow the event loop thread to carry on with other events.

Reactive has the answer. Use a Scheduler:

Mono blockingWrapper = Mono.fromCallable(() -> {
  return /* make a remote synchronous call */
}).subscribeOn(Schedulers.elastic());


Yay, now we can do synchronous calls within the event loop. Problem solved (well sort of).

Well it's sorted if you can trust that you properly wrapped all synchronous calls in Callables. Get one wrong, and well you are blocking your event loop thread and halting your application. At least in multi-threaded applications only the particular request suffered, not the whole application.

This seems, to me anyway, more a work around than an actual solution to the problem. Oh wait, everything needs to be Reactive top down so that solves this problem. Just don't do blocking calls and change all your drivers and your whole technology stack to Reactive. The whole "change everything to suit us, in a way that only integrates with us" seems very close to technology vendor lock in - in my opinion anyway.

Therefore, can we consider a solution that allows synchronous calls and does not rely so heavily on the developer getting it right? Why, yes!

Inverting the Thread Coupling

The asynchronous communication driven Reactive single threaded event loop (excuse the mouth full) is identified as the right solution. Synchronous communication is solved by developers using Schedulers. In both cases, the Reactive functions are run with a thread dictated for them:
  • asynchronous functions are executed with the thread of the event loop
  • synchronous functions executed with thread from the Scheduler
The control of the function's executing thread is heavily dependent on the developer getting it right. The developer has enough on their plate focusing on building code to meet feature requirements. Now the developer is intimately involved in the threading of the application (something thread-per-request always somewhat abstracted away from the developer). This intimacy to threading significantly increases the learning curve for building anything Reactive. Plus it will have the developer loose a lot of hair when they pull it out at 2 am trying to get the code working for that deadline or production fix.

So can we remove the developer from having to get the threading right? Or more importantly, where do we give control of selecting the thread?

Let's look at a simple event loop:

public interface AsynchronousFunction {
  void run();
}

public void eventLoop() {
  for (;;) {
    AsynchronousFunction function = getNextFunction();
    function.run();
  }
}

Well, the only thing we can target for control is the asynchronous function itself. Using an Executor to specify the thread, we can enhance the event loop as follows:

public interface AsynchronousFunction {
  Executor getExecutor();
  void run();
}

public void eventLoop() {
  for (;;) {
    AsynchronousFunction function = getNextFunction();
    function.getExecutor().execute(() -> function.run());
  }
}

This now allows the asynchronous function to specify its required threading, as:
  • using the event loop thread is via a synchronous Executor: getExecutor() { return (runnable) -> runnable.run(); }
  • using separate thread for synchronous calls is via Executor backed by thread pool: getExecutor() { return Executors.newCachedThreadPool(); }
Control is inverted so that the developer is no longer responsible for specifying the thread. The function now specifies the thread for executing itself.

But how do we associate an Executor to a function?

We use the ManagedFunction of Inversion of Control:

public interface ManagedFunction {
  void run();
}

public class ManagedFunctionImpl
    implements ManagedFunction, AynchronousFunction {

  @Inject P1 p1;
  @Inject P2 p2;
  @Inject Executor executor;

  @Override
  public void run() {
    executor.execute(() -> implementation(p1, p2));
  }

  private void implementation(P1 p1, P2 p2) {
    // Use injected objects for functionality
  }
}

Note that only the relevant ManagedFunction details have been included. Please see Inversion of (Coupling) Control for more details of the ManagedFunction.

By using the ManagedFunction, we can associate an Executor to each function for the enhanced event loop. (Actually, we can go back to the original event loop, as the Executor is encapsulated within the ManagedFunction).

So now the developer is no longer required to use Schedulers, as the ManagedFunction takes care of which thread to use for executing the function's logic.

But this just moves the problem of the developer getting it right from code to configuration. How can we make it possible to reduce developer error in specifying the correct thread (Executor) for the function?

Deciding the executing thread

One property of the ManagedFunction is that all objects are Dependency Injected. Unless Dependency Injected, there are no references to other aspects of the system (and static references are highly discouraged). Hence, the Dependency Injection meta-data of the ManagedFunction provides details of all the objects used by the ManagedFunction.

Knowing the objects used by a function helps in determining the asynchronous/synchronous nature of the function. To use JPA with the database a Connection (or DataSource) object is required. To make synchronous calls to micro-services a HttpClient object is required. Should none of these be required by the ManagedFunction, it is likely safe to consider no blocking communication is being undertaken. In other words, if the ManagedFunction does not have a HttpClient injected, it can't make HttpClient synchronous blocking calls. The ManagedFunction is, therefore, safe to be executed by the event loop thread and not halt the whole application.

We can, therefore, identify a set of dependencies that indicate if the ManagedFunction requires execution by a separate thread pool. As we know all dependencies in the system, we can categorise them as asynchronous/synchronous. Or more appropriately, whether the dependency is safe to use on the event loop thread. If the dependency is not safe, then the ManagedFunctions requiring that dependency are executed by a separate thread pool. But what thread pool?

Do we just use a single thread pool? Well, Reactive Schedulers give the flexibility to use / re-use varying thread pools for the various functions involving blocking calls. Hence, we need similar flexibility in using multiple thread pools.

We use multiple thread pools by mapping thread pools to dependencies. Ok, this is a little bit to get your head around. So let's illustrate with an example:

public class ManagedFunctionOne implements ManagedFunction {
  // No dependencies
  // ... remaining omitted for brevity
}

public class ManagedFunctionTwo implements ManagedFunction {
  @Inject InMemoryCache cache;
  // ...
}

public class ManagedFunctionThree implements ManagedFunction {
  @Inject HttpClient client;
  // ...
}

public class ManagedFunctionFour implements ManagedFunction {
  @Inject EntityManager entityManager;
  // meta-data also indicates transitive dependency on Connection
  // ...
}

Now, we have the thread configuration as follows:

DependencyThread Pool
HttpClientThread Pool One
ConnectionThread Pool Two

We then use the dependencies to map ManagedFunctions to Thread Pools:

ManagedFunctionDependencyExecutor
ManagedFunctionOne,
ManagedFunctionTwo
(none in thread pool table)Event Loop Thread
ManagedFunctionThreeHttpClientThread Pool One
ManagedFunctionFourConnection (as transitive dependency of EntityManager)Thread Pool Two

The decision of the thread pool (Executor) to use for the ManagedFunction is now just mapping configuration. Should a dependency invoke blocking calls, it is added to the thread pool mappings. The ManagedFunction using this dependency will no longer be executed on the event thread loop, avoiding the application halting.

Furthermore, the likelihood of missing blocking calls is significantly reduced. As it is relatively easy to categorise the dependencies, it leaves less chance of missing blocking calls. Plus if a dependency is missed, it is only a configuration change to the thread pool mappings. It is fixed without code changes. Something especially useful as the application grows and evolves. This is unlike Reactive Schedulers that require code changes and significant thought by the developer.

As the executing thread to execute a ManagedFunction is now controlled by the framework (not the application code), it effectively inverts control of the executing thread. No longer does the developer code threading. The framework configures it based on the dependency characteristics of the ManagedFunctions.

OfficeFloor

This is all good in theory, but show me the working code!

OfficeFloor (http://officefloor.net) is an implementation of the inversion of thread control patterns discussed in this article. We find frameworks are too rigid with their threading models that causes work arounds, such as Reactive Schedulers. We are looking for the underlying patterns to create a framework that does not require such work arounds. Code examples can be found in the tutorials and we value all feedback.

Note that while OfficeFloor follows inversion of thread control, it's actual threading model is more complex to take other aspects into consideration (e.g. dependency context, mutating state, thread locals, thread affinity, back pressure and reduced locking to increase performance). These, however, are topics for other articles. But, as this article highlights, the threading for OfficeFloor applications is a simple configuration file based on dependency mappings.

Conclusion

Inversion of control for the thread allows the function to specify it's own thread. As the thread is controlled by the injected Executor, this pattern is named Thread Injection. By allowing the injection, the choice of thread is determined by configuration rather than code. This relieves the developer of the potentially error prone, buggy task of coding threading into applications.

The side benefit of Thread Injection is that thread mapping configurations can be tailored to the machine the application is running on. On a machine with many CPUs, more thread pools can be configured to take advantage of thread scheduling by the operating system. On smaller machines (e.g. embedded) there can be more re-use of thread pools (potentially even none for single purpose applications that can tollerate blocking to keep thread counts down). This would involve no code changes to your application, just configuration changes.

Furthermore, computationally expensive functions that may tie up the event loop can also be moved to a separate thread pool. Just add in a dependency for this computation to the thread pool mappings and all ManagedFunctions undertaking the computation are now not holding up the event loop thread. The flexibility of Thread Injection is beyond just supporting synchronous/asynchronous communication.

As Thread Injection is all driven from configuration, it does not require code changes. It actually does not require any threading coding by the developer at all. This is something Reactive Schedulers are incapable of providing.

So the question is, do you want to tie yourself to the single threaded event loop that really is just a single purpose implementation for asynchronous I/O? Or do you want to use something a lot more flexible?

Inversion of (Coupling) Control

The question of "What is Inversion of Control?" or "What is Dependency Injection?" is met with code examples, vague explanations and even on StackOverflow identified as low quality answers - https://stackoverflow.com/questions/3058/what-is-inversion-of-control

We use inversion of control and dependency injection and even push it as the correct way to build applications. Yet, we can not clearly articulate why!!!

The reason is we have not clearly identified what "control" is. Once we understand what we are inverting, the concept of Inversion of Control vs Dependency Injection is not actually the question to be asked. It actually becomes the following:

Inversion of Control = Dependency (state) Injection + Thread Injection + Continuation (function) Injection

To explain this, well, let's do some code. (And yes, the apparent problem of using code to explain Inversion of Control is repeating, but bear with me - the answer has always been right before your eyes).

One clear use of Inversion of Control / Dependency Injection is the repository pattern to avoid passing around a connection. Instead of the following:

public class NoDependencyInjectionRepository implements Repository<Entity> {
  public void save(Entity entity, Connection connection) throws SQLException {
    // Use connection to save entity to database
  }
}

Dependency Injection allows the repository to be re-implemented as:

public class DependencyInjectionRepository implements Repository<Entity> {
  @Inject Connection connection;
  public void save(Entity entity) throws SQLException {
    // Use injected connection to save entity to database
  }
}

Now, do you see the problem it just solved?

If you are thinking, I can now change the Connection to say REST calls and this is all flexible to change. Well, you would be close.

To see the problem it solved, do not look at the implementation. Look at the interface. The client calling code has gone from:

repository.save(entity, connection);

to the following:

repository.save(entity);

We have removed the coupling of the client code to provide a connection on calling the method. By removing the coupling, we can substitute a different implementation of the repository (again, boring old news, but bear with me):

public class WebServiceRepository implements Repository<Entity> {
  @Inject WebClient client;
  public void save(Entity entity) {
    // Use injected web client to save entity
  }
}

With the client able to continue to call the method just the same:

repository.save(entity);

The client is unaware that the repository is now calling a micro-service to save the entity, rather than talking directly to a database. (Actually, the client is aware but we will come to that shortly.)

So taking this to an abstract level regarding the method:

  R method(P1 p1, P2 p2) throws E1, E2

  // with dependency injection becomes
  @Inject P1 p1;
  @Inject P2 p2;
  R method() throws E1, E2

The coupling of the client to provide arguments to the method is removed by Dependency Injection.



Now, do you see the four other problems of coupling?



At this point, I warn you that you will never look at code the same again once I show you the coupling problems. This is the point in the Matrix where I ask you if you want to take the red or blue pill, because there is no going back once I show you how far down the rabbit hole this problem really is - say that refactoring is actually not necessary and there are issues in the fundamentals of modelling logic in computer science (ok, big statement but read on and I can't put it any other way).



So you chose the red pill.



Let's prepare you.



To identify the four extra coupling problems, let's look at the abstract method again:

  @Inject P1 p1;
  @Inject P2 p2;
  R method() throws E1, E2

  // and invoking it
  try {
    R result = object.method();
  } catch (E1 | E2 ex) {
    // handle exception
  }

What is coupled by the client code?
  • the return type
  • the method name
  • the handling of exceptions
  • the thread provided to the method
Dependency Injection allowed me to change the objects required by the method without changing the client code calling the method. However, if I want to change my implementing method by:
  • changing it's return type
  • changing it's name
  • throwing a new exception (in the above case of swapping to a micro-service repository, throwing a HTTP exception rather than a SQL exception)
  • using a different thread (pool) to execute the method than the thread provided by the client call
This involves "refactoring" all client code for my method. Why should the caller dictate the coupling when the implementation has the hard job of actually doing the functionality? We should actually invert the coupling so that the implementation can dictate the method signature (not the caller).

This is likely the point you look at me like Neo does in the Matrix going "huh"? Let implementations define their method signatures? But isn't the whole OO principle about overriding and implementing abstract method signature definitions? And that's just chaos because how do I call the method if it's return type, name, exceptions, arguments keep changing as the implementation evolves?

Easy. You already know the patterns. You just have not seen them used together where their sum becomes a lot more powerful than their parts.

So let's walk through the five coupling points (return type, method name, arguments, exceptions, invoking thread) of the method and decouple them.

We have already seen Dependency Injection remove the argument coupling by the client, so one down.

Next let's tackle the method name.

Method Name Decoupling

Many languages, including Java with lambda's, are allowing/have functions as first class citizens of the language. By creating a function reference to a method, we no longer need to know the method name to invoke the method:

Runnable f1 = () -> object.method();
// Client call now decoupled from method name
f1.run()

We can even now pass different implementation of the method around with Dependency Injection:

@Inject Runnable f1;
void clientCode() {
  f1.run(); // to invoke the injected method
}

Ok, a bit of extra code for not much value. But again bear with me. We have decoupled the method's name from the caller.

Next let's tackle the exceptions from the method.

Method Exceptions Decoupling

By using the above technique of injecting functions, we inject functions to handle exceptions:

Runnable f1 = () -> {
  @Inject Consumer<E1> h1;
  @Inject Consumer<E2> h2;
  try {
    object.method();
  } catch (E1 e1) {
    h1.accept(e1);
  } catch (E2 e2) {
    h2.accept(e2);
  }
}
// Note: above is abstract pseudo code to identify the concept (and we will get to compiling code shortly)

Now exceptions are no longer the client caller's problem. Injected methods now handle the exceptions decoupling the caller from having to handle exceptions.

Next let's tackle the invoking thread.

Method's Invoking Thread Decoupling

By using an asynchronous function signature and injecting an Executor, we can decouple the thread invoking the implementing method from that provided by the caller:

Runnable f1 = () -> {
  @Inject Executor executor;
  executor.execute(() -> {
    object.method();
  });
}

By injecting the appropriate Executor, we can have the implementing method invoked by any thread pool we require. To re-use the client's invoking thread we just use a synchronous Exectutor:

Executor synchronous = (runnable) -> runnable.run();

So now we can decouple thread to execute the implementing method from the calling code's thread.

But with no return value, how do we pass state (objects) between methods? Let's combine it all together with Dependency Injection.

Inversion of (Coupling) Control

Let's combine the above patterns together with Dependency Injection to get the ManagedFunction:

public interface ManagedFunction {
  void run();
}

public class ManagedFunctionImpl implements ManagedFunction {

  @Inject P1 p1;
  @Inject P2 p2;
  @Inject ManagedFunction f1; // other method implementations to invoke
  @Inject ManagedFunction f2;
  @Inject Consumer<E1> h1;
  @Inject Consumer<E2> h2;
  @Inject Executor executor;

  @Override
  public void run() {
    executor.execute(() -> {
      try {
        implementation(p1, p2, f1, f2);
      } catch (E1 e1) {
        h1.accept(e1);
      } catch (E2 e2) {
        h2.accept(e2);
      });
  }

  private void implementation(
    P1 p1, P2 p2,
    ManagedFunction f1, ManagedFunction f2
  ) throws E1, E2 {
    // use dependency inject objects p1, p2
    // invoke other methods via f1, f2
    // allow throwing exceptions E1, E2
  }
}

Ok, there's a lot going on here but it's just the patterns above combined together. The client code is now completely decoupled from the method implementation, as it just runs:

@Inject ManagedFunction function;
public void clientCode() {
  function.run();
}

The implementing method is now free to change without impacting the client calling code:

there is no return type from methods (slight restriction always being void, however necessary for asynchronous code)

  • the implementing method name may change, as it is wrapped by the ManagedFunction.run()
  • parameters are no longer required by the ManagedFunction. These are Dependency Injected, allowing the implementing method to select which parameters (objects) it requires
  • exceptions are handled by injected Consumers. The implementing method may now dictate what exceptions it throws, requiring only different Consumers injected. The client calling code is unaware that the implementing method may now be throwing a HTTPException instead of SQLException. Furthermore, Consumers can actually be implemented by ManagedFunctions injecting the Exception.
  • the injection of the Executor allows the implementing method to dictate it's thread of execution by specifying the Executor to inject. This could result in re-using the client's calling thread or have the implementation run by a separate thread or thread pool

All five coupling points of the method by it's caller are now decoupled.

We have actually "Inverted Control of the Coupling". In other words, the client caller no longer dictates what the implementing method can be named, use as parameters, throw as exceptions, which thread to use, etc. Control of coupling is inverted so that the implementing method can dictate what it couples to by specifying it's required injection.

Furthermore, as there is no coupling by the caller there is no need to refactor code. The implementation changes and then configures in it's coupling (injection) to the rest of the system. Client calling code no longer needs to be refactored.

So in effect Dependency Injection only solved 1/5 of the method coupling problem. For something that is so successful for only solving 20% of the problem, it does show how much of a problem coupling of the method really is.

OfficeFloor

Implementing the above patterns would create more code than it's worth in your systems. That's why the OfficeFloor (http://officefloor.net) "true" inversion of control framework has been put together to lessen the burdon of this code. This has been an experiment in the above concepts to see if real systems are easier to build and maintain with "true" inversion of control. Try the tutorials out to see the concepts in action (http://officefloor.net/tutorials). We value your feedback, as we ourselves feel so claustrophobic with the coupling imposed by the method signature.

OfficeFloor is also an implementation of many other innovative patterns that is going to take many more articles to cover. However, the concepts have been published should you like some heavy academic reading (http://doi.acm.org/10.1145/2739011.2739013 with free download available from http://www.officefloor.net/about.html).

Possibly the most interesting aspect of these patterns is that the configuration is no longer in code, but actually done graphically. The follow is an example of this configuration:



Summary

So next time you reach for the Refactor Button/Command, realise that this is brought on by coupling of the method that has been staring us in the face every time we right code.

And really why do we have the method signature? It is because of the thread stack. We need to load memory onto a thread stack, and the method signature follows this behaviour of the computer. However, in the real world modelling of behaviour between objects there is no thread stack. Objects are loosely coupled with very small touch points - not the five coupling aspects imposed by the method.

Furthermore, in computing we strive towards low coupling and high cohesion. One might possibly put forward a case that in comparison to ManagedFunctions, that methods are:

  • high coupling: as methods have five aspects of coupling to the client calling code
  • low cohesion: as the handling of exceptions and return types from methods starts blurring the responsibility of the methods over time. Continuous change and shortcuts can quickly degrade the cohesiveness of the implementation of the method to start handling logic beyond it's responsibility

So given, as a discipline we strive for low coupling and high cohesion, our most fundamental building block (the method and also for that matter the function) may actually go against our most core principles.