Tuesday, 5 January 2021

Performance Improvements for TechEmpower Round 20

OfficeFloor is turning it's attention to improving performance now that it's underlying concepts have been defined.

OfficeFloor primarily is Inversion of Coupling Control to avoid the OO Matrix.  This goes beyond dependency injection to include function/continuation injection and thread injection. In our writings, we've come to understand the fundamental model.

However, while some frameworks claim functionality is more important than performance, we at OfficeFloor believe that's just incomplete solutions leaking cumbersome work arounds.  Take for example, aspect oriented programming in Spring, where the implementation is layers of reflective calls that hamper performance. In OfficeFloor, everything is executed in discrete functions, or more correctly first class procedures. This results in the application being a list of functions executed one after another.  An aspect is just a function inserted into the list.  Therefore, the cost of an aspect in OfficeFloor incurs no additional overhead costs.  But yes, we admit, there is overheads managing the execution of the list of functions.

So now that the inversion of coupling control solution is understood, the focus has been to look at optimisations.  And what better way to look at improving performance than competing against other frameworks.  Hence, we very much appreciate the work by the TechEmpower Benchmarks.

However, truth be told, OfficeFloor's focus has been both on improving optimisations and increasing third party integrations to inject into OfficeFloor applications.  So from round 19 to round 20 of the TechEmpower Benchmarks, OfficeFloor has added ability for asynchronous programming.  This has opened the door to using technologies such as Reactor and in particular R2DBC for asynchronous database interaction.

Asynchronous frameworks seem to dominate the top of the TechEmpower Benchmarks.  Now with Project Loom claiming negligible threading overheads this may change in future years.  However, we doubt a model of "negligible threading contention" will out perform the no threading contention of the single threaded asynchronous frameworks.  But this is just conjecture without real numbers to back it up.  So let's focus on the real numbers of round 20.

The most interesting inclusions by OfficeFloor for round 20 are:

  • officefloor-raw using R2DBC to provide implementations for the database tests
  • officefloor-async providing R2DBC solution
  • officefloor-raw using thread affinity to increase throughput
  • ByteBuffer pooling improvements
  • using parallel GC rather than Java's default garbage collection

OfficeFloor Raw using R2DBC

The officefloor-raw entry is different to the other OfficeFloor entries, as it's focus is purely on the HTTP server implementation.  It does not include the OfficeFloor inversion of coupling control framework.  It just provides a custom test implementation using OfficeFloor's HTTP server component.  Hence, why it is classed as a platform rather than framework in the tests.  This, however, allows seeing the performance overheads of OfficeFloor's inversion of coupling control framework.

Given we are looking to see the overheads, we provided officefloor-raw implementations using R2DBC for the database tests.  As officefloor-raw does not incur the inversion of control overheads, it is by nature going to be a lot faster than the rest of the OfficeFloor entries.  However, the closer we can get the other entries to officefloor-raw, the more we reduce OfficeFloor's inversion of coupling control overheads.

Though looking at the Java competitors above OfficeFloor, we will also consider supporting Vertx's PgClient for Round 21.

OfficeFloor Async

As mentioned, the major change from round 19 to round 20 for OfficeFloor, is the support of asynchronous programming.  The officefloor-async entry is using this functionality with R2DBC.

Now OfficeFloor is a multi-threaded framework.  The asynchronous functionality, therefore, has thread synchronising overheads that single threaded asynchronous frameworks do not suffer.  Well the developer suffers in a more complex programming model. However, the performance of single threaded asynchronous frameworks is optimal due to little contention.

Hence, the officefloor-async entry shines in the update test.  The update test is heavily database dependent and requires reduction of contention between connections. Therefore, pipelining database queries over lower number of connections will decrease the number of parallel updates.  This subsequently reduces the contention in the database and improves overall throughput.

The officefloor-async entry out performs the other OfficeFloor framework entries in the update test because the database contention is significantly more important than the threading contention for throughput.

Thread Affinity

OfficeFloor supports Thread Affinity in it's inversion of coupling control framework to pin servicing of requests to a single CPU.  This ensures optimal use of CPU caches, as moving threads between CPUs causes overheads in migrating data to the other CPU caches.  By using thread affinity, the operating system's thread scheduler is informed to only schedule the thread to run on a particular CPU.

For round 20, we have used thread affinity for the officefloor-raw entry to pin each socket listener thread to a particular CPU.  This effectively allows running as 32 separate servers in Citrine, rather than one big congested server grabbing at the work.  As contention is always overhead, using thread affinity to run isolated without contention (ok bus aside) has provided increased throughput.

ByteBuffer Pooling

The ByteBuffers to read and write from the socket have seen improvements in their pooling.  Creating and destroying direct ByteBuffers is expensive, hence it is much better to pool them for improved performance.  Furthermore, as direct ByteBuffers are separate memory to the heap, they allow OfficeFloor to effectively manage memory (rather than relying on the "dark magic" of garbage collection).

OfficeFloor has two levels of pooling the ByteBuffers:
  • single core pool
  • thread local pool

The single core pool is the shared pool of ByteBuffers all threads can draw from.  Previously, access to this pool was via synchronized methods.  This has been enhanced to use ConcurrentLinkedDeque to reduce thread synchronising overheads.  This has provided improved throughput to access the core pool.

The thread local pools are an individual pool per thread.  As the pool is specific for the thread, there is no thread contention in obtaining/returning ByteBuffers from/to the pool.  This further improves performance of threads heavily involved in reading/writing to the sockets.  However, can cause out of memory if many threads are involved in reading/writing from sockets (which can be the case particularly for writing responses by request servicing threads).

Improvements have also been made in the thread local pooling to keep memory down and allow better distribution of ByteBuffers.  This is achieved by only allowing the socket listener threads to pool ByteBuffers on their threads. As the majority of ByteBuffer interaction is undertaken by a constant set of socket listener threads, this means they can pool a significantly higher number of ByteBuffers for improved performance.  It does mean servicing threads do need to use the core pool for the response ByteBuffers.  However, as servicing threads spend time doing other work, there is less chance of contention in accessing the core pool.  Therefore, with the improved core pool access, less contention for core pool and the socket listener threads now able have higher pool sizes, it overall provides greater throughput.

Parallel GC

While Java is optimising garbage collection for large heaps, focus of OfficeFloor is on reducing garbage.  Creating lots of objects incurs more GC, which is overhead taking away from performance.  Hence, in reducing object creation there is little need for large heaps.  Therefore, while OfficeFloor has moved from Java 11 (round 19) to Java 15 (round 20), we have fallen back to Parallel GC for improved throughput due small heap size.

Note we did not notice any significant difference in Java 15 avoiding biases locking by default.  However, this was not extensively tested, as we had to do this locally (we did not have enough continuous runs before round 20 is being published to properly check this - so went with Java 15's claim indicating it should be negligible impact).

Learning from this is that for Java 15 consider using the Parallel GC if your heap sizes don't grow large.


Future Work For Round 21

For round 21 we will be looking at flattening of the call stack within OfficeFloor.  Work is already underway on this.  This will reduce the number of methods called to execute each function in the list.  Ideally, bringing OfficeFloor entries closer to officefloor-raw.


Saturday, 14 November 2020

OfficeFloor: going beyond Dependency Injection

This was a submission to DZone's Computer Science Fare:  https://dzone.com/articles/officefloor-going-beyond-dependency-injection

 
So why another Dependency Injection (DI) framework?  Because Dependency Injection is only part of the Inversion of (Coupling) Control problem.  OfficeFloor provides the complete injection solution.


Dependency Injection Problem

Dependency Injection on it's own can actually promote lower cohesion and higher coupling.  Why, because Dependency Injection provides undisciplined short cuts to get references to objects.  I need a repository to retrieve some data, I just inject it.  I need some logic to work out some result, I just inject it.  Overtime with on going improvements to systems, everything starts to reference everything else.  Changing the interface of one thing becomes difficult, as so many things just pull it in (higher coupling).  Furthermore, because it is so easy to "just inject" dependencies, convenience starts polluting single purpose logic of classes (lower cohesion).

Now, in disciplined development, we use principles such as SOLID to attempt to reduce this problem.  However, we are continually fighting Lehman's laws of two general concerns:

  1. systems need to continually add new functionality to stay relevant
  2. new functionality increases complexity reducing ability to change the system

Too often deadlines, budgets, even just bad days, we just keep enhancing classes without paying much attention to the second problem.  This all builds up and eventually we either:

  • require an expensive rewrite of the system (typically not feasible by the business)
  • live with degraded ability to add new features (potentially becoming the death of the system as it can no longer stay competitive)

Furthermore, Dependency Injection just looks at the static structural concerns of how objects reference each other.  We have not added into the mix threading or how we can compose methods together (effectively gaining benefits of functional composition).  We find this problem so systematic in DI only systems, that it has actually pulled the OO Matrix veil over our eyes.


Going beyond Dependency Injection

This is where OfficeFloor is taking things beyond Dependency Injection.

OfficeFloor injects dependencies not only into objects but also as arguments into  methods.  This means we do not have to structurally link all objects together for the method to traverse these structures.  The method just adds a new parameter for a new dependency.  The result is we avoid high coupling because objects no longer require structural references to all other objects.

However, this only addresses the low coupling, as we also want higher cohesion.  Well, OfficeFloor not only injects objects into methods, it also injects other methods into methods.  This effectively provides function composition, a concept OfficeFloor has termed Continuation Injection (or easier to remember Function Injection).  The result is higher cohesion of methods.  Methods can focus on single purposes and be composed together to achieve more complex functionality.

Note that due to OfficeFloor's containerising of the execution of methods, OfficeFloor can also containerise Monads to be composed in with methods.  This allows blending Object Orientation with Functional Programming in the same application.  We are still exploring this aspect, but the composibility provided by OfficeFloor keeps methods focused on single purposes resulting in significantly higher cohesion.

Going even further

Now we could have stopped there, as we are gaining significant benefits over Dependency Injection.  However, this only lets you build more maintainable systems.  We also want to build highly performant systems too.

Now multi-threading is hard.  Frameworks usually opt for a single model of threading (thread-per-request / asynchronous single threaded).  Attempting to blend threading models requires senior developer skill sets.  Furthermore, these blended solutions can run into tricky problems as further incremental functionality is added by other less skilled developers.

OfficeFloor, allows assigning execution of the individual methods to particular thread pools (what it terms Thread Injection).  This allows blending various threading models (even thread affinity patterns) to achieve the right mix of performance and ease of development.  OfficeFloor even allows for different execution profiles based on different hardware.  This provides highly performant applications.
 

Elephant in the room

Regardless of all these injection patterns, the real elephant in the room for applications is growing complexity.  To really address the second general Lehman issue of complexity, we need to be able to see the problem.
 
Lines of code one after another requires us to mentally visualise the complexity in the code.  As we practice small incremental changes, complexity becomes hidden in the scroll blindness of the vast lines of code.  Yes, we modularise and use principles.  However, deadlines and budgets focus us on that small new feature and not the growing complexity elephant in your system.

Furthermore, this level of complexity becomes hard to explain to non-technical individuals who generally hold the purse strings or set priorities.  As code is already too complex, I find we start drawing diagrams to communicate the problem.  We then use these diagrams to justify "refactoring" efforts to simplify the code, with promises that we will be able to add new features faster after this.  What's even more "fun" is trying to provide estimates on the effort of the refactoring.  When opening the Pandora's box of tackling the complexity, I find in DI only systems, the uncovered herd of complexity elephants just trample over those estimates.

However, by the time you find the need to draw diagrams to start to communicate the complexity problem, it is all too late. Complexity has already become a problem.  The slow creep of new functionality has added complexity to the point technical debt may become interest only payments.  Basically, complexity is showing the signs the system is creaking to it's demise.

This late discovery of complexity goes against modern software practices.   We want early feedback on this complexity risk so that it can be addressed before it becomes a problem.

OfficeFloor's configuration of wiring the methods together is graphical.  This graphical configuration visually demonstrates the complexity of the application.  The more spaghetti the diagram looks, the more need to spend effort simplifying the system.  As this is visual from the start of the project and always up to date (as is the actual configuration of the system), complexity is easily seen.  This means complexity can be addressed before it starts becoming a problem.

So our motivation behind OfficeFloor is to provide the necessary improvements over Dependency Injection to ease the building of systems, ease the maintaining of systems, and avoiding the complexity increasing so that developers keep their hair when trying to wrangle in that new bit of functionality.

Inversion of Coupling Control

So you might ask, why aren't Dependency Injection frameworks adopting Function Injection / Thread Injection for a complete Inversion of Coupling Control solution?

Because Function Injection and Thread Injection are significantly harder to implement than just injecting objects into each other.

To achieve method injection, we needed to containerise the execution of each method.  The container:
  1.  Ensures all dependencies are available for the method
  2.  Delegates the execution of the method to the appropriate thread pool
  3.  Handles all exceptions coming from the method
Now, this is only the start of the problem.  Once we containerised the method, we needed to type the method.  To compose the methods together, we needed to know what dependencies on other methods and objects the method requires.  The result is a typing system for methods that allows them to be visually represented in the graphical configuration.  This then lets the wiring of the methods be graphically undertaken.

The model driven design issues of using graphical configuration also needs to be avoided.  This is achieved by the graphical configuration being modularised similar to class source files.   They are:
  • modularised into separate files
  • each modular file is then typed, so can be included like typed methods in other modules 
  • within each module there is a flat structure (this avoids hierarchies that cause source code merge issues - merge is just like flat lines of code)
  • the flat structure is sorted on save, so editing graphical configurations results in the same stored XML (avoiding duplications due to source code merging different line orderings) 
  • contain no generated identifiers causing merge differences between developers
Finally the backing XML to the graphical configuration is easily readable for those rare occasions of having to resolve merge conflicts.

Performance requirements dictated that we could not keep swapping threads for each method executed.  Hence, if the next method was to be executed by the same thread pool, the current thread would continue on to execute that method.  This meant that thread context switches only occurred when required and not for every method execution.  This allows OfficeFloor to act as both asynchronous single threaded (no context switch) and thread-per-request (when using thread pools for method execution).  However, as threading can be tailored to specific methods, there are various custom threading models available to improve performance of your application (even including thread affinity).

Building this flexibility in threading allows the containerised methods to be executed by any thread. Hence, memory synchronisation issues became important.  To avoid significant locking and memory synchronising between threads, the internals of OfficeFloor is written mostly via immutable structures.  This allows significantly reduced locking when executing the methods within the containers.

Furthermore, as methods are not statically linked like objects, there is need for fast look up of dependencies.  OfficeFloor provides a compiler that transforms the modular configuration into a graph to represent the entire application.  This graph is checked for type correctness so that invalid application configurations is detected at compile time.  Once deemed valid, the graph is compiled down into the execution engine that uses array index lookup of all dependencies for performance (avoiding hash map look ups by strings).

Show me the working code

The following provides working code demonstrations of OfficeFloor:

 

Conclusion

So next time you look to build an application, consider OfficeFloor.  With the enhancements of Function Injection and Thread Injection over Dependency Injection, OfficeFloor encourages significantly lower coupling and higher cohesion in your applications.  This avoids complexity from creeping up and can keep you adding new features long after a DI only system become frozen from complexity. See the tutorials to get going with OfficeFloor.


Saturday, 28 March 2020

Migrating Spring Application to OfficeFloor

This series of articles looks at migrating Spring applications to OfficeFloor.

Why?

So the first question comes up.

Why migrate Spring applications to OfficeFloor?

Spring provides dependency injection to create a graph of objects. Spring, however, provides little management over threading and attempts to model behaviour through object composition. This is where OfficeFloor in it's Inversion of Coupling Control provides injection management of threading and functional behaviour.

Now this is not to say Spring is inferior.  Spring's focus has come from and stays with mainstream Object Orientation.  OfficeFloor has merely taken these Object Orientation concepts and extended them with Functional Programming composition concepts and Process/Threading concepts.

Now if you are happy with mainstream Object Orientation then by all means stick with Spring.  Spring is a great Object Oriented framework that deserves its success.

However, if you want to avoid writing your own threading or would like more seamless support for functional programming, then please read on.

Let's not replace Spring

It would be unreasonable to port a Spring application in its entirety.  Spring has been under active development since the turn of the millennium.  Ok, at the time of writing this article that is just under two decades. For some industries that could be considered a short time.  However, in the IT industry, where there are new JavaScript frameworks popping up every day, this longevity is testament to Spring's success.

For me, Spring's success has been because it got something fundamentally right. Now we could argue academically about Object Oriented composition vs Functional Programming composition.  We could argue about libraries vs frameworks.  We could even argue about Java vs new languages like Kotlin. But what I see in these arguments is focus on finding simpler and easier ways to write software.  And for me, Spring (and more specifically Spring Boot) did that by getting rid of the repetitive plumbing code.  This is so we could get on with writing the important functionality of our applications.

To enable focus on removing repetitive plumbing code there are a few things required:
  1. Extensible framework
  2. Plugins written for supporting the various infrastructure / existing solutions
  3. Open platform for community addition / management of plugins
Spring by wiring objects together enables this extensibility.  New functionality can be included by wiring in the objects to support that functionality

Plugins then extend the framework to cover including various infrastructure and existing solutions.  The plugins avoid us re-inventing the wheel so our focus can be on achieving the functionality of our application.

The final aspect is supporting the IT ecosystem. There are so many competing technologies and even platforms with the various cloud providers. A single group or company attempting to support this would be overwhelmed. Therefore, Spring has developed a strong open source community to write and maintain all the plugins. Even VMWare's acquisition of Spring mentions this open source community.

Therefore, it is unreasonable to rewrite Spring into something new.  There is too much of an ecosystem to replicate.

Integrating Spring

So rewriting everything Spring into OfficeFloor is not a feasible direction.

Rewriting is also a violation of the first point - an extensible framework.

OfficeFloor must subsequently be extensible to plugin Spring.  OfficeFloor supports extension via Object Orientation, Functional Programming and Threading.  Therefore, if OfficeFloor is not extensible via Object Oriented Spring, then OfficeFloor fails at being extensible.

OfficeFloor supports plugging in Spring via a SupplierSource.  This extension point of OfficeFloor allows for third party dependency injection frameworks to make their dependencies available for injection in OfficeFloor applications.

Now OfficeFloor's website provides detailed tutorials on various aspects.  OfficeFloor likes to adhere to the DRY (don't repeat yourself) principle, so this article series will direct you to the relevant tutorials for detailed aspects of Spring to OfficeFloor migration. Therefore, please see the Spring tutorial for how to plugin Spring dependencies into OfficeFloor.

The result of plugging Spring into OfficeFloor is that OfficeFloor gets a library of pre-written objects.  Spring focuses on wiring together a graph of objects.  OfficeFloor does not interfere with this, as Spring does this exceptionally well.  What OfficeFloor does is allow reference to the various objects in the Spring graph for injection into its own application.  This effectively allows OfficeFloor to be extended with Spring managed objects.

Going beyond Objects

The rest of this article serious will look at how to migrate our Object Oriented Spring applications into OfficeFloor to start using OfficeFloor's Functional Programming and Threading extensions.  While objects link data with functionality (methods), objects still provide little in terms of threading.  This article series will show how to further simplify our Spring applications with these Functional Programming and Threading extensions/plugins by migrating to OfficeFloor.

Tuesday, 11 February 2020

DDD Perth 2019 Presentation

The DDD Perth 2019 presentation of Inversion of Coupling Control has been published.  Enjoy!


Saturday, 25 January 2020

Monad with IoCC provides Process/Thread model

This is the final article in the series on looking at Inversion of Coupling Control (IoCC) composition.  The previous articles covered:


This article looks at providing a mathematical model to explain the composition.

Just a little disclaimer that I'm not a mathematics boffin.  I've a degree in computer science but it did not cover much functional programming. Much of this is through my self taught understanding of functional programming and mathematics.  Therefore, I'm happy to take feedback from more capable mathematicians on better ways to express the model.  However, I'm hoping this article reasonably conveys the underlying model for composition with Inversion of Coupling Control.

From Category Theory, we have the associative law:

  f(x) . g(x) = f.g(x)

With this we can introduce dependencies:

  f(x)(d1) . g(x)(d2) = f.g(x)(d1, d2)

where:
  d is a set of dependencies

This makes the program very rigid, as changing d1 to d3 has significant impact for use of f(x)(d1 now d3).  For example, switching from database connection to REST end point.

ZIO attempts to reduce the rigidity by the following:

  f(x)(d1) . g(x)(d2) = f.g(x)(D)

where:
  D = d1 + d2
Or, in other words:
  D extends d1 with d2

Now, we can create lots of morphisms and at execution of resulting ZIO, provide a hom(D), which is the set of all required dependencies.

So, this model works.  It is certainly enabling injection of dependencies in functional programs.

Now, I'd like to take another tact to the problem.

The Imperative Functional Programming paper could not see how to remove the continuation type (z) from the signature. The authors did conclude Monads and CPS very similar, but due to the extra continuation type on the signature and the author's intuition, the IO Monad was the direction forward.

Now I certainly am not taking the tact to replace IO Monad with CPS. I'm looking to create a complementary model. A model where continuations decouple the IO Monads.

So introducing dependencies to the IO Monad, we get:

  IO[x](d)

where:
  d is the set of dependencies required

This then follows, that joining two IO together we get:

  IO[x](d1, d2)

So, maybe let's keep the IO Monad's separate and join them via CPS.   This changes the signature to:

  IO[x](d)(z)

where:
  z = Either[Throwable,x] -> ()

The pesky z that the Imperative Functional Programming paper was talking about.

However, discussed previously is Continuation Injection. This effectively hides the z from the signature, making it an injected function. As it's an injected function, the z becomes an indirection to another function. This indirection can be represented as:

  IO[_](d1) -> (Either[Throwable,y] -> IO[y](d2)) -> IO[y](d2)

Note: the joined IO need only handle y or any of it's super types. Hence, the relationship indicates the passed type. This makes it easy to inject in another IO for handling the continuation.

Now to start isolating the IO Monads from each other, we are going to start with Thread Injection.

  d -> Executor

This represents Thread Injection choosing the appropriate Executor from the dependencies.  Therefore, we can then introduce a Thread Injection Monad to choose the Executor.

  F[_](d)(Executor) -> (d -> Executor) -> TI[F[_](d)] 

where
  TI is the Thread Injection Monad that contains the dependency to Executor mapping to enable executing the IO Monad with the appropriate Executor.

This then has the above continuation between IO Monads relationship become.

  TI[IO[_](d1)] -> (Either[Throwable,y] -> IO[y](d2)(Executor)) -> TI[IO[y](d2)]

Now the IO Monads can be executed by the appropriate Executor via the TI Monad.

Further to this, we can model dependency injection with:

  F[_](d) -> (F[_](d) -> F[_]) -> DI[F[_]]

where
  DI is the Dependency Injection Monad that supplies dependencies to the function.

Note that DI Monad will also manage the life-cycle of the dependencies.  Discussion of how this is managed will be a topic for another article.

So the above IO Monad continuation relationship becomes:

  TI[DI[IO[_]]] -> (Either[Throwable,y] -> IO[y](d)(Executor)) -> TI[DI[IO[y]]]

where:
  DI propagates the same instances of dependencies across the continuation

Now, with Continuation Injection we are not limited to injecting in only one continuation.  We can inject in many:

  TI[DI[IO[_]]] -> (Either[Throwable,y] -> IO[y](d)(Executor)) -> TI[DI[IO[y]]]
                -> (Either[Throwable,w] -> IO[w](d)(Executor)) -> TI[DI[IO[w]]]
                ...

Note: I'm guessing this can be represented on a single line (possible as set of continuations from a particular IO) but I'll leave that to a boffin more mathematical than me.

This means we can remove the Either and have the (possibly many) exceptions handled by separate continuations to get:

  TI[DI[IO[_]]] -> (y -> IO[y](d)(Executor)) -> TI[DI[IO[y]]]
                -> (ex -> IO[ex](d)(Executor)) -> TI[DI[IO[ex]]]
                ...

This demonstrates that an IO may now actually have more than one output. By having the ability to inject multiple continuations, the IO is capable of multiple outputs.

It is also execution safe.  OfficeFloor (Inversion of Coupling Control) ensures the handling of one continuation completes before the next continuation begins executing. This ensures only one IO is ever being executed at one time.

Further to this we can qualify DI. Originally we had d1, d2 that was hidden by DI. We can qualify the scope of DI as follows:

  DI[P,T,_]

where:
  P is the set of process dependency instances
  T is the set of thread dependency instances

This allows for the following.

  Same thread = DI[P,T,_] -> (_ -> _) -> DI[P,T,_]
  Spawned thread = DI[P,T,_] -> (_ -> _) -> DI[P,S,_]
  New process = DI[P,T,_] -> (_ -> _) -> DI[Q,S,_]

In other words,
  • spawning a thread is creating a new set of thread dependencies instances
  • interprocess communication is a different set of process dependency instances
Further to this:
  • the set of T may only be the same if the set of P is same
  • context (eg transactions) apply only to dependencies in T
The resulting IO Monad relationship for a same thread continuation becomes.

  TI[DI[P,T,IO[_]]] -> (y -> IO[y](d)(Executor)) -> TI[DI[P,T,IO[y]]]

while a spawned thread continuation relationship is modelled as follows.

  TI[DI[P,T,IO[_]]] -> (y -> IO[y](d)(Executor)) -> TI[DI[P,S,IO[y]]]

What this essentially allows is multi-threading concurrency. Any continuation may spawn a new thread by starting a new set of thread dependencies. Furthermore, OfficeFloor will asynchronously process the continuation returning control immediately. This has the effect of spawning a thread.

The same goes for spawning a new process.

  TI[DI[P,T,IO[_]]] -> (y -> IO[y](d)(Executor)) -> TI[DI[Q,S,IO[y]]]

Therefore, with OfficeFloor, processes and threading are a configuration not a programming problem. Developers are no longer required to code thread safety into their possible imperative code within the IO. As the seldom used process dependencies are coded thread safe, this relationship introduces the ability for mutability within the IO that is thread safe. The isolation of dependencies prevents memory corruption (assuming dependencies respect not sharing static state).

OfficeFloor (Inversion of Coupling Control) is in this sense possibly the dark side. Functional programming strives for purity of functions being the light. Given OfficeFloor can handle:
  • multiple outputs from IOs including exceptions as continuations
  • mutability within the IOs that is thread safe
OfficeFloor enables modeling the darker impurities (or maybe I just watch too much StarWars).

What we now have is a possible "inversion" of the function:
  • Function: strives to be pure, may have multiple inputs and only a single output.    
  • IoCC: allows impurities, has a single input and may have multiple outputs.
I personally like to think of functions like parts of a machine.  They are highly coupled engine cogs providing always predictable output.

I then like to think of IoCC like signals. This is a more organic structure of loosely coupled events triggering other loosely coupled events. The result is a mostly predictable output. This is more similar to human decisioning outputs.

Regardless, we now have a typed model that can be represented as a directed graph of interactions. The IO Monads are the nodes with the various continuations edges between them. An edge in the graph is qualified as follows.

  TI[DI[IO]] == y,p,t ==> TI[DI[IO]]

where:
  y indicates the data type provided to the continuation
  p indicates if a new process is spawned (represented as 0 for no and 1 for yes)
  t indicates if a new thread is spawned (again represented as 0 for no and 1 for yes)

The result is the following example graph.
which, is essentially the OfficeFloor Continuation Injection configuration.

Summary

All of the above is already implemented in OfficeFloor.

The previous articles demonstrated the type system of Inversion of Coupling Control to enable composition. The type system also enabled encapsulation in First-Class Modules for easy assembly and comprehension of OfficeFloor applications. This was then demonstrated with a simple example application.

What this article has attempted to cover is the core underlying model. It has looked at how injected continuations can be used to join together IO instances. Further, it looked at the dependencies and how they can be used to model processes and threads.

Future Work

At the moment, we're focused on making building non-distributed applications a pleasure with OfficeFloor. This runs on the premise that if you are not enjoying building smaller applications with a toolset, why would you want to build larger more complex applications with that toolset.

However, we are nearing the completion of the bulk of this work.

We will be looking to simplify building distributed applications soon.

This will be achieved by looking at algorithms to examine the directed graph of continuations to decide on best places to separate the IOs into different containers.  What the algorithms will take into account are the above relationships. In particular:
  • the directed graph of continuations
  • isolating sub graphs by the process (and possibly thread) dependency rules
  • identifying which sub graphs to isolate to another container by incorporating run time metrics of the IOs
Note that we can model interprocess communication as:

  Async (e.g. queue) = DI[P,T,_] -> (message -> _) -> DI[Q,S,_] -> ()
  Sync (e.g. REST)  = D[P,T,_] -> (request -> _) -> D[Q,S,_] -> (response -> _) -> D[P,T,_]

This can provide type safe modeling of distributing the IOs within the directed continuation graph.

Note that we may have to mark dependency instances that carry non-replicatable state between IO Monads.  In other words, a database connection (not in transactional context) can be replicated by obtaining another database connection from the pool.  However, a dependency that stores some value in it from one IO Monad that is used by the next IO Monad is non-replicable.

In practice, this has only been variable dependencies to further remove the parameter coupling between IO Monads (see OfficeFloor tutorials).

However, we are also finding in practice that it is relatively intuitive to find sub graphs to isolate to their own containers.  Endeavours in this work will likely look at automation of dynamically isolating sub graphs to containers as load changes (effectively having selective elastic scale of functions not arbitrarily bounded microservices).

Tuesday, 14 January 2020

compose Cats, Reactor, ZIO, ... Effects

This is the third in a series of articles looking at the type system for Inversion of Coupling Control to provide composition.

The previous articles covered:
This article will look at taking the theory into practice.  It will use the concepts to build an application composing Effects from various Effect libraries.

Note that the Effects used is kept deliberately simple to focus on the composition of the effects.  This is mainly because this article is not to compare libraries.  This article is to compose them.  We show how using Inversion of Coupling Control they can be seamlessly composed together in a simple application.  Also, order of discussing the libraries is nothing more than alphabetical.

To keep matters simple the effect will be retrieving a message from the database.

Cats

Let's begin with Cats Effect.

  def cats(request: ServerRequest)(implicit repository: MessageRepository): IO[ServerResponse] =
    for {
      message <- catsGetMessage(request.getId)
      response = new ServerResponse(s"${message.getContent} via Cats")
    } yield response

  def catsGetMessage(id: Int)(implicit repository: MessageRepository): IO[Message] =
    IO.apply(repository findById id orElseThrow)

The catsGetMessage function wraps the repository retrieving message effect within an IO.   This can then be used to service the request to provide a response (as per the cats function).

The use of implicit may be overkill for the single repository dependency.  However, it shows how dependency injection can remove dependency clutter from the servicing logic.  This is especially useful when the number of dependencies grows.

Reactor

Reactor has the following servicing logic.
  def reactor(request: ServerRequest)(implicit repository: MessageRepository): Mono[ServerResponse] =
    reactorGetMessage(request.getId).map(message => new ServerResponse(s"${message.getContent} via Reactor"))

  def reactorGetMessage(id: Int)(implicit repository: MessageRepository): Mono[Message] =
    Mono.fromCallable(() => repository.findById(id).orElseThrow())

Again, there is a reactorGetMessage function wrapping the retrieving message effect into a Mono.  This is then used to service the request.

ZIO

For ZIO the logic is slightly different, as ZIO provides it's own dependency injection.

  def zio(request: ServerRequest, repository: MessageRepository): ZIO[Any, Throwable, ServerResponse] = {
    // Service logic
    val response = for {
      message <- zioGetMessage(request.getId)
      response = new ServerResponse(s"${message.getContent} via ZIO")
    } yield response

    // Provide dependencies
    response.provide(new InjectMessageRepository {
      override val messageRepository = repository
    })
  }

  def zioGetMessage(id: Int): ZIO[InjectMessageRepository, Throwable, Message] =
    ZIO.accessM(env => ZIO.effect(env.messageRepository.findById(id).orElseThrow()))

  trait InjectMessageRepository {
    val messageRepository: MessageRepository
  }

The zioGetMessage again wraps the retrieve database message effect within a ZIO.  However, it extracts the injected trait to retrieve the repository.

Encapsulating into a Module

The above functions (cats, reactor, zio) are configured as First-Class Procedures into the following Module.
This module has an output being the Response with inputs Cats, Reactor, ZIO and Imperative.

As First-Class Procedures are lazily evaluated they can also wrap imperative code containing effects.  The imperative function is the following.

  def imperative(request: ServerRequest, repository: MessageRepository): ServerResponse = {
    val message = repository.findById(request.getId).orElseThrow()
    new ServerResponse(s"${message.getContent} via Imperative")
  }

Using the Module

The following configuration uses the module to service REST requests.  It is configured as the Synchronous module.

This demonstrates how easy it is to configure the module into servicing requests.

What is further interesting is the Asynchronous module has the same interface of inputs/outputs as the Synchronous module.  Now, this could quite possibly be the above module re-used (just badly named).   However, it is not.  The Asynchronous module undertakes the same logic, but just asynchronously (code available in demo project).

What is important for modules is the contractual interface of inputs and outputs.  We could quite happily swap the Synchronous / Asynchronous modules around in the configuration and the application will still continue to work.  This allows the complexity to be encapsulated.

A more real world example is we could start out with the quicker to write and easier to debug synchronous effects.  Then as the application grows in scale, we may decide to swap in an asynchronous module to better handle scale.  The amount of refactoring to swap the Synchronous module to the Asynchronous module would be:
  1. Drop in new Asynchronous module
  2. Re-wire flows to the Asynchronous module
  3. Delete the Synchronous module
As Inversion of Coupling Control removes the function coupling, there is no code to change except providing the implementation of the new module.

With modules able to contain modules, this provides a means to encapsulate complexity of the application for easier comprehension.  It also makes importing modules simple.  Drop them in and wire them up.  And is especially useful when libraries of third party modules are available for composition of ready to use functionality.

Composing Effects

This demonstrates First-Class Procedures and First-Class Modules of the previous articles in this series.

Hey, but this article promised composing effects!

Well I could tell you the send is an effect and that composing this after the above effects is that composition.   However, that's taking a lot of my word for it.

Therefore, the last module in the server configuration is the following.
This module composes an effect from each of the libraries.  The code for each effect is the following.
  def seed: String = "Hi"

  def cats(@Parameter param: String): IO[String] = IO.pure(s"$param, via Cats")

  def reactor(@Parameter param: String): Mono[String] = Mono.just(s"$param, via Reactor")

  def zio(@Parameter param: String): ZIO[Any, Nothing, String] = ZIO.succeed(s"$param, via ZIO")

  def imperative(@Parameter param: String): String = s"$param, via Imperative"

  def response(@Parameter message: String): ServerResponse = new ServerResponse(message)

Each effect just takes the input of the previous and appends it's library name.  The resulting response is a string containing all the effect library names.

No adapters

Astute readers may be thinking that under the hood of OfficeFloor there may be some fabulous adapters between the libraries.  Hmmm, can we extract these and make use of them?

Sadly and for that matter quite happily there are no adapters between the libraries.  What actually happens is that each First-Class Procedure unsafely executes its effect and retrieves the resulting output.  With the output OfficeFloor then invokes the next First-Class Procedure.  By doing so, we do not need to adapt the libraries with each other.  We can run each effect in isolation and interface them via their typed inputs/outputs.

This makes integration of new effect libraries very simple.  Just write a once off adapter to encapsulate the library's effects within a First-Class Procedure.  The effect library is then able to integrate with all the other effect libraries.  As First-Class Procedures are actually a specialised First-Class Module, this demonstrates the composition capabilities of Inversion of Coupling Control.

Summary

This article has been code and configuration heavy to demonstrate how First-Class Procedures and First-Class Modules compose.

It has demonstrated that the type system of Inversion of Coupling Control makes composition easy (essentially drawing lines).

Now you need not take my word on the code examples in this article.  They are extracted from the demonstration project you can clone and run yourself (found at https://github.com/officefloor/ComposeEffectsDemo).

Also, if we've missed your favourite effects library please excuse me.  We're happy, if enough interest, to work with you incorporate adapters to provide further demonstration of integrating the beloved effect library.  Focus of OfficeFloor is not to be opinionated but rather provide an open platform to integrate software.

The next article in the series tests my self taught mathematics to attempt to explain the underlying model of why this ease of composition is possible.

Wednesday, 25 December 2019

First-Class Module

This is the second in a series looking at the Inversion of Coupling Control type system for composition.  This article discusses a more general Module type system than the previous article's First-Class Procedure Type.

Note: some functional programming languages also attempt to define First-Class Modules.  The First-Class Modules defined in this article are created from inverted functions.

First-Class Procedure

To recap the last article, the First-Class Procedure's type is defined as follows.  Note that we exclude the dependency type, as dependencies are auto-wired.

FirstClassProcedureType {
    Class<?> parameterType;
    ContinuationType[] continuations;
}

ContinuationType {
    String name;
    Class<?> argumentType;
}

This defines the First-Class Procedure to have a single input parameter and multiple continuations out for further composition of logic and handling of exceptions.

Module

Having a single input with multiple outputs is fine for methods, functions, etc wrapped in First-Class Procedures.   However, when systems grow we don't want the complexity to make the inputs/outputs suffer similar increased complexity.  We want the inputs/outputs to provide an interface to encapsulate the complexity of the Module.  Note that without encapsulation, we don't get the ability to modularise the complexity of the application.

To enable an interface to the Module, let's create the following input / output types:

InputType {
    String name;
    Class<?> parameterType;
}

OutputType {
    String name;
    Class<?> argumentType;
}

To understand why these types are created, we are going to use the visual configuration of Inversion of Coupling Control to better aid understanding what is happening.

The following Module configuration represents a single input, handled by a First-Class Procedure that sends its result to an output:


In the above configuration, the First-Class Procedure is encapsulated in the Module.  All that is exposed from the Module is the Inputs and Outputs.  The resulting type of the above Module would be the following:
  • Input named "Input" with a parameter passed to the First-Class Procedure
  • Output named "Output" with the argument provided by the result of the First-Class Procedure execution
This, however, provides little improvement on the First-Class Procedure interface.

What becomes useful is the encapsulation of multiple First-Class Procedures to undertake functionality of the Module:


While a new procedure was included within the Module, there was no change to the interface of the Module.  Other configuration using the Module would be unaware of the internal addition of another First-Class Procedure.

We also need not limit ourselves to single inputs and outputs.   We could have an arbitrarily complex Module that has multiple Inputs and Outputs:


The resulting Module encapsulated the detail to have the following interface:
  • Input "Input"
  • Input "Input-2"
  • Output "Output"
  • Output "Output-2"
  • Output "Output-3"

Module Type

The resulting type for the Module is the following:

SectionType {
    InputType[] inputs;
    OutputType[] outputs
}
Note that OfficeFloor's naming is derived from its foundation in business concepts and subsequently calls a Module a "Section".

The Module (Section) has multiple inputs and multiple outputs.  These inputs/outputs can then be connected to respective outputs/inputs of other Modules.

Furthermore, Modules may themselves contain other Modules.  As inputs/outputs are connected for composition, Modules have the same input/output connectivity as First-Class Procedures.  The following configuration demonstrates embedding the Module at the start of this article within another Module:


Whether it is the Module containing a single First-Class Procedure or two First-Class Procedures is encapsulated and not of concern within the above configuration.  The use of the Module is only on the Inputs / Outputs exposed by the Module.  The rest of the complexity of the Module is encapsulated. This allows modularising the application's complexity.

First-Class Module

So the title mentioned "First-Class Modules", yet we've only discussed visually wiring together the Modules.

To essentially be "First-Class" the Module needs to be assigned to a variable.  Yes, there are other conditions.  However, for me, this is the simplest way of thinking about something being first class.

Well the above graphical configuration is built on Sections (Modules) being configured together programmatically.  The graphical configuration is actually a layer above the First-Class Modules (Sections) to enable easier comprehension of how the application is modularised.

You can see this in OfficeFloor's implementation of the graphical configuration used above in this article.  The above graphical configuration is via an Activity. An Activity is a specific specialisation of a Section (ActivityLoaderImpl source here).  The Activity translates the XML from the graphical configuration into the creation of Sections, First-Class Procedures, Inputs, Outputs.  Each of these in the Activity implementation are assigned to variables, stored in data structures, passed to functions, returned from functions, etc.  This makes the Section (Module) essentially "First-Class".

This input / output interface based on continuations is extremely flexible.  It is so much so that First-Class Procedures themselves are also just a specialised implementation of a Section (see ProcedureEmployer).

Summary

We have seen how we can encapsulate First-Class Procedures within First-Class Modules, and even First-Class Modules within themselves.

We have shown how the graphical configuration is actually taking advantage of the "First-Class" nature.  The graphical configuration is actually a higher level composition that provides both:
  • easier to comprehend modularising of the application
  • quicker configuration of the application (effectively just draw lines for composition)
Note that it is quite possible to programmatically configure up our application.  However, this requires understanding First-Class Procedures / Modules in significantly more depth.  Much more than junior developers may want to initially.

The graphical configuration of First-Class Modules, therefore, provides the simplicity for building modularised applications.  This is without having to deal with the complexity of the underlying constructs.  Something I'm finding other composition strategies are still having trouble with.

In the next article we look at how First-Class Modules can provide composition of varying existing composition strategies.  You may find that existing composition strategies only really consider programming in the small, rather than programming in the much larger - where First-Class Modules become a lot more effective in modularising and simplifying your applications.