CQRS - Keeping it simple
Architecture Series Overview
I have recently become increasingly interested in how we, as developers and architects, structure the systems that we are creating. The demands on on-line systems are significantly different to the demands on desktop only software. Scalability, robustness, performance, resilience are all important factors when thinking about on-line services and there has been a lot of debate within the software world about how we should architect these systems. These posts are my views of interesting architectural patterns and practices that I have been looking at:
- CQRS: Keeping it simple is right here
- Domain Events
CQRS
Command Query Responsibility Segregation, or CQRS from now on, is an architectural pattern developed by Greg Young and Udi Dahan. My aim is to take you through a very brief introduction to the simplicity of CQRS and also show you how it can be used to handle a number of different problems that we face within the software world such as separation of concerns, security and scalability.
CQRS breaks down your interaction with a domain into commands and queries as defined below:
- Command - Can change state but returns nothing (i.e. there is no query aspect to the command processing)
- Query - Can only return values but cannot change state, i.e. has no side-effects
We keep them separate from each other, implement them on different interfaces/endpoints for the simple reason that it allows us greater flexibility on the architecture of the system.
In many respects that is all that CQRS says and it seems quite boring which is why so much that is written about CQRS also delves into the realms of
Domain events,
Event Sourcing and even
Non-relational databases. You do not need any of these in order to build a CQRS system, they are useful concepts that I will delve deeper into in later posts but they are not essential in order to get the benefits of CQRS.
I would advise that anyone attempting to do CQRS also look at
Domain Driven Design (I will create a blog post on this later) to identify your domains and create the boundaries and interfaces between them.
Separation of Concerns
Well, the brief summary above says it all really, doesn't it? How does this really provide any benefit?
Today I'm going to start with a very simple example that I will hopefully expand upon in my later posts. I have chosen to think about what it is I need to do to look at a shop model and think specifically about the domain of the products.
As a store manager I need the ability to be able to add/remove items from the store, I need the ability to change the prices and create discounts and I need the tills to query the product information for the current price based on the barcode of the product (or pick the product from a list in the case of fresh produce).
Traditionally, as a developer I would have a product service that gave us this entire functionality:
public interface ProductService
{
ProductList GetAllProducts();
ProductList SearchProducts(ProductSearchCriteria criteria);
Product GetByBarcode(Barcode barcode);
Product CreateProduct(Product newProduct);
Product UpdateProductDetails(Product newDetails);
Product DeleteProduct(Product productDetails);
Product SetPrice(Price price);
Product AddOffer(Offer offer);
}
I host this service on a PC in the store and I'm quite happy with what I have. However, what models am I using to represent this? Do I have to submit an entire product model just to change the price for that product? Technically not though keeping everything in a single interface does push the developer towards re-using models throughout the interface. Traditionally the add would return the new product including any internally generated identifier that will be uniquely used to identify the product etc.
So what would I produce in a CQRS design?
public interface ProductQueryService
{
ProductList GetAllProducts();
ProductList SearchProducts(ProductSearchCriteria criteria);
Product GetByBarcode(Barcode barcode);
}
public interface ProductCommandService
{
void CreateProduct(NewProductCommand command);
void UpdateProductDetails(ChangeProductDetailsCommand command);
void DeleteProduct(DeleteProductCommand command);
void SetPrice(SetPriceCommand command);
void AddOffer(CreateOfferCommand command);
}
public interface ProductCommandProcessor
{
void CreateProduct(NewProductCommand command);
void UpdateProductDetails(ChangeProductDetailsCommand command);
void DeleteProduct(DeleteProductCommand command);
void SetPrice(SetPriceCommand command);
void AddOffer(CreateOfferCommand command);
}
So, we now have two interfaces which looks like it increases complexity but it has now separated out the two different concerns of our system. Our queries are using different models than the commands and conceptually thinking about a command will allow you to focus in on the exact details that are needed to process that command.
When we are developing the product
Domain we are focussed on the
verbs (the conceptual items that exist within the domain) and
nouns (the actions that can be performed on those items). In a CQRS system the query service is solely concerned with extracting a data structure to represent the conceptual item but the command processor is primarily concerned with hiding the data and applying actions to the concepts. There does not need to be any overlap between the domain of the query (data structures) and the domain of the command (actual objects, using proper OO design).
In the traditional model if I wished to add in one extra piece of data to the Product model, perhaps the date that the price was set (which was already being stored) then I would have to re-compile all of my source code including the parts that solely are issuing commands to the system. In CQRS only the items that query need to be re-compiled.
Security
Given the two models above I want to now create an on-line presence as well as having a number of tills within the shop. I want to host the site within the DMZ and maintain my actual data inside the store for one reason, I don't want hackers to be able to update the details of any of my products. The immediate solution is to clone the primary database inside the LAN and ensure it is kept up to date and...
- In the traditional model I cannot do this easily, everything is linked in a single interface and therefore I have to think about securing specific methods. The command methods can be secured so that they cannot write to the new database for security but I'm having to do more work to ensure that they do not update anything in the cloned database.
- In the CQRS model I can just create an instance of the query service within the DMZ and point it at the cloned database to the website. I can continue hosting the same service inside the LAN for my tills etc. Any hacker does not know the structure of my commands, the command processor is only within the LAN so the chances of them damaging either the Live database or the clone is minimised.
In my opinion it is better to only expose the Query Service to systems that only need to be able to query, you can secure down that interface/endpoint much more easily than you can in a traditional service-based approach.
Scalability
The website is now taking off, I am getting way more visitors than I was expecting but they are now hitting my servers very hard? What do I do?
I realise that I need to increase the number of services that are available to call and place a load balancer between them. Each of those services could have its own clone of the live product data. However, the queries are expensive as there are a lot of joins etc so I need to create a de-normalised reporting database rather than a clone to get the optimum performance.
- In the traditional model I think that I would have to change the structure of the service to always point commands at the LIVE database and queries at this new reporting database. I potentially have to re-deploy the services including everything that is needed to handle the command processing. This also adds new configuration that needs to be tested so that all commands are sent to one database and queries sent to the next.
- In the CQRS model I can purely change the code for the query service and deploy solely that service. I have removed the need for additional configuration. I am keeping the system simple and scalable.
That is scaling of the queries but I would also like to take a look at scaling of commands. I have a number of third parties who wants to sell through my site but I know that my single command processor service cannot support that many requests. What can I do?
I want to off-line the processing of the commands, all I need to do is to create a queue and process them off-line, potentially within the LAN not the DMZ to maintain a single point of truth for my product data:
- In the traditional model I will struggle with this, the very nature of the service suggests that I will immediately return a response with the exact details that I have saved. I can choose to ignore this at my peril or I can have a mechanism that puts the information onto a queue and only returns to the client when it can find that product in the database (which it achieves by polling). This doesn't sound too good does it?
- In the CQRS model I simply re-implement the command interface with a new version that writes to a queue, the fact that an error has not occurred tells the client that we have accepted the command. I can scale this behind a load balancer so I can expose a number of these services. Within the LAN I can provide a background worker process (or processes) that pull data from the queue and writes them to the database.
There may not be too much difference in how we scale the querying side (bar the configuration issue and the tight coupling between the command and the query) but the command side scales much more efficiently.
Summary
In summary, I am really excited about CQRS because it seems to answer a lot of the challenges that we are facing in the software world. It is applying some of the core principles of software development, the separation of concerns and that just by looking at the interface you know whether a method has an impact on the system or whether it is just querying the system. Just one word of warning, I have shown how CQRS
can be used to solve some of these problems but you only need to solve them when you encounter the problem.
It is not a silver bullet, it does not solve all architectural problems and there are a number of other techniques and patterns that we can apply to a system that I will delve into in later posts. However useful the other methods are none of the other techniques are necessary for CQRS and I would seriously advise that you focus on keeping it simple.
Tags: