Architectural tenets

The architecture of Zato reflects several key foundational concepts underlying the design of the platform. Each component of the architecture takes each of the concepts into account.

The tenets are what drives the design of Zato, this is what directly leads to what its architecture looks like.

Concept	Meaning
A broad usage spectrum	From IoT, through APIs, file transfer, enterprise backend systems and mainframe to AI & Machine Learning. Zato is meant to be used to build a wide range of integrated systems.
Productivity	Developer time is of utmost importance. The design of the platform makes it easy to quickly build both simple and complex integration environments. Python is the most productive tool for integrations and this is why Zato is in Python.
Operational excellence	Ease of use and monitoring capabilities allow one to constantly improve processes, plans and procedures. Results of your work should be easily reproducible in different contexts, environments or projects.
Scalability	It should be easy to scale environments regardless of one’s preferred deployment approach, be it cloud based, on promises, hybrid, bare metal, Docker or Kubernetes. Any combination can be used.
High availability	Integrations are the very core of any organisation or project and it is essential that the platform eliminate single points of failure, that it provide redundancy and that it offer convenient means to carry out upgrades or maintenance tasks.
Security	An integration platform needs to expect that it will be routinely attacked by nefarious actors. The very choice of Python, a very high level, secure language and the platform’s resilience to attacks are an integral part of the design.
Simplicity over complexity	The correct way to build advanced, mission-critical systems is to make them as simple as possible, but no simpler. Individual components and parts should be easy to understand and master.

Runtime architecture

Component	Notes
Clusters	Each Zato environment contains one or more clusters. A cluster is a logical collection of servers and supporting components. Clusters can run on premises, in the cloud, directly under Linux as well as using Docker, Kubernetes or Vagrant. It is perfectly fine to have more clusters for particular purposes, e.g. one for REST and AMQP and another cluster for AI and ML. A special kind of a cluster is called a quickstart cluster. Such a cluster can be created within a few seconds, using a single command, to quickly have a working environment. Quickstart is a great way to get started with Zato because it requires no additional work, the result is a fully functional environment running with all the components on a single host. Quickstart clusters can be used for development, testing and production environments. They are real clusters in every sense of the term.
Servers	Servers are containers for services and applications, this is where one’s API and publish/subscribe topics run. In the spirit of cloud computing, to scale environments horizontally, more servers can be added to a cluster. To scale servers vertically, more CPUs can be assigned to each server. Servers run in an active-active setup but the load balancer can be configured not to direct requests to a particular one. Servers can be added, removed and reconfigured on the fly. Servers synchronise their configuration automatically, e.g. deploying a new service on one server is auto-synchronised with all the other servers from the cluster.
Load balancer	The high-availability load balancer can be the built-in one or an external one. Usage of a load balancer is optional. If it is not required, e.g. if there is one server in a cluster, or if running under Kubernetes, the load balancer does not need to be created.
Dashboard	A web-based GUI for the management of servers running in clusters.
CLI and API	Everything that can be done using the Dashboard is also available via CLI and API for DevOps automation.

Understanding Zato servers

Active-active

There are no limits as to how many servers there can be in a single cluster.
By default, all servers in a cluster are always active and the load balancer will direct traffic to all of them.
It is possible to take a server offline, e.g. to apply updates, and the load balancer will redirect the traffic to other servers.
As long as a server is running, it synchronises its state with other members of the clusters, even if that server is offline. For instance, code deployed to any server will be auto-distributed to all the other servers, even if from the load balancer’s perspective any of them is offline.

Containers for high-performance services

Servers are containers onto which API services are deployed.
There are no limits as to how many services there can be in a single server.
Each idle service consumes up to 1 MB of RAM. Thus, 1 GB of RAM can mean 1,000 business API or AI services..
A service takes less than 1 ms to deploy. It takes less than 1 second to deploy 1,000 business API or AI services.
All servers from the same cluster are always mirror images in terms of what code, what services, they execute.

Scaling an environment - APIs & AI

The most important aspect of whether to add more servers or more clusters with their own servers is understanding the distinction between services that are network-bound vs. services that are CPU-bound.
Services are network-bound if they primarily wait for TCP networks. For instance, picture a sample REST or AMQP service that may take 100 ms to complete. Of that, 98 ms are spent waiting for a remote endpoint or server to respond while only 2 ms are actually spent on the actual processing of the data received. This means that the service spends 98% of its time not actively processing anything, it is bound to the network. Hence, the name, network-bound.
Services are CPU-bound if they are primarily blocked, waiting for CPUs to compute an expected result. For instance, imagine a service that requires 200 ms to obtain some data and then its AI algorithms require two minutes to complete. In this case, the service spends most of its time waiting for the CPU. Hence the name, CPU-bound. Another example may be parsing and processing of large files, e.g. multi-GB files may require CPU time to parse.
Because each Zato server in the same cluster executes the same set of services, network-bound ones should not be mixed with CPU-bound services. If they are mixed, if they are deployed to the same cluster, it may happen that CPU-bound services completely overtake CPUs, leaving no room for network-bound services. For instance, if many AI services require CPU time and a REST (TCP) request arrives, the CPUs may be completely busy with AI calculations, leaving very little or no processing time for network events.
It is perfectly fine and expected to have more than one cluster, depending on whether the workload is uniform, e.g. only network-bound or only CPU-bound as opposed to mixed workloads, containing services of both types. With mixed workloads, it is recommended to have more than one cluster.

Scaling a cluster

An individual cluster can be scaled by adding more servers with a smaller numbers of CPUs for each server or by adding more CPUs to each server.
Usually, it is more desirable to add more smaller servers than more CPUs per each server. The reason is that, in the true spirit of cloud computing, there are no limits as to how many servers can be added whereas, broadly, the limit of CPUs per each server is between 6 and 8, depending on a particular CPU make and model, and adding more CPUs above the limit does not significantly improve performance.
Servers with services using publish/subscribe are a special case in that they always require exactly 1 CPU per server. In this scenario, clusters are scaled by adding more servers, each with 1 CPU.

Next steps

Start the tutorial to learn more technical details about Zato, including its architecture, installation and usage. After completing it, you will have a multi-protocol service representing a sample scenario often seen in banking systems with several applications cooperating to provide a single and consistent API to its callers.
Visit the support page if you would like to discuss anything about Zato with its creators
Para aprender más sobre las integraciones de Zato y API en español, haga clic aquí

from Planet Python
via read more

Daily Python

Friday, July 9, 2021

Zato Blog: Scalable API and AI architectures

Architectural tenets

Runtime architecture

Understanding Zato servers

Active-active

Containers for high-performance services

Scaling an environment - APIs & AI

Scaling a cluster

Next steps

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

Search This Blog