|
All the necessary information from the DC side to Warren can and should be addressed as a set of simple, unambiguous questions. These questions, in turn, can be divided roughly into three subsets, based on the goal they are meant to achieve. Each subset can be expressed as a more general "umbrella question":
What are the current infra- and software components in use and what part they in the plans for the future?
What services is DC offering and what expectations do they have to Warren?
What is the level of commitment of DC in cooperation with Warren and what will be the 3rd party software systems alongside Warren?
If these topics are cleared from both sides, the ambiguity and misunderstanding of decisive factors that are the backbone of successful cooperation should be minimized.
These questions are vital to gather the information that affects the following topics in Warren development:
To enhance the analysis result and make it directly usable as an input to the development process, let's partition the hypothetical DC stack (hardware, firmware, software) into functional domains that have common properties to according Warren components. Two of DC functional stack domains that have been there before Warren adoption are more influential than others, both future development- and adoption process-wise. These are Network and Storage. They are also tightly coupled, as decisions in one domain heavily depend on the properties of the other. If analyzed, the connection between these two domains is expressed best in the decision-making process, as two fundamental trade-offs:
The biggest trade-off there is in multi-site computing, (thus distributed cash and storage are simultaneously good and evil at the same time ).
This can mostly be described as:
The general tendency is towards a concept “software-defined DC”, largely because of automation and management benefits it offers. The exception to tendency is bare-metal provisioning popularity that could be explained the still-existing demand for direct control over hardware, required some type of applications, independence from general software-system failures and speed.
There are several factors in DCs network setup that dictates what we need to think through in the Warren application development process. Such factors include:
This aspect defines network traffic between components, servers, racks also between DC and the internet. It sets DCs physical extendability properties, thus, we need to consider:
How the automated discovery process will be handled
What deployment schema to use when implementing new nodes
Which components are involved in such processes
How the non-positive results of such cases will be handled.
Obviously, we cannot fine-tune our setup for every topology type because it's not a standalone factor, so the set of variables in such analysis is large and too costly compared to the business-value of the outcome. But we can target the solution that covers topologies mostly used in DCs with a sufficient degree of quality. Metrics of service reliability and availability standards are something that cannot be purely theoretically calculated in the platform that is under heavy development. Thus, they will rather be deduced from DCs adoption process. The current assumption is that the most widely used topologies in the probable target DC group are fat-tree and various forms of clos. Based on that, most optimizations are made for the latter two topology types.
Although, both, this and next point seem to be trivial compared to a real problem magnets like network topology, adopting SDN solution, or better yet, consolidating different SDN solutions; this has become a major issue in public clouds (and presumably also in private ones, where such issues are usually not materialized as a series of scientific papers). Like almost all (except for SDN maybe) network-related considerations, also this one has the quantity-dependent nature.
The bigger the amounts of data-flow between hardware devices, the bigger of a problem it tends to be. This traffic (and also In-DC traffic between silos, if larger DC is under consideration), is the one that measures the service system (Warren) efficiency. It's a two-fold problem, first the traffic that is generated by the clients, secondly the one that is generated by Warren as a management system. The goal of Warren is to reallocate resources to minimize in-DC traffic and in rare cases, it can, by doing so, destabilize the network flow for a short period of time. Management flow must always take precedence when client flow is causing problems, even if it decreases client throughput further. Because it’s purpose is to restore the previous state, or at least maximize the efficiency with the currently limited amount of available resources.
In general, all SDN systems are based on the same principles an in major part, derived from two prevalent frameworks for SDN generation. There are several types of protocols when it comes to network device configuration, among which, OpenFlow is still the most dominant one. Almost all needed routing protocols are also supported by all major SDN solutions.
To conclude the above, there shouldn’t arise any drastic problems on a connection basis (which doesn't mean it's a trivial task!). However, there is an exception to that hypothetical balance - the security domain. All SDN systems implement some (or more) security domains, whether it’s client level or system-wide. To configure 2 or more SDN systems to cooperate simultaneously on that domain, might be more time consuming than configure the whole system to use adopt a new one.
Warren storage domain consists of three options:
Distributed storage
Shared storage
Local storage
To determine the right solution, one must consider several factors that are required to implement a particular storage type. As storage holds the most valuable part - client data, the impact on the reliability and to QoS. Afterall - network outage only affects the availability of data, whereas storage problems may lead to permanent data loss.
Based on network and Storage requirements, there can be predicted several issues due to poorly planned location of Warren control plain components such as:
On the other hand, keeping such level of separation between nodes, certainly increases in-DC traffic between racks. So there is no absolute rules in component placement, but rather it depends on already exiting setup, nature of provided services and median/peak traffic levels in racks.
-