Warren Hardware Recommendation

1. Server requirements

Datacenter SaaS and on-premises Warren PoC deployments require a minimum of 9 servers (12 recommended) to offer virtualization-based (no bare-metal) functionality. 
Some hardware requirements are described as PoC only and/or "no redundancy" - this means that PoC only setup cannot be upgraded for production without downtime! 
It is important is to define what is the purpose of PoC as this may change the hardware requirements. 

  • 3 Control nodes for Warren components. This set of nodes serves on average a cluster with 100 virtualization hosts. 

  • 3 Virtualization hosts - hypervisor for VMs.

  • 2 Routers - site top routers (for PoC only a single router can be used as well - no redundancy).

  • 2 Switches - top of rack switches (for PoC only a single switch can be used as well - no redundancy).

    • 2 additional aggregation Switches are needed for multi-rack setup (also need to be added right away if plan to expand later without downtime).

  • 6 Storage nodes - server for Ceph storage without Object Storage (for PoC only 3 nodes can be used as well).

2. Servers and Networking Devices models and spec

All servers must have PXE boot capability and IPMI Remote Management Interface.

Supported CPU manufacturers are Intel, AMD (Intel is preferable). 

RAM version is not fixed (both DDR3 and DDR4 are supported for each node type)

  • Control domain servers 

    • CPU: 16-core per node suffice. For larger clusters, please discuss this matter before Warren's deployment is started.

    • Memory: At least 96GB

    • Storage: At least 2 x 500 GB SSD or NVMe  in RAID 1 as boot storage for each node.

    • NIC: At least 2 extension NICs, one of which must be 10GbE or 25GbE (dual-port recommended) for each node.

    • NIC for management and hardware monitoring purposes, all servers must include at least 1 additional NIC, 1GbE.

  • Virtualization domain servers 

    • CPU: Although there are no strictly defined minimum system requirements for virtualization hosts, it is recommended to have 2 x 8-core Xeon E5 (or higher). Make sure that all hypervisors have the same CPU model to support live migrations.

    • Memory: 256GB RAM (recommended RAM amount heavily depends on the number of cores in total).

    • Storage: At least 2 x 500 GB SSD or NVMe in RAID 1 as boot storage for each node.

    • NIC: At least 2 extension NICs, one of which must be 10GbE or 25GbE (dual-port recommended) for each node.

    • NIC for management and hardware monitoring purposes, all servers must include at least 1 additional NIC, 1GbE.

  • Networking domain -  Juniper MX-series or Cisco ASR routers are required (for PoC only vMX is also ok). For other vendors compatibility, please be sure that routers have the following options available and configured:

  •  

    • Dynamic GRE tunnels (MPLS over UDP/GRE)

    • Multiprotocol Extensions for BGP (MBGP or MP-BGP)

    • L3VPN

    • Public IP pool (enough to actually run services on PoC)

3. Storage

As user storage for Virtualization domain distributed storage CEPH is used.
Minimum requirements for CEPH block storage is 6 nodes (for PoC only 3 nodes can be used as well).
3 monitor nodes and 3 OSD nodes (OSD and monitor can be installed on the same node for PoC but not recommended):

  • Xeon x56XX (single or dual CPU) or at least 6-core Xeon E3 and 64GB RAM

  • Storage: 2 x 250GB SSD in Raid 1 (the boot is only RAID in CEPH servers)

  • Storage: 4 x 1TB SSD ideally expandable to at least 8 discs (Only for OSD nodes)

  • NIC: At least 2 extension NICs, one of which must be 10GbE or 25GbE (dual-port recommended) for each node.

  • NIC for management and hardware monitoring purposes, all servers must include at least 1 additional NIC, 1GbE.

Official recommendations for CEPH: https://docs.ceph.com/en/latest/start/hardware-recommendations/ 

In terms of CEPH speed, it is important to know that the larger the cluster the better the performance.



4. Server inter-connectivity

  • Five VLAN-s are required:

    • management - This network is used to access physical nodes over VPN using SSH. It should be assigned to every physical node. NAT enabled to access public internet. Regular private network.

    • storage - This network is used for platform and Ceph communications. It should be assigned to control nodes, hypervisors and Ceph monitors. Usually Ceph refers as "Ceph public" or "Ceph client" network. Regular private network.

    • ceph_private - This network is used for Ceph internal communications. It should be assigned only to Ceph nodes.

    • tungsten - This network is used for virtual machine networking. It should be assigned to all control nodes, all hypervisors and to SDN gateway router (juniper MX/vMX or Cisco ASR etc)

    • public - This network is used for public access to platform (web UI and API). It should be assigned to control nodes only.

  • One on-board network boot enabled NICs should also be interconnected without VLAN tagging (native vlan for example). To provide PXE booting possibility between nodes.

 

image-20240417-092149.png

 

5. Software

  • All servers should come with NixOS 20.09 Installed

  • All servers should have VLAN-s configured and named as stated in Server inter-connectivity chapter (only exception is ceph_private VLAN, this name can be chosen freely).