Many people are interested in learning more about how clouds are built. But it can be tricky to find a simple, layperson’s view on this. So we thought we’d attempt it ourselves. A word of warning – there’s no escaping from the fact that a technical explanation will involve technical terms, but we’ll do our best to demystify them.
The two most oft-used terms relating to cloud are “compute” and “storage”. Compute is just that – the processing power made available by the cloud platform for the customers use. Storage is also common sense, e.g. the raw memory used by the customer to store and retrieve their data. In the simple model below, the final element is deployment – in other words, layering on the applications that the customer is seeking to use in their cloud.
The core architecture principles are the same for any cloud deployment, be it on premise, multi-tenant public cloud, single-tenant private cloud, or any other configuration. The only differences relate to scale, ownership, payment model and deployment minutiae. We shall explore each of these layers, bottom-up.
The storage layer consists of an array of disks. These are normally either spinning disks (HDDs), like most PCs or servers in use today, or solid state drivers (SSDs), which are increasingly being used in high-spec laptops, small form-factor music players and high-end storage technology (you may have heard of flash memory – to all intents and purposes, flash and SSD technology is the same thing).
The key point here is that the disks themselves are commodity items now. The intelligence exists in the software that makes “logical drives”, e.g. spreads the data across multiple drivers for resilience. There are three terms that keep coming up in storage, the first of which is RAID. It means “redundant array of independent disks” but originally meant “redundant array of inexpensive disks” until the industry decided that the market might expect RAID to be cheap (it’s not) if it had the word “inexpensive” in the acronym. There are different categories of RAID (like RAID 5, RAID 10) but the two key things to understand are mirroring and striping. Mirroring means the data is copied to more than one disk (for redundancy) and striping means the data is broken down into blocks and saved across more than one disk (to improve speed). There’s more to it than that, but the fundamental principles are covered here.
The second frequently used term is SAN. It means “Storage Area Network”, using the same naming convention as local area and wide area network (LAN and WAN respectively). Basically, a SAN “pools” a significant amount of disks into a logical network. When an organisation reaches a certain size, they usually build SANs for their storage, separating the storage from the compute. Smaller businesses would have compute and storage on the same hardware, e.g. a pizza-box server or a tower unit. All cloud providers use some form of SAN, based on RAID technology, with the only differences being scale, management, backup services and functionality available to buy.
The third frequently used term is NAS. And here’s where things get really unhelpful. For starters, it’s not SAN backwards. It means “Network Attached Storage”. In short, it’s storage that has been separated from the compute resource (e.g. a pizza-box server, as above) and connected to the network with its own network address. A NAS resource could, in theory, be as small as a single hard disk drive or a full-scale SAN.
The final point to make on storage, especially when you start talking about RAID/SAN/NAS, is that it’s complicated. But the great thing about cloud is that it’s not your problem anymore. If you visit a cloud service provider’s datacenter, you will be able to see their storage setup and realise the scale involved. Storage itself is a major cloud computing driver – companies have realised that if they want hugely resilient, reliable, geographically diverse, ultra high speed storage setup, a cloud solution will offer them functionality they couldn’t hope to afford themselves. And more to the point – they’ve made it somebody else’s problem. The value, to the company and to all users, is not the magnetic media units used to store the data, it’s the data itself. Its integrity, and protected access to it, should be the prime concern. Storage is still sourced in the same way as any customer, from an individual user to a corporate, does today, by the byte. Generally, it is multiples of Gigabytes or Terabytes, but all the customer has to do is specify how much storage space they’d like and let the cloud provider deliver it.
The same key philosophy with the storage layer applies to compute: the hardware doesn’t matter, the intelligence is in the software. For storage, we pooled disks (storage resource). Now we pool servers (compute resource), with software managing how the compute is utilised. This happens through virtualisation, where the an intelligent piece of software known as a hypervisor deploys virtual machines (aka virtual servers) across the pooled physical servers. Each VM does the same job as a physical server but no longer exists as a piece of hardware – it is “virtualised”, existing only in software, with high availability (VM redundancy/resilience) baked into the solution (with actual functionality varying by cloud provider).
At the user/company level, a virtual machine is specified in a similar way to physical servers have always been. Namely, the customer specifies the number of CPUs (core processing units – the “brains” of the computer), the amount of memory in RAM (random access memory – the “horsepower” of the computer) and the amount of storage (as per above). In reality, the detail behind configuring virtual servers differs to physical ones, but that’s what your cloud provider is there to help you with. If you do not have this expertise yourself, you should make sure you are working with a provider that does.
The deploy piece is the bit that varies the most. If it is a “Platform as a Service” or “Software as a Service” scenario, it is the responsibility of the provider themselves. In an IaaS scenario, Platform Management is as much about people and processes as it is about technology – it’s how the cloud provider manages the deployment of the customer’s application (or website, or any other service). It is often this that is the area that the customer needs to evaluate the most. Fundamentally, most compute and storage set-ups are the same: it’s a bunch of disks and a bunch of servers that combine to form a bunch of virtual machines. It’s how these VMs are deployed to the customer that counts – e.g. high availability, geographical diversity, resilience and other things to think about – and it is here that the variation between cloud providers is most keenly felt.
It is important to note that, as is a common theme on the UP site, there is no right or wrong answer. It’s the combination of who you are and what you are looking to achieve that will help you to decide which provider is right for you. Some are experts at meeting consumer needs; others focus on SMEs, mid-market customers or global corporates. Some providers offer self-service, others don’t. Every provider exists with at least one sector of the market in mind. If nothing else, ask them the question – who are your target customers? They should be able to explain if/why you are the right/wrong customer for them. Remember that one size does not fit all and that not all clouds are equal.
If you are a business customer, you may find our Cloudchoice tool helpful. Indeed, you may even discover that cloud is not the right hosting solution for you. That’s okay too. Always remember that YOU are the customer, you should always choose a model that’s right for you. If a potential hosting provider tries to impose one specific model on you, walk away, there are plenty of choices out there.