SoftNAS Virtual Storage Solutions
×
Menu
Index
  • Introduction

Introduction

 

Enterprise-Grade High-Availability with Low Complexity and Low Cost

 
SoftNAS™ SNAP HA™ High Availability delivers a low-cost, low-complexity solution for high-availability clustering that is easy to deploy and manage. A robust set of HA capabilities protect against data center, availability zone, server, network and storage subsystem failures to keep business running without downtime. SNAP HA™ for Amazon Web Services (AWS) includes patent-pending Elastic HA™ technology, providing NAS clients in any availability zone uninterrupted HA access to the storage cluster across availability zones.
 
SNAP HA™ monitors all critical storage components, ensuring they remain operational and when there is an unrecoverable failure in a system component, another storage controller detects the problem and automatically takes over, ensuring no downtime or business impacts occur.
 
SNAP HA™ works hand in hand with SoftNAS Cloud® data protection features, including RAID, and automatic error detection and recovery, and as a result, reduces operational costs and boosting storage efficiency.
 
High Availability protects companies from lost revenue when access to their data resources and critical business applications would otherwise be disrupted with features that:
1) Protect against unplanned storage outages 24 x 7 x 365
2) Provide disaster recovery capabilities to quickly resume mission-critical operations in the event of a disaster (e.g., across availability zones or data centers)
3) Ensure failed components are quickly and automatically identified and isolated, so they do not cause data loss, application errors or downtime
4) Replicate data so there is an up-to-the-minute copy of any changes that have taken place
5) Prevent outdated or incorrect data from be made available due to multiple failures across the storage environment
6) Assure business owners that applications and IT infrastructure continue operating uninterrupted by unexpected failures in the storage environment
7) Enable IT administrators to take storage systems offline for maintenance and repair, without disrupting production IT systems or applications
 

Minimize Downtime from Host and Storage Failures

 
SoftNAS™ SNAP HA™ High Availability delivers the availability required by mission-critical applications running in virtual machines and cloud computing environments, independent of the operating system and application running on it. HA provides uniform, cost-effective failover protection against hardware and operating system outages within virtualized IT and cloud computing environment. HA:
 
  • Monitors SoftNAS Cloud® storage servers to detect hardware and storage system failures
  • Automatically detects network and storage outages and re-route NAS services to keep NFS and Windows servers and clients operational
  • Restarts SoftNAS Cloud® storage services on other hosts in the cluster without manual intervention when a storage outage is detected
  • Reduces application and IT infrastructure downtime by quickly switching NAS clients over to a another storage server when an outage is detected
  • Maintains a fully-replicated copy of live production data for disaster recovery
  • Is quick and easy to install by any IT administrator, with just a few mouse clicks using the automatic setup wizard
 

Extend and Enhance Data Protection Across Enterprise Infrastructure

 
Most availability solutions are tied to specialized hardware or require complex setup and configuration. In contrast, an IT administrator configures SoftNAS SNAP HA™ with a few clicks from within the SoftNAS StorageCenter™ client interface. With simple configuration and minimal resource requirements, SNAP HA™ allows administrators to:
 
  • Provide uniform, automated data protection and availability for all applications without modifications to the application or guest operating system
  • Establish a consistent first line of data protection defense for an entire IT infrastructure
  • Protect data and applications that have no other failover options, which might otherwise be left unprotected and subject to extended outages and downtime
  • In Amazon Web Services cloud environment, provide storage HA across AWS availability zones
  • Compatible with SoftNAS Cloud® advanced NFS file servers, Windows CIFS file servers, and iSCSI SAN servers.
 

Highly-Available NAS Services

 
SoftNAS SNAP HA™ provides NFS, CIFS and iSCSI services via redundant storage controllers. One controller is active, while another is a standby controller. Block replication transmits only the changed data blocks from the source (primary) controller node to the target (secondary) controller. Data is maintained in a consistent state on both controllers using the ZFS copy-on-write filesystem, which ensures data integrity is maintained. In effect, this provides a near real-time backup of all production data (kept current within 1 to 2 minutes).
 

Storage Monitoring

 
A key component of SNAP HA™ is the HA Monitor. The HA Monitor runs on both nodes that are participating in SNAP HA™. On the secondary node, HA Monitor checks network connectivity, as well as the primary controller's health and its ability to continue serving storage. Faults in network connectivity or storage services are detected within 10 seconds or less, and an automatic failover occurs, enabling the secondary controller to pick up and continue serving NAS storage requests, preventing any downtime.
 

Storage Failover

 
Once the failover process is triggered, either due to the HA Monitor (automatic failover) or as a result of a manual takeover action initiated by the admin user, NAS client requests for NFS, CIFS and iSCSI storage are quickly re-routed over the network to the secondary controller, which takes over as the new primary storage controller. Takeover on VMware typically occurs within 20 seconds or less. On AWS, it can take up to 30 seconds, due to the time required for network routing configuration changes to take place.
 

Operation in AWS Virtual Private Cloud

 
In AWS, SNAP HA™ is applied to SoftNAS storage controllers running in a Virtual Private Cloud (VPC). It is recommend to place each controller into a separate AWS Availability Zone (AZ), which provides the highest degree of underlying hardware infrastructure redundancy and availability.
 
Virtual IP Setup
Each AZ operates on a separate subnet; e.g.. 10.0.1.0/24 and 10.1.0.0/24 (choose how to organize the subnet addresses in the VPC based on expected requirements). SoftNAS SNAP HAcan now take advantage of Virtual IPs. One virtual IP address is assigned to each VPC instance, set up within the same CIDR block. A third lone IP address is set up on a separate CIDR block, to manage NAS client traffic requirements. 
 
Virtual IPs are isolated from internet traffic completely, increasing the security of your HA VPC setup. For this reason, a Virtual IP driven private HA setup is our recommended best practice.
 
HA storage traffic uses a dedicated network interface (interface 1), which further isolates storage traffic.
 
Elastic IP setup
Traditionally, an elastic IP provided NAS clients across all AZs access to HA storage. Until recently Elastic IPs were the only IPs capable of re-routing network traffic across AZs. SoftNAS SNAP HA™ enhances the standard elastic IP provided by AWS, creating a patent-pending "Elastic HA" (EIP). Elastic HA IPs are managed by the HA controller, ensuring NAS client traffic is properly routed to the active primary storage controller at all times.
 
HA storage traffic uses a dedicated network interface (interface 1), which further isolates storage traffic.
 
There's a common misconception that elastic IPs are only useful for Internet-based access to EC2 instances. While that is the most common use case by far, Elastic HA IP addresses are typically configured using a Security Group which restricts access within the AZ private network only. This prevents any possible Internet-based access to Elastic HA IPs.
 
VPCs can also be configured for use with VPNs, which enables secure access from an administrator's office location to the private network (no other inbound Internet access is typically available). It is possible to attach optional elastic IP addresses to interface 0 on each SoftNAS controller instance for remote administration (restricted IP range access recommended).
 
 
 

Operation in VMware Private Clouds

 
On VMware, it is common to dedicate a non-routable VLAN to storage traffic. The storage VLAN segregates primary storage traffic (e.g., VMDKs attached to VMs over NFS or iSCSI) from other traffic. Data replication traffic can also be placed on its own separate non-routed VLAN. SoftNAS StorageCenter™ is typically placed on a routable VLAN (the default network), where it can be readily accessed by admins from a web browser from anywhere within the organization (or via a VPN).
 
A Virtual IP (VIP) address is employed to route NAS client traffic to the primary storage controller. In the event of a failover or takeover, the VIP is reassigned to the other controller, which immediately re-routes NAS client traffic to the proper controller.
 

High-integrity Data Protection

 
A number of measures are taken to ensure the highest possible data integrity of the HA storage system. An independent "witness" HA controller function ensures there is never a condition that can result in what is known as "split-brain", where a controller with outdated data is accidentally brought online. SNAP HA™ prevents split-brain using a number of industry-standard best practices, including use of a 3rd party witness HA control function that tracks which node contains the latest data. On AWS, shared data stored in highly-redundant S3 storage is used. On VMware, a separate HA Controller VM is used.
 
Another HA feature is "fencing". In the event of a node failure or takeover, the downed controller is shut down and fenced off, preventing it from participating in the cluster until any potential issues can be analyzed and corrected, at which point the controller can be admitted back into the cluster.
 
Finally, data synchronization integrity checks prevent accidental failover or manual takeover by a controller which contains data which is out of date.
 
The combination of high-integrity features built into SNAP HA™ ensures data is always protected and safe, even in the face of unexpected types of failures or user error.
 

Scales to Hundreds of Millions of Files

 
SNAP HA™ has been validated in real-world enterprise customer environments and is proven to handle hundreds of millions of files efficiently and effectively. The use of block replication instead of file replication supports hundreds of millions of files and directories.
 
 

Paravirtualization (PV) vs Hardware Assisted Virtual instances (HVM)

 
Paravirtualization (PV) is an efficient and lightweight virtualization technique introduced by the Xen Project team, later adopted by other virtualization solutions. PV does not require virtualization extensions from the host CPU and thus enables virtualization on hardware architectures that do not support Hardware-assisted virtualization. This has become less and less an issue in recent years. With the increase in popularity of virtualization, chip manufacturers like Intel and AMD implemented hardware virtualization support beginning in 2006. Today's hardware platforms such as Intel's Ivy Bridge used in EC2's R3, C3, I2 instance types have very complete technology support for HVM.
 
Unlike PV guests, HVM guests can take advantage of hardware extensions that provide fast access to the underlying hardware on the host system. HVM AMIs are required to take advantage of enhanced networking and GPU processing. In order to pass through instructions to specialized network and GPU devices, the OS needs to be able to have access to the native hardware platform; HVM virtualization provides this access.
 
Traditionally, Paravirtual guests performed better for storage and network operations than hardware assisted guests because they could leverage special drivers for I/O that avoided the overhead of emulating network and disk hardware. In contrast, HVM guests had to translate these instructions to emulated hardware. Recently, this has changed. These PV drivers are now available for HVM guests, so operating systems that could not be ported to run in a paravirtualized environment (such as Windows) can still see performance advantages in storage and network I/O by using them. With these PV on HVM drivers, HVM guests can get the same, or better, performance than paravirtual guests, even on workloads that traditionally performed better on PV.
 
In other words, HVM guests now have the best of both worlds, and automatically select the path that provides best performance. Paravirtualization is slowly being phased out, or the best parts of it integrated with HVM. PV is still a strong option for legacy hardware scenarios, but will be less and less useful as companies upgrade their hardware to newer chipsets with hardware virtualization support.
Copyright © 2016 SoftNAS, Inc.  All Rights Reserved