Cloud Migration Strategy — Networking (Part 1)
Cloud migration can be a nerve-wracking experience for organizations looking to move their on-premise resources to the Cloud. In this article I will talk about some of the most important Networking things to keep in mind before you start developing your own strategy for cloud migration (specifically targeted towards AWS and GCP). This is a two part series with the first part focusing on Cloud Connectivity and the latter geared towards Networking in the Cloud.
Cloud Migration background
Organizations today have multiple use cases for choosing cloud over their own infrastructure. Some examples are:
- Utilizing cloud compute and storage services for Big data processing jobs which can be pretty expensive to do on-premise.
- Backup or archival of data to Cloud for cheap, durable and highly available data requirements.
- Developing a Disaster Recovery solution on the cloud to act as active-active or active-passive replication system.
- Migrating entire infrastructure to the cloud for the long-haul or employing a hybrid approach.
Whatever your cloud migration use case may be, one thing is clear that Networking is the most important capability to make the migration happen seamlessly and most importantly — securely.
Connectivity and Data Migration to the Cloud
One of the first most critical pieces to any cloud migration is the connectivity between on-premise data center and cloud environment. Efficient and reliable connectivity helps in solving data migration challenges which are critical in building any cloud migration project. Lets take the cloud migration use case, where an organization wants to leverage cloud’s compute capability to run big data solutions. In this scenario, possibly petabyte scales of data needs to be transferred from on-premise network to cloud services like AWS S3 or Google Cloud Storage bucket. A good connectivity strategy would help ensure that this data is transferred in a fast and reliable manner. Let us first review some of the options that cloud service providers (CSP) have today to fulfill such connectivity needs:
- CSP-managed VPN : Both AWS and GCP allow the creation of a managed-VPN connection over the public Internet using IPSec protocol suite. A virtual private gateway is created at the side of the cloud provider and is then connected to an on-premise router using the authentication and encryption security configurations of the IPSec protocol suite. The cloud provider takes care of redundancy and high-availability at there side by automatically replicating VPN endpoints to two different data centers. The virtual private gateway also supports the use of dynamic routing using BGP routing protocol so that the router automatically learns new routes and does not need to be reconfigured in case of changes in the infrastructure networking theme. Amazon also offers VPN CloudHub service to create a hub-and-spoke model by connecting multiple on-premise data centers through a single cloud gateway.
- Customer-managed VPN : Customers can also deploy there own VPN solutions on the virtual machines in the cloud for creating an IPSec VPN tunnel to their on-premise network. It goes without saying that customers would be responsible for creating redundancy in there design by deploying the VPN endpoints across multiple availability zones.
- Private Network Connection : This is the fastest (low latency) and the most reliable option for connecting to the CSP network by using a dedicated fiber connection to the CSP’s endpoint. Network speeds can be reached from 50 Mbps up to 10 Gbps per connection. AWS service for the dedicated network connection is called — DirectConnect and in the GCP world, this is called Google Cloud Interconnect.
- Data transfer via connection with Cloud Storage endpoints : For customers looking to transfer data to their cloud storage services like AWS S3 or GCS bucket, there are various tools available to make this a smooth process. Both S3 and GCS have a GUI or CLI commands to directly upload the files to the bucket in your desired cloud region using SSL to encrypt data in transit. Some of these tools for AWS are rsync, S3 CLI and Glacier CLI. In the GCP world, we have gsutil and Storage Transfer Service. While this may be a good solution for small distance transfers, this is not a good solution for transferring data across long distances. Amazon offers the S3 Transfer Acceleration which maximizes data transfer across long distances by taking an optimized network path. Amazon also provides a service called Storage Gateway which can help establish a permanent and seamless gateway between your on-premise applications to AWS S3, EBS and Glacier. This is most suitable for organizations working towards a hybrid cloud storage model.
- Data transfer using physical transport (Offline) : For customers wanting a fast and secure way to transfer petabyte-scale data without creating a connection to the cloud provider, there are options to get your data physically transferred to the cloud provider’s edge location. Even with high-speed online network connections as described above, it can take days or even weeks to transfer humongous amounts of data to the cloud which is a pretty common scenario with Big data solutions in the market today. For example, a 1 Gbps network connection would take around 12 days to transfer 100 Tb of data. Using tamper-resistant and encrypted physical devices to securely transport the data, can significantly increase data migration time frame. AWS service for the physical data migration service is called — AWS Snowball and in the GCP world, this is called Google Transfer Appliance.
In the next part of this series, we will focus on some of the things to keep in mind after connectivity to the cloud is established and we are ready for the next level - Networking in the Cloud. This is the fun world where vital networking functions are created with a few clicks on the interface!