Technology

Why it’s time to move your data warehouse to the cloud

For many large organisations, Enterprise Data Warehouses (EDWs) are their lifeblood. EDWs can support a variety of workloads, including financial reporting, customer satisfaction analysis, manufacturing quality, shipping & logistics, as well as ad hoc workloads from individual business units. This ability to support so many departments means EDWs are the go-to tool for any organisation looking to utilise their data effectively.

Although many organisation still have their EDWs based on site, there is a growing trend for moving this data to the cloud, and with good reason. As operational data volumes continue to grow at exponential rates, service-level expectations are raised, and the need to integrate structured warehouse data with unstructured data in a data lake becomes greater, it’s not a matter of if you go to the cloud to manage your enterprise data, but when.

In this article we highlight some of the benefits of moving your EDW to the cloud and how you can get the process started.

The benefits of a cloud-based data warehouse

Having a cloud-based data warehouse simplifies time-consuming and costly management, administration, and tuning activities that are typical of on-premises data warehouses. But beyond these obvious advantages, there’s more to be gained from moving to the cloud.

All of which leads us to a more detailed list of key advantages of a cloud-based data warehouse:

Scalability – The volume of data in a warehouse generally grows at a steady pace as time passes and information is collected. But infrequent events, like mergers and acquisitions, can cause sudden increases in data volume. Many organizations are looking at new and different ways to manage and analyze ever-increasing volumes of data coming in various formats from multiple sources. Your current on-premises EDW was not designed for this kind of workload or data. The inherent scalability of a cloud data warehouse allows you to adapt to incremental and sudden growth, as well as sudden contraction. Cloud-based resources allow the data warehouse to quickly expand and contract processing capacity as needed, with no impact to stability, performance, and security of the infrastructure.

Will you ever have less data to worry about in your business? If the answer is “no” (which it almost always will be) then you need to consider changing infrastructure platforms to accommodate increases in data quantities, you should select tools that were built for today’s modern data challenges instead of legacy-based architectures. 

Start-up costs are a fraction of on-premises, appliance solutions – Companies can end up investing millions of dollars into a data warehouse appliance that rapidly becomes outdated. And that’s without the renewal costs to keep that tech going.

Instead of paying out huge sums to extend the life of the older technology, it makes more sense to move the data warehouse to the cloud. The cloud offers a utility-based model, allowing you to pay for what you use and when you use it, as opposed to what you think you are going to need 2-3 years in the future. Microsoft’s Azure SQL Data Warehouse independently scales computing and storage so customers only pay for the query performance, and if you pause the service, you only pay for storage. As a result, not only is the cost of entry lower, but you are not risking a huge sum of money to make the move.

Reduce ongoing costs – On-premise data warehouses are not only extremely expensive to build, but also operate. After the initial cost of implementation you’ll need staff, servers, floor space, power and cooling, amongst other things. When your data warehouse lives in the cloud, the operating expense in each of these areas is substantially reduced or eliminated altogether.

Easily change user numbers – As well as scaling up (and down), cloud data warehouses are able to scale out, allowing your organisation to add more concurrent users as required. Adding more resources or moving data to the cloud, by adding more nodes to a new or existing cluster, allows more users to access the same data without degrading the query performance.

Allows for new capabilities – With the development of new analytic paradigms, like machine learning, you need a platform that allows you to work with both detailed and aggregated data at scales never imaged. Can your platform support a thousand users without concurrency issues?  Imagine the difference it could make to your business if it could dynamically adjust to handle those new demands? A cloud-based platform gives you the ability to implement new breakthroughs in data technology.

No disruption to internal users – When moving to the cloud, you want to show incremental success and don’t want to add a lot of unnecessary risk. It’s simple to keep running your existing EDW in parallel with your new cloud DW, giving you a built-in fall-back plan for the early stages. You could also start with a small data mart as a pilot project to provide a bit of extra security.

Access to a virtual team of experts – When you buy a cloud solution, you get a virtual team of engineers, project managers and an SOC team as part of the bargain. These teams are experts in their field and so training can generally be provided without issue or downtime.

Increased security – Cloud service providers, such as Azure, Google Cloud and Snowflake, must meet the highest security standards set by health, financial and government institutions. This makes it easy to obtain certifications like SOC2, ISO27001, HIPAA, GDPR, and PCI. All cloud platforms also contain built in authentication, authorization, logging, and auditing.

Getting your cloud migration started.

Like any project, migrating your EDW will need to spend time planning and researching. W Moving to the cloud offers cost savings, efficiency, scalability and security; but, you’ll need to decide which solutions are best suited for your business.

Some important questions to ask are:

What matters to you most?

Each provider has a specific set of strengths and weaknesses. Some offer enormous scalability, while others offer a more set of personalized applications management options.

Don’t just select the market leader, do some research and speak to some companies who provide this service to help you to find the best provider for your needs.

What are your capacity demands?

Knowing these will help you know whether or not a provider can actually deliver the level of service you need. The last thing you want to do is spend time, effort and money migrating to one provider, then have to move to another one because they can’t match the scalability you require.

How much will it actually cost?

Calculating cloud computing costs is never as simple as looking at the figures on the pricing page and making a decision. Make sure you look into detail about exactly how your EDW will operate in the cloud environment. Especially keep an eye out for hidden charges based on things like processing increases, bandwidth and even geographical location.

It can be very beneficial to model some real life scenarios and calculate the actual costs of each cloud provider you are considering.

Three steps to cloud success.

Now that you’ve picked a solution, you’ll need to get planning on how to implement it. We’ve broken these phases down in Design, Migrate and Validate.

Design

The first thing you’ll need to do is develop your data integration plan. Consider how you will handle your existing solution and define the steps required for migration. You may need to re-architect your data and determine which software to keep and which to get rid of. There may also been some downtime so you’ll need to create plans for how to handle that.

Migrate

We can break this down further into four different stages of migration. Firstly, you’ll need to migrate your schema, including table structures and specifications. This is also a good opportunity to make structural changes as part of the migration, including indexing or partitioning.

Next up is migrating your data. Moving very large volumes of data is process intensive, network intensive and time-consuming. You’ll need to map out how long it will take to migrate and if there’s any way you can accelerate that process.

Once you’ve migrated the data, you’ll need to migrate your ETL. As part of this, you may need to change the code base to optimize for platform performance and change data transformations to sync with data restructuring. You’ll also need to determine if data flows should remain intact or be reorganized.

Finally, you’ll need the migrate the users and applications, without interrupting business process. Security and access authorizations may need to be created or changed, and BI and analytics tools should be connected.

Validate

Once you’ve completed the migration stage, it’s to to validate that your data was moved successfully. Run exercises to ensure the new system performs under heavy load and at optimal speeds and that the user permissions and access are correctly implemented. If you do find any errors or anomalies, you should be able to pin them down to one of the stages of migration and see where you went wrong.

Slow and steady wins the race

A typical EDW contains a large amount of data describing a number and variety of business subject areas. Migrating an entire data warehouse in a single go is usually unrealistic. Migrating your EDW incrementally is the smart approach when an all-in-one migration isn’t practical or possible and is a must when undertaking significant design changes as part of the effort.

Having a hybrid strategy is another viable option. With a hybrid approach, your on-premises data warehouse can remain operating as the cloud data warehouse comes online. During this transition phase, you’ll need to synchronize the data between the old on-premises data warehouse and the new one that’s in the cloud.

Thankfully, there are a number of services that can help you to migrate your existing data-warehouse to the cloud, whether that’s through an Azure SaaS or in your own Azure environment. So do your research before settling on the right company to help take your data warehouse forward.