SQL Server has evolved over the years to accommodate increasing volumes and varieties of data (including XML, JSON, spatial, and graph data). It's even added in-memory capabilities to process and analyze huge amounts of data faster. However, relational systems and big data stores have been in their own siloes, making it difficult for enterprises to join and analyze all available data. With SQL Server 2019, it's finally possible to have a platform that embraces both unstructured big data and relational data, as well as leverage scale-out compute.
When first announced, SQL Server 2019 Big Data Clusters were met with a bit of fanfare – and for good reason. The scale-out data virtualization platform increases flexibility and reduces time to value from data. In this blog post, I show you how.
First, let's look under the hood. SQL Server Big Data Clusters consist of the SQL Server 2019 database engine, Spark, and Hadoop Distributed File System (HDFS) running on Kubernetes. These components run side-by-side in a unified platform, enabling you to read, write, and process big data from either T-SQL or Spark. As a result, you can easily combine and analyze high-value relational data with high-volume big data. And because Big Data Clusters run on Kubernetes, you get a predictable, fast, and elastically scalable deployment.
One of the great things about SQL Server Big Data Clusters is the number of ways it allows you to interact with your data. Here are a few examples:
The SQL Server Big Data Clusters can be managed and monitored via a combination of command line tools, APIs, and dynamic management views. In addition, Azure Data Studio can be used to perform a variety of tasks on Big Data Clusters. Azure Data Studio provides a UX-guided deployment experience, and monitoring and troubleshooting experiences for Big Data Clusters (including dashboards and a set of notebooks to help with troubleshooting, repair, and so on). Integrated security and high availability are built-in management experiences in Big Data Clusters.
The SQL Server 2019 extension provides:
SQL Server Big Data Clusters allow you to use SQL Server to bring high-value relational data and high-volume big data together on a unified, scalable data platform. The many use cases offer flexibility and reduce your time to value. To learn more about what you can do with Microsoft SQL 19, check out the free Packt guide, Introducing Microsoft SQL 2019. If you're ready to jump to a fully managed cloud solution, check out The Essential Guide to Data in the Cloud.