Big data requires a way to perform big queries, which brings us to the topic of today’s episode, Google BigQuery. Today’s data is complex, and handling it requires a massive investment in system architecture and hardware. It doesn’t end there, though. You need strategies for scalability. Then, it needs to be managed and maintained. The result, a system where queries can still take minutes to hours to run but what’s more important, developing the infrastructure or finding insights from your data? Well, this is where Google BigQuery comes in.
It’s a fully-managed, massive-scale, low-cost enterprise data warehouse, running on top of Google’s proven compute, storage, and networking infrastructure. So by replacing the typical hardware setup for a traditional data warehouse, with the BigQuery service, there’s no infrastructure to manage, no database administrator. BigQuery can serve as a collective whole for all analytical data in your organization. BigQuery is fantastic for running ad hoc queries and aggregating queries across extremely large data sets and it’s really fast.
It can scan terabytes in seconds and petabytes in minutes. This makes interactive self-service exploration of massive data sets viable, which means better analysis, more creativity, and you can drive more interesting insights from your data. BigQuery does not replace every enterprise data stored, though. For instance, it’s not an online transaction processing system and it’s not geared towards applying changes as they happen. Since BigQuery is a self-contained, cloud-based solution, it’s also not an on-premise solution. Query and storage resources are allocated dynamically based on your usage patterns.
Got a really big query? BigQuery scales for you, using the processing power of Google’s infrastructure. Sharing and collaboration are easy. You can control access to both the project and your data based on your business needs, such as giving others the ability to view or query your data and because you can use standard SQL queries, anyone can get involved. Replicating data across multiple geographies ensures a 99.9% SLA. You’re always going to be able to access your data and you won’t lose your data. BigQuery also encrypts all data at rest and in-transit by default.
In terms of pricing, BigQuery separates the concepts of storage and compute. This allows you to scale and pay for each independently. You can either choose a pay-as-you-go model or a flat rate monthly price now, the fun part, Qwik Labs. You can check out the links to start the Qwik Labs yourself here. These labs provide an introductory walkthrough of how to load and query data using both BigQuery web UI and the command line tool. Keep in mind that these labs will take about 30 minutes each to complete. At this point in the lab, we’ve loaded a custom data set into the new table. We’re going to preview the table and query the custom data set. Locate the table in BigQuery. Open the table and click Preview to view the data. Click Compose Query. Query the babynames data set.