How to delete unused storage on Bigquery?
BigQuery is a powerful cloud data warehouse that is easy to use. This makes it easy for data engineers, scientists, analysts, and anyone else with basic SQL
knowledge to add tables and run queries. However, this ease of use can also lead to data bloat, which can drive up storage costs.
Identify unused tables
The first step to optimizing BigQuery storage costs is to identify tables that are not being used. You can do this by looking at the following information:
- Table creation date
- Table modification date
- Data insertion, modification, or deletion date
- Table references in queries
You can achieve that by using INFORMATION_SCHEMA
TABLE_STORAGE
and JOBS
views, but that is not an easy task. On the other hand, Biq Blue can help you visualize the number of unused tables and their storage costs in just one click, for all your datasets.
Delete unused tables
Once you have identified unused tables, you can delete them to save storage costs. To do this, click the Download list of unused tables button in Biq Blue. This will download a CSV
file with a list of all unused tables. You can then delete the tables that you no longer need and you should see the cost effect in the GCP billing console in 2 days.
Implement a data expiration policy
In addition to deleting unused tables, you can also implement a data expiration policy to automatically delete data that is no longer needed. This can help you to reduce storage costs and improve data management.
If you have a good data retention strategy, you should have an exponential curve. On the other hand, if you have a cost spike in the past, it means that you have tables that have not been used in months that are still costing you a lot of money.
BigQuery offers two types of data expiration policies:
- Dataset expiration: This policy will delete all data in a dataset after a specified period of time.
- Partition expiration: This policy will delete partitions in a table after a specified period of time.
Conclusion
By following these tips, you can optimize BigQuery storage costs and save money very quickly.
⭐ Additional tips