Configure access nodes, Databricks permissions, and Azure resources to discover, back up, and restore Databricks objects.
Access node requirements
Deploy access nodes in the same cloud region as the target metastore to avoid egress charges and help ensure backup jobs complete successfully.
Important
Only Linux access nodes are supported.
Auto-scaling for Azure access nodes is supported. For more information, see Configure auto-scaling for Azure Databricks access nodes.
Firewall requirements for access nodes
The following URLs must be accessible from access nodes:
-
registry.opentofu.org
-
github.com
-
release-assets.githubusercontent.com
-
api.github.com
-
login.microsoftonline.com
-
*.azuredatabricks.net
-
*.dfs.core.windows.net
Azure configuration and requirements
Configure Azure resources and permissions required for Databricks protection.
-
Create a access node in Azure. For more information, see Creating an access node for an Azure hypervisor.
-
Create an Azure storage account for each metastore region to stage data during backups and restores.
-
Create an Azure app registration and do the following:
-
Assign the Storage Blob Data Owner role to the app at the storage account level.
-
Assign the Reader role to the app at the subscription level.
-
Create a custom role by using the DatabricksWorkspaceCreator.json file and assign the role at the subscription level.
To perform backups directly from the Databricks table storage locations, assign the Storage Blob Data Owner role to the app registration for each storage account that Databricks uses for table storage and Reader role to the app at the subscription level.
-
-
Create an access connector for Azure Databricks.
-
Assign the access connector managed identity to the storage account, and then assign the following roles:
-
Storage Blob Data Contributor
-
Reader
-
Databricks configuration and requirements
Configure Databricks identities, permissions, and access required for protection.
-
Create a service principal in your Databricks account and do the following:
-
Assign the Admin Account role to the service principal.
-
Record the Client ID and Client Secret which will be used to authenticate your Databricks account with Commvault.
-
-
Make sure that the service principal has the Allow unrestricted cluster creation, Databricks SQL access, and Workspace access entitlements enabled in the workspace.
-
Add the storage credential in the Databricks workspace.