Blog Post

Analytics on Azure Blog
2 MIN READ

Announcing general availability of Cross-Cloud Data Governance with Azure Databricks

Jason_Pereira's avatar
Jason_Pereira
Icon for Microsoft rankMicrosoft
May 21, 2025

We are thrilled to announce the that the ability to access AWS S3 data in Azure Databricks through Unity Catalog to enable cross-cloud data governance is now generally available in Azure Databricks. This release allows teams to directly configure and query AWS S3 data from Azure Databricks without the need to migrate or duplicate datasets. This simplifies the process of standardizing policies, access controls, and auditing across both Azure Data Lake Storage (ADLS) and S3 storage.

 

Why This Matters

In today's hybrid and multicloud environments, managing data governance can be complex and fragmented. Organizations often face challenges with inconsistent security policies, duplicated governance processes, and increased operational overhead. Unity Catalog on Azure Databricks addresses these challenges by providing a unified and open governance solution for all data and AI assets, enabling security, compliance, and interoperability across clouds.

 

Key Benefits

  • Unified governance: Manage access policies, security controls, and compliance standards from a single platform, eliminating the need to juggle siloed systems.
  • Frictionless data access: Securely discover, query, and analyze data across clouds in a single workspace, reducing complexity and operational overhead.
  • Enhanced security and compliance: Gain centralized visibility, tagging, lineage, data classification, and auditing across all your cloud storage.

How It Works

Previously, accessing AWS S3 data from Azure Databricks required extracting, transforming, and loading (ETL) the data into ADLS, a process that was both costly and time-consuming. With this GA release, you can now set up an external cross-cloud S3 location directly from Unity Catalog on Azure Databricks, allowing you to read and govern your S3 data without migration or duplication.

 

Getting Started

To configure this feature, follow these steps:

  1. Set up storage credentials: Create an AWS IAM role (read-only) and set up your storage credential in the Azure Databricks Catalog Explorer.
  2. Create external locations: Navigate to External Locations within the Catalog Explorer and complete the setup using your storage credential.
  3. Apply permissions: Manage and apply consistent permissions across both ADLS and S3 data from within the Catalog Explorer.
  4. Start Querying: Query your S3 data directly from your Azure Databricks workspace.

 

Supported Features

The GA release supports accessing external tables and volumes in S3 from Azure Databricks, including:

  • AWS IAM role storage credentials
  • S3 external locations
  • S3 external tables
  • S3 external volumes
  • S3 dbutils.fs access
  • Delta sharing of S3 data from Unity Catalog on Azure

For more details on announcement, you can visit the Databricks blog post.

 

Azure continues to be the best destination for Databricks workloads. Learn more here.

 

Get started with Azure Databricks today!

 

Join Us at Data + AI Summit 2025

Join us in San Francisco at Data + AI Summit this June from 9th – 12th, where Microsoft is proud to be a Legend Sponsor. Meet Azure Databricks experts at our Microsoft booth and learn how Azure Databricks helps you unlock the full power of your data estate.

Register now to secure your spot!

Published May 21, 2025
Version 1.0
No CommentsBe the first to comment
OSZAR »