Databricks Data + AI Summit 2025 Recap: 15+ New Launches

The Databricks Data + AI Summit 2025 is over, but what a week it has been, with tons of excitement and lots of energy. The most exciting part, however, was hearing about some incredible technology that was announced during the event.

Databricks Summit 2025 at Moscone Center

The Databricks Data + AI Summit 2025 went down at the Moscone Center in San Francisco from June 9-12, drawing over 22,000 in-person attendees and an additional 65,000 virtual participants worldwide.

Databricks has just dropped some exciting updates that could transform the industry. These updates include Databricks Lakebase, Databricks Apps, Mosaic AI Agent Bricks, the latest version of MLflow (v3.0), and so much more….

Several new tools and features were unveiled, and we will review those shortly. In this article, we will deep dive into the details of what was revealed and give you an in-depth look at what to anticipate in the months to come.

Databricks Summit 2025 — Product Announcements Summary

Need a quick summary? Here's a rundown of the major product announcements from the summit 👇.

📅 Day 1 of Databricks Data + AI Summit

🔮 Databricks Free Edition (Public Preview) → Databricks Free Edition provides a no‑cost, serverless workspace for individuals to ingest data, build dashboards, and train AI models on the same platform.

🔮 Databricks Lakebase (Public Preview) → Databricks Lakebase is a fully managed, serverless, Postgres‑compatible OLTP database built on Neon. It delivers ACID compliance, separates compute from storage, and offers low‑latency, high‑concurrency transaction processing.

🔮 Databricks Apps (GA) → Databricks Apps is generally available, offering a fully managed, serverless runtime within your Databricks workspace. It provides built‑in identity, governance, and observability so you can build, deploy, and scale interactive data‑intelligence apps without managing infrastructure.

🔮 Databricks App Builder Ecosystem (GA) → The Databricks App Builder Ecosystem delivers frameworks and serverless runtimes for rapidly authoring, securing, and deploying production-ready apps on the Data Intelligence Platform.

🔮 Agent Bricks (Beta) → Databricks Agent Bricks is a no‑code platform for designing, benchmarking, and deploying AI agents optimized on your proprietary data..

🔮 MLflow 3.0 (GA) → MLflow 3.0 has been redesigned specifically for generative AI workflows. It unifies monitoring, tracing, prompt versioning, human-in-the-loop evaluation, and cross-platform agent observability to manage the complete AI lifecycle (works on or off Databricks).

🔮 Serverless GPU Compute (Beta) → Databricks Serverless GPU Compute offers on‑demand, serverless access to NVIDIA A10g GPUs (with H100s coming soon) for training, inference, and classic ML workloads.

🔮 Model Context Protocol (MCP) Support (Beta) → Databricks now integrates Anthropic’s MCP standard: you can host and manage MCP‑compliant servers via Databricks Apps and prototype agents in the AI Playground, enabling LLMs to call external tools and enterprise data through a unified API.

🔮 AI Functions in SQL (GA) → AI Functions in SQL now deliver up to 3× faster performance and expanded multi‑modal capabilities to embed generative AI workflows directly into SQL queries.

🔮 Storage‑Optimized Vector Search (Public Preview) →  Vector Search is rebuilt with separate compute and storage to index and query billions of vectors at scale. It delivers up to 7× lower cost and microsecond latencies for retrieval‑augmented generation and semantic search use cases.

📅 Day 2 of Databricks Data + AI Summit

🔮 Databricks Azure Partnership Extension → Databricks and Microsoft have inked a long-term extension of their strategic partnership, locking in Azure Databricks as a key Microsoft service through the 2030s, with plans for more seamless integration within the Azure ecosystem.

🔮 Databricks Apache Iceberg™ Support (Public Preview) → Databricks now provides full read/write access and governance for both managed and external Iceberg tables via Unity Catalog’s Iceberg REST Catalog API, complete with Predictive Optimization for Liquid Clustering.

🔮 Databricks Unity Catalog Metrics (Public Preview) → Unity Catalog Metrics delivers a semantic layer for defining, storing, and governing standardized business metrics that work seamlessly with SQL Analytics, AI/BI dashboards, and other external tools.

🔮 Databricks Unity Catalog Discover (Private Preview) → Unity Catalog Discover gives you a tailored, internal marketplace of vetted data products, complete with AI-driven suggestions and expert curation.

🔮 Apache Spark 4.0 → Spark 4.0 introduces Pythonic DataFrame plotting APIs, performance optimizations for PySpark workloads, enhanced SQL analytics, and major under‑the‑hood improvements for both batch and streaming execution.

🔮 Real‑time Mode (Private Preview) → A new low‑latency execution mode for Spark Structured Streaming (Project Lightspeed) that delivers p99 latencies under 300 ms for both stateless and stateful queries—now contributed upstream to Apache Spark.

🔮 Declarative Pipelines (GA) → Built on the open Spark Declarative Pipelines standard, this framework enables fully managed, serverless ETL pipelines defined in SQL or Python, with deep Unity Catalog integration.

🔮 Databricks Lakeflow (GA) → A unified data engineering solution—Lakeflow integrates ingestion, transformation, and orchestration within the Databricks Platform for end‑to‑end pipeline management.

🔮 Databricks Lakeflow Connect (GA) → Provides managed, reliable ingestion connectors (including CDC) for enterprise apps (Salesforce, Workday, ServiceNow), file sources (SFTP, SharePoint), databases (SQL Server, Oracle), and warehouses (Snowflake, BigQuery).

🔮 Databricks Zerobus (GA) → A Lakeflow Connect API offering high‑throughput direct writes into Unity Catalog with real‑time latency, perfect for event‑driven telemetry ingestion.

🔮 New IDE for Data Engineering (GA) → An integrated development environment for Lakeflow Declarative Pipelines featuring code/DAG side‑by‑side views, context‑aware debugging, built‑in Git, and AI‑assisted pipeline authoring.

🔮 Databricks Lakeflow Jobs (GA) → A production‑grade orchestrator on the unified Workflows platform, supporting notebooks, SQL, Declarative Pipelines, dbt, conditional execution, and serverless optimization.

🔮 Databricks Lakeflow Designer (GA) → Lakeflow Designer lets you build pipelines without writing code, thanks to its visual interface and GenAI-powered suggestions. Just drag and drop to create, set up, and get your ETL pipelines up and running.

🔮 Databricks SQL (Next Gen) → DBSQL Serverless has achieved a 5× performance gain on customer dashboards; Predictive Query Execution and Photon Vectorized Shuffle add another 25 % boost, turning 20 s queries into ~15 s.

🔮 Databricks AI Functions in SQL (GA) → Built‑in SQL functions (`ai_parse_document`, `ai_text_completion`, embeddings) let you call LLMs and multimodal models directly from SQL queries.

🔮 Databricks Lakebridge → An open source, end‑to‑end migration toolkit featuring Analyzer, Converter, and Validator to automate legacy data warehouse‑to‑Databricks SQL migrations.

🔮 Databricks & Google Cloud Strategic AI Partnership → Databricks and Google Cloud embed Gemini models natively into the Data Intelligence Platform, enabling you to build and scale AI agents with Gemini’s capabilities on your enterprise data.

🔮 AI/BI Genie Enhancements (GA) → AI/BI Genie offers natural‑language Q&A, instant summaries, AI Forecasting, AI Top Drivers, and Deep Research Mode within Dashboards and Genie spaces.

🔮 Databricks One (GA) → Databricks One is a redesigned BI experience that provides business users with simple, secure access to AI/BI Dashboards, Genie spaces, and Databricks Apps via a unified, no‑code interface.

Databricks Data + AI Summit Day 1 (June 9)—Opening Keynote & Data Intelligence Kickoff

Databricks Co-founder and CEO Ali Ghodsi opened Day 1 with a keynote looking back at the company’s journey and the changes in the Data + AI world. He spoke about the early days, about ten years ago, when the Spark Summit at the Moscone Center drew only about 3k people. Three years ago, the event was renamed the Spark + AI Summit and attendance grew to over 5k, Ali noted.

Spark + AI Summit 2022

This year, the Databricks Data + AI Summit 2025 reached record numbers: more than 22k individuals participated in person, and over 65k joined online from 150 countries, making it the largest Data + AI conference ever.

Databricks Data + AI Summit 2025 total participant

Ghodsi emphasized the foundational importance of open source to Databricks’ success,. He pointed to the massive adoption of projects like Apache Spark (over 2 billion downloads), Delta Lake (over 1 billion), Iceberg (over 360 million), and MLflow (over 300 million downloads).

He stressed that most enterprises, especially the older ones, face significant challenges with databases, data lakes, ETL jobs, BI tools, and governance issues. Each system has its own approach to security, metadata, and governance. Databricks has long promoted the Lakehouse as a solution, centralizing data in public cloud storage (S3, ADLS, GCS) using open formats.

Equally important, he argued, is open governance. This is where Unity Catalog comes in, providing a universal, open source governance layer that supports both Delta and Iceberg formats and integrates with Hive Metastore and REST Catalog APIs.

Ali Ghodsi discussing Unity Catalog

Data Intelligence for All

A key theme of this summit was data intelligence. Ghodsi explained that the platform’s intelligence layer enables users to interact with data using natural language. He shared adoption figures:

Genie is used by 81% of Databricks customers.
Databricks Assistant has been adopted by an impressive 98% of customers.

Databricks Genie and Databricks Assistant usage stats

🔮 Databricks Free Edition

In a surprise announcement, Ghodsi announced Databricks free edition. No credit card, no corporate email, free forever. To top it off, the company is also investing $100M in training and education.

Databricks Free Edition

Customer Success Stories

The keynote featured compelling customer success stories:

Joby Aviation:

Joby Aviation’s presentation demonstrated how they use Databricks to process gigabytes of telemetry data per minute from their electric aircraft. Databricks powers their real‑time sandboxes, endpoints, and model‑serving infrastructure.

Virgin Atlantic:

Richard Masters, VP of Data and AI, shared how Virgin Atlantic uses the Medallion architecture to organize data into bronze, silver, and gold layers, with Unity Catalog enforcing strict governance over flight, customer, and commercial data. AI is leveraged for predictive maintenance, pricing optimization, and safety messaging — saving teams hours of work daily. However, he stressed the importance of human oversight, as AI models can still produce errors.

Customer success story of Virgin Atlantic

Ghodsi then shifted focus to transactional databases (OLTP), arguing they haven’t changed much in 40 years and are “stuck in the past” due to proprietary vendor lock‑in, which stifles innovation.

🔮 Introducing Lakebase….

The solution: "Databricks Lakebase", a new architecture for transactional databases.

Ali Ghodsi announcing Databricks Lakebase

It splits the database into a "base" layer for processing and a "lake" layer where data lives in open formats on cloud storage. Built on open source PostgreSQL, it has a decoupled storage-from-compute design. This makes it serverless, with autoscaling and the ability to scale to zero. A key feature is instant branching of the database, including both data and schema.

Ali Ghodsi announcing Databricks Lakebase

Lakebase Deep Dive

Ghodsi then called Reynold Xin to give a more in-depth talk about Lakebase. Reynold Xin explained that combining analytics and AI with transactional workloads has historically been difficult. Lakebase, built on open formats and integrated with the Lakehouse and Unity Catalog, removes these silos.

Reynold Xin giving a talk on Databricks Lakebase

He revealed that Lakebase innovations stem from Databricks' recent acquisition of Neon, a leading serverless PostgreSQL company. Databricks had invested in Neon years earlier and collaborated on the storage‑compute separation architecture.

Nikita Shamgunov, CEO of Neon, joined the stage and shared a compelling statistic: 80% of databases created on Neon are generated by AI agents. He predicted this figure will reach 99% within a few years, underscoring the need for databases designed for an AI-driven future.

Databricks + Neon acquisitions

Live Demo

Holly Smith, Staff Developer Advocate at Databricks, provided a live demo of Lakebase in an inventory management application. The demo showed Lakebase handling real‑time operational data while integrating seamlessly with analytical data in Delta tables. She also showcased the new PostgreSQL SQL editor, now native to Databricks. Performance reached nearly 19,000 queries per second with a median latency of just 4.56 ms.

Building Applications and AI Agents

The focus then shifted to building applications and AI agents.

🔮 Databricks Apps

Ghodsi then called Justin Debrabant, director and product manager at Databricks. He made a big announcement about the general availability of Databricks Apps, a solution designed to simplify the creation of secure and governed data intelligence applications directly within the Databricks environment, with governance managed through Unity Catalog.

Justin Debrabant announcing GA of Databricks Apps

Next, Dario Amodei, CEO of Anthropic, joined via video to discuss the Databricks partnership. He highlighted how companies like Block are using Anthropic's models through Databricks to build powerful internal coding agents and stressed the importance of governance frameworks for the coming wave of "agent swarms".

Fireside chat between Ghodsi and Dario

🔮 Agent Bricks

Ghodsi then introduced "Agent Bricks", a new offering for building production-ready AI agents that are "auto-optimized on your data".

Ghodsi introducing Agent Bricks

Hanlin Tang, CTO of Databricks, provided an in-depth overview of Agent Bricks, followed by a live demo by Kasey Uhlenhuth.

Hanlin Tan providing an in-depth overview of Agent bricks

Kasey Uhlenhuth proving demo of Agent Bricks

🔮 MLflow 3.0 Announcment

Once the demo session was over, Hanlin Tang joined again and made a major announcement: MLflow 3.0, a new version redesigned specifically for the generative AI era. It offers real-time tracking and observability for production applications, even those hosted outside of Databricks.

Introduction of MLflow 3.0

🔮 Serverless GPU Compute

Along with MLflow 3.0 he announced “Serverless GPU Compute”. GPUs, specifically A10s, are now available in beta for serverless compute across notebooks and jobs, with hundreds more coming soon.

Introduction of Serverless GPU compute

🔮 Model Context Protocol (MCP) support in Databricks

Tang also announced Model Context Protocol (MCP) support in Databricks, a common protocol for delivering tools and knowledge to large language models. Users can now host their own MCP servers using Databricks Apps, develop and prototype with MCP integrated directly into the Playground as tools, and connect to Databricks-hosted MCP servers for major services like Unity Catalog functions, Genie, and Vector Search.

Hanlin Tand introducing MCP on Databricks

🔮 Upgraded AI Functions in SQL & Vector Search capability

Hanlin also had some other big news to share: AI Functions in SQL just got a major upgrade. They're now quicker, cheaper, and can handle multiple types of data, like documents and images, all while outperforming the competition by up to three times. Plus, their Vector Search capability has been rebuilt from the ground up to scale to billions of vectors and offers up to seven times lower costs.

Hanlin introducing upgraded AI Funtions in SQL and faster vector search

Mastercard’s Use Case

Greg Springer, Chief AI & Data Officer at Mastercard, joined the session and shared how his team built a "Product Onboarding Assistant" on Databricks that reduced customer onboarding time by 30%. He expressed excitement for Agent Bricks’ potential to help scale more use cases with reliable evaluation and trust.

Customer success story of Mastercard

Fireside Chat with Jamie Dimon

Day 1 concluded with a conversation between Ali Ghodsi and Jamie Dimon, Chairman and CEO of JPMorgan Chase, where Dimon revealed that the bank allocates $18 billion yearly to IT and $2 billion to AI, employs 55,000 programmers and a 200-strong AI research team, and runs 600 live AI use cases in production (set to double or triple soon). His advice: Don’t debate AI; use it. But he warned that data complexity, not models, is the real challenge—especially after so many mergers and disparate data systems.

Dimon also flagged cybersecurity as the No. #1 issue, with AI both helping and complicating defense. He warned that data complexity, not models, poses the biggest hurdle after mergers and disparate systems. Cybersecurity remains the top risk, with AI both aiding and complicating defenses. He predicted AI will transform every job and process but emphasized the need for grit, persistent problem-solving, and strong.

Products and Features Announced on Day 1

Product/Feature	Description
🔮 Databricks Free Edition	Free, serverless workspace with limited compute and SQL—no SLAs, SSO, or private networking.
🔮 Databricks Lakebase	Serverless, Postgres-compatible OLTP DB built on Neon with ACID support and high concurrency.
🔮 Databricks Apps (GA)	Secure, governed environment to build and deploy data apps—fully serverless.
🔮 App Builder Ecosystem	Frameworks, templates, and runtimes to build and ship Databricks-native apps fast.
🔮 Agent Bricks (Beta)	No-code platform to build, evaluate, and deploy AI agents on enterprise data.
🔮 MLflow 3.0 (GA)	GenAI-focused release with tracking, evaluation, and agent observability.
🔮 Serverless GPU Compute (Beta)	On-demand A10 GPUs for training and inference—no cluster setup.
🔮 Model Context Protocol Support	Unified API to connect LLMs with tools, data, and prompts using open MCP standard.
🔮 AI Functions in SQL	Native, multi-modal AI functions embedded in SQL for LLM tasks.
🔮 Vector Search (Upgraded)	High-speed, low-cost search at scale—billions of vectors with RAG-ready APIs.

Check out this video if you want to watch the full summary of the Databricks Data + AI Summit's Day 1 keynote:

Data + AI Summit Keynote Day 1

Databricks Data + AI Summit Day 2—Massive Product Drops

Day 1 was pretty amazing with all those cool announcements, but Day 2 took it to the next level. Ali kicked off the day with a quick rundown of Day 1's top moments, which got the crowd revved up for what was coming next. From there, it was non-stop action with product reveal after product reveal, plus a packed speaker lineup.

🔮 Microsoft Partnership Deepens into the 2030s

Ali Ghodsi opened with key updates. Microsoft CEO Satya Nadella joined via pre-recorded video to celebrate their deepened strategic partnership.

Fireside chat between Ghodsi and Satya Nadella

During a fireside chat, Ghodsi and Nadella discussed the "crazy" pace of AI's growth over the past two years. Nadella highlighted AI's evolution from simple chat features to task-specific assistants and now to "digital co-workers" capable of independent operation. The partnership aims to help businesses derive tangible value from these automated systems by building on their existing technology.

While Azure Databricks is already tightly integrated with the Azure ecosystem, this long-term commitment will make it a native part of the Azure experience. For enterprise customers, this simplifies multi-vendor relationships and establishes a single point of accountability.

Nadella emphasized the importance of enterprise-wide platform decisions over piecemeal AI adoption. He argued that companies treating AI as a collection of scattered solutions miss the larger opportunity. A unified platform approach, like the one offered by Databricks and Azure, provides the consistency and data management necessary to scale AI applications across an entire organization.

🔮 Unity Catalog: Now with Iceberg

Databricks co-founder Matei Zaharia took the stage, recalling how Unity Catalog launched four years ago to solve fragmented governance across data tables, files, ML models, and even dashboards.

Matei Zaharia announcing Unity Catalog with Apache Iceberg

Unity Catalog brings together users, various data assets, and capabilities like security, access control, data discovery, lineage tracking, and cost management. Zaharia highlighted that 97% of customers now use Unity Catalog for unified security, lineage and cost control.

The major announcement: Full Apache Iceberg support in Unity Catalog. Databricks now offers what they call "the best catalog for Apache Iceberg". It means you get governed reads and writes from any engine, with performance that's 5X faster than some competitors' managed Iceberg offerings, thanks to predictive optimization. It even ties into the Delta Sharing ecosystem.

Michelle Leon then demoed how to access Unity Catalog-managed tables from tools like EMR, Snowflake, and DuckDB, showcasing attribute-based access control that masks sensitive data.

Michelle Leon showcasing Unity Catalog with Iceberg

🔮 Expansion of Unity Catalog to Business Users

Matei also announced two new business-user features:

Unity Catalog Metrics (GA): A governed semantic layer in Databricks where you define your dimensions and measures once as metric views so that everyone (data engineers, data scientists, BI users) can consistently use those same standardized definitions across tools and workloads.

Matei announcing Unity Catalog Metrics

Unity Catalog Discover (Preview): An internal marketplace for curated data assets. It organizes data by business domain, making it easier for users to find and request access to what they need.

Matei announcing Unity Catalog Discover

Keegan Dubbs then demonstrated how these features help analysts find certified tables and centralized KPIs.

Keegan Dubbs showcasing demo of Unity Catalog Discover and Unity Catalog Metrics

🔮 Apache Spark 4.0: Faster, Simpler, and Ready for Anything

Michael Armbrust (creator of Spark SQL, Delta Lake, and Delta Live Tables) took the stage to talk about Apache Spark. He highlighted Spark's impressive growth, from just 60 contributors when the conference started over a decade ago to over 4,000 today.

Michael Armbrust announcing Apache Spark 4.0

He described Spark 4.0 as the "greatest release yet", highlighting several key improvements:

SQL enhancements include new User-Defined Functions and pipe syntax for chaining complex transformations
Variant data type (Delta/Iceberg compatible).
ANSI mode as default for robust queries.
Spark Connect support for Swift/Rust/Go.
Python data source API for custom read/write.
Reimagined streaming state store ("transform with state" API).

Armbrust then announced Real-time Mode, a new capability being contributed to Apache Spark. He explained that while structured streaming has been great for ETL with latency in seconds to minutes, there was a gap for operational use cases needing sub-second or millisecond latency. Real-time Mode changes how streaming works by running long-running tasks that constantly pull new data, processing it immediately upon arrival.

Michael Armbrust announcing Spark Real-time mode

Perhaps the biggest Spark news was the open sourcing of Delta Live Tables (DLT) as Spark Declarative Pipelines. Armbrust recounted Spark's history, from the complex Resilient Distributed Datasets (RDDs) to the simpler Spark SQL, Structured Streaming, and Delta Lake. Each step simplified Spark, but building production-ready data pipelines still required significant expertise to manage checkpoints, retries, and dependencies. DLT was created to abstract away this complexity.

Michael Armbrust announcing Declarative Pipelines

Now, by contributing "Spark declarative pipelines" to Apache Spark, Databricks is making it possible to build these end-to-end production pipelines with just a few lines of SQL. Armbrust concluded his presentation by live-merging the code into Apache Spark on GitHub.

Michael Armbrust merging code

🔮 Introducing Databricks LakeFlow…

Bilal Aslam then joined the stage and introduced Databricks LakeFlow, announcing its General Availability (GA).

Bilal Aslam announcing Databricks LakeFlow

LakeFlow is designed to tackle the common problem of fragmented data engineering tools, which often lead to complex, unreliable pipelines that are hard to govern.

Bilal Aslam announcing Databricks LakeFlow connect, Declarative pipeline and jobs

Aslam outlined the three core capabilities of Databricks Lakeflow:

LakeFlow Connect: Pulls structured and unstructured data into Databricks with point-and-click simplicity, backed by powerful APIs.
LakeFlow Declarative Pipelines: The evolution of Delta Live Tables (DLT), now open-sourced as part of Apache Spark.
LakeFlow Jobs: Serves as the orchestrator for running production workloads.

A surprise announcement was Zerobus, a direct write API for Unity Catalog designed to handle massive event data ingestion.

Bilal Aslam announcing Zerobus

Another big announcement was a new IDE for Data Engineering that offers AI assistance, data exploration, and debugging tools designed for production readiness.

IDE for Data Engineering

🔮 LakeFlow Designer—No Code, No Problem

Ali introduced Michael Piatek to talk about LakeFlow Designer, a solution specifically for teams who "do not want to code at all".

Michael Piatek announcing LakeFlow Designer

Lakeflow Designer is designed to provide production-quality ETL (Extract, Transform, Load) with "no code required".

The tool aims to bridge the gap between data engineers who build complex pipelines and business analysts who have deep business knowledge but often lack coding skills. LakeFlow Designer facilitates this with:

Simplified collaboration where everyone uses LakeFlow, whether they prefer code or visual interfaces
Built-in path to production where visual pipelines run as standard LakeFlow pipelines
Enhanced AI productivity because it understands business context

Michael Piatek announcing LakeFlow Designer

🔮 Databricks SQL: Now Faster and More Affordable

Ghodsi also provided updates on Databricks SQL (DBSQL), noting its rapid growth, with adoption doubling in the last year to over 12,000 customers. He highlighted DBSQL's impressive five-fold performance increase over the past three years. He then introduced a new "next-generation" version that delivers a 25% performance boost at no additional cost.

Ali Ghodsi providing updates on Databricks SQL

Arsalan Tavakoli then joined the satge to share about the innovations that Databricks has in store in DBSQL.

Arsalan Tavakoli explaining about DBSQL

Arsalan Tavakoli elaborated on why customers are choosing DBSQL. Here are some highlights he mentioned:

100% open formats (Delta/Iceberg) with Unity Catalog governance.
Multimodel AI functions (LLMs, image queries) inside SQL.
Lakebase sync for real‑time analytics + transactional apps.
AI‑driven auto‑tuning (stats, file sizing, clustering).

🔮 Databricks Lakebridge

Tavakoli then proceed to drop in the biggest announcement which was Lakebridge, a new, free, open, and AI-powered data migration tool.

Arsalan introducing Lakebridge

🔮 Strategic AI Partnership with Google Cloud

Ghodsi announced a strategic AI partnership with Google Cloud to bring Gemini models natively to Databricks. The goal? Make "all the models available in Databricks" with Agent Bricks, optimizing them.

In a pre-recorded video, Google Cloud CEO Thomas Kurian explained that enterprises can use Gemini on Databricks within Google Cloud to cleanse and understand data, as well as assist with data science and analytics. He highlighted Gemini's advanced capabilities, such as strong AI reasoning, breaking down complex tasks, self-critique for finding the best answers, and sophisticated tool selection for generating SQL or Spark instructions.

Fireside chat with Ali Ghodsi and Thomas Kurian

🔮 AI/BI Reimagined

Ken Wong, the product leader for AI/BI, shared that user adoption of AI/BI on Databricks has surged by 500% in the last year. He then proceeded to provide an update on the latest features on AI/BI platform like:

Cross-filtering, one-click drill-down capabilities, and the ability to create calculated measures and dimensions without writing SQL.
An expanded library of visualizations, such as Sankey diagrams, geospatial maps, and statistical plots.
Multi-page support for complex operational reports, personalized subscriptions, and embedding options for corporate portals like SharePoint or Atlassian.
Theming options to match corporate branding.

Ken Wong introducing updates on Databricks AI/BI platform

Wong explained that Genie was designed to work without requiring an upfront semantic model. Instead, it uses "knowledge extraction" from platform usage to identify relevant data assets for context. He then announced three surprising updates:

AI Forecasting builds directly into the SQL warehouse, letting users add accurate forecasts to dashboards with a single click.
AI Top Drivers handles diagnostic questions like "Why did sales drop?" Users select anomalous chart points and ask AI/BI to explain differences.
Deep Research Mode (Preview) handles complex, open-ended questions. It uses advanced LLM reasoning and Genie's internal knowledge to generate research plans, execute sub-steps in parallel, and provide summarized results with citations.

🔮 Databricks One

Finally, to cap off the presentation, Wong unveiled Databricks One, a completely redesigned user experience tailored specifically for business users, which was then demonstrated by Miranda Luna.

Ken Wong introducing Databricks One

Miranda Luna showcasing Demo of Databricks One

Products and Features Announced on Day 2

Here is a quick summary of everything announced on Day 2.

Product/Feature	Description
🔮 Azure Databricks Partnership Extension	Multi‑year extension securing Azure Databricks as a first‑party Microsoft service through the 2030s with deeper AD, Synapse, and Purview integration
🔮 Unity Catalog Apache Iceberg Support	Full read/write/governance of managed and external Iceberg tables via Unity Catalog’s Iceberg REST Catalog API
🔮 Unity Catalog Metrics (GA)	Semantic layer for defining, storing, and governing standardized business metrics across SQL Analytics and BI tools
🔮 Unity Catalog Discover (Preview)	Curated internal marketplace of certified data products—tables, dashboards, metrics—with AI recommendations for discovery
🔮 Apache Spark 4.0	Major release with Python DataFrame plotting APIs, SQL enhancements, and under‑the‑hood execution optimizations
🔮 Real‑time Mode (Open Source to Spark)	New low‑latency Structured Streaming mode (Project Lightspeed) delivering p99 <300 ms for stateless and stateful workloads
🔮 Declarative Pipelines (DLT Open Sourced)	Evolved Delta Live Tables framework—serverless ETL pipelines in SQL/Python with Unity Catalog governance—now open source
🔮 Databricks LakeFlow (GA)	Unified data engineering stack: ingestion (LakeFlow Connect), transformation (Declarative Pipelines), orchestration (Jobs)
🔮 LakeFlow Connect	Managed connectors (CDC, bulk) for apps (Salesforce, Workday), databases (SQL Server, Oracle), and file sources
🔮 Zerobus	High‑throughput API for direct writes into Unity Catalog (100 MB/s) with <5 s latency for event telemetry ingestion
🔮 IDE for Data Engineering	Code/DAG side‑by‑side IDE for LakeFlow Declarative Pipelines with Git, debugging, previews, and AI assistance
🔮 LakeFlow Jobs	Production orchestrator on unified Workflows—supports notebooks, SQL, DLT, dbt, triggers, and conditional logic
🔮 LakeFlow Designer	No‑code, drag‑and‑drop ETL builder with GenAI‑powered suggestions for analysts to deploy production pipelines
🔮 Databricks SQL (Next‑Gen)	Serverless SQL engine with Photon Vectorized Shuffle and Predictive Query Execution—delivers up to 5× dashboard speed gains
🔮 DBSQL AI Capabilities	Native SQL AI functions (`ai_parse_document`, `ai_translate`, embeddings) for LLM tasks within queries and dashboards
🔮 Lakebridge	Open‑source migration toolkit (Analyzer, Converter, Validator) for automated legacy warehouse to Databricks SQL migrations
🔮 Google Cloud Strategic AI Partnership	Native integration of Google’s Gemini models into Databricks for building and scaling AI agents on enterprise data
🔮 AI/BI Genie (GA)	Generally available AI/BI component offering natural‑language Q&A, AI Forecasting, Top Drivers, and Deep Research Mode
🔮 AI/BI Deep Research Mode (Preview)	Advanced Genie mode for exploratory, open‑ended data research with multi‑modal AI assistance
🔮 Databricks One (GA)	Redesigned, no‑code BI interface unifying Dashboards, Genie spaces, and Apps with role‑based governance and natural‑language querying

Check out this video if you want to watch the full summary of the Databricks Data + AI Summit's Day 2 keynote:

Data + AI Summit Keynote Day 2

Day 3 (June 11) — Hands-On Sessions

After two days packed with major product announcements, Day 3 shifted from theory to practice. The Databricks Data + AI Summit 2025 transformed into a hands-on laboratory, where attendees could finally work with the tools teased during the keynotes.

Hands-On Sessions With New Tech

The deep-dive workshops allowed attendees to break things, fix them, and truly understand what makes the new features tick. Databricks' product teams set up dedicated labs for each major announcement from the previous days.

For those wanting to test-drive the new serverless GPU capabilities, there was a workshop available. If you were curious about how MLflow 3.0 handles specific use cases, you could join a guided session. These hands-on experiences enabled participants to see how the products behave with real data, not just the curated examples from marketing demos.

Learning Through Experimentation

Day 3 provided an opportunity for experimentation. Attendees could push the limits of the new AI Functions, see how they handle edge cases, and determine where they might fit into existing workflows. The workshops covered everything from basic setup to advanced configuration.

Catch all the keynotes and sessions from the Databricks Data + AI Summit 2025—watch on demand here:

Agenda | Databricks

Explore the comprehensive agenda for the Data + AI Summit by Databricks. Plan your conference experience with sessions, workshops, and keynotes led by industry experts.

Databricks

Day 4 (June 12) — Wrap Up

The final day of the Databricks Data + AI Summit 2025 took a different approach. Instead of more product announcements or hands-on coding, Day 4 focused on the ecosystem that makes Databricks work in the real world.

Partner Sessions

Partners such as Monte Carlo, Microsoft, Tredence, Actian, Alation, Deloitte, Infosys, Tealium shared proven approaches to unify governance, improve data quality, and operationalize trusted AI at scale. These were not sales pitches disguised as technical talks; they were real stories from companies that have deployed Databricks in complex enterprise environments.

Customer Implementation Stories

Customer talks featured organizations using the Databricks Data Intelligence Platform for enterprise AI and governance modernization. Data leaders from 7-Eleven, Acxiom, Corning, FedEx, Nationwide, American Airlines, T-Mobile, PepsiCo, Atlassian, Amgen, Capital One, and more shared thier Unity Catalog implementations.

These talks covered catalog migrations, fine-grained access control, cost optimization, and AI governance. The focus was on scaling secure data access, streamlining compliance, and enabling trusted collaboration across business units.

Network Building Sessions

Day 4 also served as the summit's networking finale. With no major announcements to chase, attendees could focus on building relationships. The partner sessions became natural conversation starters.

Day 4 sessions bridged the gap between conference demos and actual deployment. Partners provided concrete implementation paths, while customer stories showed what works in production environments.

The summit concluded with networking and planning sessions. Attendees left with specific next steps rather than just product knowledge.

Audio Summary of Databricks Summit (Gen. via NotebookLM)

0:00

/2097.72

Conclusion

And that’s a wrap! Databricks Data + AI Summit 2025 kicked off with a ton of energy, big ambitions, and a clear message: Databricks is creating the future of enterprise data and AI, not just building tools. They're making it easy for business users to access data with Databricks One and AI/BI Genie. Now they're going all in on AI innovation with Agent Bricks and MLflow 3.0, aiming to make AI a breeze for everyone and bring data workflows together. The launch of Lakebase is huge - it merges transactional and analytical systems, making data architecture a whole lot simpler. To boost the ecosystem, Databricks is forging partnerships and offering a free edition, attracting new talent along the way. Databricks is on a mission to become the go-to platform where data and AI meet.

Additional Resources

Databricks Data + AI Summit 2025 Playlist

Product Launches from Data + AI Summit 2025 Day 1

Product Launches from Data + AI Summit 2025 Day 2