Databricks Data + AI Summit 2025: All 15+ Big Product Drops
The Databricks Data + AI Summit 2025 is over, but what a week it has been, with tons of excitement and lots of energy. The most exciting part, however, was hearing about some incredible technology that was announced during the event.
The Databricks Data + AI Summit 2025 went down at the Moscone Center in San Francisco from June 9-12, drawing over 22,000 in-person attendees and an additional 65,000 virtual participants worldwide.
Databricks has just dropped some exciting updates that could transform the industry. These updates include Databricks Lakebase, Databricks Apps, Mosaic AI Agent Bricks, the latest version of MLflow (v3.0), and so much more….
Several new tools and features were unveiled, and we will review those shortly. In this article, we will deep dive into the details of what was revealed and give you an in-depth look at what to anticipate in the months to come.
Databricks Summit 2025 — Product Announcements Summary
Need a quick summary? Here's a rundown of the major product announcements from the summit 👇.
📅 Day 1 of Databricks Data + AI Summit
- 🔮 Databricks Free Edition (Public Preview) → Databricks Free Edition provides a no‑cost, serverless workspace for individuals to ingest data, build dashboards, and train AI models on the same platform.
- 🔮 Databricks Lakebase (Public Preview) → Databricks Lakebase is a fully managed, serverless, Postgres‑compatible OLTP database built on Neon. It delivers ACID compliance, separates compute from storage, and offers low‑latency, high‑concurrency transaction processing.
- 🔮 Databricks Apps (GA) → Databricks Apps is generally available, offering a fully managed, serverless runtime within your Databricks workspace. It provides built‑in identity, governance, and observability so you can build, deploy, and scale interactive data‑intelligence apps without managing infrastructure.
- 🔮 Databricks App Builder Ecosystem (GA) → The Databricks App Builder Ecosystem delivers frameworks and serverless runtimes for rapidly authoring, securing, and deploying production-ready apps on the Data Intelligence Platform.
- 🔮 Agent Bricks (Beta) → Databricks Agent Bricks is a no‑code platform for designing, benchmarking, and deploying AI agents optimized on your proprietary data..
- 🔮 MLflow 3.0 (GA) → MLflow 3.0 has been redesigned specifically for generative AI workflows. It unifies monitoring, tracing, prompt versioning, human-in-the-loop evaluation, and cross-platform agent observability to manage the complete AI lifecycle (works on or off Databricks).
- 🔮 Serverless GPU Compute (Beta) → Databricks Serverless GPU Compute offers on‑demand, serverless access to NVIDIA A10g GPUs (with H100s coming soon) for training, inference, and classic ML workloads.
- 🔮 Model Context Protocol (MCP) Support (Beta) → Databricks now integrates Anthropic’s MCP standard: you can host and manage MCP‑compliant servers via Databricks Apps and prototype agents in the AI Playground, enabling LLMs to call external tools and enterprise data through a unified API.
- 🔮 AI Functions in SQL (GA) → AI Functions in SQL now deliver up to 3× faster performance and expanded multi‑modal capabilities to embed generative AI workflows directly into SQL queries.
- 🔮 Storage‑Optimized Vector Search (Public Preview) → Vector Search is rebuilt with separate compute and storage to index and query billions of vectors at scale. It delivers up to 7× lower cost and microsecond latencies for retrieval‑augmented generation and semantic search use cases.
📅 Day 2 of Databricks Data + AI Summit
- 🔮 Databricks Azure Partnership Extension → Databricks and Microsoft have inked a long-term extension of their strategic partnership, locking in Azure Databricks as a key Microsoft service through the 2030s, with plans for more seamless integration within the Azure ecosystem.
- 🔮 Databricks Apache Iceberg™ Support (Public Preview) → Databricks now provides full read/write access and governance for both managed and external Iceberg tables via Unity Catalog’s Iceberg REST Catalog API, complete with Predictive Optimization for Liquid Clustering.
- 🔮 Databricks Unity Catalog Metrics (Public Preview) → Unity Catalog Metrics delivers a semantic layer for defining, storing, and governing standardized business metrics that work seamlessly with SQL Analytics, AI/BI dashboards, and other external tools.
- 🔮 Databricks Unity Catalog Discover (Private Preview) → Unity Catalog Discover gives you a tailored, internal marketplace of vetted data products, complete with AI-driven suggestions and expert curation.
- 🔮 Apache Spark 4.0 → Spark 4.0 introduces Pythonic DataFrame plotting APIs, performance optimizations for PySpark workloads, enhanced SQL analytics, and major under‑the‑hood improvements for both batch and streaming execution.
- 🔮 Real‑time Mode (Private Preview) → A new low‑latency execution mode for Spark Structured Streaming (Project Lightspeed) that delivers p99 latencies under 300 ms for both stateless and stateful queries—now contributed upstream to Apache Spark.
- 🔮 Declarative Pipelines (GA) → Built on the open Spark Declarative Pipelines standard, this framework enables fully managed, serverless ETL pipelines defined in SQL or Python, with deep Unity Catalog integration.
- 🔮 Databricks Lakeflow (GA) → A unified data engineering solution—Lakeflow integrates ingestion, transformation, and orchestration within the Databricks Platform for end‑to‑end pipeline management.
- 🔮 Databricks Lakeflow Connect (GA) → Provides managed, reliable ingestion connectors (including CDC) for enterprise apps (Salesforce, Workday, ServiceNow), file sources (SFTP, SharePoint), databases (SQL Server, Oracle), and warehouses (Snowflake, BigQuery).
- 🔮 Databricks Zerobus (GA) → A Lakeflow Connect API offering high‑throughput direct writes into Unity Catalog with real‑time latency, perfect for event‑driven telemetry ingestion.
- 🔮 New IDE for Data Engineering (GA) → An integrated development environment for Lakeflow Declarative Pipelines featuring code/DAG side‑by‑side views, context‑aware debugging, built‑in Git, and AI‑assisted pipeline authoring.
- 🔮 Databricks Lakeflow Jobs (GA) → A production‑grade orchestrator on the unified Workflows platform, supporting notebooks, SQL, Declarative Pipelines, dbt, conditional execution, and serverless optimization.
- 🔮 Databricks Lakeflow Designer (GA) → Lakeflow Designer lets you build pipelines without writing code, thanks to its visual interface and GenAI-powered suggestions. Just drag and drop to create, set up, and get your ETL pipelines up and running.
- 🔮 Databricks SQL (Next Gen) → DBSQL Serverless has achieved a 5× performance gain on customer dashboards; Predictive Query Execution and Photon Vectorized Shuffle add another 25 % boost, turning 20 s queries into ~15 s.
- 🔮 Databricks AI Functions in SQL (GA) → Built‑in SQL functions (`ai_parse_document`, `ai_text_completion`, embeddings) let you call LLMs and multimodal models directly from SQL queries.
- 🔮 Databricks Lakebridge → An open source, end‑to‑end migration toolkit featuring Analyzer, Converter, and Validator to automate legacy data warehouse‑to‑Databricks SQL migrations.
- 🔮 Databricks & Google Cloud Strategic AI Partnership → Databricks and Google Cloud embed Gemini models natively into the Data Intelligence Platform, enabling you to build and scale AI agents with Gemini’s capabilities on your enterprise data.
- 🔮 AI/BI Genie Enhancements (GA) → AI/BI Genie offers natural‑language Q&A, instant summaries, AI Forecasting, AI Top Drivers, and Deep Research Mode within Dashboards and Genie spaces.
- 🔮 Databricks One (GA) → Databricks One is a redesigned BI experience that provides business users with simple, secure access to AI/BI Dashboards, Genie spaces, and Databricks Apps via a unified, no‑code interface.
Databricks Data + AI Summit Day 1 (June 9)—Opening Keynote & Data Intelligence Kickoff
Databricks Co-founder and CEO Ali Ghodsi kicked off Day 1 with a keynote reflecting on the company's impressive history and the evolution of the Data + AI landscape. He recalled the early days, about a decade ago, when the Spark Summit at the Moscone Center had a relatively small crowd of 3k. Just three years ago, the conference was renamed the Spark + AI Summit and grew to 5k+ attendees, reflecting a growing focus on AI.
This year, the Databricks Data + AI Summit 2025 has grown tremendously. More than 22k individuals participated in person, and over 65k joined online from 150 countries, making it the largest Data + AI conference ever.
The massive turnout shows that data and AI are now important considerations for businesses. It also shows that Databricks is currently on the right track.
Ghodsi highlighted the foundational importance of open source to Databricks' success. He pointed to the massive adoption of projects like Apache Spark (over 2 billion downloads), Delta Lake (over 1 billion), Iceberg (over 360 million), and MLflow (over 300 million downloads).
Why Databricks Still Cares About Open Data?
Ghodsi stated the plain truth: most enterprises, especially the older ones, face significant challenges with databases, data lakes, ETL jobs, BI tools, and governance issues. Each system has its own methods for handling security, metadata, and governance. Databricks has been promoting the Lakehouse for years as a solution; it centralizes data in public cloud storage (S3, ADLS, GCS) using open formats.
He further argued that open governance is equally crucial. This is where Unity Catalog comes in, providing a universal, open source governance layer that supports both Delta and Iceberg formats and integrates with Hive Metastore and REST Catalog APIs.
Data Intelligence for All
The idea of "data intelligence" was a central theme toward summit. Ghodsi highlighted that the platform's intelligence layer enables users to interact with data using natural language. He shared some numbers on tools like Genie, which 81% of Databricks customers already use, and the Databricks Assistant, which a huge 98% of customers have adopted.
🔮 Databricks Free Edition
Ghodsi made a surprising announcement: Databricks now offers a free edition. No credit card or corporate email is required; just sign up and you're ready to go. And the best part? It's yours to use, with no strings attached, forever. To top it off, Databricks is investing a staggering $100 million in training and education.
Customer Success Stories
The keynote featured compelling customer success stories:
Joby Aviation:
Joby Aviation’s presentation demonstrated how they use Databricks to process gigabytes of telemetry data per minute from their electric aircraft. Databricks powers their real-time sandboxes, endpoints, and model serving infrastructure.
Virgin Atlantic:
Richard Masters, VP of Data and AI, shared how Virgin Atlantic uses the Medallion architecture to organize data into bronze, silver, and gold layers, with Unity Catalog providing strict governance over flight, customer, and commercial data. The airline leverages AI for predictive maintenance, pricing optimization, and safety messaging, saving teams hours of work daily. However, he emphasized the importance of human oversight, as AI models can still produce errors.
Ghodsi then shifted focus to transactional databases (OLTP), arguing they haven't changed much in 40 years and are "stuck in the past". He attributed this stagnation to proprietary vendor lock-in, which eliminates incentives for innovation.
🔮 Introducing Lakebase….
The solution announced was "Databricks Lakebase", a new architecture for transactional databases.
It splits the database into a "base
" layer for processing and a "lake
" layer where data lives in open formats on cloud storage. Built on open source PostgreSQL, it has a decoupled storage-from-compute design. This makes it serverless, with autoscaling and the ability to scale to zero. A key feature is instant branching of the database, including both data and schema.
Ghodsi then called Reynold Xin to give a more in-depth talk about Lakebase. He highlighted the historical difficulty of combining analytics and AI with transactional workloads. Lakebase, by building on open formats and integrating with the Lakehouse and Unity Catalog, inherently breaks down these silos.
Reynold Xin announced that the innovations behind Lakebase are powered by Databricks' recent acquisition of Neon, a leading serverless PostgreSQL company. Databricks had a long-standing relationship with Neon, having invested in the company years ago and collaborated on the fundamental storage-compute separation architecture.
Nikita Shamgunov, CEO of Neon, joined the stage and shared a compelling statistic: 80% of databases created on Neon are generated by AI agents. He predicted this figure will reach 99% within a few years, underscoring the need for databases designed for an AI-driven future.
Holly Smith, Staff Developer Advocate at Databricks, provided a live demo of Lakebase in an inventory management application. The demo showcased Lakebase's ability to handle real-time operational data while seamlessly integrating with analytical data in Delta tables. Smith also highlighted the new PostgreSQL SQL editor, now natively available in the Databricks platform. The demonstration achieved nearly 19,000 queries per second with a median latency of just 4.56 milliseconds.
Building Applications and AI Agents
The focus then shifted to building applications and AI agents.
🔮 Databricks Apps
Ghodsi then called Justin Debrabant, director and product manager at Databricks. He made a big announcement about the general availability of Databricks Apps, a solution designed to simplify the creation of secure and governed data intelligence applications directly within the Databricks environment, with governance managed through Unity Catalog.
Next, Dario Amodei, CEO of Anthropic, joined via video to discuss the Databricks partnership. He highlighted how companies like Block are using Anthropic's models through Databricks to build powerful internal coding agents and stressed the importance of governance frameworks for the coming wave of "agent swarms
".
🔮 Agent Bricks
Ghodsi then introduced "Agent Bricks", a new offering for building production-ready AI agents that are "auto-optimized on your data".
Hanlin Tang, CTO of Databricks, provided an in-depth overview of Agent Bricks, followed by a practical demonstration by Kasey Uhlenhuth.
🔮 MLflow 3.0 Announcment
Once the demo session was over, Hanlin Tang joined again and made a major announcement: MLflow 3.0, a new version redesigned specifically for the generative AI era. It offers real-time tracking and observability for production applications, even those hosted outside of Databricks.
🔮 Serverless GPU Compute
Along with MLflow 3.0 he announced “Serverless GPU Compute”. GPUs, specifically A10s, are now available in beta for serverless compute across notebooks and jobs, with hundreds more coming soon.
🔮 Model Context Protocol (MCP) support in Databricks
He then announced Model Context Protocol (MCP) support in Databricks, a common protocol for delivering tools and knowledge to large language models. Users can now host their own MCP servers using Databricks Apps, develop and prototype with MCP integrated directly into the Playground as tools, and connect to Databricks-hosted MCP servers for major services like Unity Catalog functions, Genie, and Vector Search.
🔮 Upgraded AI Functions in SQL & Vector Search capability
Hanlin also had some other big news to share: AI Functions in SQL just got a major upgrade. They're now quicker, cheaper, and can handle multiple types of data, like documents and images, all while outperforming the competition by up to three times. Plus, their Vector Search capability has been rebuilt from the ground up to scale to billions of vectors and offers up to seven times lower costs.
Next up, Greg Springer, Chief AI & Data Officer at Mastercard, joined the session and shared how his team built a "Product Onboarding Assistant" on Databricks that reduced customer onboarding time by 30%. He expressed excitement for Agent Bricks’ potential to help scale more use cases with reliable evaluation and trust.
Finally, Day 1 ended with a fireside chat between Ali Ghodsi and Jamie Dimon, Chairman (CEO of JP Morgan Chase), where Dimon revealed that the bank allocates $18 billion yearly to IT and $2 billion to AI, employs 55,000 programmers and a 200-strong AI research team, and runs 600 live AI use cases in production (set to double or triple soon). Dimon’s advice: don’t debate AI; use it. But he warned that data complexity, not models, is the real challenge—especially after so many mergers and disparate data systems.
Dimon also flagged cybersecurity as the No. #1 issue, with AI both helping and complicating defense. He expects AI to impact every job and process, mostly for the better, but he’s realistic about risks and the need for "grit and persistent problem-solving". He also didn’t hold back on the need for strong US leadership in tech and a hard-nosed view on global risks.
Products and Features Announced on Day 1
Product/Feature | Description |
---|---|
🔮 Databricks Free Edition | Free, serverless workspace with limited compute and SQL—no SLAs, SSO, or private networking. |
🔮 Databricks Lakebase | Serverless, Postgres-compatible OLTP DB built on Neon with ACID support and high concurrency. |
🔮 Databricks Apps (GA) | Secure, governed environment to build and deploy data apps—fully serverless. |
🔮 App Builder Ecosystem | Frameworks, templates, and runtimes to build and ship Databricks-native apps fast. |
🔮 Agent Bricks (Beta) | No-code platform to build, evaluate, and deploy AI agents on enterprise data. |
🔮 MLflow 3.0 (GA) | GenAI-focused release with tracking, evaluation, and agent observability. |
🔮 Serverless GPU Compute (Beta) | On-demand A10 GPUs for training and inference—no cluster setup. |
🔮 Model Context Protocol Support | Unified API to connect LLMs with tools, data, and prompts using open MCP standard. |
🔮 AI Functions in SQL | Native, multi-modal AI functions embedded in SQL for LLM tasks. |
🔮 Vector Search (Upgraded) | High-speed, low-cost search at scale—billions of vectors with RAG-ready APIs. |
Check out this video if you want to watch the full summary of the Databricks Data + AI Summit's Day 1 keynote:
Data + AI Summit Keynote Day 1
Databricks Data + AI Summit Day 2—Massive Product Drops
Day 1 was pretty amazing with all those cool announcements, but Day 2 took it to the next level. Ali kicked off the day with a quick rundown of Day 1's top moments, which got the crowd revved up for what was coming next. From there, it was non-stop action with product reveal after product reveal, plus a packed speaker lineup.
🔮 Microsoft Partnership Deepens into the 2030s
Ali Ghodsi opened with key updates. Microsoft CEO Satya Nadella joined via pre-recorded video to celebrate their deepened strategic partnership.
During a fireside chat, Ghodsi and Nadella discussed the "crazy" pace of AI's growth over the past two years. Nadella highlighted AI's evolution from simple chat features to task-specific assistants and now to "digital co-workers" capable of independent operation. The partnership aims to help businesses derive tangible value from these automated systems by building on their existing technology.
While Azure Databricks is already tightly integrated with the Azure ecosystem, this long-term commitment will make it a native part of the Azure experience. For enterprise customers, this simplifies multi-vendor relationships and establishes a single point of accountability.
Nadella emphasized the importance of enterprise-wide platform decisions over piecemeal AI adoption. He argued that companies treating AI as a collection of scattered solutions miss the larger opportunity. A unified platform approach, like the one offered by Databricks and Azure, provides the consistency and data management necessary to scale AI applications across an entire organization.
🔮 Unity Catalog: Now with Iceberg
Databricks co-founder Matei Zaharia took the stage, recalling how Unity Catalog launched four years ago to solve fragmented governance across data tables, files, ML models, and even dashboards.
Unity Catalog brings together users, various data assets, and capabilities like security, access control, data discovery, lineage tracking, and cost management. Zaharia highlighted that 97% of customers now use Unity Catalog for unified security, lineage and cost control.
The major announcement: Full Apache Iceberg support in Unity Catalog. Databricks now offers what they call "the best catalog for Apache Iceberg". It means you get governed reads and writes from any engine, with performance that's 5X faster than some competitors' managed Iceberg offerings, thanks to predictive optimization. It even ties into the Delta Sharing ecosystem.
Michelle Leon then demoed how to access Unity Catalog-managed tables from tools like EMR, Snowflake, and DuckDB, showcasing attribute-based access control that masks sensitive data.
🔮 Expansion of Unity Catalog to Business Users
Matei also announced two new business-user features:
Unity Catalog Metrics (GA): A governed semantic layer in Databricks where you define your dimensions and measures once as metric views so that everyone (data engineers, data scientists, BI users) can consistently use those same standardized definitions across tools and workloads.
Unity Catalog Discover (Preview): An internal marketplace for curated data assets. It organizes data by business domain, making it easier for users to find and request access to what they need.
Keegan Dubbs then demonstrated how these features help analysts find certified tables and centralized KPIs.
🔮 Apache Spark 4.0: Faster, Simpler, and Ready for Anything
Michael Armbrust (creator of Spark SQL, Delta Lake, and Delta Live Tables) took the stage to talk about Apache Spark. He highlighted Spark's impressive growth, from just 60 contributors when the conference started over a decade ago to over 4,000 today.
He described Spark 4.0 as the "greatest release yet
", highlighting several key improvements:
- SQL enhancements include new User-Defined Functions and pipe syntax for chaining complex transformations
- Variant data type (Delta/Iceberg compatible).
- ANSI mode as default for robust queries.
- Spark Connect support for Swift/Rust/Go.
- Python data source API for custom read/write.
- Reimagined streaming state store ("transform with state" API).
Armbrust then announced Real-time Mode, a new capability being contributed to Apache Spark. He explained that while structured streaming has been great for ETL with latency in seconds to minutes, there was a gap for operational use cases needing sub-second or millisecond latency. Real-time Mode changes how streaming works by running long-running tasks that constantly pull new data, processing it immediately upon arrival.
Perhaps the biggest Spark news was the open sourcing of Delta Live Tables (DLT) as Spark Declarative Pipelines. Armbrust recounted Spark's history, from the complex Resilient Distributed Datasets (RDDs) to the simpler Spark SQL, Structured Streaming, and Delta Lake. Each step simplified Spark, but building production-ready data pipelines still required significant expertise to manage checkpoints, retries, and dependencies. DLT was created to abstract away this complexity.
Now, by contributing "Spark declarative pipelines
" to Apache Spark, Databricks is making it possible to build these end-to-end production pipelines with just a few lines of SQL. Armbrust concluded his presentation by live-merging the code into Apache Spark on GitHub.
🔮 Introducing Databricks LakeFlow…
Bilal Aslam then joined the stage and introduced Databricks LakeFlow, announcing its General Availability (GA).
LakeFlow is designed to tackle the common problem of fragmented data engineering tools, which often lead to complex, unreliable pipelines that are hard to govern.
Aslam outlined the three core capabilities of Databricks Lakeflow:
- LakeFlow Connect: Pulls structured and unstructured data into Databricks with point-and-click simplicity, backed by powerful APIs.
- LakeFlow Declarative Pipelines: The evolution of Delta Live Tables (DLT), now open-sourced as part of Apache Spark.
- LakeFlow Jobs: Serves as the orchestrator for running production workloads.
A surprise announcement was Zerobus, a direct write API for Unity Catalog designed to handle massive event data ingestion.
Another big announcement was a new IDE for Data Engineering that offers AI assistance, data exploration, and debugging tools designed for production readiness.
🔮 LakeFlow Designer—No Code, No Problem
Ali introduced Michael Piatek to talk about LakeFlow Designer, a solution specifically for teams who "do not want to code at all".
Lakeflow Designer is designed to provide production-quality ETL (Extract, Transform, Load) with "no code required".
The tool aims to bridge the gap between data engineers who build complex pipelines and business analysts who have deep business knowledge but often lack coding skills. LakeFlow Designer facilitates this with:
- Simplified collaboration where everyone uses LakeFlow, whether they prefer code or visual interfaces
- Built-in path to production where visual pipelines run as standard LakeFlow pipelines
- Enhanced AI productivity because it understands business context
🔮 Databricks SQL: Now Faster and More Affordable
Ghodsi also provided updates on Databricks SQL (DBSQL), noting its rapid growth, with adoption doubling in the last year to over 12,000 customers. He highlighted DBSQL's impressive five-fold performance increase over the past three years. He then introduced a new "next-generation" version that delivers a 25% performance boost at no additional cost.
Arsalan Tavakoli then joined the satge to share about the innovations that Databricks has in store in DBSQL.
Arsalan Tavakoli elaborated on why customers are choosing DBSQL. Here are some highlights he mentioned:
- 100% open formats (Delta/Iceberg) with Unity Catalog governance.
- Multimodel AI functions (LLMs, image queries) inside SQL.
- Lakebase sync for real‑time analytics + transactional apps.
- AI‑driven auto‑tuning (stats, file sizing, clustering).
🔮 Databricks Lakebridge
Tavakoli then proceed to drop in the biggest announcement which was Lakebridge, a new, free, open, and AI-powered data migration tool.
🔮 Strategic AI Partnership with Google Cloud
Ghodsi announced a strategic AI partnership with Google Cloud to bring Gemini models natively to Databricks. The goal? Make "all the models available in Databricks" with Agent Bricks, optimizing them.
In a pre-recorded video, Google Cloud CEO Thomas Kurian explained that enterprises can use Gemini on Databricks within Google Cloud to cleanse and understand data, as well as assist with data science and analytics. He highlighted Gemini's advanced capabilities, such as strong AI reasoning, breaking down complex tasks, self-critique for finding the best answers, and sophisticated tool selection for generating SQL or Spark instructions.
🔮 AI/BI Reimagined
Ken Wong, the product leader for AI/BI, shared that user adoption of AI/BI on Databricks has surged by 500% in the last year. He then proceeded to provide an update on the latest features on AI/BI platform like:
- Cross-filtering, one-click drill-down capabilities, and the ability to create calculated measures and dimensions without writing SQL.
- An expanded library of visualizations, such as Sankey diagrams, geospatial maps, and statistical plots.
- Multi-page support for complex operational reports, personalized subscriptions, and embedding options for corporate portals like SharePoint or Atlassian.
- Theming options to match corporate branding.
Wong explained that Genie was designed to work without requiring an upfront semantic model. Instead, it uses "knowledge extraction" from platform usage to identify relevant data assets for context. He then announced three surprising updates:
- AI Forecasting builds directly into the SQL warehouse, letting users add accurate forecasts to dashboards with a single click.
- AI Top Drivers handles diagnostic questions like "Why did sales drop?" Users select anomalous chart points and ask AI/BI to explain differences.
- Deep Research Mode (Preview) handles complex, open-ended questions. It uses advanced LLM reasoning and Genie's internal knowledge to generate research plans, execute sub-steps in parallel, and provide summarized results with citations.
🔮 Databricks One
Finally, to cap off the presentation, Wong unveiled Databricks One, a completely redesigned user experience tailored specifically for business users, which was then demonstrated by Miranda Luna.
Products and Features Announced on Day 2
Here is a quick summary of everything announced on Day 2.
Product/Feature | Description |
---|---|
🔮 Azure Databricks Partnership Extension | Multi‑year extension securing Azure Databricks as a first‑party Microsoft service through the 2030s with deeper AD, Synapse, and Purview integration |
🔮 Unity Catalog Apache Iceberg Support | Full read/write/governance of managed and external Iceberg tables via Unity Catalog’s Iceberg REST Catalog API |
🔮 Unity Catalog Metrics (GA) | Semantic layer for defining, storing, and governing standardized business metrics across SQL Analytics and BI tools |
🔮 Unity Catalog Discover (Preview) | Curated internal marketplace of certified data products—tables, dashboards, metrics—with AI recommendations for discovery |
🔮 Apache Spark 4.0 | Major release with Python DataFrame plotting APIs, SQL enhancements, and under‑the‑hood execution optimizations |
🔮 Real‑time Mode (Open Source to Spark) | New low‑latency Structured Streaming mode (Project Lightspeed) delivering p99 <300 ms for stateless and stateful workloads |
🔮 Declarative Pipelines (DLT Open Sourced) | Evolved Delta Live Tables framework—serverless ETL pipelines in SQL/Python with Unity Catalog governance—now open source |
🔮 Databricks LakeFlow (GA) | Unified data engineering stack: ingestion (LakeFlow Connect), transformation (Declarative Pipelines), orchestration (Jobs) |
🔮 LakeFlow Connect | Managed connectors (CDC, bulk) for apps (Salesforce, Workday), databases (SQL Server, Oracle), and file sources |
🔮 Zerobus | High‑throughput API for direct writes into Unity Catalog (100 MB/s) with <5 s latency for event telemetry ingestion |
🔮 IDE for Data Engineering | Code/DAG side‑by‑side IDE for LakeFlow Declarative Pipelines with Git, debugging, previews, and AI assistance |
🔮 LakeFlow Jobs | Production orchestrator on unified Workflows—supports notebooks, SQL, DLT, dbt, triggers, and conditional logic |
🔮 LakeFlow Designer | No‑code, drag‑and‑drop ETL builder with GenAI‑powered suggestions for analysts to deploy production pipelines |
🔮 Databricks SQL (Next‑Gen) | Serverless SQL engine with Photon Vectorized Shuffle and Predictive Query Execution—delivers up to 5× dashboard speed gains |
🔮 DBSQL AI Capabilities | Native SQL AI functions (ai_parse_document , ai_translate , embeddings) for LLM tasks within queries and dashboards |
🔮 Lakebridge | Open‑source migration toolkit (Analyzer, Converter, Validator) for automated legacy warehouse to Databricks SQL migrations |
🔮 Google Cloud Strategic AI Partnership | Native integration of Google’s Gemini models into Databricks for building and scaling AI agents on enterprise data |
🔮 AI/BI Genie (GA) | Generally available AI/BI component offering natural‑language Q&A, AI Forecasting, Top Drivers, and Deep Research Mode |
🔮 AI/BI Deep Research Mode (Preview) | Advanced Genie mode for exploratory, open‑ended data research with multi‑modal AI assistance |
🔮 Databricks One (GA) | Redesigned, no‑code BI interface unifying Dashboards, Genie spaces, and Apps with role‑based governance and natural‑language querying |
Check out this video if you want to watch the full summary of the Databricks Data + AI Summit's Day 2 keynote:
Data + AI Summit Keynote Day 2
Day 3 (June 11) — Hands-On Sessions
After two days packed with major product announcements, Day 3 shifted from theory to practice. The Databricks Data + AI Summit 2025 transformed into a hands-on laboratory, where attendees could finally work with the tools teased during the keynotes.
Hands-On Sessions With New Tech
The deep-dive workshops allowed attendees to break things, fix them, and truly understand what makes the new features tick. Databricks' product teams set up dedicated labs for each major announcement from the previous days.
For those wanting to test-drive the new serverless GPU capabilities, there was a workshop available. If you were curious about how MLflow 3.0 handles specific use cases, you could join a guided session. These hands-on experiences enabled participants to see how the products behave with real data, not just the curated examples from marketing demos.
Learning Through Experimentation
Day 3 provided an opportunity for experimentation. Attendees could push the limits of the new AI Functions, see how they handle edge cases, and determine where they might fit into existing workflows. The workshops covered everything from basic setup to advanced configuration.
Catch all the keynotes and sessions from the Databricks Data + AI Summit 2025—watch on demand here:
Day 4 (June 12) — Wrap Up
The final day of the Databricks Data + AI Summit 2025 took a different approach. Instead of more product announcements or hands-on coding, Day 4 focused on the ecosystem that makes Databricks work in the real world.
Partner Sessions
Partners such as Monte Carlo, Microsoft, Tredence, Actian, Alation, Deloitte, Infosys, Tealium shared proven approaches to unify governance, improve data quality, and operationalize trusted AI at scale. These were not sales pitches disguised as technical talks; they were real stories from companies that have deployed Databricks in complex enterprise environments.
Customer Implementation Stories
Customer talks featured organizations using the Databricks Data Intelligence Platform for enterprise AI and governance modernization. Data leaders from 7-Eleven, Acxiom, Corning, FedEx, Nationwide, American Airlines, T-Mobile, PepsiCo, Atlassian, Amgen, Capital One, and more shared thier Unity Catalog implementations.
These talks covered catalog migrations, fine-grained access control, cost optimization, and AI governance. The focus was on scaling secure data access, streamlining compliance, and enabling trusted collaboration across business units.
Network Building Sessions
Day 4 also served as the summit's networking finale. With no major announcements to chase, attendees could focus on building relationships. The partner sessions became natural conversation starters.
Day 4 sessions bridged the gap between conference demos and actual deployment. Partners provided concrete implementation paths, while customer stories showed what works in production environments.
The summit concluded with networking and planning sessions. Attendees left with specific next steps rather than just product knowledge.
Conclusion
And that’s a wrap! Databricks Data + AI Summit 2025 kicked off with a ton of energy, big ambitions, and a clear message: Databricks is creating the future of enterprise data and AI, not just building tools. They're making it easy for business users to access data with Databricks One and AI/BI Genie. Now they're going all in on AI innovation with Agent Bricks and MLflow 3.0, aiming to make AI a breeze for everyone and bring data workflows together. The launch of Lakebase is huge - it merges transactional and analytical systems, making data architecture a whole lot simpler. To boost the ecosystem, Databricks is forging partnerships and offering a free edition, attracting new talent along the way. Databricks is on a mission to become the go-to platform where data and AI meet.
Additional Resources
- Databricks Announces 2025 Data + AI Summit Keynote Lineup
- Mosaic AI Announcements at Data + AI Summit 2025
- Your 2025 Data + AI Summit Guide for the Tech Industry Experience
- What’s new in security and compliance at Data + AI Summit 2025
- What’s new with Databricks Unity Catalog at Data + AI Summit 2025
- Introducing Databricks One
- Introducing Lakebridge: Free, Open Data Migration to Databricks SQL
- AI/BI Genie is now Generally Available
- What Is a Lakebase?
- Databricks SQL Accelerates Customer Workloads by 5x in Just Three Years
- Announcing the General Availability of Databricks Lakeflow
- Announcing Lakeflow Designer: No-Code ETL, Powered by the Data Intelligence Platform
- Bringing Declarative Pipelines to the Apache Spark™ Open Source Project
- Announcing full Apache Iceberg™ support in Databricks
Databricks Data + AI Summit 2025 Playlist
Product Launches from Data + AI Summit 2025 Day 1
Product Launches from Data + AI Summit 2025 Day 2