Open Semantic Interchange (OSI)
An exited enterprise client came back from a conference where Snowflake delighted them with AI demos and semantic views built on Open Semantic Interchange (OSI) standard. Snowflake even went further to show how their Cortex Analyst tool returns deterministic AI answers. Naturally, given their existing investments in Snowflake data lake and ODS, the client questioned why we don’t build everything in Snowflake instead of bringing Microsoft Fabric and two vendors into the mix.
What’s OSI?
Reading about the relatively freshly baked OSI, we learn that “the Open Semantic Interchange is an industry-wide specification effort to standardize how we exchange semantic metadata across analytics, AI and BI platforms, providing a vendor neutral, single source of truth for semantic data.” Great, I am all about standardization. If you ask me, the world should adopt the metric system and English as a universal language, and life will be much simpler. But this is about BI so let’s peek under the hood and keep ‘em honest.
Now, like ogres and cakes, a BI architecture has layers. Besides data sources, at minimum I like to see a central repository (let’s called a data warehouse) with star schema (if the star is missing, you don’t have DW, but operational data source, sorry), semantic layer (don’t skip it!), and of course reports with possibly AI – the cherry on top of the cake. “Modern” medallionists will of course dream of a bigger cake with bronze, silver, and gold layers, and then wonder what to put in them, but I digress.
OSI is an initiative from major Microsoft competitors in the BI space (Snowflake, Dbt, Google, Databricks, Salesforce) to standardize the semantic model definition so good report vendors who have bad semantic models, like Tableau and Salesforce, can integrate with vendors who have good backends but bad reporting, like Snowflake and Google. Did I get this right? I believe the main goal here is to compete more effectively against Microsoft which currently dominates the data analytics space. All that wrapped with “avoid the vendor lock-in and single version of truth” story.
About Snowflake semantic views
A Snowflake semantic view is OSI-based metadata definition described in YAML inside their database. Created similarly to a SQL view, it enumerates the star schema dimension and fact tables, their relationships, and basic metrics with SQL formulas. Inside Snowflake, the semantic views are currently used by their Cortex Analyst tool (analogous to Copilot in Microsoft Fabric) to let users and apps talk to data with natural questions. Behind the scenes, the question is translated to SQL, which is how Microsoft Fabric Data Agent works when connected to a lakehouse or warehouse.
For the most part, tables, relationships, and metrics is all OSI has defined at this point. And of course, ontology to glue semantic views together so AI knows how to reason across them. I’m glad Snowflake calls them “views” and not semantic models, which would be a big misnomer. By contrast, Microsoft has a 30+ years head start on semantic modeling so the two technologies (semantic view vs semantic model) can’t be meaningfully compared by any criteria (features, tooling, etc.).
Shall we standardize?
At this point, Microsoft doesn’t participate in OSI. Although to the best of my knowledge Microsoft hasn’t released official reasons, more than likely it’s because they don’t need to. There is a large distance between Microsoft and the rest of the pack. Further, they spent 20+ years on their engine and DAX tooling. I don’t think it’s even possible to retrofit many features into a new SQL-based basic standard. For example, the OSI metric language is SQL while DAX is Excel-like language because the thinking back then was to transition Excel users into self-service BI. I remember having discussions with the Analysis Services team about why not use SQL, but alas, Excel prevailed…I wonder if they’ve made a mistake there.
Now, if we are serious about open standards and interoperability, then I would argue that we should start with data formats. Wouldn’t be nice if Google and Snowflake rewrite their databases to use open formats, such as Delta or Iceberg, before getting to the semantic layer? That would immediately facilitate data integration and virtualization, such as by letting a Fabric user create shortcuts in a lakehouse to Google and Snowflake tables instead of replicating the data, as I mentioned in my “Give me your data” blog. So, if we are serious about make integration easier, let’s start from the bottom up as Microsoft and Databricks did, shall we?
Meanwhile, if you have invested in another database vendor, my advice would be to use the best of both worlds. If you like Snowflake, use their database for lake/warehouse and Power BI/Fabric for its semantic models and reporting capabilities. The best data source for AI is a rich semantic layer (sorry, Snowflake OSI semantic views).
And about the Cortex AI deterministic answers, it’s pure marketing propaganda; all LLMs might vary their answers and are not guaranteed to return the same results.














