Comments on: Finding Duplicates in DAX

By: Prologika

Prologika — Wed, 15 Mar 2017 01:07:00 +0000

In reply to Marty Piecyk. The following modified expression will flag the first occurence of the duplicated row as 1, assuming the table has an unique identifier and the data is ordered by it, such as the DimProduct table in AdventureWorksDW. Index = CALCULATE(COUNTROWS(), ALLEXCEPT(Product, 'Product'[ProductAlternateKey]), EARLIER(Product[ProductKey]) >= Product[ProductKey])

By: Marty Piecyk

Marty Piecyk — Tue, 14 Mar 2017 23:18:00 +0000

In reply to Marty Piecyk.

Here’s one solution. I’m sharing this, since this posting shows up in the top results of search engines.

For example, given this table “Host”:
ID HostName IpAddress CreatedDate
1 duplicatehost 192.168.1.2 2017-03-13
2 duplicatehost 192.168.1.2 2017-01-01
3 singlehost 192.168.1.3 2017-03-13

This DAX can be used to generate the following calculated table using “HostName” and “IpAddress” as the key for finding duplicates and getting the latest CreatedDate:

=FILTER(Host, [CreatedDate]=MAXX(FILTER(Host, [HostName]=EARLIER([HostName]) && [IpAddress]=EARLIER([IpAddress])), [CreatedDate]))

ID HostName IpAddress CreatedDate
1 duplicatehost 192.168.1.2 2017-03-13
3 singlehost 192.168.1.3 2017-03-13

There must not any duplication of the CreatedDate column for the key-grouping or duplicates will be present in the output. In my use case, duplicates are not possible.

By: Marty Piecyk

Marty Piecyk — Tue, 14 Mar 2017 18:10:00 +0000

In reply to Prologika. Thanks for the reply. I'm using Analysis Services and connecting to my server live through Power BI Desktop. So, I cannot edit queries or use the Remove Duplicates feature in Power BI Desktop. Also, I need to create relationships between my tables, but this can only be done after the duplicates are removed, so I would rather do this in my Analysis Services model.

By: Prologika

Prologika — Tue, 14 Mar 2017 14:53:00 +0000

In reply to Marty Piecyk. Hi Marty. Which tool are using: Power BI Desktop, Excel? As you mentioned, this blog is two-year old. Have you tried the Remove Duplicates feature in the Query Editor which is available in both Power BI Desktop and Excel?

By: Marty Piecyk

Marty Piecyk — Tue, 14 Mar 2017 00:22:00 +0000

“Once the column is created, you can filter on it in the Data View to find out the duplicate rows with values 2, 3, etc.”
Thanks for this helpful post (although it’s two years old at the moment). By filtering out duplicate rows with values 2 and greater, you’ll remove every instance of a value that has a duplicate. Do you have any advice how to keep the first instance of a duplicate?

In your example above, the row with ID 2 would be present in your new calculated table, but customers will wonder where the row with ID 1 is.