
“Tableau Prep doesn’t support LOD syntax (at least in version 1).”Īh, but we just matched the logic of the calculation with our flow in Tableau Prep. Filter to keep only true values and you’ve deduped your data set. When the Hired Date matched the LOD result, you have the most recent record for that employee. Imagine you were writing a calculation to find the latest record in Tableau Desktop. Guess what? The solution to Case #3 is an LOD calculation in Tableau Prep! If you were wondering if you could write a FIXED LOD – you can! It’s just visual! And you can even see the red color of the earlier date for Employee 3 that we wanted to exclude. But then, we’ve joined it to the previous step in the flow! Yes, you can join to previous steps in the flow – simply by dragging and dropping!Īnd the join, as you can see above, is done on the Employee ID as well as the Date Hired – which is all dates in the original data set matched to the latest (MAX) dates from the aggregation. The Max Date step is simply a renaming of the Dedup step in case #2.
TABLEAU PREP 2020.1 FULL
But we need to take that information and use it to match to the full data set to identify the entire records with the latest date. We’ll group by Employee ID and Name and find the Maximum hire date. Instead, we’ll extend our solution just as the data set has been extended. And we can’t AGGREGATE by Position either because we’ll end up with an arbitrary value (should it be MIN or MAX – one of those might work in this specific case, but other combinations of Positions will have different results whatever we chose). We can’t use the previous approach to GROUP by Position because we’d get two records. So, we need to be able to keep the record with the latest hire date but we also need to keep the other values for that record intact. Now we know why Walvoord has two hire dates – it represents two different positions. Let’s extend the previous input data set so that not only is there a most recent hire date, but we can’t simply group and aggregate because there is additional data in the record that we need to retain.
TABLEAU PREP 2020.1 HOW TO
I’ve already covered how to get the latest snapshot of records elsewhere, but it fits nicely here because it is a case of duplicate data that needs to be deduped. We remove the unwanted near-duplicate record and end up with this output: That means for every unique set of employees, we’ll get the max hire date. It’s very similar to the previous solution, but here, we’re grouping only by Employee ID and Name while the Date Hired field gets aggregated as a MAX. You’ve probably already jumped to a possible solution for removing this duplicate data. In this case, Walvoord was hired once in 1997 and then subsequently re-hired in 2014 (this data set doesn’t indicate the reason – did he have an intermediate job or was it hire to a new position?) Whatever the cause for this data, let’s say we only care to know the most recent hire date. Let’s modify the data set slightly and consider how we might eliminate the duplicate records in Tableau Prep: (and thank you to Tom Fuller for pointing out this approach!)Ģ: Similar, but not exactly, Duplicate Records You can continue the Tableau Prep flow with a nicely deduped data set. Only unique rows are retained while duplicates will vanish before your eyes. No matter how many duplicate records you had, you could do the following in Tableau Prep:Īll you have to do is add an aggregate and add all fields to the Grouped Fields section.

Everything is the same: the ID, the Name, the Date. You can see the Employee ID 3 has 2 exactly identical duplicate records. Let’s say you’ve got a data set that looks like this: Let’s talk about three possible cases of removing duplicate records in Tableau Prep (each gets a bit harder): 1: Exact Duplicate Records in Tableau Prep (And if you make it to the end, you’ll get a a bonus: An LOD calculation in Tableau Prep!) And not only that, but it’s very EASY to remove duplicates. BUT it is definitely possible to remove duplicate records in Tableau Prep. At first glance, the first version of Tableau Prep doesn’t seem to have this feature. One of the questions that comes up quite often is “How do I remove duplicate records in Tableau Prep?” or “How do I dedup data in Tableau Prep?” Some data preparation tools have a specific feature to do this.
