This paper investigates different techniques that produce the same result with different performance, showing that you need to make your homework in order to find the right optimization for your model, explaining how to read query plans and how to choose the best solution for your model.
IMPORTANT: Please note that the goal of this paper is to understand the different query plans and operations performance by the query engine.
This is not an ultimate guide to distinct count optimization.
The performance of distinct count calculations is affected by many other factors, such as the number of distinct values in the column and in the result set. Your mileage may vary a lot, so test everything in your specific data model. At the end of the paper, we provide a comparison between the same set of queries on two different databases in order to give you an idea of how important testing is.