Today I ran into a nice problem with SQL Server fancy way of sorting GUIDS.
I know, there should be no reason at all to sort GUID columns but, if what you need is run two tables side by side to check whether they are aligned or not (this is the task of TableDifference) and a GUID is the best key you have, then you will end up sorting GUIDS. The question now is: how do you sort GUIDS? The fact is that SQL Server and .NET give different answers. Let’s have a look at it in deeper detail.
I was using .NET guids to read data from SQL Server and then use .Compare to check for the sorting, supposing that a uniqueidentifier in SQL is sorted the same way a guid is sorted in .NET.
This is not true, I discovered here that you should use SqlGuid instead. Let’s not spend too many words on why having two sort algorithm may be clever or not, what is frightening about that article is the fact that it says that SqlGuid uses only the last six bytes to sort data. If it’s working this way then we will end up with a sorting algorithm that does not really sorts. Seemed to be too stupid, so I double checked it and ran into this that says something more interesting about how GUIDS are sorted. Anyway, the algorithm is not still clear. Will I end up with a completely sorted list if I issue an ORDER BY on a uniqueidentifier column? and, how will this data be sorted?
After all, the best way is always that of trying, so I end up with this very simple and funny query (that you can use to show friends that bits are still here, in 2007!):
With UIDs As (— 0 1 2 3 4 5 6 7 8 9 A B C D E F Select ID = 1, UID = cast (‘00000000-0000-0000-0000-010000000000’ as uniqueidentifier) Union Select ID = 2, UID = cast (‘00000000-0000-0000-0000-000100000000’ as uniqueidentifier) Union Select ID = 3, UID = cast (‘00000000-0000-0000-0000-000001000000’ as uniqueidentifier) Union Select ID = 4, UID = cast (‘00000000-0000-0000-0000-000000010000’ as uniqueidentifier) Union Select ID = 5, UID = cast (‘00000000-0000-0000-0000-000000000100’ as uniqueidentifier) Union Select ID = 6, UID = cast (‘00000000-0000-0000-0000-000000000001’ as uniqueidentifier) Union Select ID = 7, UID = cast (‘00000000-0000-0000-0100-000000000000’ as uniqueidentifier) Union Select ID = 8, UID = cast (‘00000000-0000-0000-0010-000000000000’ as uniqueidentifier) Union Select ID = 9, UID = cast (‘00000000-0000-0001-0000-000000000000’ as uniqueidentifier) Union Select ID = 10, UID = cast (‘00000000-0000-0100-0000-000000000000’ as uniqueidentifier) Union Select ID = 11, UID = cast (‘00000000-0001-0000-0000-000000000000’ as uniqueidentifier) Union Select ID = 12, UID = cast (‘00000000-0100-0000-0000-000000000000’ as uniqueidentifier) Union Select ID = 13, UID = cast (‘00000001-0000-0000-0000-000000000000’ as uniqueidentifier) Union Select ID = 14, UID = cast (‘00000100-0000-0000-0000-000000000000’ as uniqueidentifier) Union Select ID = 15, UID = cast (‘00010000-0000-0000-0000-000000000000’ as uniqueidentifier) Union Select ID = 16, UID = cast (‘01000000-0000-0000-0000-000000000000’ as uniqueidentifier) ) Select * From UIDs Order By UID, ID
The result is simple, the ID’s are completely reversed even if they appear as second argument in the SORT request. This means that the evaluation is carried on in byte groups this way:
- 0..3 are evaluated in left to right order and are the less important, then
- 4..5 are evaluated in left to right order, then
- 6..7 are evaluated in left to right order, then
- 8..9 are evaluated in right to left order, then
- A..F are evaluated in right to left order and are the most important
Well, I know it is not very important, but it is funny and maybe useful in some future. :) The really important thing is that ORDER BY GUID works, all bytes are evaluated and sorted, even if in a strange way.