Some of the variation between Bike Radar's data from Wheel Energy, Al Morrisons roller based data at bikeblather.blogspot.com, and Jarno's at www.bicyclerollingresistance.com
may simply come down to the test protocol. They are all good at maintaining consistency of test conditions within their own environment, but they differ from each other, most significantly in the drum diameter of their fixture.
Al is using training rollers. I don't recall the exact diameter of his, but it probably somewhere around 10cm. Jarno is using a 77cm drum, so the difference there is huge, and that is even more true with Wheel Energy, as they use around a 122cm drum if I recall correctly. There isn't any reason to believe that tire casing hysteresis losses all increase/decrease in a totally linear fashion with more/less deflection, so a smaller roller that distorts the tire more may provide different results than a larger roller.
Also, Al's rollers are smooth, whereas Jarno's drum has a diamond plate texture. Wheel Energy has the capability of offering multiple surface drums, and as you saw in the Bike Radar article, they got different rankings between their smooth and diamond plate drums, even within the same test facility.
I haven't taken the time to do so, but it would be interesting to see if the Bike Radar rough drum data tracks Jarno's more than their smooth drum data, as that would suggest that the surface texture of the drum was a deciding factor. You could do the same with Al Morrisons smooth roller data, and get an idea of if the drum diameter altered the relative rankings substantially. An easy model to look at would the the GP4000S2 both because it is common, and because it jumped around quite a bit in the rankings at Wheel Energy based on the drum surface.
In addition, there will be some tire to tire variation, as every weight weenie who has weighed a batch of "identical" model tires will know. Generally speaking, the less rubber in the tire, the faster it will roll, all else being equal. Tires can have a +/- weight margin of 10% or so and a lot of that will come from the rubber thickness, so if Bike Radar happened to get one of the thinnest lightest examples of tire A, and one of the thickest heaviest examples of tire B, and Jarno got the opposite mix, then that could explain some of the difference in rankings.