
And here is a zoomed-in look at the densest area, the Greater Vancouver Area - approximately one third of all the road segments in the province are within this area which is on the order of 1/100th of the area of the province.

I have yet to do empirical tests to determine if the packed index works any better, but from a purely aesthetic point of view I'm happy with my packing algorithm. It works in O( n*log(n) ) time, using an algorithm similar to quick-sort. I am confident there will be a measurable improvement in performance in terms of the number of page-accesses required for the packed index vs. the regularly constructed index, however the real question is, once caching and what-not is factored in, can you see a significant difference in query time in the database.
Lots of other people have thought about packing R-Tree indexes before, STR-Pack and Hilbert-Sort do perform similar operations in similar time, but I feel strongly that Median-Split does a better job than they do, particularly for region queries, which are common for map rendering (as opposed to point queries). I don't yet have any proof of this but it just "feels good". To my knowledge no-one else has published an algorithm that works the way Median-Split does - I may look into publishing it.
Shoehorning an index packing mechanism into PostgreSQL is a daunting task, so before I go down that road I'm going to do some simulations to determine if I'm right about the performance in terms of page-accesses.
1 comment:
Have you had a look at the R*-Tree?
I thought that everybody was using the R*-Tree, but apparently I was wrong, and people have fallen for the "oh, there is a linear time heuristic" trap.
Give it a try.
Post a Comment