The problem
So, say one imagines a 2-d array of integer values which represents a gridded-map, like this:
+-----+------+-----+-----+-----+
| 10 | 2 | 2 | 4 | 656 |
+-----+------+-----+-----+-----+
| 234 | 165 | 724 | 759 | 230 |
+-----+------+-----+-----+-----+
| 843 | 734 | 999 | 143 | 213 |
+-----+------+-----+-----+-----+
| 242 | 2135 | 131 | 24 | 374 |
+-----+------+-----+-----+-----+
| 159 | 464 | 155 | 124 | 151 |
+-----+------+-----+-----+-----+
The 2d indices represent the coordinates of a cell on the map, and the values in the array represent the relative difficulty to traverse the terrain of that cell - so for example 999 might be thick brambles, while 2,3,4 might be a slightly inclining path... or something.
Now we want to find the easiest path from [x,y] on the grid to [q,r] on the grid (where the sum of the steps is the lowest possible, in other words)
The problem domain
This needs to run in a modern browser, where a rather spartan map is rendered, and we'll draw a line from [x,y] to [q,r] through all the interceding vertices, after the user has input [q,r]. Conveniently, [X,Y] is always the same (say [0,0] for simplicity)
So use Dijkstra's algorithm or A*!
So my first instinct was to model the array as a graph, apply Dijkstra's algorithm and work from there. And in the above case, with a 5x5 grid, that works fine. I traverse each array index, and use the value, and adjacent values, to generate a node with weighted edges to all of it's neighbours. This builds up a graph which I can then apply Dijkstra's algorithm to.
However, In practice, I will be working with arrays up to 50,000 x 50,000 in size! That's 250 million!
So obviously building a graph on-the-fly to run Dijkstra’s algorithm isn't applicable. My next idea was to pre-pute the paths (The data-set is fixed), store them on the server and do a callback when we get the destination [q,r]...but this is 250,000,000 paths... even if I made it run in less than a second (which i don't think it will) it'll take years to pute all the paths...
I think I might need to take another approach but I'm not sure, how can I make this work?
The problem
So, say one imagines a 2-d array of integer values which represents a gridded-map, like this:
+-----+------+-----+-----+-----+
| 10 | 2 | 2 | 4 | 656 |
+-----+------+-----+-----+-----+
| 234 | 165 | 724 | 759 | 230 |
+-----+------+-----+-----+-----+
| 843 | 734 | 999 | 143 | 213 |
+-----+------+-----+-----+-----+
| 242 | 2135 | 131 | 24 | 374 |
+-----+------+-----+-----+-----+
| 159 | 464 | 155 | 124 | 151 |
+-----+------+-----+-----+-----+
The 2d indices represent the coordinates of a cell on the map, and the values in the array represent the relative difficulty to traverse the terrain of that cell - so for example 999 might be thick brambles, while 2,3,4 might be a slightly inclining path... or something.
Now we want to find the easiest path from [x,y] on the grid to [q,r] on the grid (where the sum of the steps is the lowest possible, in other words)
The problem domain
This needs to run in a modern browser, where a rather spartan map is rendered, and we'll draw a line from [x,y] to [q,r] through all the interceding vertices, after the user has input [q,r]. Conveniently, [X,Y] is always the same (say [0,0] for simplicity)
So use Dijkstra's algorithm or A*!
So my first instinct was to model the array as a graph, apply Dijkstra's algorithm and work from there. And in the above case, with a 5x5 grid, that works fine. I traverse each array index, and use the value, and adjacent values, to generate a node with weighted edges to all of it's neighbours. This builds up a graph which I can then apply Dijkstra's algorithm to.
However, In practice, I will be working with arrays up to 50,000 x 50,000 in size! That's 250 million!
So obviously building a graph on-the-fly to run Dijkstra’s algorithm isn't applicable. My next idea was to pre-pute the paths (The data-set is fixed), store them on the server and do a callback when we get the destination [q,r]...but this is 250,000,000 paths... even if I made it run in less than a second (which i don't think it will) it'll take years to pute all the paths...
I think I might need to take another approach but I'm not sure, how can I make this work?
Share Improve this question edited Sep 27, 2018 at 10:29 m69 ''snarky and unweling'' 12.3k3 gold badges35 silver badges59 bronze badges asked Sep 26, 2015 at 14:42 PaulPaul 3,58210 gold badges40 silver badges63 bronze badges 11- 1 How did you model the array as a graph? You probably can take advantage of the grid structure... – Stefan Haustein Commented Sep 26, 2015 at 14:59
- check this article: Fast and Accurate Estimation of Shortest Paths in Large Graphs – user784540 Commented Sep 26, 2015 at 17:54
- 1 it's 2 500 000 000, not 250 000 000 – Sopel Commented Sep 26, 2015 at 21:02
- @sopel you're right, sorry! The scope has changed a few times as I've been working on this. – Paul Commented Sep 26, 2015 at 23:02
- @StefanHaustein I generate the graph rather naively. I just traverse the array and create a graph for each index, then generate the weights of the edges based on the difficulty of adjacent indicies...make sense? – Paul Commented Sep 26, 2015 at 23:36
1 Answer
Reset to default 8Don't construct an explicit graph (pointers are expensive) -- use pairs of coordinates to represent nodes in the queue and modify your Dijkstra implementation to operate on your 2d array representation directly.
Use an array similar to the costs array to store the (initially tentative) distances calculated by the algorithm.
Dijkstra will calculate the costs to all nodes in a single run, so if your starting point is fixed, running it once should be sufficient -- there is no need to run it millions of times.
P.S.: Created a Jsfiddle running Dijkstra on images: https://goo.gl/5GWwMF
Computes the distances to all points from a mouse click, where darker pixels are interpreted as more expensive...
It bees slower with larger images but didn't manage to crash it so far, but I think for your data it will run out of memory in the browser.
The Dijkstra implementation uses the following interface to access the graph -- I think this should be straight forward to provide on top of your data structure without explicitly generating a "traditional" graph data structure with explicit nodes and edges in memory:
/**
* The interface the Dijkstra implementation below uses
* to access the graph and to store the calculated final
* and intermediate distance data.
*
* @Interface
*/
Graph = function() {};
/**
* Returns the current distance for the given node.
* @param {Object} node
* @return {number}
*/
Graph.currentDistance = function(node) {};
/**
* Stores the current distance for the given node.
* @param {Object} node
* @param {number} distance
*/
Graph.setCurrentDistance = function(node, distance) {};
/**
* Returns an array of connected nodes for the given node,
* including the distances.
*
* @param {Object}
* @return {Array<{cost:number, node:Object}>}
*/
Graph.connections = function(node) {};
P.P.S.: Added code to display the shortes path on all clicks after the first click. Also fixed a bug permitting diagonal movement: https://goo.gl/wXGwiv
So in conclusion, while this probably doesn't scale to 50k x 50x in the browser, I think this shows that Dijkstra operating on the arrays directly is worth trying on the server side, and that an array identical in size to the data array is all that is needed to store all shortest paths from a single point.