# Single-Source Shortest Paths

## Problem Description

This benchmark computes the shortest path from a source node to all nodes in a directed graph with non-negative edge weights by using a modified Bellman-Ford algorithm [1,2].
[1] http://en.wikipedia.org/wiki/Bellman-Ford_algorithm

[2] http://www.walden-family.com/public/bf-history.pdf

## Algorithm

The computation is implemented in a topology-driven manner; all the graph edges are tried for relaxation in every iteration and another iteration is executed if at least one edge was relaxed in the previous iteration. This continues until a fixed-point.for all nodes src { dsrc = dist[src]; for each edge <src, dst, wt> { ddst = dist[dst]; if ddst > dsrc + wt { dist[dst] = dsrc + wt; anotheriteration = true; } } }The body of the outer for loop is implemented as the kernel. The kernel is repeatedly called from the host until a fixed-point is reached.

## Performance

The benchmark is written in CUDA.Compile as:

`make`

Execute as:

`sssp-wlc <graphfile>`

e.g.,

`sssp-wlc USA-road-d.NY.gr`

Graph | Time (ms) |
---|---|

rmat22 | 649.61 |

USA | 126655.91 |

random23 | 265.95 |

The above performance is observed on a 1.45 GHz Quadro 6000 with 6 GB of main memory and 448 cores distributed over 14 SMs. It has 64 kB of fast memory per SM that is split between the L1 data cache and the shared memory.