# ON THE ACCELERATION OF SHORTEST PATH CALCULATIONS IN TRANSPORTATION NETWORKS

## Abstract

Shortest path algorithms are a key element of many graph problems. They are used in such applications as online direction finding and navigation, as well as modeling of traffic for large scale simulations of major metropolitan areas. As the shortest path algorithms are an execution bottleneck, it is beneficial to move their execution to parallel hardware such as Field-Programmable Gate Arrays (FPGAs). Hardware implementation is accomplished through the use of a small A core replicated on the order of 20 times on an FPGA device. The objective is to maximize the use of on-board random-access memory bandwidth through the use of multi-threaded latency tolerance. Each shortest path core is responsible for one shortest path calculation, and when it is finished it outputs its result and requests the next source from a queue. One of the innovations of this approach is the use of a small bubble sort core to produce the extract-min function. While bubble sort is not usually considered an appropriate algorithm for any non-trivial usage, it is appropriate in this case as it can produce a single minimum out of the list in O(n) cycles, whwere n is the number of elements in the vertext list. The costmore »

- Authors:

- Los Alamos National Laboratory

- Publication Date:

- Research Org.:
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

- OSTI Identifier:
- 1000496

- Report Number(s):
- LA-UR-07-0085

TRN: US201101%%364

- DOE Contract Number:
- AC52-06NA25396

- Resource Type:
- Conference

- Resource Relation:
- Conference: IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTON COMPUTING MACH ; 200704 ; NAPA

- Country of Publication:
- United States

- Language:
- English

- Subject:
- 99; ACCELERATION; ALGORITHMS; ARCHITECTURE; BUBBLES; IMPLEMENTATION; MEMORY MANAGEMENT; NAVIGATION; QUEUES; SIMULATION; TOLERANCE; URBAN AREAS

### Citation Formats

```
BAKER, ZACHARY K., and GOKHALE, MAYA B..
```*ON THE ACCELERATION OF SHORTEST PATH CALCULATIONS IN TRANSPORTATION NETWORKS*. United States: N. p., 2007.
Web.

```
BAKER, ZACHARY K., & GOKHALE, MAYA B..
```*ON THE ACCELERATION OF SHORTEST PATH CALCULATIONS IN TRANSPORTATION NETWORKS*. United States.

```
BAKER, ZACHARY K., and GOKHALE, MAYA B.. Mon .
"ON THE ACCELERATION OF SHORTEST PATH CALCULATIONS IN TRANSPORTATION NETWORKS". United States.
doi:. https://www.osti.gov/servlets/purl/1000496.
```

```
@article{osti_1000496,
```

title = {ON THE ACCELERATION OF SHORTEST PATH CALCULATIONS IN TRANSPORTATION NETWORKS},

author = {BAKER, ZACHARY K. and GOKHALE, MAYA B.},

abstractNote = {Shortest path algorithms are a key element of many graph problems. They are used in such applications as online direction finding and navigation, as well as modeling of traffic for large scale simulations of major metropolitan areas. As the shortest path algorithms are an execution bottleneck, it is beneficial to move their execution to parallel hardware such as Field-Programmable Gate Arrays (FPGAs). Hardware implementation is accomplished through the use of a small A core replicated on the order of 20 times on an FPGA device. The objective is to maximize the use of on-board random-access memory bandwidth through the use of multi-threaded latency tolerance. Each shortest path core is responsible for one shortest path calculation, and when it is finished it outputs its result and requests the next source from a queue. One of the innovations of this approach is the use of a small bubble sort core to produce the extract-min function. While bubble sort is not usually considered an appropriate algorithm for any non-trivial usage, it is appropriate in this case as it can produce a single minimum out of the list in O(n) cycles, whwere n is the number of elements in the vertext list. The cost of this min operation does not impact the running time of the architecture, because the queue depth for fetching the next set of edges from memory is roughly equivalent to the number of cores in the system. Additionally, this work provides a collection of simulation results that model the behavior of the node queue in hardware. The results show that a hardware queue, implementing a small bubble-type minimum function, need only be on the order of 16 elements to provide both correct and optimal paths. Because the graph database size is measured in the hundreds of megabytes, the Cray SRAM memory is insufficient. In addition to the A* cores, they have developed a memory management system allowing round-robin servicing of the nodes as well as virtual memory managed over the Hypertransport bus. With support for a DRAM graph store with SRAM-based caching on the FPGA, the system provides a speedup of roughly 8.9x over the CPU-based implementation.},

doi = {},

journal = {},

number = ,

volume = ,

place = {United States},

year = {Mon Jan 08 00:00:00 EST 2007},

month = {Mon Jan 08 00:00:00 EST 2007}

}