Triangle Counting

Problem Description

This benchmark counts the number of triangles in a given undirected graph. It implements the approach from Polak [1] in IrGL[2].

[1] Adam Polak. Counting triangles in large graphs on GPU. In IPDPS Workshops 2016, pages 740–746, 2016
[2] https://users.ices.utexas.edu/~sreepai/sree-oopsla2016.pdf

Algorithm

The algorithm preprocess the given graph using a filtering step, which removes edges that point from nodes of a higher degree to those of lower degree, breaking ties by using node identifiers. The remaining edges are considered as active edges. Then edges of each node are sorted using ModernGPU implementation. Finally, the triangles are counted by intersecting the list of edges remaining from the first step are intersected to determine. The computation is initially parallelized over the nodes, and IrGL's nested parallelism optimization is used to dynamically parallelize execution over edges at runtime. The pseudo code of the algorithm is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 /* Preprocess the graph to filter the edges */ /* sort edge lists for all nodes */ num_triangles = 0; for (index_type v = 0 + tid; v < num_vertices; v += nthreads) /* Initialization code for Nested Parallelism in IrGL */ for (int j = threadIdx.x; j < num_edges(v) ; j += BLKSIZE) Node n = e.src; Node m = e.dst; NodeSet n_ngh = {s| s is n's neighbor and n < s < m}; NodeSet m_ngh = {t| t is m's neighbor and n < t < m}; num_triangles += |n_ngh intersect m_mgh|; } }

Figure 2: Pseudocode for triangle counting algorithm in IrGL

Performance

GraphTime (ms) No. of Triangles
rmat226820 1989348515
livejournal463 285730264

The above performance is observed on a 1733.500 Mhz GeForce GTX 1080 6000 with 8119 MB of main memory.