# Survey Propagation

## Application Description

This benchmark implements Survey Propagation, a heuristic
SAT-solver based on Bayesian inference [1]. The algorithm
represents the Boolean formula as a *factor graph*,
which is a bipartite graph with variables on one side and
clauses on the other. An edge connects a variable to a
clause if the variable participates in the clause. The
edge is given a value of -1 if the literal in the clause
is negated, and +1 otherwise. The general strategy of
Survey Propagation is to iteratively update each variable
with the likelihood that it should be assigned a truth
value of *true* or *false*. The data parallelism
in this algorithm arises from the worklist of all the
nodes (both variables and clauses) that need to be processed.

[1] A. Braunstein, M. Mezard, and R. Zecchina. Survey Propagation: An Algorithm for Satisfiability. Random Structures and Algorithms, 27:201-226, 2005.

## Algorithm

The algorithm proceeds as follows (pseudocode is provided in Figure 1). Each phase of the algorithm first iterates over the variables and clauses of the formula updating 'surveys' until all updates are below some small epsilon. Then, the surveys are processed to find the most biased variables which are fixed to a specific value. The graph is then reduced by removing fixed variables. If only trivial surveys exist, the problem is passed on to a simpler solver. If remaining terms exist, a new phase is started.

The termination condition for Survey Propagation is fairly complex: when the number of variables is small enough, the Survey Propagation iterations are terminated, and the remaining problem is solved using a local heuristic such as WalkSAT. Alternatively, if there is no progress after some number of iterations, the algorithm may just give up.

The benchmark performs as many phases of the algorithm as necessary until no more variables can be fixed.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | FactorGraph f = /* read initial formula */; do { Workset ws = new Workset(); ws.addAll(Variables); foreach (Variable v : ws) { foreach (Clause c : f.getNeighbors(v)) { update_eta_and_bias(v, c); if (/* eta changed more than epsilon */) { ws.push(f.getNeighbors(f.getNeighbors(v))); } } } foreach (Variable v : Variables) { if (/* bias greater than threshold bias */) { // fix v and remove from graph } } } while (/* has interesting surveys */); |

Figure 1: Pseudocode for Survey Propagation.

## Data Structures

There are two key data structures used in Survey Propagation:

- Unordered Set
- Because the algorithm is based on iteratively updating the values of variables chosen at random, the worklist can be represented as an unordered set. There are no ordering constraints on processing the elements of the worklist (although, as discussed above, different orders of processing may lead to different algorithmic behavior).
- Factor Graph
- The bipartite graph representing the Boolean formula.

## Parallelism

The active nodes in Survey Propagation are variable nodes of the factor graph. Initially every variable node is active. As nodes are processed, if the surveys on edges change, the nodes' neighbors' neighbors (other variables in common clauses) are added to the worklist. The neighborhood of each active node is the neighbors and neighbors' neighbors.

Each iteration of survey propagation touches a single node and its neighbors and neighbors' neighbors in the factor graph. Iterations conflict with one another if those neighborhoods overlap. The structure of the graph is largely constant, except for occasionally removing nodes when they are stable. Thus, the available parallelism reflects the connectivity of the graph and remains roughly constant, dropping occasionally as nodes are removed from the graph.

## Performance

Figure 2 and Figure 3 show the execution time in seconds and self-relative speedup (speedup with respect to the single thread performance), respectively, of Survey Propagation for a random input consisting of 1,000,000 variables and 3,000,000 clauses.

Since Survey Propagation is a heuristic to solve SAT, it is non-deterministic and direct comparisons to full SAT solvers would not be appropriate.

## Machine Description

Performance numbers are collected on a 4 package (14 cores per package) Intel(R) Xeon(R) Gold 5120 CPU machine at 2.20GHz from 1 thread to 56 threads in increments of 7 threads. The machine has 192GB of RAM. The operating system is CentOS Linux release 7.5.1804. All runs of the Galois benchmarks used gcc/g++ 7.2 to compile.