CS292C-6: Computer-Aided Reasoning for Software

CS292C-6: Computer-Aided Reasoning for Software
Lecture 6: SAT Solvers - CDCL and Conflict Analysis
YF
by Yu Feng
 
Lecture Roadmap
1
DPLL limitations
Understanding why we need better algorithms
2
CDCL algorithm
Core structure and improvements
3
Implication graphs
Tracking decisions and conflicts
4
Advanced techniques
Clause learning and backtracking
DPLL Pseudocode
DPLL(F):
  F ← BCP(F)
  if F = ⊤ return true.  // 1
  if F = ⊥ return false
  p ← choose variable // 2
  return DPLL(F[p := true]) || DPLL(F[p := false]) // 3
The Davis-Putnam-Logemann-Loveland (DPLL) algorithm forms the foundation of modern SAT solving, combining Boolean Constraint Propagation with recursive search.
DPLL Drawbacks
View more
Drawbacks of DPLL
No learning (1)
Discards work when discovering bad assignments (recursive calls don't share information)
Conflict repetition
May repeatedly encounter conflicts from the same cause (conflicts don't guide future search)
Naive decision heuristics (2)
Variable selection heavily impacts performance (lacks sophisticated selection heuristics)
Chronological backtracking (3)
Only backtracks one decision at a time (recursive structure forces sequential exploration)
The CDCL Algorithm
CDCL(F)
  A ← {}
  if BCP(F,A)=conflict then return ⟘
  level ← 0
  while hasUnassignedVars(F)
    level ← level + 1
    A ← A ∪ { DECIDE(F, A) }
    while BCP(F,A) = conflict
      ⟨b, c⟩ ← ANALYZECONFLICT()
      F ← F ∪ {c}
      if b < 0 then return ⟘
      else BACKTRACK(F, A, b)
      level ← b
  return ⟙
Key improvements over DPLL:
Decision heuristics
Choose the next literal based on search state
Learning from mistakes
Augment F with conflict clauses that summarize root causes
Non-chronological backtracking
Backtrack multiple levels based on conflict analysis
The Power of CDCL: Driving Modern Formal Verification
The chart shows how CDCL-based solvers have greatly expanded what we can verify, making formal methods useful for real-world applications. Microsoft used CDCL algorithms in their tools to reduce bugs in Windows, showing the real impact of these advances.
Citation: Marques-Silva, J. P., & Sakallah, K. A. (1999). GRASP: A Search Algorithm for Propositional Satisfiability. IEEE Transactions on Computers, 48(5), 506-521.
What CDCL Adds
Clause Learning
Remembers conflicts
Implication Graph
Tracks decision dependencies
Non-Chronological Backtracking
Jumps multiple levels back
Decision Heuristics
Smarter variable selection
High-Level CDCL Flow
Decide
Pick a literal
Propagate
Apply unit clauses
Analyze Conflict
If conflict occurs
Learn & Backtrack
Add clause, jump back
Core Concepts in CDCL
Variables
Assigned: Given a value (true or false)
Unassigned: No value yet
Example Formula
F = { c1,c2,c3,c4,...,c9 }
c1: ¬x1 ⋁ x2 ⋁ ¬x4
c2: ¬x1 ⋁ ¬x2 ⋁ x3
…
c9: x9 ⋁ x10 ⋁ x3
Clauses
Each clause in the formula can be:
✅ Satisfied: At least one literal is true
❌ Unsatisfied: All literals are false
⚠️ Unit: One unassigned literal
⏳ Unresolved: Otherwise
Example Formula
F = { c1,c2,c3,c4,c5,c6 }

c1: ¬x1 ⋁ x2 ⋁ ¬x4
c2: ¬x1 ⋁ ¬x2 ⋁ x3
c3: ¬x3 ⋁ ¬x4
c4: x4 ⋁ x5 ⋁ x6
c5: x7 ⋁ ¬x5
c6: ¬x6 ⋁ x7 ⋁ ¬x8
We'll use this formula in our implication graph example
What is an Implication Graph?
An implication graph G = (V, E) is a directed acyclic graph (DAG) that records the history of decisions and resulting deductions derived with Boolean Constraint Propagation (BCP).
Nodes (V): Each v∈V is a literal (or κ) and the decision level at which it entered the current partial assignment (PA)
Edges (E): ⟨v, w⟩ ∈ E iff v ≠ w, ¬v ∈ antecedent(w), and ⟨v, w⟩ is labeled with antecedent(w)
F = { c1,c2,c3,c4,c5,c6 }
c1: ¬x1 ⋁ x2 ⋁ ¬x4
c2: ¬x1 ⋁ ¬x2 ⋁ x3
c3: ¬x3 ⋁ ¬x4
c4: x4 ⋁ x5 ⋁ x6
c5: x7 ⋁ ¬x5
c6: ¬x6 ⋁ x7 ⋁ ¬x8
Implication graph for conflict analysis
A conflict clause is implied by formula F and it blocks partial assignments (PAs) that lead to the current conflict.
Examples of conflict clauses:
¬x1⋁¬x4
¬x1⋁x7⋁¬x8
Every cut that separates sources (decision variables) from the sink (conflict node) defines a valid conflict clause. The conflict clause prevents the solver from exploring the same failing search space repeatedly.
What is a UIP?
A unique implication point (UIP) is any node in the implication graph other than the conflict that is on all paths from the current decision literal (lit@d) to the conflict (κ@d).
UIPs represent "bottleneck" nodes in the implication graph - all paths from the decision to the conflict must pass through them.
A first UIP is the UIP that is closest to the conflict, making it particularly valuable for clause learning.
Clause Learning via Resolution
Conflict analysis derives learned clauses through resolution:
Start with clause labeling incoming edge to conflict node, derive new clauses via resolution until we find literal in first UIP
In current clause c, find last assigned literal l in c
Pick any incoming edge to l labeled with clause c'
Resolve c and c' 
Set current clause be resolvent of c and c′
Repeat until current clause contains negation of the first UIP literal (as the single literal at current decision level)
Resolution Example
1
Initial Conflict
c = c2: ¬x1 ⋁ ¬x2 ⋁ x3
Last assigned: x2
2
Find Edge
c1: ¬x1 ⋁ x2 ⋁ ¬x4
3
Resolve
¬x1 ⋁ x3 ⋁ ¬x4
4
… Final Clause
¬x1 ⋁ ¬x4
The learned clause ¬x1 ⋁ ¬x4 contains the first UIP, preventing exploration of the same failing search space.
Backtracking with Learned Clause
New Clause
(¬x1@1 ∨ ¬x4@3)
Added to formula
Jump Level
Second highest decision level among literals in clause
x1@1 → Jump to level 1
CDCL vs DPLL
Decision Heuristics
Which variable should we decide next?
Strategic Selection
Different heuristics target different problem characteristics
Performance Impact
Good heuristics can reduce search space by orders of magnitude
Modern Approaches
Activity-based and conflict-driven variable selection
CDCL(F)
  A ← {}
  if BCP(F,A)=conflict then return ⟘
  level ← 0
  while hasUnassignedVars(F)
    level ← level + 1
    A ← A ∪ { DECIDE(F, A) }
    while BCP(F,A) = conflict
      ⟨b, c⟩ ← ANALYZECONFLICT()
      F ← F ∪ {c}
      if b < 0 then return ⟘
      else BACKTRACK(F, A, b)
      level ← b
  return ⟙
Dynamic Largest Individual Sum (DLIS):
Choose the literal that satisfies the most unresolved clauses
Expensive: complexity of making a decision is proportional to the number of clauses
This heuristic prioritizes variables that appear frequently in the current set of clauses, potentially satisfying more constraints with each decision. While effective, DLIS requires recounting occurrences after each decision and unit propagation step.
Modern solvers often use variants like VSIDS (Variable State Independent Decaying Sum) that maintain "activity" scores for variables, incrementing them when variables appear in conflicts and periodically decaying all scores.
VSIDS Heuristic
Variable State Independent Decaying Sum
Conflict-based scoring
Assigns higher scores to variables involved in recent conflicts
Activity tracking
Score literals by frequency of appearance in conflict analysis
Temporal relevance
Periodically decay all scores to favor recent conflicts
Decision making
Always select unassigned variable with highest score for next decision
Mahajan, Y.S., Fu, Z., & Malik, S. (2005). Zchaff2004: An Efficient SAT Solver.
Summary
✅ CDCL adds learning and smart search
Conflict-Driven Clause Learning enhances DPLL with powerful heuristics
✅ Implication graphs = backbone of clause learning
Visual representation of variable assignments and their consequences
✅ UIP → resolution → backjump
Process for deriving learned clauses and non-chronological backtracking
✅ Next step: build your own CDCL-based solver!
Apply these concepts to create an efficient SAT solver implementation
What's Next?
Extend your SAT solver (HW2)
Apply CDCL concepts to improve your implementation
Optional: Try implementing VSIDS
Variable State Independent Decaying Sum - modern heuristic for variable selection
Next: SMT and decision procedures
Moving beyond SAT to Satisfiability Modulo Theories