3.7 Building a FOL Agent with Z3

The propositional agent from Section 3.6: Building a Knowledge-Based Agent works well, but its encoding is grounded: one Bool variable and one biconditional per square per predicate. For a \(4 \times 4\) grid this means 48 biconditionals. For a \(100 \times 100\) grid it would mean 30,000. The physics rules are the same at every square—only the neighbors change—yet we must repeat them for each location. First-order logic eliminates this redundancy: a single quantified sentence like \(\forall L\, \text{Creaking}(L) \Leftrightarrow \exists L'\, \text{Adjacent}(L, L') \land \text{Damaged}(L')\) covers every location at once.

In this section we extend the Z3-based agent from Section 3.6: Building a Knowledge-Based Agent to use quantified FOL sentences. Since we already know Z3's Solver, push/pop, and z3_entails from Section 3.6: Building a Knowledge-Based Agent, we can jump straight to the new FOL machinery: DeclareSort, Function, ForAll, and Exists.

3.7.1 From Propositions to Predicates

3.7.1.1 Defining the Domain

In propositional logic, we created one Bool per square: Bool(f'D_{x}_{y}'). In FOL, we instead declare a Location sort and define predicates as Z3 Function objects:

from z3 import DeclareSort, Function, BoolSort, Const, ForAll, Exists, Distinct

Location = DeclareSort('Location')

Damaged_fn  = Function('Damaged',  Location, BoolSort())
Forklift_fn = Function('Forklift', Location, BoolSort())
Creaking_fn = Function('Creaking', Location, BoolSort())
Rumbling_fn = Function('Rumbling', Location, BoolSort())
Safe_fn     = Function('Safe',     Location, BoolSort())
Adjacent_fn = Function('Adjacent', Location, Location, BoolSort())

Each Function maps Location (or pairs of locations) to booleans—exactly the predicates from Section 3.4.5: Predicates: Properties and Relations in Section 3.4: First-Order Logic. We also create a constant for each grid square:

loc = {}
for x in range(1, width + 1):
    for y in range(1, height + 1):
        loc[(x, y)] = Const(f'L_{x}_{y}', Location)

Compare with the propositional approach from Section 3.6: Building a Knowledge-Based Agent, where damaged(3, 1) returned Bool('D_3_1'). Here, Damaged_fn(loc[(3, 1)]) applies the predicate Damaged to the location constant L_3_1—a fundamentally different representation.

3.7.1.2 Quantified Physics Rules

Now we can write the creaking rule as a single quantified sentence:

L  = Const('L',  Location)
Lp = Const('Lp', Location)

solver.add(ForAll(L,
    Creaking_fn(L) == Exists(Lp, And(Adjacent_fn(L, Lp), Damaged_fn(Lp)))
))

This is a direct translation of equation (3.5) from Section 3.4: First-Order Logic: \[\forall L\ \text{Creaking}(L) \Leftrightarrow \exists L'\ \text{Adjacent}(L, L') \land \text{Damaged}(L')\]

One sentence for all locations—no loop over grid squares. The rumbling and safety rules are equally concise:

solver.add(ForAll(L,
    Rumbling_fn(L) == Exists(Lp, And(Adjacent_fn(L, Lp), Forklift_fn(Lp)))
))

solver.add(ForAll(L,
    Safe_fn(L) == And(Not(Damaged_fn(L)), Not(Forklift_fn(L)))
))

Three sentences encode the entire physics of the warehouse, regardless of whether the grid is \(4 \times 4\) or \(100 \times 100\). Compare this with the propositional agent from Section 3.6: Building a Knowledge-Based Agent, which needed 48 grounded biconditionals for a \(4 \times 4\) grid.

3.7.1.3 Structural Facts: Adjacency and Domain Closure

The quantified rules express general physics, but the solver also needs to know the structure of the grid: which locations exist and which pairs are adjacent. These structural facts do require enumeration—but they encode the topology, not the physics.

Adjacency is a closed-world assertion: every pair of grid squares is either adjacent or not.

for x in range(1, width + 1):
    for y in range(1, height + 1):
        adj_set = set(get_adjacent(x, y, width, height))
        for x2 in range(1, width + 1):
            for y2 in range(1, height + 1):
                if (x2, y2) in adj_set:
                    solver.add(Adjacent_fn(loc[(x, y)], loc[(x2, y2)]))
                else:
                    solver.add(Not(Adjacent_fn(loc[(x, y)], loc[(x2, y2)])))

Domain closure is a subtlety that arises because Z3's DeclareSort creates an uninterpreted sort—the solver is free to imagine additional elements beyond our grid constants. Without domain closure, ForAll(L, ...) ranges over these phantom locations too. This breaks process-of-elimination reasoning: if the agent hears creaking at \((2,1)\), the solver should conclude that one of \((2,1)\)'s neighbors has damaged floor. But if phantom locations could also be adjacent to \((2,1)\) and absorb the blame, the solver can construct models where no real grid square is damaged—and the entailment fails.

The fix is a single axiom stating that every Location is one of our grid constants:

solver.add(ForAll(L,
    Or([L == loc[(x, y)]
        for x in range(1, width + 1)
        for y in range(1, height + 1)])
))
solver.add(Distinct(list(loc.values())))

With domain closure, ForAll and Exists range only over actual grid squares, and the FOL encoding produces exactly the same entailment results as the propositional version from Section 3.6: Building a Knowledge-Based Agent.

3.7.1.4 The Complete FOL KB Builder

Putting it all together, build_warehouse_kb_fol creates the solver, declares the domain, adds structural facts, and encodes the quantified rules:

def build_warehouse_kb_fol(width=4, height=4):
    Location = DeclareSort('Location')

    Damaged_fn  = Function('Damaged',  Location, BoolSort())
    Forklift_fn = Function('Forklift', Location, BoolSort())
    Creaking_fn = Function('Creaking', Location, BoolSort())
    Rumbling_fn = Function('Rumbling', Location, BoolSort())
    Safe_fn     = Function('Safe',     Location, BoolSort())
    Adjacent_fn = Function('Adjacent', Location, Location, BoolSort())

    loc = {}
    for x in range(1, width + 1):
        for y in range(1, height + 1):
            loc[(x, y)] = Const(f'L_{x}_{y}', Location)

    solver = Solver()

    # Domain closure
    L = Const('L', Location)
    solver.add(ForAll(L,
        Or([L == loc[(x, y)]
            for x in range(1, width + 1)
            for y in range(1, height + 1)])
    ))
    solver.add(Distinct(list(loc.values())))

    # Adjacency (closed-world)
    for x in range(1, width + 1):
        for y in range(1, height + 1):
            adj_set = set(get_adjacent(x, y, width, height))
            for x2 in range(1, width + 1):
                for y2 in range(1, height + 1):
                    if (x2, y2) in adj_set:
                        solver.add(Adjacent_fn(loc[(x, y)], loc[(x2, y2)]))
                    else:
                        solver.add(Not(Adjacent_fn(loc[(x, y)], loc[(x2, y2)])))

    # Quantified physics rules
    Lp = Const('Lp', Location)

    solver.add(ForAll(L,
        Creaking_fn(L) == Exists(Lp, And(Adjacent_fn(L, Lp), Damaged_fn(Lp)))
    ))
    solver.add(ForAll(L,
        Rumbling_fn(L) == Exists(Lp, And(Adjacent_fn(L, Lp), Forklift_fn(Lp)))
    ))
    solver.add(ForAll(L,
        Safe_fn(L) == And(Not(Damaged_fn(L)), Not(Forklift_fn(L)))
    ))

    # Initial knowledge
    solver.add(Safe_fn(loc[(1, 1)]))

    predicates = {
        'Creaking': Creaking_fn, 'Rumbling': Rumbling_fn,
        'Safe': Safe_fn, 'Damaged': Damaged_fn,
        'Forklift': Forklift_fn, 'Adjacent': Adjacent_fn,
    }
    return solver, loc, predicates

The function returns a (solver, loc, predicates) tuple. The agent stores loc (to map grid coordinates to Z3 location constants) and predicates (to construct TELL and ASK expressions).

3.7.2 The Z3 FOL Agent

The WarehouseZ3Agent class mirrors the WarehouseKBAgent from Section 3.6: Building a Knowledge-Based Agent in decision strategy, path planning, and action conversion. The difference is the reasoning engine: Z3 with quantified FOL rules replaces the propositional grounded encoding.

3.7.2.1 Initialization

The agent builds the FOL KB and stores the location map and predicates:

def __init__(self, env):
    self.env = env
    self.solver, self.loc, self.preds = build_warehouse_kb_fol(
        env.width, env.height
    )
    # ... position, direction, visited, known_safe, etc.

3.7.2.2 TELL: Percepts as FOL Assertions

When the agent perceives creaking or rumbling, it TELLs the solver by applying the predicate function to the current location constant:

def tell_percepts(self, percept):
    L = self.loc[(self.x, self.y)]
    if percept.creaking:
        self.solver.add(self.preds['Creaking'](L))
    else:
        self.solver.add(Not(self.preds['Creaking'](L)))
    if percept.rumbling:
        self.solver.add(self.preds['Rumbling'](L))
    else:
        self.solver.add(Not(self.preds['Rumbling'](L)))

Compare with the propositional agent's solver.add(creaking_at(x, y)) from Section 3.6: Building a Knowledge-Based Agent. Here we write self.preds['Creaking'](L) — applying a FOL predicate to a location constant, rather than looking up a ground propositional symbol.

3.7.2.3 ASK: Safety Queries

The agent queries safety using z3_entails with the Safe predicate:

def update_safety(self):
    for x in range(1, self.env.width + 1):
        for y in range(1, self.env.height + 1):
            pos = (x, y)
            if pos in self.known_safe or pos in self.known_dangerous:
                continue
            L = self.loc[pos]
            if z3_entails(self.solver, self.preds['Safe'](L)):
                self.known_safe.add(pos)
            elif z3_entails(self.solver, Not(self.preds['Safe'](L))):
                self.known_dangerous.add(pos)

The rest of the agent—plan_path, path_to_actions, choose_action, execute_action, run—is identical to Section 3.6: Building a Knowledge-Based Agent.

3.7.3 Complete Implementation

The following module contains the complete FOL agent. It includes the quantified FOL encoding used by the agent and the full agent class.

Source code for file warehouse_z3_agent.py

"""
FOL Agent for the Hazardous Warehouse (Z3 Version)

Uses Z3's SMT solver with quantified first-order logic to reason about
safety and navigate the warehouse to retrieve the package.

This agent implements the same TELL/ASK loop as warehouse_kb_agent.py
but replaces the DPLL-based PropKB with Z3's Solver and expresses the
physics rules as quantified FOL sentences:
  1. TELL the solver about percepts (solver.add)
  2. ASK via entailment check (push/Not(query)/check/pop)
  3. Plan a path through safe squares toward the goal
  4. Execute actions and repeat

The physics rules are expressed as single universally quantified sentences:
  - ForAll L, Creaking(L) <=> Exists L', Adjacent(L,L') & Damaged(L')
  - ForAll L, Rumbling(L) <=> Exists L', Adjacent(L,L') & Forklift(L')
  - ForAll L, Safe(L) <=> ~Damaged(L) & ~Forklift(L)
"""

from collections import deque

from z3 import (
    Or, And, Not, Solver, unsat,
    DeclareSort, Function, BoolSort, Const, ForAll, Exists, Distinct,
)
from hazardous_warehouse_env import (
    HazardousWarehouseEnv,
    Action,
    Direction,
)


# ---------------------------------------------------------------------------
# Z3 Entailment Check
# ---------------------------------------------------------------------------

def z3_entails(solver, query):
    """Check whether the solver's current assertions entail *query*.

    Uses the refutation method: push a checkpoint, assert Not(query),
    and check satisfiability.  If unsat, the negated query is
    inconsistent with the KB --- meaning the KB entails the query.
    Pop restores the solver to its previous state.
    """
    solver.push()
    solver.add(Not(query))
    result = solver.check() == unsat
    solver.pop()
    return result


# ---------------------------------------------------------------------------
# Adjacency
# ---------------------------------------------------------------------------

def get_adjacent(x, y, width=4, height=4):
    """Return the list of (x, y) positions adjacent to (x, y)."""
    result = []
    for dx, dy in [(-1, 0), (1, 0), (0, -1), (0, 1)]:
        nx, ny = x + dx, y + dy
        if 1 <= nx <= width and 1 <= ny <= height:
            result.append((nx, ny))
    return result


# ---------------------------------------------------------------------------
# Knowledge-Base Construction (Quantified FOL Encoding)
# ---------------------------------------------------------------------------

def build_warehouse_kb_fol(width=4, height=4):
    """Build a Z3 Solver using quantified first-order logic.

    The physics rules are expressed as single quantified sentences ---
    one per rule, independent of the grid size:

        ForAll L, Creaking(L) == Exists L', Adjacent(L,L') & Damaged(L')
        ForAll L, Rumbling(L) == Exists L', Adjacent(L,L') & Forklift(L')
        ForAll L, Safe(L) == And(Not(Damaged(L)), Not(Forklift(L)))

    Structural facts (adjacency, domain closure) require enumeration over
    grid squares, but these encode the grid topology, not the physics.

    Returns (solver, loc, predicates) where:
      - solver: Z3 Solver with all constraints
      - loc: dict mapping (x,y) to Z3 Location constants
      - predicates: dict mapping names to Z3 Function objects
    """
    Location = DeclareSort('Location')

    # Uninterpreted functions (predicates)
    Damaged_fn = Function('Damaged', Location, BoolSort())
    Forklift_fn = Function('Forklift', Location, BoolSort())
    Creaking_fn = Function('Creaking', Location, BoolSort())
    Rumbling_fn = Function('Rumbling', Location, BoolSort())
    Safe_fn = Function('Safe', Location, BoolSort())
    Adjacent_fn = Function('Adjacent', Location, Location, BoolSort())

    # Location constants --- one per grid square
    loc = {}
    for x in range(1, width + 1):
        for y in range(1, height + 1):
            loc[(x, y)] = Const(f'L_{x}_{y}', Location)

    solver = Solver()

    # --- Domain closure: every Location is one of our grid constants ---
    # Without this, ForAll could range over phantom locations that absorb
    # damage/forklift blame, breaking process-of-elimination reasoning.
    L = Const('L', Location)
    solver.add(ForAll(L,
        Or([L == loc[(x, y)]
            for x in range(1, width + 1)
            for y in range(1, height + 1)])
    ))

    # All location constants are distinct
    solver.add(Distinct(list(loc.values())))

    # --- Adjacency facts (closed-world) ---
    # Every pair of grid squares is either adjacent or not.
    for x in range(1, width + 1):
        for y in range(1, height + 1):
            adj_set = set(get_adjacent(x, y, width, height))
            for x2 in range(1, width + 1):
                for y2 in range(1, height + 1):
                    if (x2, y2) in adj_set:
                        solver.add(Adjacent_fn(loc[(x, y)], loc[(x2, y2)]))
                    else:
                        solver.add(Not(Adjacent_fn(loc[(x, y)], loc[(x2, y2)])))

    # --- Quantified physics rules ---
    # One sentence each --- no loop over grid squares.
    Lp = Const('Lp', Location)

    # Creaking rule
    solver.add(ForAll(L,
        Creaking_fn(L) == Exists(Lp, And(Adjacent_fn(L, Lp), Damaged_fn(Lp)))
    ))

    # Rumbling rule
    solver.add(ForAll(L,
        Rumbling_fn(L) == Exists(Lp, And(Adjacent_fn(L, Lp), Forklift_fn(Lp)))
    ))

    # Safety rule
    solver.add(ForAll(L,
        Safe_fn(L) == And(Not(Damaged_fn(L)), Not(Forklift_fn(L)))
    ))

    # --- Initial knowledge ---
    solver.add(Safe_fn(loc[(1, 1)]))

    predicates = {
        'Creaking': Creaking_fn,
        'Rumbling': Rumbling_fn,
        'Safe': Safe_fn,
        'Damaged': Damaged_fn,
        'Forklift': Forklift_fn,
        'Adjacent': Adjacent_fn,
    }

    return solver, loc, predicates


# ---------------------------------------------------------------------------
# Turning Helpers
# ---------------------------------------------------------------------------

_DIRECTION_ORDER = [Direction.NORTH, Direction.EAST, Direction.SOUTH, Direction.WEST]


def _direction_index(d):
    return _DIRECTION_ORDER.index(d)


def turns_between(current, target):
    """Return a list of TURN_LEFT / TURN_RIGHT actions to face *target*.

    Chooses the shortest rotation direction.
    """
    if current == target:
        return []
    ci = _direction_index(current)
    ti = _direction_index(target)
    right_steps = (ti - ci) % 4   # clockwise
    left_steps = (ci - ti) % 4    # counter-clockwise
    if right_steps <= left_steps:
        return [Action.TURN_RIGHT] * right_steps
    else:
        return [Action.TURN_LEFT] * left_steps


def delta_to_direction(dx, dy):
    """Map a movement delta to the Direction enum."""
    return {
        (0, 1): Direction.NORTH,
        (0, -1): Direction.SOUTH,
        (1, 0): Direction.EAST,
        (-1, 0): Direction.WEST,
    }[(dx, dy)]


# ---------------------------------------------------------------------------
# Z3 FOL Knowledge-Based Agent
# ---------------------------------------------------------------------------

class WarehouseZ3Agent:
    """A knowledge-based agent using Z3 FOL for the Hazardous Warehouse.

    Mirrors WarehouseKBAgent from warehouse_kb_agent.py with the same
    decision strategy, path planning, and action conversion logic.
    The difference is the reasoning engine: Z3 with quantified FOL
    rules replaces the DPLL-based propositional WarehouseKB.

    Decision strategy (in priority order):
      1. If the beacon is detected, GRAB the package.
      2. If carrying the package, navigate to (1,1) and EXIT.
      3. Otherwise, explore the nearest safe unvisited square.
      4. If no safe unvisited square is reachable, return to (1,1) and EXIT.
    """

    def __init__(self, env):
        self.env = env
        self.solver, self.loc, self.preds = build_warehouse_kb_fol(
            env.width, env.height
        )
        self.x = 1
        self.y = 1
        self.direction = Direction.EAST
        self.has_package = False
        self.visited = {(1, 1)}
        self.known_safe = {(1, 1)}
        self.known_dangerous = set()
        self.action_queue = []
        self.step_count = 0

    # ----- Percepts ----------------------------------------------------------

    def tell_percepts(self, percept):
        """Translate a Percept into Z3 FOL assertions and add to the solver."""
        L = self.loc[(self.x, self.y)]
        if percept.creaking:
            self.solver.add(self.preds['Creaking'](L))
        else:
            self.solver.add(Not(self.preds['Creaking'](L)))
        if percept.rumbling:
            self.solver.add(self.preds['Rumbling'](L))
        else:
            self.solver.add(Not(self.preds['Rumbling'](L)))

    # ----- Safety queries ----------------------------------------------------

    def update_safety(self):
        """Check entailment for every square whose status is still unknown."""
        for x in range(1, self.env.width + 1):
            for y in range(1, self.env.height + 1):
                pos = (x, y)
                if pos in self.known_safe or pos in self.known_dangerous:
                    continue
                L = self.loc[pos]
                if z3_entails(self.solver, self.preds['Safe'](L)):
                    self.known_safe.add(pos)
                elif z3_entails(self.solver, Not(self.preds['Safe'](L))):
                    self.known_dangerous.add(pos)

    # ----- Path planning -----------------------------------------------------

    def plan_path(self, start, goal_set):
        """BFS through known-safe squares from *start* to any cell in *goal_set*.

        Returns a list of (x, y) positions forming the path (including
        *start* and the reached goal), or None if no path exists.
        """
        queue = deque([(start, [start])])
        seen = {start}
        while queue:
            (cx, cy), path = queue.popleft()
            if (cx, cy) in goal_set:
                return path
            for nx, ny in get_adjacent(cx, cy, self.env.width, self.env.height):
                if (nx, ny) not in seen and (nx, ny) in self.known_safe:
                    seen.add((nx, ny))
                    queue.append(((nx, ny), path + [(nx, ny)]))
        return None

    def path_to_actions(self, path):
        """Convert a position path into a sequence of Actions.

        Returns (actions, final_direction) where *actions* is the list of
        TURN_LEFT / TURN_RIGHT / FORWARD actions and *final_direction* is
        the direction the robot faces after executing them all.
        """
        actions = []
        direction = self.direction
        for i in range(1, len(path)):
            dx = path[i][0] - path[i - 1][0]
            dy = path[i][1] - path[i - 1][1]
            target_dir = delta_to_direction(dx, dy)
            actions.extend(turns_between(direction, target_dir))
            actions.append(Action.FORWARD)
            direction = target_dir
        return actions, direction

    # ----- Decision logic ----------------------------------------------------

    def choose_action(self, percept):
        """Select the next action based on the current state of knowledge."""
        # Execute queued actions first (from a multi-step plan).
        if self.action_queue:
            return self.action_queue.pop(0)

        # 1. If the beacon is on, grab the package.
        if percept.beacon and not self.has_package:
            return Action.GRAB

        # 2. If carrying the package, navigate home and exit.
        if self.has_package:
            if (self.x, self.y) == (1, 1):
                return Action.EXIT
            path = self.plan_path((self.x, self.y), {(1, 1)})
            if path and len(path) > 1:
                actions, _ = self.path_to_actions(path)
                self.action_queue = actions[1:]
                return actions[0]
            # Already at (1,1) or can't find path — just exit.
            return Action.EXIT

        # 3. Explore the nearest safe unvisited square.
        safe_unvisited = self.known_safe - self.visited
        if safe_unvisited:
            path = self.plan_path((self.x, self.y), safe_unvisited)
            if path and len(path) > 1:
                actions, _ = self.path_to_actions(path)
                self.action_queue = actions[1:]
                return actions[0]

        # 4. Nothing left to explore — go home and exit.
        if (self.x, self.y) == (1, 1):
            return Action.EXIT
        path = self.plan_path((self.x, self.y), {(1, 1)})
        if path and len(path) > 1:
            actions, _ = self.path_to_actions(path)
            self.action_queue = actions[1:]
            self.action_queue.append(Action.EXIT)
            return actions[0]
        return Action.EXIT

    # ----- Execution ---------------------------------------------------------

    def execute_action(self, action):
        """Send *action* to the environment and update internal bookkeeping."""
        percept, reward, done, info = self.env.step(action)

        if action == Action.FORWARD and not percept.bump:
            dx, dy = self.direction.delta()
            self.x += dx
            self.y += dy
            self.visited.add((self.x, self.y))
        elif action == Action.TURN_LEFT:
            self.direction = self.direction.turn_left()
        elif action == Action.TURN_RIGHT:
            self.direction = self.direction.turn_right()
        elif action == Action.GRAB and info.get("grabbed"):
            self.has_package = True

        self.step_count += 1
        return percept, reward, done, info

    # ----- Main loop ---------------------------------------------------------

    def run(self, verbose=True):
        """Run the full perceive-tell-ask-act loop until the episode ends."""
        # Process the initial percept at (1, 1).
        percept = self.env._last_percept
        self.tell_percepts(percept)
        self.update_safety()

        if verbose:
            print(f"Start at ({self.x},{self.y}) facing {self.direction.name}")
            print(f"  Percept: {percept}")
            print(f"  Known safe: {sorted(self.known_safe)}")

        while True:
            action = self.choose_action(percept)
            percept, reward, done, info = self.execute_action(action)

            if verbose:
                print(f"\nStep {self.step_count}: {action.name}")
                print(f"  Position: ({self.x},{self.y}), Facing: {self.direction.name}")
                print(f"  Percept: {percept}")
                print(f"  Info: {info}")

            if done:
                if verbose:
                    print(f"\n{'=' * 40}")
                    print(f"Episode ended.  Reward: {self.env.total_reward:.0f}")
                    print(f"Steps taken: {self.step_count}")
                    success = info.get("exit") == "success"
                    print(f"Success: {success}")
                return

            # After moving to a new square, tell percepts and re-query safety.
            if action == Action.FORWARD and not percept.bump:
                self.tell_percepts(percept)
                self.update_safety()
                if verbose:
                    print(f"  Known safe: {sorted(self.known_safe)}")
                    print(f"  Known dangerous: {sorted(self.known_dangerous)}")


# ---------------------------------------------------------------------------
# Main — run on the example layout from the textbook
# ---------------------------------------------------------------------------

if __name__ == "__main__":
    from hazardous_warehouse_viz import configure_rn_example_layout

    env = HazardousWarehouseEnv(seed=0)
    configure_rn_example_layout(env)

    print("True state (hidden from the agent):")
    print(env.render(reveal=True))
    print()

    agent = WarehouseZ3Agent(env)
    agent.run(verbose=True)

3.7.4 Running the Agent

To run the Z3 FOL agent on the example layout from Section 3.2: The Hazardous Warehouse Environment:

from hazardous_warehouse_env import HazardousWarehouseEnv
from hazardous_warehouse_viz import configure_rn_example_layout
from warehouse_z3_agent import WarehouseZ3Agent

env = HazardousWarehouseEnv(seed=0)
configure_rn_example_layout(env)

print("True state (hidden from agent):")
print(env.render(reveal=True))

agent = WarehouseZ3Agent(env)
agent.run(verbose=True)

The agent follows the same exploration path as the propositional agent from Section 3.6: Building a Knowledge-Based Agent:

Starts at \((1,1)\), deduces \((2,1)\) and \((1,2)\) are safe.
Explores \((2,1)\) (creaking) and \((1,2)\) (rumbling), deducing that \((3,1)\) is damaged, \((1,3)\) has the forklift, and \((2,2)\) is safe.
Explores \((2,2)\) and then \((2,3)\), detects the beacon, and grabs the package.
Returns to \((1,1)\) through safe squares and exits.

The step count, reward, and exploration order are identical—the same logical conclusions are drawn by both the propositional and FOL encodings.

3.7.5 Comparison: Propositional vs. FOL

Propositional vs. FOL encodings, both using Z3
Aspect	Propositional (`warehouse_kb_agent.py`)	FOL (`warehouse_z3_agent.py`)
Encoding	Grounded `Bool` (one per square)	Quantified `ForAll`/`Exists`
Rule count (\(4 \times 4\))	48 biconditionals	3 quantified sentences
Adjacency	Implicit in `get_adjacent()`	Explicit `Adjacent` facts + domain closure
Solver strategy	SAT (Boolean Constraint Propagation)	MBQI (grounds quantifiers internally)
Variables	80 Bools (\(16 \times 5\) predicates)	5 Functions + 16 constants
Performance (\(4 \times 4\))	Fast	Fast

Both agents use Z3 and make identical decisions on the same layout. For a \(4 \times 4\) grid, performance is comparable—the bottleneck is the environment interaction, not the solver. Z3 internally grounds quantifiers over finite domains via MBQI (Model-Based Quantifier Instantiation), so the two approaches converge at the solver level. The FOL encoding's advantages become more significant for larger grids, richer constraint types (e.g., cardinality bounds like "at most 2 damaged squares"), or when extending the agent with arithmetic reasoning.

3.7.6 Discussion

3.7.6.1 Strengths

The Z3 FOL agent inherits the same soundness guarantee as the propositional agent: it will never enter a square it has not proven safe. Z3 is both sound and complete for the theories it supports, so the refutation-based entailment check is reliable.

The FOL encoding is the key pedagogical payoff. Three quantified sentences replace 48 grounded biconditionals, and the code reads almost identically to the mathematical formulas from Section 3.4: First-Order Logic. Writing ForAll(L, ...) forces you to think about the general rule, not the specific squares—exactly the expressiveness advantage that first-order logic promises.

3.7.6.2 Domain Closure: A FOL Subtlety

The domain closure axiom is worth reflecting on. In propositional logic, the set of symbols is fixed—there are no "phantom" propositions the solver can invent. In FOL with an uninterpreted sort, the solver is free to imagine additional domain elements unless we explicitly close the domain. This is a genuine cost of FOL expressiveness: the generality that lets us write grid-size-independent rules also requires us to specify the boundaries of the domain.

For finite, known domains like our warehouse grid, domain closure is straightforward. For open-ended domains (e.g., "all possible future locations"), it would not apply—and the FOL encoding would need a different approach.

3.7.6.3 Limitations

The agent exhibits the same conservative behavior as the propositional agent from Section 3.6: Building a Knowledge-Based Agent: it only enters provably safe squares. Probabilistic reasoning would require a different framework entirely.

For very large grids, the quantified FOL encoding could hit quantifier instantiation limits, though this is unlikely for warehouse-sized problems. The propositional encoding scales linearly with grid size and remains efficient.

3.7.7 Summary

Building the Z3-based FOL agent required:

A FOL domain: a Location sort with constants for each grid square, and Function objects for each predicate—replacing the Bool variables from Section 3.6: Building a Knowledge-Based Agent.
Three quantified physics rules (ForAll/Exists) that express creaking, rumbling, and safety independently of grid size.
Structural facts: closed-world adjacency and domain closure to ground the FOL encoding in the finite grid.
The same agent logic: perceive-tell-ask-act loop, BFS path planning, and decision priorities—unchanged from Section 3.6: Building a Knowledge-Based Agent.

The result is an agent that produces identical behavior to the propositional agent, with physics rules expressed as quantified FOL sentences—directly executable versions of the formulas from Section 3.4: First-Order Logic.

Drawing Tools

Table of Contents