Solving the Sudoku puzzle by logic only
This page describes how to solve a Sudoku puzzle. It's a puzzle in a 9x9
grid where each number (from 1 to 9) has to appear exactly once in each row,
column and exactly once in each of the nine 3x3 squares. If you've never
heard of it, search the web, there are plenty of examples.
It's an intriguing puzzle because it can seem at once very simple and very,
very difficult. I wrote a solution finder that simply tried each number in
each position but that takes forever. I didn't let it finish, I have a
sneaking suspicion it's end-of-the-universe type stuff.
The solution
So, my final solution ended up being a bit smarter. It employs three kinds of tricks:
- Firstly, it goes though each cell and determines if there is only one
possible number that can be placed there. This is the simplest and most
simple puzzles found in newspapers can be solved with this method alone. It
just takes a bit of time, although it's trivial for a computer.
- Secondly, it checks each cell to see if there is a number possible
there that is not possible in any other cell in the row, column or box.
This is a slightly more sophisticated method although you can apply it very
easily when you're looking at the grid yourself. It gets you out of stuck
spots where the first method can't proceed.
- Thirdly, it tries the method known as locked candidates. It's the
technique where if a number in any row or column is only present in a
single box, you can exclude it from the rest of the box. The site given has
abetter explanation.
- The final method is one that simply count the number of distinct number
in a group of cells and if that number is the same as the number of cells
you've made a subgroup, This restricts the contens of other cells and may
lead to a solution.
The program basically keeps applying the first method until it stops
working. It then applies the second method and tries the first again. It
only tries the later methods if the earlier ones don't make progress. The
solver never guesses though it could easily be made to do so. If it
can't sole it it shows you where it got upto and stops. Usually I can't
solve the resulting grid either which means I can't really fix it either.
The implementation
The grid is stored as a two dimensional array. The cells are referenced by
Row and Col numbers from 1 to 9. The cell either contains a number or is
unbound (represented by '_' in Prolog). The important basic predicates are:
- available( In, Out ) :- Given a list of cells (either bound or unbound)
returns the set of numbers that do not appear in that set. Unbound
cells are ignored.
- possibles( P, Rin, Cin, Out ) :- Given the grid P, return the
set of possible values for the cell (Rin,Cin). It works basically by
extracting the values in the same row, column and box, run the
available/2 predicate on each and return the intersection of those
sets. Note, if the cell already has a value, this predicate fails.
You shouldn't be doing that anyway. It also simplifies coding as filled-in
cells will be automatically ignored.
- row( P, Row ), col( P, Col ), box( P, Box )
:- return the values in the grid associated with the given row, column or
box.
After seeing the above basic predicates you can see how the solving might
proceed. Note, the built-in findall/3 predicate is used a lot to find all
possible solutions.
Discovered values are stored in the form of [Row,Col,Value]. If in
an intermediate stage with multiple possible values, we use
[Row,Col,[Values]]. The predicate assign/3 applies the found
results to the grid. It fails if there is nothing to apply, so each
individual step doesn't need to explicitly check if it didn't find anything.
- Sanity check: Check that there are no cells for which
possibles/4 returns an empty set. This means we screwed up. This
is where it's important for possibles/4 to fail on filled in cells rather
than return an empty set.
- First method: Find all cells where possibles/4 returns a single
element. If found, that's the element to fill in.
- Second method: This is trickier. For each row, column and box we
collect the set of possibles for each cell. Within a single (for example)
row, we take each the possibles for each cell and subtract the possibles
for the other cells. If there is exactly one number left over, that's
number it should be.
We do this search in row, column and boxes one after the other and return
the union of the results. (row|col|box)set/3 are the testing
predicates. (row|col|box)all/2 are the predicates to collect the
results.
- Third method: This is the Locked Candidates test. For each row, column
or box it extract the set of possibles. If a number in that row, column or
box only appears in one section of it, it is excluded in the other
direction also.
Much of the complexity here it caused by the fact that I generalised it to
handle all the possiblities. It choose a major axis (say row) and number
(say 1) and a minor axis (say box). It will then select all the cells in
row 1. It divides these cells by the minor axis giving you a set with the
number used in the intersections of (row 1, box1) (row 1, box 2) (row 1,
box 3). If any of those contain a number not in the other two, that number
can be removed for all other rows in the same box.
A bit difficult to get your head around but it's a very effective solve
method.
- Fourth method: This basically scans each row, column or box and tries
to find a subset of cells where the number of distinct numbers is the same
as the number of cells. Works quite well.
If none of the above work we've either solved the puzzle or got stuck. In
either case, print the result and stop.
Speed
Basically, it solves the puzzle quicker than you can type it in. The
problems in the file I've put in the predicate problem/2 so you basically
say:
> problem(1,X), solve(X).
This will solve the first puzzle in the file. I defined 7. Ideally it should
read from a file but I havn't done that yet.
It prints out each step along the way along with which each type of method it
used. You could turn this off if you want. It's probably fast enough to
create a puzzle setter. solve/2 will fail outright if the puzzle is
inconsistent and return an incomplete grid if there is not enough
information. You could score the different methods of solving to try and
make harder-to-solve puzzles.
Download: sudoku3.plg
By Martijn van Oosterhout (kleptog (at) svana.org)
Copyright © 2000-2006 - Last modified 3/04/2006