In Leon's original paper on backtracking algorithms, he describes his method of working with a linear binary [n,k] code C. The idea is to produce a set of vectors V in F2n such that
- V is invariant under Aut(C),
- V is a relatively small set, and
- [Aut(V):Aut(C)] is very small (the runtime rises rapidly as a function of this index).
Another approach to refinement is to never store such large sets in memory. We want to avoid o(2k) memory usage, so we only ever store a basis for the code, and we only keep track of the column partitions. This way memory usage will be around O(nk). However, this will require a different approach to refinement, which will either be effective in refining partitions and run in time O(n2k), or won't be very effective but will run in time O(nk). The idea for the prior is to generalize the notion of degree between a point p and a cell: each word which has a 1 at position p has a certain number w of 1's in the cell. Then the degree is an (n+1)-tuple (w_0, ..., w_n), where w_i is the number of words which has a 1 in position p and i 1's in the cell. This requires a gray code traversal of every word in the code, so time must be O(n2k), but the constants will be very nice: for each word, we need a few operations for the gray code traversal, a few operations for mask and looking up hamming weight, and the n is actually divided by the radix.
So we have three approaches:
- Make a gigantic bipartite graph and analyze that.
- Use refinements of a small (often difficult to find) invariant set whose symmetries are not much bigger than those of our code.
- Use a small memory profile and submit to more frequent gray code traversals of the code.
None of these really solves the dimension problem (*), however the second and particularly the third make it feasible to at least ask the question of a large code. Given the potential of these algorithms for parallelization, the third option becomes particularly appealing, since the overhead of copying data from worker to worker would be small, and the underlying time-consuming operations could at least be cut down by the brute force of many processors. (*)- The dimension problem seems to be necessary in all of this, since the crucial difference between an effective and an ineffective refinement process is the ability to tell the weight of words in the code (loosely). The difficulty of problems reducing to weights of words in codes is well known...