rlm-blog: August 2008

Wednesday, August 20, 2008

How not to write a paper, part I

I have been reading over Jeffrey Leon's paper on partition backtrack methods, titled

Permutation Group Algorithms Based on Partitions, I: Theory and Algorithms.

What Leon accomplishes in this paper is as powerful as it is incomprehensible, and in reading it I have had so much difficulty that I have decided to use it as a model for how not to write a paper, much in the spirit of a lecture by Serre I once saw on video.

First of all, it is a very bad idea to put "part I" in the title of a paper. I have titled this blog that way, because I am pretty certain that I will be posting a follow-up list. Leon never submitted a follow up paper to this one, and in fact references this non-existent paper several times in the paper I am reading.

This paper is truly a maze of which a minotaur would be jealous. Every definition refers back to Definition XX or Notation XX of Section YY. This is no fault of Leon's, since the editors of the Journal of Symbolic Computation control the headers of the article, but you never know what section you are in anyway, so these references are already difficult to follow. To make matters worse (yes, that's right, Notation is on the same footing as Definition and Theorem), here is an example progression of his numbering from section 6:

...
Definition 19
Example 6
Definitions 20-22
Lemma 9
Definition 23
Proposition 4
Definition 24
Proposition 5
Definition 25
Proposition 6
Corollary
Definition 26
...

That's correct, corollaries are only referenced as "the corollary to Proposition 6" or something like it.

Another good idea if you are trying to win the journal equivalent of an obfuscated C contest is to use lots of notation. Leon starts his article with at least three pages of notation, and to make matters worse, for a large part of the notation, one reads "Notation not specified above may be found in Gorenstein (1968) or in Wielandt (1964)." As if one more table of notation would hurt!

More on notation: no latin letters are ever used for notation, except for groups. Everything is either Greek, Fraktur, or some monstrous calligraphication of Fraktur. More on that: your typos can get pretty interesting when your paper is structured this way. An "R"-base is defined as a sequence of things, ("A_1", ..., "A_{n-1}"), where "" means some kind of Fraktur notation. In this definition, another set of notations is defined, and yet a third sequence of things ("U_1", ..., "U_{n-1}") is referenced to in the "such that" clause that never comes up again. After struggle, the reader (if one has not already jumped off the cliff) realizes that the "U_i" and "A_i" are actually the same object, and the scriptyness of the fonts being used has had the reader thinking that those were "A"s, when in fact they were "U"s all along. The "U_i" were actually a typo of not being hard enough to read, using regular Fraktur instead of super scripty Fraktur!

Now keep in mind, we have almost made it through half the paper, and most of what we've been presented with have been definitions of notation, and complicated concepts with what I find to be simple English explanations (lacking in the paper of course). We haven't even gotten to the "Theory and Algorithms" yet!

All this aside, I should mention that Leon's work in this paper is very important, and I wouldn't be struggling with it so hard if anyone else had published anything like it. We also owe him a bit of thanks for releasing his code under GPL, so that when I eventually go crazy over his notation, I can just look at his source code (written in ASCII, with no weird alphabets) to see what he means. Thanks also to Google, once again, for paying me for this self-torture.

Monday, August 18, 2008

Double cosets

I'm now turning to the task of canonical augmentation, or orderly generation. More simply, the question of generating unique isomorphism representatives of a certain class of finite objects, where the notion of isomorphism comes from the action of some finite permutation group.

One key step in the algorithm is to discover whether or not two pairs of objects are isomorphic via the same isomorphism. More specifically, it needs to be computed whether (P, C) ~ (M, g(C)), where g(C) is the canonical relabeling of C (a superobject of P), and M is defined to be the "canonical parent." This means, is there a permutation h such that h(P) = M and h(C) = g(C). First, this means that P and M must be isomorphic, let's say via d: M = d(P). So once we know d, we can say that we are looking for some permutation which is in the coset dAut(P) and also in the coset gAut(C).

There are two places in the code in Sage which does this, in generating graphs and in generating binary codes. Both are currently implemented ad hoc- graphs are written in Python in a pretty naive implementation, and binary codes actually farm the work out to Gap, which takes over half the CPU time.

This is an example of a double-coset problem, which is one of the types of algorithms I was planning on implementing. Since this can also be used to solve the isomorphism problem substantially faster (at least in the negative-answer case; when they are isomorphic the runtime is the same), I have decided to tackle this problem next.

Once more, thanks to Google for funding my work over the summer.

Wednesday, August 6, 2008

Binary Linear Codes of arbitrary degree

Patch #2 for my Google funded summer work is now finished!

Next, I will start thinking about other structures which should be pretty easy to implement (given what is already written), such as matrix automorphism groups, hypergraphs, and linear codes over other fields.

$ ./sage -t devel/sage/sage/groups/perm_gps/partn_ref/ sage -t devel/sage/sage/groups/perm_gps/partn_ref/refinement_binary.pyx [12.7 s] sage -t devel/sage/sage/groups/perm_gps/partn_ref/refinement_graphs.pyx [43.2 s] sage -t devel/sage/sage/groups/perm_gps/partn_ref/tree_traversal.pyx [2.1 s] ---------------------------------------------------------------------- All tests passed! Total time for all tests: 58.0 seconds $ ./sage -t -long devel/sage/sage/groups/perm_gps/partn_ref/ sage -t -long devel/sage/sage/groups/perm_gps/partn_ref/refinement_binary.pyx [201.8 s] sage -t -long devel/sage/sage/groups/perm_gps/partn_ref/refinement_graphs.pyx [244.4 s] sage -t -long devel/sage/sage/groups/perm_gps/partn_ref/tree_traversal.pyx [3.0 s] ---------------------------------------------------------------------- All tests passed! Total time for all tests: 449.2 seconds

rlm-blog