THE PUZZLE Every language has its own idea of what constitutes a "word." In English "mustard" is a word while "moutarde" is not. In French it is the reverse. This puzzle concerns a recently unearthed manuscript found in a cave in the recently liberated country of Lower Elbonia. It appears to be written in a language so strange that the linguists studying it have dubbed it Gobblidook. It has only two letters, I'll call them a and b, though the puzzle could be generalized to more letters. So far every word encountered in the manuscript has been the same length, though the linguists are remaining quiet about what that length is until they have convinced themselves that no other length occurs in the manuscript. They have however disclosed certain regularities they have observed in the portion of the manuscript they have studied thus far. 1. Every word that uses only one letter has been seen in the manuscript. With only the letters a and b, that just means that they've observed the two words aaa...a and bbb...b somewhere in the manuscript. 2. If one forms a crossword without black squares such that the words across and down have all been seen in the manuscript, then the word formed by the letters running down the diagonal from top left to bottom right can also be found in the manuscript. 3. If one starts to make up any word by writing two letters at two word positions, one can always complete it to a word in the manuscript. So here's the puzzle. Knowing that the linguists have confirmed 1-3 on the basis of what they've examined so far, does there exist a word, of the same length as the ones they've found, that they have not yet encountered in the manuscript, or have they now seen every possible word of that length in the manuscript? For words up to a certain length, the answer is: the latter. To see why you will need to read the puzzle history. The easiest case is when the words are finite. See if you can prove this yourself. A somewhat harder case is when every word goes on forever. (These linguists have been trained to pick up speed as they read a word, spending only half as long looking at any given letter as they did looking at the previous letter, otherwise they'd get hung up on the very first word they looked at.) The case that is not yet known is when the letters of the word are as densely packed as the points on a line. That is, every point on a line of length one foot say has its own letter. This is vastly more dense than a word that merely goes on forever, where each letter can still have say three millimeters of space to itself. A line has uncountably many points, even if it is only finitely long. While this may seem unimaginably dense, here's one way to think of it. Since the alphabet has only two letters such a word can be defined as the same thing as the set of those points on the line at which the letter a is written. The points at which the letter b is written are the points not in that set. (What's a set, you ask? Oh, don't they teach that in school these days? Maybe that's a good thing.) In this super-dense case the puzzle is unsolved.

PUZZLE 1.5: Update January 16, 2004 Still no developments. The prize of $2000 for solving this puzzle remains uncollected. Come on, people, this problem about crossword puzzles is easy to state, so it surely can't be that hard, can it? I spoke on this problem and its background at this month's Annual AMS meeting in Phoenix, AZ, in the session on "The Many Lives of Lattice Theory." The talk was appropriately scheduled right after Dana Scott's talk, since both of us were dealing with the question of how to manufacture comprehensive cartesian closed categories, something of a cottage industry these days. For example Martin Escardo, Alex Simpson, and Jimmie Lawson announced three days ago (Jan. 13) their paper "Comparing cartesian closed categories of (core) compactly generated spaces", available at http://www.dcs.ed.ac.uk/home/als/Research/comparing.pdf, which explores a variety of offshoots of Kelley's original notion of compactly generated Hausdorff spaces, which furnishes homotopy with what Kelley considered a very convenient category. Vaughan Pratt pratt@cs.stanford.edu

PUZZLE 1.5 (as of April 18, 2003). There have been no noteworthy developments regarding Puzzle 1.5 in the past month. It looks like April 18 will be the last update on the puzzle until someone has some progress to report! I'd been planning to start a new puzzle series numbered 2.x, but Tiqit has started to consume too much of my time lately so this will have to wait. Here again is the statement of the puzzle, with a minor rewording in terms of sets of sets instead of predicates on sets for the sake of variety. Let D be a set of subsets of a set A with the following three properties. (i) D contains A and the empty set. (ii) Let C be any set of pairs (a,b), a,b in A, such that for all a in A, the sets {b | (a,b) is in C} and {b | (b,a) is in C} are in D. Then {b | (b,b) is in C} is in D. (iii) (T1) For any two distinct elements a,b of A, D contains a set containing a but not b, and another containing b but not a. The set D consisting of all subsets of A evidently satisfies (i)-(iii). Does any other set of subsets of A do so? Conditions (i)-(ii) define the notion of comonoid, while condition (iii) expresses the T1 property, defined as for topological spaces. The question is equivalent to asking whether every T1 comonoid is discrete. This is true for two special cases: countable T1 comonoids, and T1 comonoids that form directed CPOs (DCPOs). On the other hand it is not the case that every T1 topological space is discrete, or topology would be a very different subject. The most recent noteworthy observation to emerge from the theory-edge discussions of this problem is Mark Aujus' example of a comonoid representing the natural numbers N standardly ordered together with an additional point constrained to be less or equal to the sup of N without being constrained to be less or equal to any specific natural number. This is a kind of order information not representable with an ordinary poset. Seehttp://boole.stanford.edu/pub/comonoids.pdf

for my paper on this topic presented at the Coalgebraic Methods in CS workshop held April 5-6 in Warsaw. See below for details about the prize for solving this problem.

MARCH 7-APRIL 18, 2003: No postings during this period. (Earlier content-sparse March 7 posting deleted.)

FEBRUARY 28, 2003: PUZZLE 1.5 No increase this week. However here's a refinement of the prize structure for the proposition TCD = "All T1 comonoids are discrete." A proof that TCD is consistent with ZFC will earn $1000. A proof that its negation ~TCD is consistent with ZFC will earn $500, and more if a plausible application for any nondiscrete T1 comonoids can be demonstrated. More detailed information measuring the distance from ZFC, e.g. showing an implication in either direction between TCD or ~TCD and some well-known axiom known to be independent of ZFC, will earn additional amounts depending on the axioms used. Amounts for such partial results will be taken out of the total. Thus a proof that ~TCD is consistent with ZFC followed by a proof of ~TCD in ZFC would divide up the $2000 as $500 to the first and $1500 to the second. If in the other order however the whole $2000 would of course go to the proof of ~TCD. Vaughan Pratt

FEBRUARY 21, 2003: PUZZLE 1.5 Ok, last increment for a while: one more $500 brings the jackpot to $2,000. It will stay there until someone solves the puzzle. Vaughan Pratt

FEBRUARY 14, 2003: PUZZLE 1.5 At some point I'm going to have stop this, but for the time being the increment is again $500. Puzzle 1.5 now has $1,500 on its head. All definitions, caveats, etc. from previous puzzles continue to apply. Vaughan Pratt

FEBRUARY 7, 2003: PUZZLE 1.5 (still) Read my lips: no new puzzles until this one is solved. The prize is now $1000. Bear in mind that timing of the first correct entry is determined by when it is posted to theory-edge@yahoogroups.com, not by when you solve it, or when you send it to me. Vaughan Pratt

PS (added FEB. 10): To save newcomers the pain of reassembling Puzzle 1.5 from its fragmentary account below, here's a self-contained statement. Let D be a predicate on a power set 2^A with the following three properties. (i) D holds of A and the empty set. (ii) Let C be any binary predicate on A such that for all a in A, D holds of both {b | C(a,b)} and {b | C(b,a)}. Then D also holds of {b | C(b,b)}. (iii) For any two distinct elements a,b of A, D holds of some subset of A containing a but not b, and of one containing b but not a. D = TRUE evidently satisfies (i)-(iii). Does any other predicate on 2^A?

JANUARY 31, 2003: PUZZLE 1.5 Does there exist a nondiscrete T1 binary comonoid? (Definitions below.) The first correct and soundly reasoned answer posted to theory-edge@yahoogroups.com will win its submitter US$500. Definitions. 1. A word over an alphabet S is a pair (A,f) where A is a set called the *length* of the word, and f is a function f: A -> S giving the letters of the word. Example. S = 2 = {0,1}, A is the set of reals, and f: A -> S is the function mapping rationals to 1 and irrationals to 0. 2-4. As for puzzle 1.4 (which is given below).

Hence this week's puzzle asks to extend Mark's result to words of length strictly longer than N, i.e. uncountably long words.

Hint. First solve the problem for A the set of real numbers. Any solution that works for this example of an uncountable length is almost certainly going to work for any length.

A word of length R over {0,1} is a function from R to 2. This is the same thing as a subset of R, which you can visualize as dots along the real line, one dot per real in the subset, totally unrestricted as to density or distribution. You can put no dots, or dots everywhere along the line, or dots just at integers, or just at the rationals, etc.

A dictionary of words of length R is therefore nothing more or less than a set of sets of reals, such as {{1.4, 3}, {2.78, 9.1}} (but of course that's merely a finite example). Equivalently it is a set of arbitrarily dotted lines.

A crossword then amounts to dots in the plane, such that for each vertical or horizontal line, the dots on that line are those of some word in the dictionary. The problem asks about the line y = x: if the dots along it always form a dictionary word, is there a word (subset of R) that the dictionary can omit?

JANUARY 24, 2003: PUZZLE 1.4 For dictionaries with infinite words, does there exist a nondiscrete T1 binary comonoid? (Definitions below.) The first correct and soundly reasoned answer posted to theory-edge@yahoogroups.com will win its submitter US$90. Definitions. 1. An infinite word is an infinite sequence a0,a1,a2,... of letters. Example: 31415926535... is an infinite sequence whose letters are from the alphabet {0,1,2,...,9}. 2. A dictionary of words of a given length over a given alphabet is *discrete* when it contains every word of that length over that alphabet. Examples: Only the dictionary on the right is discrete, the one on the left lacks the word 00. 011 0011 101 0101 3. A *comonoid* over a given alphabet is a dictionary of words all of the same length that includes all constant words, and having the property that for every crossword whose rows and columns are in the dictionary, its main diagonal is also in the dictionary. (See Puzzle 1.2 for examples.) 4. A binary dictionary D is T1 when for any pair a,b of points there are two words in D such at point a, one word is 0 and the other 1, while at b it is the other way round. Example: the dictionary on the left with words 01, 10, and 11, 011 011 101 001 is T1 because we only have two points to worry about, corresponding to the first and second rows, and in the first word (column) we have 0 then 1 while in the second we have 1 then 0, forming a 2x2 checkerboard. The one on the right, with words 00, 10, and 11, is not T1 since the second row is <= the first row; equivalently there is no "checkerboard" pattern of two 0s and two 1s anywhere in those two rows. Vaughan Pratt

Binary of course means over the alphabet {0,1}. So Puzzle 1.2 was about finite binary comonoids. The T1 property ensures that there can be no two points a,b such that either a <= b or b <= a when compared bit by bit. Since finite T0 comonoids are just partial orders, strengthening T0 to T1 prevents nontrivial order, and an unordered set allows all possible cuts, whence every finite T1 binary comonoid is discrete. The puzzle asks whether this extends to infinite comonoids.

*Significance*. Topological spaces can be T1 (have only trivial =
discrete order) without being topologically trivial = discrete
topology.

Complete partial orders (CPOs), on the other hand, by definition obtain any topological structure they might have from their order structure. A CPO is a partial order X every directed set A of which has a least upper bound (in X but not necessarily in A). (A subset A of a partially ordered set X is called *directed* when it is nonempty and for any pair a,b in A there exists c in A such that a <= c and b <= c. Minipuzzle: any finite directed set must have a maximal element, whence all finite directed sets A not only have a least upper bound but contain it. Hence every finite partial order is automatically a CPO.) When the least upper bound a of a directed set A is not in A, only possible for infinite A, the Scott topology on that CPO disallows the cut separating a from A, that is, it insists on a being the limit of A.

The only directed sets of a T1 CPO (one whose order is just the identity relation a = a) are the singletons, and such a CPO therefore cannot have nontrivial topological structure.

So what we're really asking here is, do infinite comonoids go the way of topological spaces or CPOs when the T1 condition is imposed? Can they have structure without order or must they lose it all?

SOLUTION TO PUZZLE 1.4, by Mark (aujus_1066 AT yahoo.com), in 6h 10m. Lemma: If D is a T1 binary comonoid with (countably) infinite words, then for all finite prefixes p, there exists a word w in D with prefix p. Proof: Index the positions of words starting at 1. Let t(i,j) be a word in D whose i-th position is 1 and whose j-th position is 0. Such a word is guaranteed to be in D because of the T1 property. Let k be the length of p, and let i be an index between 1 and k. Define m(i,k) to be the AND of {t(i,j) : 1 <= j <= k, j != i}. Then m(i,k) is in D, because D is closed under finite AND, and the i-th position of m(i,k) is 1 while the other positions between 1 and k are 0. Now, let w be the OR of {m(i,k) : 1 <= i <= k, i-th position of p is 1}. Then for 1 <= i <= k, the i-th position of w is 1 iff the i-th position of p is 1. Hence w has prefix p, and w is in D because D is closed under finite OR. QED Lemma: Let D be a T1 binary comonoid with (countably) infinite words, and let w be an arbitrary (countably) infinite word. Then w is in D. Proof: We construct a (symmetric) crossword whose rows and columns are all in D, and whose diagonal is w. Hence, by the comonoid property, w will be in D. Let r(i) be the i-th row (and also column) of the crossword, and let r(i)_j be the bit at the j-th position of r(i). Construct the crossword as follows: r(1) := word in D with prefix w_1, r(2) := word in D with prefix r(1)_2 w_2, r(3) := word in D with prefix r(1)_3 r(2)_3 w_3, ... r(n) := word in D with prefix r(1)_n r(2)_n ... r(n-1)_n w_n, ... Clearly, by construction, the diagonal of the crossword is w, so we have r(i)_i = w_i for all i. Also, each row of the crossword is in D. It remains to show that the crossword is symmetric, which will guarantee that all the columns of the crossword are also in D. For 1 <= i < n, r(n)_i is the i-th bit of the prefix r(1)_n r(2)_n ... r(n-1)_n w_n, which is r(i)_n. Hence r(n)_i = r(i)_n. QED Corollary: Every T1 binary comonoid with (countably) infinite words is discrete.

JANUARY 17, 2003: PUZZLE 1.3 Square Boolean crosswords. Given an integer n > 0 and a set D of n-bit vectors, is there an nxn crossword whose rows and columns are all in D? Examples (n=3): D = {101,110}: no. D = {001,011,100}: yes, [100,001,011]. This problem is either NP-hard or in P. For US$100, which and why? Answers to theory-edge@yahoogroups.com Vaughan Pratt

Mark solved this puzzle in favor of NP-hard in a mere 82.5 hours, after much time spent "trying to prove this problem is in P by generalizing the weight=2 case and finding an 'augmenting path'-type algorithm and/or a precise characterization a la Hall's theorem."

...here is a reduction from 3DM that took 5 minutes to figure out and an hour to prove, in my sleep-deprived state.(Luckily for Mark he did not "sleep on it", since Mike Robson gave an elegant reduction from the three-letter square case a mere 3.5 hours later.) Continuing with Mark's solution, ...

Given 3n villagers, affectionately called 1,2,..,3n, where the first sex is from 1 to n, the second sex is from n+1 to 2n, and the third sex is from 2n+1 to 3n, and a set of acceptable triplings for them, form a dictionary with words of length 3n over the alphabet {0,1} as follows. The dictionary consists of one word of length 3n per acceptable tripling. This word has three 1's, which are put at the positions corresponding to the three villagers in this tripling (this is why we asked them for their names ahead of time). Claim: There is a 3n x 3n square crossword puzzle compatible with this dictionary if and only if all the villagers can be tripled up acceptably. Proof (only if): The first n rows of the crossword puzzle give an acceptable tripling that we can read off. Because the rows are in the dictionary which only contained acceptable triples, the only thing that remains to be shown is that each villager appears exactly once in this solution. But each column is in the dictionary as well, which has the property that in the first n bits of each word, there is exactly one 1. Hence no villager can appear twice or be omitted from the solution. Proof (if): Given the mayor's assignment, this provides a partition of the all 1 word, which means that the dictionary is 1-coverable, and hence crosswordable. In fact, since every 1-coverable dictionary is symcrosswordable, this means that the problem of finding a symmetric square crossword is NP-complete as well. QED Mark, going to bed for four and a half hours before I have to get up to take my daughter to school.

JANUARY 10, 2003, 21:00 GMT: PUZZLE 1.2 Diagonal crossword problem. Given an integer n > 0 and a dictionary (i.e. a set) D of n-bit binary words, including the two constant words 00...0 and 11...1, is there an nxn crossword whose rows and columns are all in D but whose main diagonal (across and down) is not in D? Examples. D = {000,010,100,111}: yes, diag[100,010,000] = 110. D = {00,01,11}: no. This problem is either NP-hard or in P. For US$70, which and why? Answers to theory-edge@yahoogroups.com Vaughan Pratt

After people had made various "gaming-the-system" guesses as to which of NP-hard or P was correct, Mark (aujus_1066 at yahoo.com) again was the first to submit a correct answer.

It is interesting to compare Puzzles 1.1 and 1.2 for time to solution vs. complexity of answer. 1.1 took Mark 35 minutes, while 1.2 took him 8 hours 55 minutes. The solution to 1.1 was a dynamic programming algorithm similar to Floyd's all-shortest-paths algorithm or Cocke-Kasami-Younger (CKY) context-free parsing. The solution to 1.2 was to prove that the problem was equivalent to asking whether the dictionary was closed under AND and OR.

I'm guessing that Puzzle 1.3 will continue this pattern: even harder to solve than 1.2 (so it will be for more), but with a solution just as short, simple, and obvious in retrospect, as those of 1.2 and 1.1. I'll be delighted if someone proves me wrong on the difficulty of solving it.

From: "aujus_1066" To: theory-edge@yahoogroups.com Date: Sat, 11 Jan 2003 05:54:55 -0000 Subject: [theory-edge] Re: Puzzle 1.2 Here is pseudocode for an O((d^3)*n) algorithm, where d is the cardinality of the dictionary D: bool answer, found_or, found_and; bool w1[n], w2[n], w3[n], word_or[n], word_and[n]; answer := true; for w1 in D do __ for w2 in D do _____ word_or := w1 | w2; // bitwise or _____ word_and := w1 & w2; // bitwise and _____ found_or := false; _____ found_and := false; _____ for w3 in D do ________ if (w3 == word_or) then ___________ found_or := true; ________ end if; ________ if (w3 == word_and) then ___________ found_and := true; ________ end if; _____ end for; _____ if ((!found_or) || (!found_and)) then ________ answer := false; _____ end if; __ end for; end for; return !answer; Basically, the algorithm is testing if the dictionary is closed under "bitwise and" and "bitwise or", and then returning "no" if it is and "yes" if it isn't. It relies on the following three lemmas: Lemma 1: If D is not closed under "bitwise and", then there exists a crossword puzzle whose rows and columns are in D, but whose diagonal is not in D. Proof: Let a[n] and b[n] be two words in D such that a&b is not in D. Then form the crossword c[i][j] = a[i] & b[j]. Each row c[i][*] is either b or the all zero word, depending on whether a[i] is 1 or 0. Similarly, each column c[*][j] is either a or the all zero word, depending on whether b[j] is 1 or 0. Finally, the diagonal is a&b, which is not in D. Lemma 2: If D is not closed under "bitwise or", then there exists a crossword puzzle whose rows and columns are in D, but whose diagonal is not in D. Proof: Let a[n] and b[n] be two words in D such that a|b is not in D. Then form the crossword c[i][j] = a[i] | b[j]. Each row c[i][*] is either b or the all one word, depending on whether a[i] is 0 or 1. Similarly, each column c[*][j] is either a or the all one word, depending on whether b[j] is 0 or 1. Finally, the diagonal is a|b, which is not in D. Lemma 3: If D is closed under "bitwise and" and "bitwise or", then it follows that for any crossword puzzle whose rows and columns are all in D, the diagonal of the crossword puzzle is also in D. Proof: It holds by inspection for n=1 and n=2. For n >= 3, we prove the lemma by induction. Deleting the i-th bit of each word in D, the induction hypothesis shows the diagonal with its i-th bit deleted is in D. Thus, the only way that full diagonal is not in D is if the diagonal with the i-th bit flipped is in D. This must hold for each i. Now, if the diagonal has two 0's, then "bitwise and" of flipping each of those 0's is the original diagonal, and thus in D. If the diagonal has two 1's, then the "bitwise or" of flipping each of those 1's is the original diagonal, and thus in D. But since n >= 3, one of those two cases must occur. Mark

Postscript (by Vaughan). Mark proves Lemma 3 by induction, which is fine. However it raises the question of whether induction is essential. The following proof avoids induction, and moreover works even for dictionaries with infinitely many infinitely long words, provided only that it be closed under AND and OR of *any* (arbitrary) set of words. (So if the dictionary is infinite, the AND or OR of any infinite set of words from the dictionary must be in the dictionary.)

Lemma 3. If dictionary D is closed under arbitrary AND and OR, and contains the two constant words 00...0 and 11...1, then any crossword built from D has its diagonal also in D. DICTIONARY PREPROCESSING CONSIDERATION. For each i in 1..n, D contains a minimum word m^i having a 1 in position i. Proof: Obtain m^i as the AND of all words w in D with w_i = 1. QED (Since 11...1 is in D, we never have to form the AND of the empty set.) MAIN ARGUMENT, ONE DIRECTION. Let M[1..n,1..n] be a crossword on D with diagonal word d (d_i = M[i,i]). Let e be the OR of all m^i for which d_i = 1. Exercise. d <= e (that is, any 1 in d appears in e at the same position). (Note that e was built by ANDing and ORing words of D, so must be in D.) MAIN ARGUMENT, OTHER DIRECTION. It suffices to show e <= d. Let e_j = 1; we want to show d_j = 1. For some k such that d_k = 1, the j-th bit of m^k must be 1 (look at how e was constructed). Hence *************************************************** * every word in the dictionary that has a 1 at * (1) * position k must have a 1 at position j as well. * *************************************************** (Look at how m_k was constructed.) Since d_k = 1, M[k,k] = 1. But then M[j,k] = 1 (apply (1) to the k-th column of M). But then M[j,j] = 1 (apply (1) to the j-th row of M). That is, d_j = 1. QED Vaughan

JANUARY 3, 2003, 21:00 GMT: PUZZLE 1.1 String-cutting Problem. Given positive integers k and d1 < d2 < ... < dn, can a piece of string be cut into k pieces sequentially (i.e. with k-1 cuts one after another) such that each of the 2k-1 pieces of string encountered along the way has length di for some i in 1..n? (Examples: For 1 < 3 < 6 < 8, the maximum possible k is 2---start with 6 and cut in half to yield 2 pieces---but increases to 3 if 8 is reduced to 7---start with 7, cut off 1 then 3, yielding 3 pieces.) Further clarification: each of the k-1 cuts can be applied to any of the pieces of string obtained thus far (as distinct from only cutting pieces off the end of the main piece). Example: For 1 < 3 < 6 < 10 < 14, the maximum possible k is 2---start with 6 and cut in half to yield 2 pieces---but increases to k=6 if 10 is reduced to 7---start with 14, cut in half to yield two 7's, then cut each of the 7's into 1+3+3 as in the previous example. Assuming the numbers are written in binary, determine whether this problem is NP-hard or in P (it is definitely one or the other). Answers to theory-edge@yahoogroups.com US$50 for the first soundly reasoned correct answer, as judged by a reasonable consensus of the theory-edge readership. Vaughan Pratt

The first correct solution (except for a missing end-if, no biggy) was
submitted within 35 minutes by Mark Aujus, aujus_1066 at yahoo.com.
Here's Mark's O(n^{3}) algorithm as coded up in C by "wahchelc":

#includeMark in turn passed his winnings on to the first person to improve Mark's solution to O(n#define ARRAYLENGTH 4 int d[ARRAYLENGTH]={1,3,6,8}; int v[ARRAYLENGTH]; int main(){ int k = 2; int j1,j2; int i; bool answer = false; for(i=0;i < ARRAYLENGTH;i++){ v[i]=1; for(j1=0;j1 <= i-1;j1++){ for(j2= 0;j2 <= i-1;j2++){ if((d[j1]+d[j2]) == d[i]) if(v[i] < v[j1]+v[j2]) v[i] = v[j1]+v[j2]; } } if(v[i] >= k) answer = true; } return 0; }