Jekyll2021-07-17T22:06:24+00:00https://www.sligocki.com//feed.xmlsligockiShawn LigockiCollatz-like behavior of Busy Beavers2021-07-17T00:00:00+00:002021-07-17T00:00:00+00:00https://www.sligocki.com//2021/07/17/bb-collatz<p>Nick Drozd just <a href="https://nickdrozd.github.io/2021/07/11/self-cleaning-turing-machine.html">announced</a> a new 4x2 <a href="/2021/03/06/beeping-busy-beaver/">Beeping Busy Beaver</a> champion (<code class="language-plaintext highlighter-rouge">BBB(4, 2)</code>). This is a 4-state, 2-symbol Turing Machine which runs for <strong>32,779,478</strong> steps only to produce a blank tape and then run off to left forever in state <code class="language-plaintext highlighter-rouge">C</code>. As far as I can tell, the previous record for <code class="language-plaintext highlighter-rouge">BBB(4, 2)</code> was 66,349, so this is quite an improvement!</p>
<p>This machine is also a textbook example of Collatz-like behavior, which <a href="https://arxiv.org/abs/0906.3749">Pascal Michel has shown</a> to be quite common in BB champions and candidates. I love this sort of analysis so much because it takes these opaque Turing Machines and expresses what they are doing in a simple structured way and shows how these BB candidates are taking advantage of the machinery of an open problem in mathematics (and a little luck) to run for extremely long runtimes.</p>
<h2 id="the-machine">The Machine</h2>
<p>This Turing Machine is defined by the transition table:</p>
<table>
<thead>
<tr>
<th style="text-align: center"> </th>
<th style="text-align: center">0</th>
<th style="text-align: center">1</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: center">A</td>
<td style="text-align: center">1RB</td>
<td style="text-align: center">1LD</td>
</tr>
<tr>
<td style="text-align: center">B</td>
<td style="text-align: center">1RC</td>
<td style="text-align: center">1RB</td>
</tr>
<tr>
<td style="text-align: center">C</td>
<td style="text-align: center">1LC</td>
<td style="text-align: center">1LA</td>
</tr>
<tr>
<td style="text-align: center">D</td>
<td style="text-align: center">0RC</td>
<td style="text-align: center">0RD</td>
</tr>
</tbody>
</table>
<p>Or <code class="language-plaintext highlighter-rouge">1RB 1LD 1RC 1RB 1LC 1LA 0RC 0RD</code> in the one-line format that Nick and I use.</p>
<p>You can see how it performs <a href="http://turingmachinesimulator.com/shared/hrtjbntayc">in an online simulator</a> or using my <a href="https://github.com/sligocki/busy-beaver/blob/master/Code/Visual_Simulator">Visual_Simulator</a> which looks like:</p>
<p><img src="/assets/images/bbb42_vis_sim.png" alt="Visual_Simulator output" /></p>
<p>Here, each line is the tape at the step number specified at left. The upper-case letter is the current state and head position. Red means 1 on the tape, White means 0.</p>
<h2 id="tape-compression">Tape Compression</h2>
<p>From either of those simulations we can see that it’s performing a sort of two-stage process, moving back and forth from the right side to the middle repeatedly, each iteration moving that middle point towards the left and eventually when it reaches the left reseting in some way and starting over again. This pattern is pretty common among Busy Beaver candidates.</p>
<p>The direct simulations above are great for observing behavior, but become a bit tedious when trying to analyze that behavior. Instead, we often find it useful to use <a href="https://en.wikipedia.org/wiki/Run-length_encoding">run-length encoding</a> on the tape (I first learned about this technique from <a href="http://turbotm.de/~heiner/BB/macro.html#6">Heiner Marxen’s excellent article about Macro Machines</a>). Consider the last line in the image above (Visual_Simulator output at step 71). If we were to write that configuration out in text we could write it as:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>... 0 1 1 1 1 0 0 0 0 0 (C0) 1 1 1 1 1 0 ...
</code></pre></div></div>
<p>In other words, the TM is in state C, reading head over a 0. To the right are 5 1s and then infinitely many 0s. To the left are 5 0s, then 4 1s, then infinitely many 0s. Inspired by this English language description, we can write the tape in such a compressed format:</p>
<p><code>
0<sup>∞</sup> 1<sup>4</sup> 0<sup>5</sup> (C0) 1<sup>5</sup> 0<sup>∞</sup>
</code></p>
<p>We will use a slight modification of this format:</p>
<p><code>
0<sup>∞</sup> 1<sup>4</sup> 0<sup>6</sup> <b><C</b> 1<sup>5</sup> 0<sup>∞</sup>
</code></p>
<p>The difference here being that we represent the TM head as moving towards the 6 0s rather than on top of one of them with 5 to the left.</p>
<h2 id="chain-step">Chain Step</h2>
<p>Now that we are using this tape compression technique let’s see what it looks like if we continue our simulation starting at step 71 (from above):</p>
<p><samp>
71: 0<sup>∞</sup> 1<sup>4</sup> 0<sup>6</sup> <b><C</b> 1<sup>5</sup> 0<sup>∞</sup><br />
72: 0<sup>∞</sup> 1<sup>4</sup> 0<sup>5</sup> <b><C</b> 1<sup>6</sup> 0<sup>∞</sup><br />
73: 0<sup>∞</sup> 1<sup>4</sup> 0<sup>4</sup> <b><C</b> 1<sup>7</sup> 0<sup>∞</sup><br />
74: 0<sup>∞</sup> 1<sup>4</sup> 0<sup>3</sup> <b><C</b> 1<sup>8</sup> 0<sup>∞</sup><br />
</samp></p>
<p>Can you see a pattern here? Because we have the transition rule <code class="language-plaintext highlighter-rouge">C0 → 1LC</code>, we can see that: <code>0<sup>n</sup> <b><C</b></code> → <code><b><C</b> 1<sup>n</sup></code> for any <code class="language-plaintext highlighter-rouge">n</code>. In other words, when the Turing Machine is in state <code class="language-plaintext highlighter-rouge">C</code>, traveling left into a block of 0s, it will travel straight through, converting them all to 1s and continuing on in state <code class="language-plaintext highlighter-rouge">C</code> to the left of that block. This is our first Turing Machine simulation acceleration technique! I call this a <strong>chain step</strong>, because it allows us to jump over entire “chains” of symbols.</p>
<h2 id="rule-step">Rule Step</h2>
<p>Now with tape compression and chain steps, we are equipped to see some higher-level patterns.</p>
<p>Now with this new representation we can see some patterns more easily. Jumping ahead a bit to step 138:</p>
<p><samp>
138: 0<sup>∞</sup> 1<sup>16</sup> 0<sup>5</sup> <b>C></b> 0<sup>∞</sup><br />
139: 0<sup>∞</sup> 1<sup>16</sup> 0<sup>5</sup> <b><C</b> 1<sup>1</sup> 0<sup>∞</sup><br />
144: 0<sup>∞</sup> 1<sup>16</sup> <b><C</b> 1<sup>6</sup> 0<sup>∞</sup> (via chain step)<br />
145: 0<sup>∞</sup> 1<sup>15</sup> <b><A</b> 1<sup>7</sup> 0<sup>∞</sup><br />
146: 0<sup>∞</sup> 1<sup>14</sup> <b><D</b> 1<sup>8</sup> 0<sup>∞</sup><br />
147: 0<sup>∞</sup> 1<sup>13</sup> 0<sup>1</sup> <b>D></b> 1<sup>8</sup> 0<sup>∞</sup><br />
155: 0<sup>∞</sup> 1<sup>13</sup> 0<sup>9</sup> <b>D></b> 0<sup>∞</sup> (via chain step)<br />
156: 0<sup>∞</sup> 1<sup>13</sup> 0<sup>10</sup> <b>C></b> 0<sup>∞</sup><br />
</samp></p>
<p>So, we are back in the same configuration except that there are 3 fewer 1s and 5 more 0s. This behavior is, in fact, independent of the exact values of the exponents. Let’s see by generalizing the exponents:</p>
<p><samp>
0<sup>∞</sup> 1<sup>n+3</sup> 0<sup>m</sup> <b>C></b> 0<sup>∞</sup><br />
0<sup>∞</sup> 1<sup>n+3</sup> 0<sup>m</sup> <b><C</b> 1<sup>1</sup> 0<sup>∞</sup> (in 1 step)<br />
0<sup>∞</sup> 1<sup>n+3</sup> <b><C</b> 1<sup>m+1</sup> 0<sup>∞</sup> (in m steps, m+1 total)<br />
0<sup>∞</sup> 1<sup>n+2</sup> <b><A</b> 1<sup>m+2</sup> 0<sup>∞</sup> (in 1 step, m+2 total)<br />
0<sup>∞</sup> 1<sup>n+1</sup> <b><D</b> 1<sup>m+3</sup> 0<sup>∞</sup> (in 1 step, m+3 total)<br />
0<sup>∞</sup> 1<sup>n</sup> 0<sup>1</sup> <b>D></b> 1<sup>m+3</sup> 0<sup>∞</sup> (in 1 step, m+4 total)<br />
0<sup>∞</sup> 1<sup>n</sup> 0<sup>m+4</sup> <b>D></b> 0<sup>∞</sup> (in m+3 steps, 2m+7 total)<br />
0<sup>∞</sup> 1<sup>n</sup> 0<sup>m+5</sup> <b>C></b> 0<sup>∞</sup> (in 1 step, 2m+8 total)<br />
</samp></p>
<p>So, we now have a general rule that <code>0<sup>∞</sup> 1<sup>n+3</sup> 0<sup>m</sup> <b>C></b> 0<sup>∞</sup></code> → <code>0<sup>∞</sup> 1<sup>n</sup> 0<sup>m+5</sup> <b>C></b> 0<sup>∞</sup></code> in <code class="language-plaintext highlighter-rouge">2m+8</code> steps (as long as <code class="language-plaintext highlighter-rouge">n, m ≥ 0</code>). Rather than writing out this configuration completely each time, let’s define <code>g(n, m) = 0<sup>∞</sup> 1<sup>n</sup> 0<sup>m</sup> <b>C></b> 0<sup>∞</sup></code>. Now we can restate our rule as:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>g(n+3, m) → g(n, m+5) in 2m+8 steps
</code></pre></div></div>
<p>(Note that this rule only applies for <code class="language-plaintext highlighter-rouge">n, m ≥ 0</code>, rather than writing that out every time, I will simply use the convention that all rules apply for all variable <code class="language-plaintext highlighter-rouge">≥ 0</code>. This is why I didn’t write this rule as <code class="language-plaintext highlighter-rouge">g(n, m) → g(n-3, m+5)</code>)</p>
<p>I call this a <strong>rule step</strong>.</p>
<h3 id="repeated-application-of-rule-steps">Repeated Application of Rule Steps</h3>
<p>This rule is especially nice because you can apply it repeatedly. So for example, going back to step 138 where our TM configuration is <code>g(16, 5) = 0<sup>∞</sup> 1<sup>16</sup> 0<sup>5</sup> <b>C></b> 0<sup>∞</sup></code> we can see the evaluation:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>g(16, 5) → g(13, 10) → g(10, 15) → g(7, 20) → g(4, 25) → g(1, 30)
</code></pre></div></div>
<p>In fact, rather than manually applying this function multiple times, we can jump straight from the beginning to the end!</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>g(3k + r, m) → g(r, 5k + m)
</code></pre></div></div>
<p>In other words, applying our rule <code class="language-plaintext highlighter-rouge">k</code> times removes <code class="language-plaintext highlighter-rouge">3k</code> from the first parameter (the 1s count) and adds <code class="language-plaintext highlighter-rouge">5k</code> to the second parameter (the 0s count).</p>
<h4 id="computing-steps">Computing steps</h4>
<p>As a small aside, how many steps does this take? That is a little more grungy. Remember that <code class="language-plaintext highlighter-rouge">g(a+3, b) → g(a, b+5) in 2b+8 steps</code>, but as we repeatedly apply this rule the value of the second parameter (<code class="language-plaintext highlighter-rouge">b</code>) is changing! So, to compute the steps from <code class="language-plaintext highlighter-rouge">g(3k + r, m) → g(r, 5k + m)</code> we must compute this series:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sum(2b + 8
where b = 5j + m
for j in 0..k-1)
= sum(10j + 2m + 8
for j in 0..k-1)
= 10 * sum(j for j in 0..k-1) + (2m + 8) * k
= 10 * k * (k-1) / 2 + (2m + 8) * k
= 5k^2 + (2m + 3) k
</code></pre></div></div>
<h2 id="collatz-like-behavior">Collatz-like Behavior</h2>
<p>We have come a long way towards understanding the behavior of this TM, the remaining open question is what happens to <code class="language-plaintext highlighter-rouge">g(n, m)</code> when <code class="language-plaintext highlighter-rouge">n < 3</code>? For this we must go back to the tape representation. We will consider each case separately:</p>
<ul>
<li><code>g(0, m) = 0<sup>∞</sup> 1<sup>0</sup> 0<sup>m</sup> <b>C></b> 0<sup>∞</sup> = 0<sup>∞</sup> <b>C></b> 0<sup>∞</sup></code></li>
</ul>
<p><samp>
0<sup>∞</sup> <b>C></b> 0<sup>∞</sup><br />
0<sup>∞</sup> <b><C</b> 1<sup>1</sup> 0<sup>∞</sup><br />
<b><C</b> 1<sup>∞</sup> 0<sup>∞</sup><br />
</samp></p>
<p>This last step is a bit tongue-in-cheek, but effectively, at this point the TM runs forever to the left in state <code class="language-plaintext highlighter-rouge">C</code> trailing 1s. Since it remains in state <code class="language-plaintext highlighter-rouge">C</code> forever after, it has “quasihalted” at this point!</p>
<ul>
<li><code>g(1, m) = 0<sup>∞</sup> 1<sup>1</sup> 0<sup>m</sup> <b>C></b> 0<sup>∞</sup></code></li>
</ul>
<p><samp>
0<sup>∞</sup> 1<sup>1</sup> 0<sup>m</sup> <b>C></b> 0<sup>∞</sup><br />
0<sup>∞</sup> 1<sup>1</sup> 0<sup>m</sup> <b><C</b> 1<sup>1</sup> 0<sup>∞</sup> (in 1 step)<br />
0<sup>∞</sup> 1<sup>1</sup> <b><C</b> 1<sup>m+1</sup> 0<sup>∞</sup> (in m steps, m+1 total)<br />
0<sup>∞</sup> <b><A</b> 1<sup>m+2</sup> 0<sup>∞</sup> (in 1 step, m+2 total)<br />
0<sup>∞</sup> 1<sup>1</sup> <b>B></b> 1<sup>m+2</sup> 0<sup>∞</sup> (in 1 step, m+3 total)<br />
0<sup>∞</sup> 1<sup>m+3</sup> <b>B></b> 0<sup>∞</sup> (in m+2 step, 2m+5 total)<br />
0<sup>∞</sup> 1<sup>m+4</sup> <b>C></b> 0<sup>∞</sup> (in 1 step, 2m+6 total)<br />
</samp></p>
<p>So, <code class="language-plaintext highlighter-rouge">g(1, m) → g(m+4, 0)</code> in <code class="language-plaintext highlighter-rouge">2m + 6</code> steps.</p>
<ul>
<li><code>g(2, m) = 0<sup>∞</sup> 1<sup>2</sup> 0<sup>m</sup> <b>C></b> 0<sup>∞</sup></code></li>
</ul>
<p><samp>
0<sup>∞</sup> 1<sup>2</sup> 0<sup>m</sup> <b>C></b> 0<sup>∞</sup><br />
0<sup>∞</sup> 1<sup>2</sup> 0<sup>m</sup> <b><C</b> 1<sup>1</sup> 0<sup>∞</sup> (in 1 step)<br />
0<sup>∞</sup> 1<sup>2</sup> <b><C</b> 1<sup>m+1</sup> 0<sup>∞</sup> (in m steps, m+1 total)<br />
0<sup>∞</sup> 1<sup>1</sup> <b><A</b> 1<sup>m+2</sup> 0<sup>∞</sup> (in 1 step, m+2 total)<br />
0<sup>∞</sup> <b><D</b> 1<sup>m+3</sup> 0<sup>∞</sup> (in 1 step, m+3 total)<br />
0<sup>∞</sup> <b>C></b> 1<sup>m+3</sup> 0<sup>∞</sup> (in 1 step, m+4 total)<br />
0<sup>∞</sup> <b><A</b> 1<sup>m+3</sup> 0<sup>∞</sup> (in 1 step, m+5 total)<br />
0<sup>∞</sup> 1<sup>1</sup> <b>B></b> 1<sup>m+3</sup> 0<sup>∞</sup> (in 1 step, m+6 total)<br />
0<sup>∞</sup> 1<sup>m+4</sup> <b>B></b> 0<sup>∞</sup> (in m+3 step, 2m+9 total)<br />
0<sup>∞</sup> 1<sup>m+5</sup> <b>C></b> 0<sup>∞</sup> (in 1 step, 2m+10 total)<br />
</samp></p>
<p>So, <code class="language-plaintext highlighter-rouge">g(2, m) → g(m+5, 0)</code> in <code class="language-plaintext highlighter-rouge">2m + 10</code> steps.</p>
<h3 id="collatz-like-function">Collatz-like Function</h3>
<p>Putting all of these together we can see that once the TM get’s into any configuration like <code class="language-plaintext highlighter-rouge">g(n, m)</code>, we can completely describe that TMs behavior by using the above rules! Collecting them all together we have:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>g(3k, 0) → g(0, 5k) → Quasihalt
g(3k + 1, 0) → g(1, 5k) → g(5k + 4, 0)
g(3k + 2, 0) → g(2, 5k) → g(5k + 5, 0)
</code></pre></div></div>
<p>In other words, the behavior of this TM is equivalent to the behavior of this Collatz-like function of type 3 → 5 (using the language from Pascal Michel’s paper <a href="https://arxiv.org/abs/1311.1029">“Problems in Number Theory from Busy Beaver Competition”</a>):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>f(3k) undefined
f(3k + 1) = 5k + 4
f(3k + 2) = 5k + 5
</code></pre></div></div>
<p>Stated explicitly, if the TM reaches state <code>g(n, 0) = 0<sup>∞</sup> 1<sup>n</sup> <b>C></b> 0<sup>∞</sup></code>, then it will quasihalt if and only if there exists a <code class="language-plaintext highlighter-rouge">t</code> such that <code>f<sup>t</sup>(n)</code> is undefined.</p>
<h3 id="difficulty-of-deciding-quasihalting">Difficulty of Deciding Quasihalting</h3>
<p>In other words, the difficulty of the Quasihalting problem for this specific TM (but considering arbitrary initial tapes) is at least as hard as solving the above Collatz-like problem (whether or not iterating <code class="language-plaintext highlighter-rouge">f</code> starting from a specific value ever reaches the undefined value). Collatz-like problems are, in general, believed to be quite difficult to solve. Famously, Erdős said about the original Collatz conjecture: “Mathematics may not be ready for such problems.”</p>
<h2 id="collatz-orbit-from-blank-tape">Collatz Orbit from Blank Tape</h2>
<p>However, for the purposes of the Beeping Busy Beaver, we do not need to solve the halting problem for this machine for all input tapes, only for the blank tape! And that can, in fact, be directly computed. First of all, note that the TM reaches configuration <code class="language-plaintext highlighter-rouge">g(2, 0)</code> on step 2. From there we can calculate the iterations of the function <code class="language-plaintext highlighter-rouge">f</code> (the orbit of 2 via <code class="language-plaintext highlighter-rouge">f</code>):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>2 → 5 → 10 → 19 → 34 → 59 → 100 → 169 → 284 → 475
→ 794 → 1325 → 2210 → 3685 → 6144 → Quasihalt
</code></pre></div></div>
<p>What makes this TM so successful in the BBB is not only that it simulates a Collatz-like function, but also that it happens to have found a surprisingly long orbit (14 iterations before hitting the undefined state)!</p>
<h3 id="chance-of-long-orbit">Chance of Long Orbit</h3>
<p>The following is definitely hand-wavy, but: To put this orbit into perspective, if you chose random values <code class="language-plaintext highlighter-rouge">n</code>, then you’d expect for 1/3 of them, <code class="language-plaintext highlighter-rouge">f(n)</code> would be undefined (those divisible by 3). If we assume that the remainders of of the output of <code class="language-plaintext highlighter-rouge">f</code> after dividing by 3 are “uniformly distributed”, then the chance of <code>f<sup>t</sup>(n)</code> being defined would be <code>(2/3)<sup>t</sup></code>. Specifically in this case, the chance of <code>f<sup>14</sup>(n)</code> being defined is <code>(2/3)<sup>14</sup> = 0.3%</code>. So if feels like this TM got very “lucky” that it found such a long orbit: <code>f<sup>14</sup>(2) = 6144</code>!</p>
<p>Of course, there is a selection bias here. We are searching for the TMs that run a long time before quasihalting and thus are finding these “lucky” TMs out of a pile of “unlucky” ones.</p>
<h2 id="use-for-tm-simulation-acceleration">Use For TM Simulation Acceleration</h2>
<p>I find that modeling a TM by Collatz-like behavior gives me a personal feeling of insight into what it’s doing. But in addition to that value, our software also uses this behavior to significantly accelerate simulation of long-running TMs. Without it we would never be able to simulate TMs like the current <code class="language-plaintext highlighter-rouge">BB(6, 2)</code> <a href="https://webusers.imj-prg.fr/~pascal.michel/bbc.html">champion</a> which runs for <code> >10<sup>36,534</sup></code> steps!</p>Shawn LigockiNick Drozd just announced a new 4x2 Beeping Busy Beaver champion (BBB(4, 2)). This is a 4-state, 2-symbol Turing Machine which runs for 32,779,478 steps only to produce a blank tape and then run off to left forever in state C. As far as I can tell, the previous record for BBB(4, 2) was 66,349, so this is quite an improvement!WikiTree Network Degree Distribution2021-07-01T00:00:00+00:002021-07-01T00:00:00+00:00https://www.sligocki.com//2021/07/01/wikitree-network-degree-distribution<p>As promised, I will now present my first discovery based on analysis of the various WikiTree Networks which I <a href="/2021/06/24/wikitree-network-definition.html">described</a> and <a href="/2021/06/24/wikitree-network-corner-cases.html">clarified</a> in previous posts!</p>
<p>Where do we even start when trying to analyze these mega-networks? As Bernard has repeated many times, there is no easy way to visualize these giant networks directly and many network algorithms do not scale well up to networks with millions of nodes. One common place that many researchers seem to start is analyzing the <a href="https://en.wikipedia.org/wiki/Degree_distribution">“degree distribution”</a> of the network.</p>
<h2 id="definitions">Definitions</h2>
<p>The <a href="https://en.wikipedia.org/wiki/Degree_(graph_theory)">“degree”</a> of a node is simply the number of edges connected to it or, equivalently, the number of neighboring nodes (nodes 1 step away). For example, in the below network: Node C has degree 5 (5 edges connect to it from nodes A, B, E, G, & J) and Node D has degree 1 (only 1 edge connects to it from E).</p>
<p><img src="/assets/images/complex_person_network.png" alt="Example Network" /></p>
<p>In the Person Network, the degree is the number of people linked from that person’s profile (in other words, the total number of parents, siblings, spouses and children). In the language of the <a href="https://www.wikitree.com/wiki/Space:100_Circles">100 Circles project</a>, it is the size of the first circle (C1).</p>
<p>The “degree distribution” of a network is a summarization of what percentage of nodes have each degree. For the example network (above), the degree distribution would be:</p>
<table>
<thead>
<tr>
<th>Degree</th>
<th>% Nodes</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>2/9 = 22%</td>
</tr>
<tr>
<td>2</td>
<td>0/9 = 0%</td>
</tr>
<tr>
<td>3</td>
<td>2/9 = 22%</td>
</tr>
<tr>
<td>4</td>
<td>3/9 = 33%</td>
</tr>
<tr>
<td>5</td>
<td>2/9 = 22%</td>
</tr>
</tbody>
</table>
<p>Degree distributions are most often represented as a plot.</p>
<h2 id="wikitree-degree-distributions">WikiTree Degree Distributions</h2>
<p>So, what do the degree distributions look like for our networks?</p>
<p><img src="/assets/images/degrees_combined_linear.png" alt="Degree Distributions" /></p>
<p>[Notice that for the Bipartite Network, I plotted the two different node types (Person & Family) separately because (as you can see) they have noticeably different distributions.]</p>
<p>OK, so what are we seeing here? Well they all have similar shapes. Tons of nodes with small degree (few neighbors) and then a drastic drop-off. As far as I can tell, this general description fits most networks. It’s extremely common for networks to have most nodes with small degree and quickly drop-off with only a few nodes having giant degree. But there are subtle differences in the shapes that these distributions can take that only become apparent once we use a different plotting scale. Let’s dig into each one separately:</p>
<h3 id="person-network">Person Network</h3>
<p><img src="/assets/images/degrees_person.png" alt="Person Network Degree Distributions" /></p>
<p>Alright, here we have the exact same data plotted in three different types of plots.</p>
<ul>
<li>On the left is the standard plot called Linear-Linear because the y-axis and x-axis both use linear scaling (spacing between 0% & 2% is the same as between 2% & 4%).</li>
<li>In the middle is the same data plotted on a <a href="https://en.wikipedia.org/wiki/Semi-log_plot">Log-Linear Plot</a>. Here the x-axis (Degree) is still linear, but the y-axis is now using “logarithmic scaling”. First let me clarify how to read the y-axis: 10<sup>-1</sup> = 1/10 = 10%; 10<sup>-2</sup> = 1/100 = 1%; 10<sup>-3</sup> = 1/1000 = 0.1%; … 10<sup>-6</sup> = 1/1,000,000 (one in a million). So, this plot lets us see a lot more detail throughout the whole distribution even though it drops off so quickly! The red line here is a regression line which is the “best” line that fits the data. For a log-linear plot, this is called an “Exponential Regression”. If the data matched the regression line closely we would say that it has an “Exponential Degree Distribution”. In this specific case, it looks like a bit of a match, but not the best.</li>
<li>Finally, on the right is the same data plotted on a <a href="https://en.wikipedia.org/wiki/Log%E2%80%93log_plot">Log-Log Plot</a>. Here the y-axis is logarithmic scale again, but now the x-axis is also logarithmic scale. For log-log plots, the regression line represents a “Power Regression” and if the data matches that line, we say that it has a “Power-law Degree Distribution”. In this specific case, it looks like a terrible match.</li>
</ul>
<p>Why have a chosen these specific plots? Well it turns out that “Power-law Degree Distributions” are extremely common and well studied in Network Theory. The “Exponential Degree Distribution” is definitely not as famous, but it also seem to come up again and again in the literature.</p>
<h3 id="bipartite-network">Bipartite Network</h3>
<p>Here are the same plots for the Bipartite Network. I’ve completely separated it so that we look at only the Person Nodes first and then only the Family Nodes since their distributions are much different.</p>
<p><img src="/assets/images/degrees_bi_person.png" alt="Bipartite Network, Person Node Degree Distributions" /></p>
<p>For the Bipartite Network, the degree of a Person node is the number of “Family units” a person is part of. So for a person who never marries or has children, they will have degree 1 (their birth family). If a person marries once, they will have degree 2 (birth and marital family). And each additional marriage adds 1 to the degree.</p>
<p>Neither of these regression lines look great to me, but it does sort of look like the Power Regression might apply if we ignored the left-most and right-most points.</p>
<p><img src="/assets/images/degrees_bi_family.png" alt="Bipartite Network, Family Node Degree Distributions" /></p>
<p>For Family nodes in the Bipartite Network, the degree is the number of people in that family unit. In other words, the number of parents (2) + the number of children.</p>
<p>The Exponential Regression looks pretty decent, the Power Regression is definitely bad.</p>
<h3 id="family-network">Family Network</h3>
<p>For the Family Network, the degree is the number of “family units” connected to this one. Specifically, the number of childhood families of the parents (2 if they’re documented on WikiTree) + the number of families the children start.</p>
<p><img src="/assets/images/degrees_family.png" alt="Family Network Degree Distributions" /></p>
<p>Wow, wow wow wow. Look how well that Exponential Regression fits the Degree Distribution! Certainly it is not perfect (the blue line wobbles a bit), but compared to all the other regression lines we’ve seen, this one fits like a glove! So this is my first discovery: <strong>The WikiTree Family Network has Exponential Degree Distribution</strong>. In fact, this is what convinced me that the Family Network was worth considering further.</p>
<h2 id="exponential-degree-distribution">Exponential Degree Distribution</h2>
<p>So … what does that mean? Honestly, I’m not sure yet. Ever since I discovered this fact about a month ago, I have been frantically trying to read everything I can about Exponential Degree Distributions. The fit is so good, that I feel like it must say something about our network.</p>
<p>The best I’ve been able to find so far is section 2.1 and 5.1 of the book <a href="https://doi.org/10.1093/acprof:oso/9780198515906.001.0001">“Evolution of Networks”</a> by Dorogovtsev and Mendes in which they discuss how exponential networks can be grown via a process I will call “Random Attachment” in which we grow a network by repeatedly adding nodes and then connecting them to a random pre-existing node (without any preference).</p>
<p>This contrasts with the much more famous process of <a href="https://en.wikipedia.org/wiki/Preferential_attachment">“Preferential Attachment”</a> in which the added nodes are “preferentially” attach to nodes that have higher degree (the rich get richer). Preferential attachment appears to be an extremely popular theory in the Network Theory literature and it leads to a Power-law Degree Distribution.</p>
<p>Preferential attachment makes a lot of sense in other well studied networks: citations of published research papers; webpages and their hyperlinks; the network of actors who have worked together on films. In all of these cases, a popular node (often cited paper; highly linked to webpage; popular actor) is more likely to be linked to from a new node (new paper; new webpage; new movie). All of these networks have been shown to be power-law networks (also called <a href="https://en.wikipedia.org/wiki/Scale-free_network">“Scale-free networks”</a>).</p>
<p>However, in contrast, WikiTree is a bit different. Certainly there seems to be some level of preferential treatment (a person with a large family already on WikiTree is probably more likely get further research as more relatives discover their profile and add details). However, when it comes to the degree of that node (the number of family members) this cannot grow forever! At some point there will be no more family to add because they will all already be on WikiTree.</p>
<p>So could it be that the WikiTree Person Network is being created by some sort of “Random Attachment” process? I don’t know. Continue on this adventure with me and perhaps we will discover the answer.</p>
<hr />
<p>(Comments and discussion at <a href="https://www.wikitree.com/g2g/1262580/network-degree-distribution">WikiTree G2G</a>)</p>
<hr />
<p>This article is part of a series on the WikiTree Network:</p>
<ol>
<li>
<p><a href="/2021/06/23/wikitree-and-network-theory.html">WikiTree and Network Theory</a></p>
</li>
<li>
<p><a href="/2021/06/24/wikitree-network-definition.html">The WikiTree Network Definition</a></p>
</li>
<li>
<p><a href="/2021/06/24/wikitree-network-corner-cases.html">WikiTree Network Corner Cases</a></p>
</li>
<li>
<p><strong>WikiTree Network Degree Distribution</strong></p>
</li>
</ol>Shawn LigockiAs promised, I will now present my first discovery based on analysis of the various WikiTree Networks which I described and clarified in previous posts!WikiTree Network Corner Cases2021-06-24T23:00:00+00:002021-06-24T23:00:00+00:00https://www.sligocki.com//2021/06/24/wikitree-network-corner-cases<p>I was worried my <a href="/2021/06/24/wikitree-network-definition.html">WikiTree Network Definition</a> post was going to put everyone to sleep, but in contrast, you all just asked me for even more nitty-gritty details! I love it! So I’ve decided to delay my next article in order to add a follow-up Q&A post. Specifically:</p>
<ul>
<li>Eva <a href="https://www.wikitree.com/g2g/1258747/wikitree-network-defined?show=1258755#a1258755">asked</a> how we handle more complicated examples, such as people with multiple marriages.</li>
<li>Bernard <a href="https://www.wikitree.com/g2g/1258747/wikitree-network-defined?show=1258893#a1258893">asked</a> how distance measurements relate between these 3 network models.</li>
</ul>
<p>These are great questions and I think get to the heart of some of the (perhaps arbitrary) decisions that I (and the WikiTree connection finder folks) have made in the precise details of how these models are defined.</p>
<p>These two questions are actually intertwined in the sense that I can answer Bernard’s question (about relating distance measurements) concisely if a network doesn’t have these complicated scenarios. But it becomes more complicated once we consider different corner cases.</p>
<h2 id="complicated-example">Complicated Example</h2>
<p>There are several cases that I know of that lead to some complication between networks:</p>
<ol>
<li>A person with multiple marriages</li>
<li>Unmarried people having children together</li>
<li>People only one parent listed</li>
<li>People with no parents listed</li>
</ol>
<p>And here’s a tree with all of these wrapped up together:
<img src="/assets/images/complex_tree.png" alt="Complex Tree" /></p>
<p>Let me explain this one to make sure my notation is clear!</p>
<ul>
<li>A married B and they had child C</li>
<li>A did not marry D, but they had a child E together</li>
<li>B also married F and they had children G & J</li>
<li>We only know one parent of A: H</li>
<li>We don’t know either parents of B</li>
</ul>
<h2 id="person-network">Person Network</h2>
<p><img src="/assets/images/complex_person_network.png" alt="Complex Person Network" /></p>
<p>Notable details here:</p>
<ul>
<li>A & D are not connected directly because they never married.</li>
<li>C is connected directly to E, G & J because they are (half-)siblings (they share a parent). Since those connections are not part of any single “Nuclear Family unit”, I color them black.</li>
<li>E & G (step-siblings) are not directly connected because they do not share either parent.</li>
</ul>
<p>I made these choices in order to emulate the WikiTree Connection Finder. If these details differ from how the Connection Finder actually works, let me know and I’ll update my code :)</p>
<h2 id="bipartite-network">Bipartite Network</h2>
<p><img src="/assets/images/complex_bipartite_network.png" alt="Complex Bipartite Network" /></p>
<p>Notable details here:</p>
<ul>
<li>We do not distinguish between a married couple and unmarried co-parents. In other words the DAE subnetwork here is identical to the ABC subnetwork.</li>
<li>Half-siblings more distant than full siblings, you must travel up through their shared parent and back down.</li>
</ul>
<h2 id="family-network">Family Network</h2>
<p><img src="/assets/images/complex_family_network.png" alt="Complex Family Network" /></p>
<p>Notable details here:</p>
<ul>
<li>The same as in the Bipartite Network, we do not distinguish between a married couple and unmarried co-parents.</li>
<li>There is no direct connection between Family DAE and ABC (or between ABC & BFGJ) instead you have to travel through the parent node (H?A) to connect them. I could have chosen to connect these nodes directly (they share a person in common). In fact my original Family Network did connect re-marriage families together, however that led to cliques re-appearing for people who married many times. As an extreme example: <a href="https://www.wikitree.com/wiki/Young-93">Brigham Young</a> ended up causing a 56-node clique (with his 55 wives and one birth family) which had 1540 edges! The whole point of creating the Family Network was to avoid these sorts of giant cliques, so I changed my definition so that there would only be edges between the childhood family and each adult family (not between re-marriages).</li>
<li>I’ve added a virtual parent node for B (Grey node: ??B). Without this node, ABC and BFGJ would be completely disconnected which totally changes the connectivity of the network. So instead we just imagine that we had a stub profile for B’s parents (with no names or siblings, etc). [Side note: Adding these virtual parent nodes leads to a 2% node increases and 4% edge increase.]</li>
</ul>
<h2 id="distances">Distances</h2>
<p>OK, I think that answers Eva’s question, we are now almost ready to target Bernard’s. What is the relationship between shortest path distances between two people in these various networks?</p>
<h3 id="bipartite-network-projection">Bipartite Network Projection</h3>
<p>There is one tool that will be very helpful in comparing the distance metrics on these 3 networks: the <a href="https://en.wikipedia.org/wiki/Bipartite_network_projection">Bipartite Network Projection</a>. Well that’s quite a pile of jargon, what is it? Well it is a way to turn a bipartite network into a normal network. You can project a bipartite network onto either of its node types. Let me define the projection of our Bipartite Network onto its Person nodes: The new network will have all the Person nodes from the Bipartite Network and two Person nodes will be connected directly if they were both adjacent to the same Family node. In other words, we connect all members of a family to each other. Let’s call this the “Projected Person Network”. For our example it looks like:</p>
<p><img src="/assets/images/complex_projected_person_network.png" alt="Complex Projected Person Network" /></p>
<p>Notably, this looks almost identical to our Person Network with 2 differences:</p>
<ol>
<li>Half-siblings are no longer connected.</li>
<li>Unmarried co-parents (D & A) are now connected.</li>
</ol>
<p>This was the projection onto the Person nodes, but we can also project the Bipartite Network onto the Family nodes by the analogous method: All Family nodes from the Bipartite Network become nodes in our new network, they are connected if they share a Person in common. We’ll call this one the “Projected Family Network”:</p>
<p><img src="/assets/images/complex_projected_family_network.png" alt="Complex Projected Family Network" /></p>
<p>Notably, this looks very similar to our Family Network with 2 differences:</p>
<ol>
<li>Re-marriages are now connected (DAE to ABC and ABC to BFGJ).</li>
<li>We no longer need the virtual (??B) node to keep the network connected.</li>
</ol>
<h4 id="distance-properties-of-projection">Distance Properties of Projection</h4>
<p>Alright, why have I just introduced you to yet 2 more networks and this whole Network Projection concept? Well comparing distances on a bipartite network to its projections is super-simple: The distance between two points on a projection is exactly half of the distance on the original bipartite network! You can see this because every edge in the projected network represents two edges in the original bipartite network (for example, the edge A->B in the Projected Person Network above represents the edges A->Red->B in the original Bipartite Network).</p>
<p>So, if our original Person Network was identical to this new Projected Person Network, then we would know that distances in the Bipartite Network would be exactly twice the distances in the Person Network (the Connection Finder distances). Unfortunately, they are not the same. WikiTree Connection Finder developers decided to make half-siblings directly connected and unmarried co-parents unconnected and so the distance measurements between these two networks are non-trivial to compare. In general, I expect that distances in the Person Network graph will be roughly half as much as in the Bipartite Network, but not always exactly.</p>
<p>Likewise, if my Family Network was identical to this new Projected Family Network, then we would know that distances in the Bipartite Network would be exactly twice the distances in the Family Network. In this case, I am the one to blame. As mentioned above, I have avoided connecting re-marriages to each other to try and avoid cliques in the network (and because of that had to introduce the virtual parent nodes). So once again, in general, I expect that distances in the Family Network graph will be roughly half as much as in the Bipartite Network and thus roughly equal to distances in the Person Network, but not always exactly.</p>
<hr />
<p>(Comments and discussion at <a href="https://www.wikitree.com/g2g/1259092/wikitree-network-corner-cases">WikiTree G2G</a>)</p>
<hr />
<p>This article is part of a series on the WikiTree Network:</p>
<ol>
<li>
<p><a href="/2021/06/23/wikitree-and-network-theory.html">WikiTree and Network Theory</a></p>
</li>
<li>
<p><a href="/2021/06/24/wikitree-network-definition.html">The WikiTree Network Definition</a></p>
</li>
<li>
<p><strong>WikiTree Network Corner Cases</strong></p>
</li>
<li>
<p><a href="/2021/07/01/wikitree-network-degree-distribution.html">WikiTree Network Degree Distribution</a></p>
</li>
</ol>Shawn LigockiI was worried my WikiTree Network Definition post was going to put everyone to sleep, but in contrast, you all just asked me for even more nitty-gritty details! I love it! So I’ve decided to delay my next article in order to add a follow-up Q&A post. Specifically: Eva asked how we handle more complicated examples, such as people with multiple marriages. Bernard asked how distance measurements relate between these 3 network models.The WikiTree Network Definition2021-06-24T06:00:00+00:002021-06-24T06:00:00+00:00https://www.sligocki.com//2021/06/24/wikitree-network-definition<p>In my <a href="/2021/06/23/wikitree-and-network-theory.html">last post</a>, I discussed the motivation for applying Network Theory towards analyzing the WikiTree genealogy database. In this post, I will discus the nitty-gritty details about how to define such a network. In fact, I have come up with 3 different ways for defining such a network each of which has its own advantages and disadvantages.</p>
<p>Perhaps this does not sound very interesting? “I was promised exciting discoveries not nitty-gritty network definitions!” you might say. Well, I promise, the discoveries are coming! Stick with me through this adventure because I’ve discovered that these details make a big difference on the analysis!</p>
<h2 id="example-family-tree">Example Family Tree</h2>
<p><img src="/assets/images/example_tree.png" alt="Example Family Tree" title="Example Family Tree" />
In order to visualize these different representations, let me use this simple family tree. It contains 15 people (3 generations of descendants from J & H and the spouses of those descendants). Keep this tree in mind when reading about the various network layouts I propose below. Note that I consistently use the same colors to denote each family unit and the same letters to denote individual people.</p>
<h2 id="the-person-network">The Person Network</h2>
<p><img src="/assets/images/example_person_network.png" alt="Example Person Network" title="Example Person Network" />
This is the most natural way to represent the WikiTree database as a network. Each node represents a person and two nodes are connected if they represent a parent-child, sibling or spouse relationship.</p>
<p>Based on the 2021-06-20 data dump: This network has 23.5M nodes and 118.6M edges!</p>
<p>Pro:</p>
<ul>
<li>This is a very natural way to represent the WikiTree dataset. It is the same way that the WikiTree Connection Finder represents it. It is easy to explain to people what it means.</li>
</ul>
<p>Con:</p>
<ul>
<li>This network has a <strong>lot</strong> of edges! Even for this small example, the blue family (2 parents and 3 children) has 10 edges. And as you scale this up to large families the number of edges grow very quickly (quadratically). For example, for a family with 2 parents and 8 children, it will have 45 edges connecting every single member of the family to each other. Mathematicians call these completely interconnected sections <a href="https://en.wikipedia.org/wiki/Clique_(graph_theory)">“cliques”</a> and in addition to making visualization harder (so many crossing edges), cliques also presents a problem for many network analysis algorithms and statistics.</li>
</ul>
<h2 id="the-bipartite-network">The Bipartite Network</h2>
<p><img src="/assets/images/example_bipartite_network.png" alt="Example Bipartite Network" title="Example Bipartite Network" />
In order to avoid all the cliques present in the Person Network, I thought maybe I would aim to make networks that look more like standard family tree visualizations. In the Example Family Tree above, we don’t draw edges between every person in a family to every other, instead we connect them all through a common central splitting line. To mimic that, we could add a new node type to represent every “nuclear family unit”. So now we have two types of nodes: Person nodes and Family nodes. In this network I will never connect any Person node directly to another Person node (and never Family to Family either). Instead I will connect every Person to all of the Families they are a part of (the one they were born into and one for every marriage/co-parentage they have). Mathematicians call this sort of network a <a href="https://en.wikipedia.org/wiki/Bipartite_graph">Bipartite network</a>.</p>
<p>This network has 30.4M nodes and 30.8M edges.</p>
<p>Pro:</p>
<ul>
<li>To me this network looks closest to how I think about a family network. If you discover a new sibling and add it to the network, you are adding one node and one edge (in contrast to the Person Network where you are adding tons of new edges for every sibling added). This feels closer to what is really happening, you have added one new person and one new connection.</li>
</ul>
<p>Cons:</p>
<ul>
<li>Bipartite networks are more complicated to analyze. Many conventional algorithms do not really work on them out of the box, instead you have to search for bipartite-specific versions.</li>
<li>Although we reduced the edge count 4x, we’ve actually increased the node count (by adding Family nodes). When I tried loading this network on my laptop it still ended up using all 30GB of RAM on my machine, woof.</li>
</ul>
<h2 id="the-family-network">The Family Network</h2>
<p><img src="/assets/images/example_family_network.png" alt="Example Family Network" title="Example Family Network" /></p>
<p>The Person nodes don’t seem to be adding much information in the Bipartite Network image above. They mostly just seem to either connect to two Family nodes (if we know their parents and they were married or had children) or to a single Family node (if they never wed/had children or we don’t know their parents). It seems like the Family nodes are really representing most of the interesting connectivity in this tree. So, what if we built a network with only Family nodes? Since we no longer have Person nodes, I have labeled each Family node with the people it represents (2 parents/spouses at top, children at bottom). Now, how do we draw edges on this network? For every person, we connect their childhood Family node to their adult Family node (if they ever marry/have children).</p>
<p>This network has 7.6M nodes and 8.1M edges.</p>
<p>Pro:</p>
<ul>
<li>This network is <strong>dramatically</strong> smaller and simpler. It has 1/3 as many nodes and 1/15 as many edges as the original Person Network! I almost feel embarrassed about drawing such a small network, it feels like I should have tried to represent a larger family. So this visualization allows us to represent a lot more content in a smaller space!</li>
</ul>
<p>Con:</p>
<ul>
<li>Of all the networks I’ve discussed here, I think this one might be the least intuitive to interpret and explain. Without the Person nodes, you sort of have to re-orient your mind to thinking about family units as objects (nodes) and people as the connections (edges).</li>
</ul>
<h2 id="closing-thoughts">Closing thoughts</h2>
<p>Thanks for sticking with me through the details! In the next post, I’ll share the first exciting discovery I’ve made which wouldn’t have been possible without having these networks in mind.</p>
<hr />
<p>(Comments and discussion at <a href="https://www.wikitree.com/g2g/1258747/wikitree-network-defined">WikiTree G2G</a>)</p>
<hr />
<p>This article is part of a series on the WikiTree Network:</p>
<ol>
<li>
<p><a href="/2021/06/23/wikitree-and-network-theory.html">WikiTree and Network Theory</a></p>
</li>
<li>
<p><strong>The WikiTree Network Definition</strong></p>
</li>
<li>
<p><a href="/2021/06/24/wikitree-network-corner-cases.html">WikiTree Network Corner Cases</a></p>
</li>
<li>
<p><a href="/2021/07/01/wikitree-network-degree-distribution.html">WikiTree Network Degree Distribution</a></p>
</li>
</ol>Shawn LigockiIn my last post, I discussed the motivation for applying Network Theory towards analyzing the WikiTree genealogy database. In this post, I will discus the nitty-gritty details about how to define such a network. In fact, I have come up with 3 different ways for defining such a network each of which has its own advantages and disadvantages.WikiTree and Network Theory2021-06-23T00:00:00+00:002021-06-23T00:00:00+00:00https://www.sligocki.com//2021/06/23/wikitree-and-network-theory<p>About 3 years ago, I became interested in genealogy research and soon after discovered <a href="https://www.wikitree.com/">WikiTree</a>. WikiTree is a wonderful community of genealogists who collaborate to build and improve a single, shared family tree for all of humanity. At its core is a user-built database of 27 million profiles, each representing a person (users and ancestors) and containing a biography; vital record (birth, death and marriage) details; and connections to family members (parents, children, spouses and siblings).</p>
<p>As a Mathematician, I quickly became very interested in thinking about this dataset as a <a href="https://en.wikipedia.org/wiki/Graph_(discrete_mathematics)">mathematical graph</a> (or what is more commonly known these days as a <a href="https://en.wikipedia.org/wiki/Network_theory">network</a>). For example, we could consider each profile to be a node and connect nodes together with an edge if they are direct family members (parent-child, spouse or sibling relationships, for example).</p>
<p>You can use this network to answer many questions about connectivity in the dataset. For example, the <a href="https://www.wikitree.com/wiki/Special:Connection">WikiTree connection finder</a> is the equivalent of the <a href="https://en.wikipedia.org/wiki/Shortest_path_problem">shortest path</a> in this network between any two profiles. And whether or not a profile is <a href="https://www.wikitree.com/wiki/Help:Unconnected">connected</a> to the main tree is equivalent to the question: is this profile’s node in the largest <a href="https://en.wikipedia.org/wiki/Component_(graph_theory)">connected component</a> of the network?</p>
<p>But there are more complex questions you could answer with a network as well. <a href="https://www.wikitree.com/g2g/612063/mathematical-graph-structure-of-global-tree">For example</a>:</p>
<ul>
<li>What is the average distance between two randomly chosen profiles?</li>
<li>What is the average distance from a specific profile to all others?</li>
<li>Who is most “central” to the network (smallest average distance to all others)?</li>
<li>How well connected is the network? Say how many nodes/edges would have to be removed in order to disconnect two people?</li>
<li>How important are individual people or connections? How does removing a person or a connection degrade the network (disconnect people, increase average distance, etc.)?</li>
</ul>
<p>These are all questions that can be answered by analyzing the network directly and there is a whole field of Network Theory built around answering these types of questions. Of course, before you start, you need access to the full database in order to build such a network. Luckily, WikiTree is run on a very open data model and so you can easily apply for access to weekly <a href="https://www.wikitree.com/wiki/Help:Database_Dumps">Data Dumps</a>. Over the last several years, I’ve spent evenings and weekends analyzing these data dumps, producing networks and learning Network Theory in order to try and figure out what interesting things I can say about the WikiTree Network. In future posts, I will share some of the discoveries I’ve made and difficulties I’ve run into along the way.</p>
<hr />
<p>(Comments and discussion at <a href="https://www.wikitree.com/g2g/1258039/wikitree-and-network-theory">WikiTree G2G</a>)</p>
<hr />
<p>This article is part of a series on the WikiTree Network:</p>
<ol>
<li>
<p><strong>WikiTree and Network Theory</strong></p>
</li>
<li>
<p><a href="/2021/06/24/wikitree-network-definition.html">The WikiTree Network Definition</a></p>
</li>
<li>
<p><a href="/2021/06/24/wikitree-network-corner-cases.html">WikiTree Network Corner Cases</a></p>
</li>
<li>
<p><a href="/2021/07/01/wikitree-network-degree-distribution.html">WikiTree Network Degree Distribution</a></p>
</li>
</ol>Shawn LigockiAbout 3 years ago, I became interested in genealogy research and soon after discovered WikiTree. WikiTree is a wonderful community of genealogists who collaborate to build and improve a single, shared family tree for all of humanity. At its core is a user-built database of 27 million profiles, each representing a person (users and ancestors) and containing a biography; vital record (birth, death and marriage) details; and connections to family members (parents, children, spouses and siblings).Beeping Busy Beaver2021-03-06T00:00:00+00:002021-03-06T00:00:00+00:00https://www.sligocki.com//2021/03/06/beeping-busy-beaver<p>As a long time <a href="https://en.wikipedia.org/wiki/Busy_beaver">Busy Beaver</a> enthusiast, I was very excited when I heard that Scott Aaronson published a <a href="https://www.scottaaronson.com/blog/?p=4916">Busy Beaver survey</a> last September.<sup id="t1"><a href="#f1">1</a></sup> There are many exciting ideas and conjectures within it, but the one that has stuck with me the most was his proposed “Beeping Busy Beaver” function BBB(n). Quoting from the paper:</p>
<blockquote>
<p>One can define a function BB<sub>1</sub>(n), involving Turing machines with oracles for the original BB function, which grows uncomputably quickly even given an oracle for BB. Could we compute the first values of BB<sub>1</sub>? Alas, this is liable to be uninteresting, just because the details of how a Turing machine queries a BB oracle (by writing n onto an oracle tape, etc.) will involve many kludgy and non-canonical choices, and one might need many states before one saw the effect of those choices.</p>
<p>But there’s an alternative. Define a beeping Turing machine to be exactly the same as one of the Turing machines except that it emits a “beep” as it exits a certain designated state, called the “beep state”. Then given a beeping Turing machine M, let b(M) be the number of the last step on which M beeps (or b(M) := ∞ if M beeps infinitely often), when M is run on an initially all-0 input tape. Finally, define the n th Beeping Busy Beaver number “BBB(n)” to be the maximum (non-infinite) value of b(M) for all n-state Turing Machines M.</p>
<p>Clearly BBB(n) ≥ BB(n) for all n, since we could designate the state from which M halts as its beep state. One can show that, as n gets larger, BBB must grow uncomputably faster than even BB—indeed, it grows at a similar rate to BB<sub>1</sub>, in the sense that BBB and BB<sub>1</sub> are both computable given an oracle for any upper bound on the other.</p>
</blockquote>
<p>Nick Drozd <a href="https://nickdrozd.github.io/2020/08/13/beeping-busy-beavers.html">introduced the term “quasihalt”</a> to denote the moment that a Beeping TM beeps last (TMs which beep infinitely often never quasihalt). This naming is very convenient and evocative because it makes the analogy between the Busy Beaver and Beeping Busy Beaver crystal clear: The Busy Beaver problem is to find the TM which runs longest before halting, the Beeping Busy Beaver problem is to find the TM which runs longest before quasihalting.</p>
<p>I was quite shocked to discover that such a simple modification to the Turing Machine halt condition was enough to give it the strength of an oracle machine! And at first it is not clear exactly why or how it accomplishes this. In the rest of this post I explore various aspects of the power of beeping TMs.</p>
<h2 id="detecting-quasihalting">Detecting Quasihalting</h2>
<p>The first thing to note is that quasihalting is quite a bit more complicated that halting. While the question of whether a TM <strong>will</strong> halt is undecidable, the question of whether it <strong>has</strong> halted is extremely easy: is it in the Halt state? Deciding whether a TM has quasihalted, on the other hand is no simple task. It amounts to proving that the TM will never again reach a certain state. In fact, detecting whether a TM has quasihalted yet is an undecidable problem itself. We can see that by noting that if we had a reliable method for detecting that we had quasihalted, we could use it to solve the classic halting problem! Specifically, consider any arbitrary TM X, say:<sup id="t2"><a href="#f2">2</a></sup></p>
<table>
<thead>
<tr>
<th style="text-align: center"> </th>
<th style="text-align: center">0</th>
<th style="text-align: center">1</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: center">A</td>
<td style="text-align: center">1RB</td>
<td style="text-align: center">1LA</td>
</tr>
<tr>
<td style="text-align: center">B</td>
<td style="text-align: center">0RC</td>
<td style="text-align: center">1LC</td>
</tr>
<tr>
<td style="text-align: center">C</td>
<td style="text-align: center">1RD</td>
<td style="text-align: center">0LA</td>
</tr>
<tr>
<td style="text-align: center">D</td>
<td style="text-align: center">1RE</td>
<td style="text-align: center">1RB</td>
</tr>
<tr>
<td style="text-align: center">E</td>
<td style="text-align: center">1LC</td>
<td style="text-align: center">1R<strong>H</strong></td>
</tr>
</tbody>
</table>
<p>We would like to know if it ever reaches the halt state <strong>H</strong>. Well, let’s create a second TM Y by replacing the halt state with a beeping state “Q”, make that the new start state and have it simply transition to the previous start state “A”:</p>
<table>
<thead>
<tr>
<th style="text-align: center"> </th>
<th style="text-align: center">0</th>
<th style="text-align: center">1</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: center"><strong>Q</strong></td>
<td style="text-align: center"><strong>0RA</strong></td>
<td style="text-align: center">—</td>
</tr>
<tr>
<td style="text-align: center">A</td>
<td style="text-align: center">1RB</td>
<td style="text-align: center">1LA</td>
</tr>
<tr>
<td style="text-align: center">B</td>
<td style="text-align: center">0RC</td>
<td style="text-align: center">1LC</td>
</tr>
<tr>
<td style="text-align: center">C</td>
<td style="text-align: center">1RD</td>
<td style="text-align: center">0LA</td>
</tr>
<tr>
<td style="text-align: center">D</td>
<td style="text-align: center">1RE</td>
<td style="text-align: center">1RB</td>
</tr>
<tr>
<td style="text-align: center">E</td>
<td style="text-align: center">1LC</td>
<td style="text-align: center">1R<strong>Q</strong></td>
</tr>
</tbody>
</table>
<p>So this new TM starts in state Q, then immediately transitions to the original TM (X)’s start state “A” and starts running on a blank tape. It will then act exactly the same as the original TM (X) until/unless it reaches the <code class="language-plaintext highlighter-rouge">E1 -> 1RQ</code> transition.</p>
<p>Has this new TM (Y) already quasihalted after it’s first step? If it has, then that means the original TM (X) will never halt. If Y has not already quasihalted, that means that X will halt. So if we could decide whether an arbitrary TM has already quasihalted, we could decide the classical halting problem!</p>
<h2 id="the-power-of-quasicomputing">The Power of Quasicomputing</h2>
<p>OK, so clearly quasihalting is more <em>complicated</em> than halting, but is it more <em>powerful</em>? What does it even mean to be more powerful? Well, for normal TMs, we say that they compute a function if, for any input <code class="language-plaintext highlighter-rouge">n</code> they can write <code class="language-plaintext highlighter-rouge">f(n)</code> on the tape and then halt. Any function <code class="language-plaintext highlighter-rouge">f</code> which can be computed by a TM is said to be <strong>computable</strong>. Famously, the Busy Beaver function is not computable and, in fact, it grows faster than any computable function. Reasoning about tape contents is a bit annoying and runtime is often more useful. So consider a TM that which given input <code class="language-plaintext highlighter-rouge">n</code> runs for <code class="language-plaintext highlighter-rouge">g(n)</code> steps and then halts. It can be shown that this <code class="language-plaintext highlighter-rouge">g</code> is also a computable function.</p>
<p>So then, let us say that a function <code class="language-plaintext highlighter-rouge">h</code> is “quasicomputable” if there exists a TM which when given input <code class="language-plaintext highlighter-rouge">n</code> runs for <code class="language-plaintext highlighter-rouge">h(n)</code> steps and then quasihalts. Is quasicomputablity stronger than computability? Yes! In fact, the classic Busy Beaver function is quasicomputable! (or to be a bit more technical, it is dominated by a quasicomputable function.)</p>
<p>Here is the algorithm for quasicomputing the Busy Beaver function:</p>
<ul>
<li>Given an input <code class="language-plaintext highlighter-rouge">n</code>, enumerate all <code class="language-plaintext highlighter-rouge">n</code>-state 2-symbol Turing machines onto subsections of the tape.</li>
<li>In parallel, simulate each of these TMs (Simulate step 1 of TM 1, then step 1 of TM 2, …, step 2 of TM 1, …).</li>
<li>Whenever any of the simulated TMs halts, enter the beeping state and continue with the other TMs.</li>
</ul>
<p>So, this program will beep exactly once for every <code class="language-plaintext highlighter-rouge">n</code>-state TM that halts. And, as we know, there are only a finite number of <code class="language-plaintext highlighter-rouge">n</code>-state halting TMs. Therefore, this program will only beep a finite number of times and thus it quasihalts! But how many steps does it run before halting? Well to know the exact number would require me to give a much more detailed description, but for sure it will run for more steps than every single one of the halting <code class="language-plaintext highlighter-rouge">n</code>-state TMs runs for (because it is simulating every one of those steps before quasihalting). So this program computes a function <code class="language-plaintext highlighter-rouge">h(n) > BB(n)</code>. And thus we can see that quasicomputablity is stronger than computability (because <code class="language-plaintext highlighter-rouge">BB(n)</code> grows faster than any computable function)! Perhaps we should have called it “hyper-computability”.</p>
<h2 id="future-work">Future work</h2>
<p>This demonstration that the Busy Beaver function is quasicomputable seems very powerful, but we have still not completely shown that BBB(n) grows at the same rate as BB<sub>1</sub>(n) (the oracle-Busy Beaver). Likewise, Scott mentions in a footnote that we could have defined a similar function for Nondeterministic TMs and that that function would also grow at the same rate. In a future post I plan to dig into those comparisons.</p>
<h2 id="footnotes">Footnotes</h2>
<ul>
<li><b id="f1"><a href="#t1">1</a></b> A small aside about Scott Aaronson: I first learned of Scott over a decade ago when I read his excellent 1998 essay <a href="https://www.scottaaronson.com/writings/bignumbers.html">Who Can Name the Biggest Number?</a> which really captures the excitement I feel about the Busy Beaver problem so beautifully. I heard of him again in 2016 when his master’s student Adam Yedidia constructed an explicit upper bound to the Busy Beaver number that ZFC can prove (<a href="https://www.scottaaronson.com/blog/?p=2725">The 8000th Busy Beaver number eludes ZF set theory</a>) which was quite exciting.</li>
<li><b id="f2"><a href="#t2">2</a></b> This is one of the 8031 holdout 5-state, 2-symbol TMs that we believe never halts, but have been unable to prove that it runs infinitely. If only we could use a quasihalt detection algorithm on these, we could declare once and for all that BB(5) = 47,176,870.</li>
</ul>Shawn LigockiAs a long time Busy Beaver enthusiast, I was very excited when I heard that Scott Aaronson published a Busy Beaver survey last September.1 There are many exciting ideas and conjectures within it, but the one that has stuck with me the most was his proposed “Beeping Busy Beaver” function BBB(n). Quoting from the paper:Bubbly Puzzle2019-02-08T00:00:00+00:002019-02-08T00:00:00+00:00https://www.sligocki.com//2019/02/08/bubbly<p>Want to play a game? Pop any bubble you like. When you pop a bubble, all the bubbles that contain it will also be popped (your needle must pierce them to get the target bubble, of course).</p>
<p><img src="http://www.mit.edu/~puzzle/2019/assets/puzzles/bubbly/image-2.png" alt="Bubbly example" title="Bubbly example" /></p>
<p>For example, if you want to pop the bubble containing <code class="language-plaintext highlighter-rouge">U</code> you will also pop the bubbles <code class="language-plaintext highlighter-rouge">N</code>, <code class="language-plaintext highlighter-rouge">F</code> and <code class="language-plaintext highlighter-rouge">I</code>. Then your opponent gets to pop a bubble of their choosing. The player who pops the last bubble wins. How do you know which bubble to pop?</p>
<p>This is the <a href="http://www.mit.edu/~puzzle/2019/puzzle/bubbly.html">Bubbly</a> puzzle from <a href="http://www.mit.edu/~puzzle/2019/">this year’s (2019) MIT Mystery Hunt</a>. The goal of this puzzle was to find the correct winning moves for the first player in a number of increasingly complex bubble arrangements.</p>
<p>Our team ended up solving this puzzle by writing a brute-force computer program which used the <a href="https://en.wikipedia.org/wiki/Minimax">Min-Max algorithm</a> to find all the winning moves. But while working on this we noticed several interesting properties that kept me interested in the problem over the next few weeks. It reminded me of the classic mathematical game <a href="https://en.wikipedia.org/wiki/Nim">Nim</a> and I wondered if I could find some elegant way to analyze Bubbly positions faster using <a href="https://en.wikipedia.org/wiki/Combinatorial_game_theory">Combinatorial Game Theory</a> which I discovered a couple years ago in the book <a href="https://en.wikipedia.org/wiki/Winning_Ways_for_your_Mathematical_Plays"><em>Winning Ways for your Mathematical Plays</em></a>.</p>
<p>If you like these sorts of games, I suggest you play around with it for a bit, see if you can figure out the winning first moves in some of the example games (there are more than one winning move in all games). Then read on for some mathematical shortcuts.</p>
<h2 id="nim">Nim</h2>
<p>Before analyzing Bubbly, let’s consider the simpler game: <strong>Nim</strong>. In the game <a href="https://en.wikipedia.org/wiki/Nim">Nim</a>, there are some number of stacks of coins. On your turn you may remove as many coins as you like from one stack (but you must remove at least one coin from one stack). Like in Bubbly, the player who removes the last coin wins (sometimes the game is played with oposite rule, but we will not consider that version in this article). If you have not heard of Nim before, try playing some games and see if you can figure out any patterns.</p>
<p>It turns out that there is an elegent mathematical rule for how to play Nim optimally. You take the sizes of all the stacks and add them up using <a href="https://en.wikipedia.org/wiki/Nimber#Addition">“Nimber addition”</a>. One way to think about Nimber addition is to convert all the stack sizes into base 2 and then add them <strong>without</strong> carrying. So for example: If the stack sizes were 3, 4, and 5 then the Nimber sum is:</p>
<figure class="highlight"><pre><code class="language-txt" data-lang="txt"> 11
100
⊕ 101
-----
010</code></pre></figure>
<p>If this Nimber sum is 0 on your turn, then you lose, any move you make can be countered by your opponent. If the sum is non-0, you can win by removing coins from one of the stacks enough to make the Nimber sum 0. (How do we know that there is always a way to remove coins from only one stack to cause the Nimber sum to be 0? Try it out yourself and then read <a href="http://web.mit.edu/sp.268/www/nim.pdf">this article</a> for a more rigorous proof).</p>
<h2 id="sprague-grundy-theorem">Sprague-Grundy theorem</h2>
<p>It turns out that Nimbers will be quite useful to us. In fact, the <a href="https://en.wikipedia.org/wiki/Sprague%E2%80%93Grundy_theorem">Sprague-Grundy theorem</a> tells us that every “impartial” 2-player “sequential game” of “perfect information” using the “normal play convention” can be evaluated as a single Nimber and multiple games can be added together using Nimber addition!</p>
<p>Let’s unpack a this with a few definitions first. A 2-player <strong>sequential game</strong> is one in which the players alternate turns, no player gets multiple turns in a row. Nim, Bubbly, Chess and most other games are sequential. <a href="https://en.wikipedia.org/wiki/Dots_and_Boxes">Dots and Boxes</a> is example of a game which is not sequential because filling a box gives the player another turn (although it could be converted into a sequential game by considering the whole sequence of extra turns as one move).</p>
<p>A game of <strong>perfect information</strong> means that there is no luck and no “hidden information” which one player knows, but the other does not. Nim, Bubbly, and Chess are all perfect information games. Poker, Backgammon, and Go Fish are examples of games without perfect information.</p>
<p>An <strong>impartial game</strong> is a game where the list of valid moves for either player is the same in all game positions. Nim, Bubbly and Dots and Boxes are all impartial games because the rules specify no difference between the moves that either player can make in a given position. In contrast, Chess, Go and Tic-Tac-Toe are not impartial games (they are called “partisan games”) because each player can only place (or move) their own pieces and not their opponent’s pieces.</p>
<p>The <strong>normal play convention</strong> is that the player who plays the last move (and thus forces their opponent to have no moves) is the winner. As discussed above Nim and Bubbly both follow this convention. The other games mentioned above do not strictly follow this normal play convention, having more complicated winning conditions (in fact, forcing your opponent to have no moves causes a Tie game in Chess). (You may question whether this is really a “normal” play given that so few games actually use this convention, but we digress).</p>
<p>It turns out that every game satisfying these conditions can be evaluated to be worth a specific Nimber and if you combine multiple games together, you can evaluate the total value by Nimber addition on the individual games (Just as we did for Nimber addition on the stack sizes in Nim). Let’s call these games Nimber games.</p>
<h2 id="adding-two-games">Adding two Games</h2>
<p>What does it mean to add two games together? Well, it means that you play all the games at the same time, but on each person’s turn they must make a move in exactly one of the games. So, for example, if you break Nim up so that each stack is a separate game, you can see that the full game of Nim is exactly the sum of the individual stack games because the player must remove coins from exactly one stack/game.</p>
<p>Likewise in Bubbly, if you look at all the outside bubbles (ones not contained in any others (E, H and I in the picture above)) you can see that each of those can be broken off as it’s own game and the sum of those games will be the same as the value of the original game (because on every move in Bubbly, you must pick one outermost bubble and play a move inside of it).</p>
<h2 id="nimber-evaluation">Nimber Evaluation</h2>
<p>This is all well and good, but now that we have separated the bubbly example above into 3 sub-games, how do you evaluate the Nimber value of each sub-game?</p>
<p>Well, 2 of those sub-games are actually rather easy to evaluate. For example, consider the top-left bubble (<code class="language-plaintext highlighter-rouge">E</code> with <code class="language-plaintext highlighter-rouge">G</code> inside), if we make a move in this bubble, there are only two options:</p>
<ol>
<li>Pop <code class="language-plaintext highlighter-rouge">E</code> and leave a single bubble <code class="language-plaintext highlighter-rouge">G</code></li>
<li>Pop <code class="language-plaintext highlighter-rouge">G</code> and leave nothing</li>
</ol>
<p>This is exactly the same as a Nim stack of size 2 (which has only two options, remove one coin and leave a single coin; or remove all the coins and leave nothing). So this bubble group has Nimber value 2.</p>
<p>Likewise the group in the bottom left is 4 nested bubbles and allows us four options: Leave (3, 2, 1, or 0) nested bubbles. So this is the same as a Nim stack of size 4 and so it has Nimber value 4.</p>
<p>The big bubble on the right is clearly going to be the hardest and this one will require the MEX-rule to help us out.</p>
<h2 id="mex-rule">MEX rule</h2>
<p>For a given Nimber game position, consider the set of possible positions that can be reached after one move (this will be the same set for either player since we’re only talking about impartial games). For example, given a Nim staick of size 3, the possible next positions are <code class="language-plaintext highlighter-rouge">{0, 1, 2}</code> (meaning you can reduce this to a stack of size 0, 1 or 2).</p>
<p>The MEX rule says that in any Nimber game, the Nimber value is equal to the smallest Nimber not in the set of possible next moves (the Minimum EXcluded or <strong>MEX</strong> Nimber).</p>
<p>So for the example above the Nim stack of size 3 has MEX 3 since 3 is the smallest Nimber not in the set of next moves <code class="language-plaintext highlighter-rouge">{0, 1, 2}</code>.</p>
<h2 id="applying-mex-to-bubbly">Applying MEX to Bubbly</h2>
<p>Let’s consider a more interesting example, the <code class="language-plaintext highlighter-rouge">F</code> bubble in the image above. This is one outside bubble with two stacks inside: A stack of size 1 (<code class="language-plaintext highlighter-rouge">S</code>) and a stack of size 2 (<code class="language-plaintext highlighter-rouge">N</code>-<code class="language-plaintext highlighter-rouge">U</code>). What are the possible next positions here:</p>
<ul>
<li>Pop <code class="language-plaintext highlighter-rouge">U</code>: Leaves a single bubble (<code class="language-plaintext highlighter-rouge">S</code>). Nimber value = 1</li>
<li>Pop <code class="language-plaintext highlighter-rouge">N</code>: Leaves two single bubbles (<code class="language-plaintext highlighter-rouge">S</code> & <code class="language-plaintext highlighter-rouge">U</code>). Nimber value = 1 ⊕ 1 = 0</li>
<li>Pop <code class="language-plaintext highlighter-rouge">S</code>: Leaves a single stack of two bubbles (<code class="language-plaintext highlighter-rouge">U</code>-<code class="language-plaintext highlighter-rouge">N</code>). Nimber value 2</li>
<li>Pop <code class="language-plaintext highlighter-rouge">F</code>: Leaves a stack of one (<code class="language-plaintext highlighter-rouge">S</code>) & a stack of 2 (<code class="language-plaintext highlighter-rouge">U</code>-<code class="language-plaintext highlighter-rouge">N</code>). Nimber value 1 ⊕ 2 = 3</li>
</ul>
<p>So the Nimber values of next positions are <code class="language-plaintext highlighter-rouge">{0, 1, 2, 3}</code> and so the MEX rule tells us that the <code class="language-plaintext highlighter-rouge">F</code> bubble with everything inside of it has Nimber value 4. For the remainder of this article, I’ll use <code class="language-plaintext highlighter-rouge">X+</code> to denote the <code class="language-plaintext highlighter-rouge">X</code> bubble with everything inside of it and I’ll use <code class="language-plaintext highlighter-rouge">nim(X+)</code> to refer to the Nimber value of <code class="language-plaintext highlighter-rouge">X+</code>. We just showed that <code class="language-plaintext highlighter-rouge">nim(F+) = 4</code>.</p>
<p>We have enough tools now to evaluate the Nimber value for <code class="language-plaintext highlighter-rouge">I+</code> (the entire <code class="language-plaintext highlighter-rouge">I</code> bubble and everything inside of it). The possible next positions are:</p>
<ul>
<li>Pop <code class="language-plaintext highlighter-rouge">I</code>: Leaves the <code class="language-plaintext highlighter-rouge">F+</code> & <code class="language-plaintext highlighter-rouge">T+</code>. <code class="language-plaintext highlighter-rouge">value = nim(F+) ⊕ nim(T+) = 4 ⊕ 4 = 0</code></li>
<li>Pop something in <code class="language-plaintext highlighter-rouge">F+</code>: Leaves <code class="language-plaintext highlighter-rouge">T+</code> and one of the positions listed above (<code class="language-plaintext highlighter-rouge">S</code>, <code class="language-plaintext highlighter-rouge">S</code> ⊕ <code class="language-plaintext highlighter-rouge">U</code>, <code class="language-plaintext highlighter-rouge">N+</code> or <code class="language-plaintext highlighter-rouge">S</code> ⊕ <code class="language-plaintext highlighter-rouge">N+</code>). We’ve already evaluated the Nimber values for all those sub-<code class="language-plaintext highlighter-rouge">F+</code> positions, those values are <code class="language-plaintext highlighter-rouge">{0, 1, 2, 3}</code>. So the value options when considering <code class="language-plaintext highlighter-rouge">T+</code> are <code class="language-plaintext highlighter-rouge">nim(T+) ⊕ {0, 1, 2, 3} = 4 ⊕ {0, 1, 2, 3} = {4, 5, 6, 7}</code></li>
<li>Pop something in <code class="language-plaintext highlighter-rouge">T+</code>: Leaves <code class="language-plaintext highlighter-rouge">F+</code> and one of (<code class="language-plaintext highlighter-rouge">C+</code>, <code class="language-plaintext highlighter-rouge">A+</code>, <code class="language-plaintext highlighter-rouge">L</code> or Empty). Value options = <code class="language-plaintext highlighter-rouge">nim(F+) ⊕ nim({C+, A+, L, ∅}) = 4 + {3, 2, 1, 0} = {4, 5, 6, 7}</code></li>
</ul>
<p>So the Nimber values of the next positions are <code class="language-plaintext highlighter-rouge">{0, 4, 5, 6, 7}</code> and thus the MEX rule evaluates <code class="language-plaintext highlighter-rouge">nim(I+) = 1</code> (Note, the 4, 5, 6, 7 don’t effect this computation, we’re just looking for the minimum excluded number which is 1).</p>
<p>Finally we can evalute the entire game’s Nimber value, which is: <code class="language-plaintext highlighter-rouge">nim(I+) ⊕ nim(H+) ⊕ nim(E+) = 1 ⊕ 4 ⊕ 2 = 7</code>.</p>
<h2 id="how-to-win-bubbly">How to win Bubbly</h2>
<p>Ok, so now what? We know the Nimber value of this position is 7. Woohoo … wait … what does that mean again?</p>
<p>This value tells you whether or not the current play has a winning strategy in this position. If the Nimber value is 0, the currently player has no winning strategy (assuming perfect opponent play). If the Nimber value is not 0, then the current player does have a winning strategy. So what is it? Well the trick is to figure out how to pop a bubble and leave the Nimber value at 0 for your opponent. You could do this the brute-force way by just iterating through each bubble and see what Nimber score popping it leaves you with … or you could systematicly go about it. For example:</p>
<ul>
<li>Pop something inside of <code class="language-plaintext highlighter-rouge">E+</code>: Leaves <code class="language-plaintext highlighter-rouge">H+</code> ⊕ <code class="language-plaintext highlighter-rouge">I+</code> ⊕ (something inside <code class="language-plaintext highlighter-rouge">E+</code>). <code class="language-plaintext highlighter-rouge">nim(H+) ⊕ nim(I+) = 4 ⊕ 1 = 5</code>, so the only way to get to 0 would be to have the Nimber value of something inside <code class="language-plaintext highlighter-rouge">E+</code> be 5. but the possible values after popping something in <code class="language-plaintext highlighter-rouge">E+</code> are <code class="language-plaintext highlighter-rouge">{0, 1}</code>, so that won’t work.</li>
<li>Pop something inside of <code class="language-plaintext highlighter-rouge">H+</code>: Leaves <code class="language-plaintext highlighter-rouge">E+</code> ⊕ <code class="language-plaintext highlighter-rouge">I+</code> ⊕ (something inside <code class="language-plaintext highlighter-rouge">H+</code>). <code class="language-plaintext highlighter-rouge">nim(E+) ⊕ nim(I+) = 2 ⊕ 1 = 3</code>. Aha, if we pop <code class="language-plaintext highlighter-rouge">H</code> we are left with <code class="language-plaintext highlighter-rouge">V+</code> and <code class="language-plaintext highlighter-rouge">nim(V+) = 3</code>, so popping <code class="language-plaintext highlighter-rouge">H</code> is a winning move because it leaves a possition with Nimber value 0 (<code class="language-plaintext highlighter-rouge">nim(E+) ⊕ nim(I+) + nim(V+) = 0</code>).</li>
<li>Pop something inside of <code class="language-plaintext highlighter-rouge">I+</code>: Leaves <code class="language-plaintext highlighter-rouge">E+</code> ⊕ <code class="language-plaintext highlighter-rouge">H+</code> ⊕ (something inside <code class="language-plaintext highlighter-rouge">I+</code>). <code class="language-plaintext highlighter-rouge">nim(E+) ⊕ nim(H+) = 2 ⊕ 4 = 6</code>. If we look back at the evaluation of <code class="language-plaintext highlighter-rouge">nim(I+)</code> we can see that there are two ways to get to a Nimber value of 6 by popping something inside of <code class="language-plaintext highlighter-rouge">I</code>: Either by popping something inside of <code class="language-plaintext highlighter-rouge">F+</code> (<code class="language-plaintext highlighter-rouge">S</code>) or something inside of <code class="language-plaintext highlighter-rouge">T+</code> (<code class="language-plaintext highlighter-rouge">C</code>).</li>
</ul>
<p>So there are 3 winning moves from this position: <code class="language-plaintext highlighter-rouge">C</code>, <code class="language-plaintext highlighter-rouge">H</code> or <code class="language-plaintext highlighter-rouge">S</code>. If you find the rest of the winning moves on the MIT Mystery Hunt Puzzle, maybe you’ll spell something interesting :)</p>Shawn LigockiWant to play a game? Pop any bubble you like. When you pop a bubble, all the bubbles that contain it will also be popped (your needle must pierce them to get the target bubble, of course).