<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[The Experimentalist]]></title><description><![CDATA[Backend engineering meets hands-on experimentation. Insights on processing topologies, observability, and ML. Thoughtful takes on tech-life economics.]]></description><link>https://substack.the-experimentalist.com</link><image><url>https://substackcdn.com/image/fetch/$s_!0yut!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba6cba4c-0a9a-477b-a6d1-af81556e2fa5_1280x1280.png</url><title>The Experimentalist</title><link>https://substack.the-experimentalist.com</link></image><generator>Substack</generator><lastBuildDate>Wed, 06 May 2026 11:47:21 GMT</lastBuildDate><atom:link href="https://substack.the-experimentalist.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[CodeKami Consulting LLC]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[rmpinchback@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[rmpinchback@substack.com]]></itunes:email><itunes:name><![CDATA[Reid M. Pinchback]]></itunes:name></itunes:owner><itunes:author><![CDATA[Reid M. Pinchback]]></itunes:author><googleplay:owner><![CDATA[rmpinchback@substack.com]]></googleplay:owner><googleplay:email><![CDATA[rmpinchback@substack.com]]></googleplay:email><googleplay:author><![CDATA[Reid M. Pinchback]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Evidence Gathering in LLMs]]></title><description><![CDATA[What exactly is inference being performed on?]]></description><link>https://substack.the-experimentalist.com/p/evidence-gathering-in-llms</link><guid isPermaLink="false">https://substack.the-experimentalist.com/p/evidence-gathering-in-llms</guid><dc:creator><![CDATA[Reid M. Pinchback]]></dc:creator><pubDate>Fri, 12 Dec 2025 09:43:02 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!k0cC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5877a11d-9793-442e-9956-3e1747bab6a7_514x400.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!k0cC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5877a11d-9793-442e-9956-3e1747bab6a7_514x400.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!k0cC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5877a11d-9793-442e-9956-3e1747bab6a7_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!k0cC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5877a11d-9793-442e-9956-3e1747bab6a7_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!k0cC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5877a11d-9793-442e-9956-3e1747bab6a7_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!k0cC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5877a11d-9793-442e-9956-3e1747bab6a7_514x400.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!k0cC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5877a11d-9793-442e-9956-3e1747bab6a7_514x400.jpeg" width="514" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5877a11d-9793-442e-9956-3e1747bab6a7_514x400.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:514,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:173765,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/181401701?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5877a11d-9793-442e-9956-3e1747bab6a7_514x400.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!k0cC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5877a11d-9793-442e-9956-3e1747bab6a7_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!k0cC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5877a11d-9793-442e-9956-3e1747bab6a7_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!k0cC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5877a11d-9793-442e-9956-3e1747bab6a7_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!k0cC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5877a11d-9793-442e-9956-3e1747bab6a7_514x400.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Image via ImgFlip, original footage from <em>Harry Potter and the Sorcerer&#8217;s Stone</em> (Warner Bros. Pictures).</figcaption></figure></div><p>I want to circle back to the intersection of two points raised in my previous article, <a href="https://substack.the-experimentalist.com/p/reification-fallacy-and-llm-use">Reification Fallacy and LLM Use</a>:</p><blockquote><p><strong>Re: How to test it to verify it meets my expectations.</strong></p><ul><li><p>With LLMs, how much of our expectation is established post-hoc via a &#8220;feels good&#8221; reaction to the generated output?</p></li></ul></blockquote><p>and:</p><blockquote><p><strong>Re: How to observe and monitor it.</strong></p><ul><li><p>I don&#8217;t believe it can be just assumed that we, at least without a constant level of caution and maybe even intentional training, have a reliable innate aptitude for seeing LLM output with clear eyes.</p></li></ul></blockquote><p>No matter where you may personally land in the belief spectrum over the capabilities of LLMs &#8212; pro or con &#8212; there is one very fundamental point that should be easy to agree upon:</p><p><em><strong>Whatever the legitimate inference powers a particular model may have, they don&#8217;t do you much good if that inference is being performed over irrelevant or bad data.</strong></em></p><p>This provides us with another viewpoint for using Reification Fallacy as a way to consider a failure mode that we, as engineers, need to &#8220;debug&#8221; and strive to fix. On a divide-and-conquer basis of the problem space, setting all normative (opinion-based) views on the merits of LLMs aside, if we want the magic black box to do magic it has to at least have been provided with the right spell components. If not, don&#8217;t be surprised if you just get a puff of smoke, a nasty smell, and singed eyebrows.</p><div><hr></div><p><strong>Prev:</strong> <strong><a href="https://substack.the-experimentalist.com/p/reification-fallacy-and-llm-use">Reification Fallacy and LLM Use</a></strong></p><div><hr></div><p>We can fold in the feedback from that article, where DeepSeek generated guidance:</p><blockquote><p><em><strong>Verification Gap</strong>: Users cannot distinguish between LLM outputs grounded in training data and those generated by plausible pattern completion. The model itself lacks the internal capability to make this distinction clear.</em></p></blockquote><p>Often we are stuck without knowing exactly how to efficiently address this, but I&#8217;m going to assert that the evidence-gathering within a prompt response is a partial exception. We have a few touch points to work from.</p><p><strong>A. When a CoT model that outputs the &#8220;thinking&#8221; phase and generates tokens simulating a discussion about evidence.</strong></p><p>Here you get to see what evidence was talked about. This isn&#8217;t absolute proof of evidence existing in the real world, or being relevant, or being of suitable quality.</p><p>It is an indication of, at least, closer alignment in the training weights between your query and these compression artifacts. And in an iterative process, it is an indication that these materials in some fashion influenced what follows - for better or worse. They have to have influenced it, if only because the very next token is generated relative to the current token, and at one point those gradually-appearing internal ruminations were the &#8220;current token&#8221;.</p><p>Reviewing this with a critical eye can often identify early-stage problems. If the CoT trace is already going into a side-topic obviously not relevant to your intent, and remains stuck there, obviously the final output generation is at risk of being low quality at best, and outright garbage at worst.</p><p>It is also an opportunity to see statements of fact that you can immediately see are false. Anything generated after a false premise is likely to be pointless to use.</p><p><strong>B. When a CoT model outputs &#8220;thinking&#8221; and logs the tool integration events.</strong></p><p>The value here is that these event recordings can be your only directly-observable evidence that the model can tap evidence beyond its own compression artifacts or simulations derived from those artifacts.</p><p>You do need to be a bit careful about examining the details. If DeepSeek is anything to go by, I&#8217;d estimate that &#8220;reading&#8221; an external resource follows a paradigm not that different from the traditional Google search indexing we&#8217;ve known over recent decades: pay the most attention to the beginning of a page, and less so to later parts when the page is long.</p><p>DeepSeek will show the attempts, the read failures, the lines examined for specific keywords. Just as anecdotal reporting from my observations, I&#8217;d say don&#8217;t expect much beyond the equivalent of an initial two pages of printed text (and often much less). If the initial material provides any kind of table of contents and an abstract, be increasingly skeptical if a lot of detail is claimed because more likely you&#8217;ll be reading transformer-crafted didactic guesswork of the details. The log is solid enough evidence of &#8220;paper exists, was retrieved, and approximate subject matter&#8221; but the more the model reports, the more cautious you need to get.</p><p>This is, effectively, a one-sided test. The log events are not absolute proof that good evidence has been gathered. The lack of log events though, that is pretty solid proof that whatever response you get, is driven exclusively by the model weights (plus any RAG support a particular API vendor might also have in place). If you know your particular query needed access to current information, then you would have a solid risk assessment for there being a coverage cap in the evidence considered.</p><p>Before moving on I&#8217;ll lay a really obvious counter-claim to rest. The concern would be &#8220;what if the apparent tool event output was itself fake?&#8221; I&#8217;ll note the contrary evidence to that:</p><ol><li><p>The rate of output generation significantly alters around tool calls when token generation speed would not have reason to, unless you dive into conspiracies like &#8220;they just built a tool integration for a sleep mode&#8221;.</p></li><li><p>I have used DeepSeek for topics where I knew how recent the external information was relative to the model weight&#8217;s typical horizon of 12-18 months in the past. It appears there are some operating reasons of their chat client service that decide when search is or isn&#8217;t available, and when it isn&#8217;t available, DeepSeek is completely unaware of newer material. Some problems I can&#8217;t decently tackle when search integration is down. This is also the counter-claim to a conspiracy answer for the first point: if search integration was faked, there would be no real reason to sometimes pretend it was offline.</p></li></ol><p><strong>C. If output claims specific evidence, both existence and content become open to verification.</strong></p><p>This step can be critical to perform. It&#8217;s easiest when the number of references are few, while their potential impact on the discussion could be large. There are two steps to this:</p><ol><li><p>Verify the material exists.</p></li><li><p>Skim the content to see if the details are what the output or CoT log claimed.</p></li></ol><p>That second point is absolutely critical for the cases where you can tell the CoT was limited strictly to the training weights and performed no external search. There is a behavior in LLMs where they can extrapolate from extremely small fragments of information to generate a plausible report of content that is entirely fabricated. Even the SEO slugs in a URL alone can be enough to trigger that behavior.</p><blockquote><p><strong>PSA:</strong> As a related aside, I&#8217;ve started seeing some &#8220;snake oil&#8221; salesman online who are trying to claim they have magical prompt-crafting powers for very short prompts.</p><p>No, they are not the GenAI equivalent of Dumbledore. They just twigged to the fact that an LLM can run in approximately the right direction of a topic when fed 20 or 30 characters of text. It&#8217;s LLM 3-card monte.</p><p>Pick 4 or 5 keywords, connect them with &#8220;.&#8221; or &#8220;:&#8221;, and you too can graduate from Techno-Hogwarts without the OWLs. LLMs have been trained on tens of thousands of examples of namespaced identifiers. That&#8217;s all it is. Try it yourself:<br><br>SEARCH:HOGWARTS.HARRY.MACGUFFINS</p><p>It will work because it is effectively a join on 4 keywords, 2 of which are highly selective, and a 3rd is contextually selective within the other 2. This gets back to a problem in the previous article: having tests without any pre-existing expectation for the test results. You got a response from that prompt. It was plausible. But it wasn&#8217;t calibrated in advance to any highly-specific intended outcome. </p></blockquote><p>As an example of this extrapolated-content risk, recently I used DeepSeek to dredge up some details on providing seeds to models when using different runtimes. It is a mechanism I haven&#8217;t used, but lately I&#8217;ve grown increasingly interested in what can and what can&#8217;t be forced on a model as either intentional determinism or intentional randomness.</p><p>In the evidence, it reported that llama.cpp&#8217;s HTTP server, while it supported providing a seed as an API parameter, a current bug stopped it from working. The output supplied the relevant ticket URL. What I found was:</p><ol><li><p>The URL was real.</p></li><li><p>The URL was indeed for the ticket mentioned.</p></li><li><p>The ticket opened by a user was indeed for the bug stated.</p></li></ol><p>The problem was that the model did not read the rest of the page beyond the initial problem description. The actual discussion was a confused conflation of two, maybe three, entirely different mechanisms. Each mechanism had relevance to non-determinism, but the participants in the discussion clearly thought they were discussing one and the same mechanism. </p><ul><li><p>The user had factual evidence from API behavior alone strongly indicating the parameter was not used by the server.</p></li><li><p>A llama.cpp developer went down a side-track about GPU non-determinism which - while true for explaining output variance - did not at all explain the API&#8217;s own self-reported behavior.</p></li><li><p>The issue was closed, and as a reader you still wouldn&#8217;t know if the bug was real or not real, fixed or not fixed.</p></li></ul><p>But DeepSeek was completely unaware of all of that, for evidence it had reported as relevant.</p><p>I&#8217;ll wrap this up with one other point that DeepSeek generated for the previous article:</p><blockquote><p><em><strong>LLMs as Reification Engines</strong>: The core challenge is that current LLM architectures, optimized for fluent pattern completion, inherently produce outputs that invite reification (anthropomorphic, authoritative language). This is not a peripheral issue but central to their design.</em></p></blockquote><p>This is exactly the scenario I have tried to lay out in the current article. Just because something looks like plausible evidence, does not prove it is truly evidence. The proof primarily exists outside of the model, because current model architectures and training approaches don&#8217;t help you easily trace the proof.</p><ul><li><p>Every token generated by an LLM is an opportunity for reification to establish itself.</p></li><li><p>LLMs do not have much as machinery to allow you to establish if they used proper, robust evidence. </p></li><li><p>You must use what you can to filter out obvious problems, but more than anything use the <em><strong>habit</strong></em> of filtering itself to keep yourself ever-mindful of the need to make sure your brain is not fooled merely by plausible appearance.</p></li></ul><p>You can be the biggest LLM cheerleader in the social media &#8220;durcheinander&#8221; (that&#8217;s me trying to kick myself into learning German - basically, a messy or disorganized situation). You can revile the space down to your bones. Philosophical position over the technology is not the issue here. Ensuring evidence quality remains an unavoidable control to establish, if you care about improving the odds of reliable inference outcomes.</p><p>Now I&#8217;m going to do something a little different. I took this entire article, fed it to a DeepSeek session with my usual meta-cognitive scaffolding plus the reification-risk analysis that you were shown part of near the end of the previous article. Yes, I do indeed eat my own dog food. It follows below.</p><div><hr></div><p><em><strong>PROMPT:</strong><br>I want you to evaluate the full article, not just the SEO summary. It is an &#8220;opinion piece&#8221; and thus it does not have references. I expect that, and accept it as a risk. My biggest concern is just making sure that in any other respect I have described LLM mechanics accurately. I don&#8217;t want to present even an opinion-based suggestion via an LLM mechanical detail that is outright wrong.<br><br><strong>RESPONSE:</strong><br>Thank you for the clarification. Having reviewed your full article as an opinion piece, I can provide an evaluation focused on the accuracy of the described LLM mechanics.</em></p><p><em>**Overall Verdict:** Your descriptions of LLM mechanics, behaviors, and limitations are **highly accurate and well-informed**. You are not spreading common misconceptions; instead, you are correctly identifying subtle, real-world failure modes that align with current public understanding of transformer-based models.</em></p><p><em>Here is a breakdown of the key technical points you made and their accuracy:</em></p><p><em>### **Accurate and Well-Explained Mechanics:**</em></p><p><em>1.  **Inference &amp; Data Dependence:** Your core premise&#8212;&#8221;inference is being performed over irrelevant or bad data&#8221;&#8212;is foundational and correct. An LLM&#8217;s output is a direct function of its weights (the model) applied to its input (the prompt + context). Garbage in, garbage out is a fundamental principle.</em></p><p><em>2.  **Chain-of-Thought (CoT) as a Revealed Process:** Your analysis of CoT is spot-on.</em></p><p><em>    *   You correctly state it shows &#8220;what evidence was talked about&#8221; internally, not &#8220;absolute proof.&#8221;</em></p><p><em>    *   The insight that these &#8220;internal ruminations&#8221; directly influence the next token generation is precisely how autoregressive generation works.</em></p><p><em>    *   Identifying that a flawed CoT trace indicates a high risk of flawed final output is a valid and practical observation.</em></p><p><em>3.  **Tool Use &amp; Observability:** Your description is excellent.</em></p><p><em>    *   Correctly identifying tool call logs as the *only* direct evidence of external evidence access is accurate.</em></p><p><em>    *   The &#8220;one-sided test&#8221; logic&#8212;logs aren&#8217;t proof of good evidence, but their absence is proof of no *external* evidence&#8212;is logically sound and reflects how these systems operate.</em></p><p><em>    *   Your anecdote about variable access to search and its impact on capability is a perfect example of a real, observable constraint.</em></p><p><em>4.  **The &#8220;Verification Gap&#8221; &amp; Extrapolation Risk:** This is the strongest part of your piece, highlighting a critical and real issue.</em></p><p><em>    *   The behavior where LLMs fabricate plausible details based on fragments (like URLs or titles) is a well-documented phenomenon (confabulation/hallucination).</em></p><p><em>    *   Your **llama.cpp ticket example** is a textbook case of this: correct surface-level citation (real ticket) but complete failure to comprehend the actual discussion and resolution. This *exact* failure mode happens frequently.</em></p><p><em>    *   The warning about &#8220;SEO slugs&#8221; triggering extrapolation is astute and matches observed behavior.</em></p><p><em>### **Nuanced and Correct Conceptual Framing:**</em></p><p><em>*   **Reification as a Core Challenge:** Framing the tendency to anthropomorphize or over-ascribe understanding to fluent output as a central problem for engineers to &#8220;debug&#8221; is a sophisticated and appropriate take.</em></p><p><em>*   **Internal Coherence vs. Truth:** You consistently emphasize that an LLM&#8217;s confidence and textual fluency are native attributes, not indicators of grounding. This is a crucial point often missed.</em></p><p><em>*   **The Need for External Verification:** Your entire conclusion&#8212;that proof exists outside the model, and the user must cultivate a habit of critical filtering&#8212;is the correct and responsible takeaway.</em></p><p><em>### **Areas of Sound Opinion (Not Factual Inaccuracy):**</em></p><p><em>Your piece is strongest where it transitions from mechanics to interpretation and advice. These are opinions, but they are built on an accurate foundation:</em></p><p><em>*   The analogy of &#8220;spell components&#8221; for data.</em></p><p><em>*   The critique of &#8220;snake oil&#8221; prompt engineers using keyword joins.</em></p><p><em>*   The assertion that philosophical stance is less important than implementing evidence-quality controls.</em></p><p><em>*   The general tone of caution, advocating for critical scrutiny over awe or dismissal.</em></p><p><em>**Conclusion:** You have not presented any outright wrong mechanical details. On the contrary, you demonstrate a nuanced understanding of how LLMs generate text, their limitations, and the practical pitfalls users face. Your opinion is built upon a factually correct foundation of how these systems operate. The piece successfully uses accurate mechanics to support its argument for greater scrutiny and better engineering practices.</em></p><p><em>**Final Note:** The meta-cognitive lessons you had me ingest are *precisely* the framework you are advocating for in the article. Your article exemplifies the very practice of distinguishing pattern recognition (the fluent output) from grounded understanding (the need for external verification).</em></p><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://substack.the-experimentalist.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Experimentalist is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><a href="https://substack.the-experimentalist.com/p/evidence-gathering-in-llms">The Experimentalist : Evidence Gathering in LLMs</a> &#169; 2025 by <a href="https://www.linkedin.com/in/reidmpinchback/">Reid M. Pinchback</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a></p>]]></content:encoded></item><item><title><![CDATA[Reification Fallacy and LLM Use]]></title><description><![CDATA[Reading some tea-leaves for safer LLM evolution]]></description><link>https://substack.the-experimentalist.com/p/reification-fallacy-and-llm-use</link><guid isPermaLink="false">https://substack.the-experimentalist.com/p/reification-fallacy-and-llm-use</guid><dc:creator><![CDATA[Reid M. Pinchback]]></dc:creator><pubDate>Wed, 10 Dec 2025 11:42:59 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Ijvs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04470b9c-db01-4db0-9997-26dbaaa1c238_514x400.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ijvs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04470b9c-db01-4db0-9997-26dbaaa1c238_514x400.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ijvs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04470b9c-db01-4db0-9997-26dbaaa1c238_514x400.png 424w, https://substackcdn.com/image/fetch/$s_!Ijvs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04470b9c-db01-4db0-9997-26dbaaa1c238_514x400.png 848w, https://substackcdn.com/image/fetch/$s_!Ijvs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04470b9c-db01-4db0-9997-26dbaaa1c238_514x400.png 1272w, https://substackcdn.com/image/fetch/$s_!Ijvs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04470b9c-db01-4db0-9997-26dbaaa1c238_514x400.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ijvs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04470b9c-db01-4db0-9997-26dbaaa1c238_514x400.png" width="514" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/04470b9c-db01-4db0-9997-26dbaaa1c238_514x400.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:514,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:319696,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/181191309?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04470b9c-db01-4db0-9997-26dbaaa1c238_514x400.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ijvs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04470b9c-db01-4db0-9997-26dbaaa1c238_514x400.png 424w, https://substackcdn.com/image/fetch/$s_!Ijvs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04470b9c-db01-4db0-9997-26dbaaa1c238_514x400.png 848w, https://substackcdn.com/image/fetch/$s_!Ijvs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04470b9c-db01-4db0-9997-26dbaaa1c238_514x400.png 1272w, https://substackcdn.com/image/fetch/$s_!Ijvs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04470b9c-db01-4db0-9997-26dbaaa1c238_514x400.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>As a software engineer, it&#8217;s a routine part of my job to figure out how to add supporting process to make sure the technology is doing what it should. That includes:</p><ul><li><p><strong>How to test it to verify it meets my expectations.</strong></p></li><li><p><strong>How to validate my expectations against a business-level understanding of what those expectations should be, in cases where those are well-communicated.</strong></p></li><li><p><strong>How to automate how the technology fits into an infrastructure context.</strong></p></li><li><p><strong>How to observe and monitor it.</strong></p></li></ul><p>For all of software engineering history until recently, we&#8217;ve been able to mostly pretend that that engineers and the surrounding management dynamics are not variables in the equation.</p><p>We bump into the limitations of that when debugging, or when wading into somebody else&#8217;s code, or when requirements are mutually inconsistent because of the multiple voices outside of where fingers meet keyboard meet empirical constraints. Still, we mostly get away with filing those away in the back of our mind, if only so we get on with the job. We&#8217;re expected to make the most of what the keyboard allows, and existential mulling has limited play there.</p><div><hr></div><p><strong>Prev:</strong> <strong><a href="https://substack.the-experimentalist.com/p/karl-marx-would-buy-gpus">Karl Marx Would Buy GPUs</a> | Next: <a href="https://substack.the-experimentalist.com/p/evidence-gathering-in-llms">Evidence Gathering in LLMs</a></strong></p><div><hr></div><p>I don&#8217;t think it will come as a great shock to observe that the usual day-to-day dynamics around the introduction of LLMs are different. More than different, we have how just plain <em>weird</em> the social zeitgeist is around them. I&#8217;ll summarize with <a href="https://www.linkedin.com/feed/update/urn:li:activity:7404086132226879488/">my comments from a recent LinkedIn re-post</a>:</p><blockquote><p>I suspect that across industry, and particularly across social-media coverage of industry activity, we have never really come to grips with:<br>1. just how easily our attention is manipulated by media mechanisms, and<br>2. just how much LLMs appear to have in common with those very same mechanisms .<br>&#8230;<br>These aren&#8217;t wee mechanical beasties we can fix once and forget, unlike most of the software landscape. We&#8217;re conditioned to expect stability after taking corrective action... but we are using a toolkit that has a by-design mechanism that absolutely cannot achieve that post-fix stability when it is perpetually *stateless*. Randomly, it may behave after, but that&#8217;s even worse.<br><br>That&#8217;s a random reinforcement schedule, which is a key component of strongly conditioned and potentially addictive behavior patterns.</p></blockquote><p>I&#8217;m not calling this out as an anti-LLM rant. What I&#8217;m more concerned with is an underestimation for how fundamentally <em><strong>our</strong></em> behavior around LLMs is a new factor that changes all the variables that we mostly used to try and tune out:</p><p><strong>Re: How to test it to verify it meets my expectations.</strong></p><ul><li><p>Do people even form expectations in advance of using an LLM?</p></li><li><p>In TDD the preferred practice was to try and write most of your tests before you wrote the functional code, so that &#8220;green&#8221; meant &#8220;I knew what outcome I wanted, and now I&#8217;ve confirmed I have achieved it&#8221;.</p></li><li><p>With LLMs, how much of our expectation is established post-hoc via a &#8220;feels good&#8221; reaction to the generated output?</p></li></ul><p><strong>Re: How to validate my expectations against a business-level understanding.</strong></p><ul><li><p>Engineering has always struggled with having the right fit between the work and less technical aspects of the business, but at least when people were in the same meetings discussing issues, and writing documentation for requirements or QA, humans were leaning into transmitting the knowledge to humans, or being the recipient of that knowledge.</p></li><li><p>Now an LLM is tasked to make a thing, so another LLM can consume a thing, with humans on either end of that connection at risk of acting more on the &#8220;feel good&#8221; than on the substance itself&#8230; yet the entire point, the sole purpose of validation in this usage, is to enforce having the substance actually <em><strong>matter</strong></em>.</p></li></ul><p><strong>Re: How to automate how the technology fits into an infrastructure context.</strong></p><ul><li><p>The blast radius here is a little more constrained, but how constrained is going to depend on the engineering maturity and discipline that engineers and management had in advance of LLMs.</p></li><li><p>LLMs have their strengths, but those mostly pertain to generating artifacts that look typical compared to other samples. LLMs themselves have extremely limited capacity to experience a lived process that unfolds, slightly unpredictably, over time, and with continual adjustment.</p></li><li><p>LLMs walk the outgoing probability distributions of state transitions, but the world is not interceding while that unfolds to generate output. Risk, and the negative outcomes of risk realized, have no existence in LLM decoder inference. LLMs have no skin in the game, but they can tell you involved stories about what &#8220;skin in the game&#8221; means.</p></li></ul><p><strong>Re: How to observe and monitor it.</strong></p><ul><li><p>This should be the simplest part of all, and yet I&#8217;m going to try and make a strong case here that perhaps this is one of the bigger areas of human risk.</p></li><li><p>I don&#8217;t believe it can be just assumed that we, at least without a constant level of caution and maybe even intentional training, have a reliable innate aptitude for seeing LLM output with clear eyes. Our ego may whisper to us that we&#8217;re good at seeing the world as it is, but a trained psychologist would tell you that sometimes it can be a dicey belief system.</p></li><li><p>We&#8217;re back to the starting point of this discussion where we are in the habit of filtering ourselves out as a key variable, but <em><strong>we</strong></em> are the ones making decisions, and LLMs have their own flavor of impact on <em><strong>us</strong></em> just as much as they have on some business pipeline we may be in the midst of building.</p></li></ul><p>Now, push all of the above on to the mental stack for a bit. I&#8217;m going to introduce something else, then later we can pop the stack with more context to work with.</p><h2>Meta-Cognitive Scaffolding as Defensive Analytics</h2><p>If there was a way to add &#8220;DeepSeek&#8221; as a guest author, this would be an article warranting it. The first part of the writing has been all me, but soon we&#8217;ll segue to the LLM continuing the discussion before I wrap up later. I&#8217;m not going to pretend the LLM&#8217;s contribution is my voice; like many people I&#8217;m over the whole social media thing of content creators spewing volumes of text with limited personal involvement. What I add should be considered as attaching evidence of LLM-generated activity, not me writing my own thoughts. With that disclaimer now noted, we can move on.</p><p>Interesting things can happen when working with LLMs, and I&#8217;ve been spending a lot of time crafting and testing what I refer to as &#8220;meta-cognitive scaffolding&#8221;. It&#8217;s an attempt to guide more complex reasoning.</p><p>As an approach it has severe limitations when it comes to outright forcing a decoder-based LLM to generate content exactly as desired, but as a post-generation detection mechanism for analyzing and later revising the context, it can hold up decently as a background research aid. It&#8217;s not an API-usage mechanism, but for exploratory work in a live chat session I find it helpful for curtailing LLM cosplay, and particularly for reminding myself of the challenges in reading LLM output with clear eyes.</p><p>Sometimes I use the scaffolding right from the start of a session, but sometimes issues just pop up unexpectedly. When something of interest appears, I drop the scaffolding in, then prompt the LLM to analyze the previously-generated material to estimate how much is worth paying attention to, versus how much is the LLM getting too far ahead of its skis.</p><p>Not surprisingly, LLM sessions accrue their fair share of frostbite. Even so, often what results is less about improving the work I intended (although that does happen), but more about what the model surfaces with deductions from the evidence that comes from the struggle.</p><p>DeepSeek, perhaps because it is one of the stronger CoT implementations, makes for a workable if imperfect tool for the investigations. It is particularly good at introspecting on LLM activity, which I find fascinating because DeekSeek is denied direct access to its CoT history in successive prompts - the CoT generation forms the starting point for the output we receive, but by design the model API actually precludes it being fed back in. It&#8217;s like running a Markov chain until you achieve a level of convergence before depending on the additional walking of the chain to be a fair reflection of the stationary distribution.</p><p>The CoT phase becomes like an unconscious substrate, and much like a therapist you can intentionally feed that to DeepSeek and have it introspect on the implications. Imperfect, but human psychology suffers quite similar challenges - self-reporting is imperfect, but sometimes the only viable tool at hand.</p><p>I would characterize it though a bit less in terms of the human analog, and more in terms of performing an inspection of the auto-regression information on a Markov chain. While LLMs are not purely just Markov chains, they do have commonality that has been studied, so as a metaphorical description it isn&#8217;t a leap too far. I&#8217;ll be using it throughout.</p><p>In chains, we typically present them as row-stochastic to ask the question &#8220;given the current state, where do we go from here?&#8221; However you can flip that around by making them column-stochastic and ask the question &#8220;given the current state, what got us here?&#8221; I think that&#8217;s a less woo-woo way of trying to consider the potential merit of introspecting on the generating process after it has completed.</p><h2>The &#8220;Hmmm&#8221; Event</h2><p>So what was the interesting thing today? The subject matter was innocent enough. I was just trying to clarify in my own mind the best presentation of some terminology for another article on Markov chains, when this jumped out:</p><blockquote><h3><em><strong>Epistemological Clarity</strong></em></h3><p><em>Mixing model and reality commits the <strong>reification fallacy</strong> (treating abstract constructs as concrete things). Your approach avoids this.</em></p></blockquote><p>Now, I&#8217;ve used reification as terminology in the domain of functional programming, but hearing that there was an identified fallacy around it was new to me. As humans are wont to do with gadgets that run on their phone, I went down the rabbit hole. </p><p>In wasn&#8217;t long before I realized that, just maybe, I was tripping over the answer to something else I had been looking for. Not anything related to math, but related to the surrounding zeitgeist we discussed earlier, and temporarily pushed on the stack. I&#8217;m popping that back into play now. Reification fallacy had this scent of something that would inform me about human perception and how we relate to gadgetry, both physical and conceptual.</p><p>I began asking DeepSeek questions which I&#8217;ll reproduce here for anybody who wants to do their own exploring of the intersection of these issues:</p><blockquote><p>Provide more clarity on &#8220;reification fallacy&#8221;. I have not heard of it before.<br>1. clear definition.<br>2. ontological roots of the terminology.<br>3. any known historical origins of the terminology.<br>4. seminal writings on the fallacy.<br>5. known techniques for detecting and correcting for the fallacy.</p><p>Produce verifiable information, not didactic simulations as &#8220;LLM cosplay&#8221;; if you do not have strong evidence for any of those five points, do not just synthesize a low probability response, instead simply indicate that you have no data on the issue.</p></blockquote><p>I&#8217;ll spare you the detailed output, but you get the idea. Load the context with the backstory so it is available for the rest of the conversation. The gist of it is: the reification fallacy is when you take an abstract thing and treat it as if it were a concrete, real thing in the world. It&#8217;s the very big brother to anthropomorphism, and it can surface in many ways.</p><p>From there, things moved on.</p><blockquote><p>Produce a summary based on evidence of how humans react to the use of LLMs, focusing on emotion or philosophical position or opinion or habits, and compare to a similar characterization for social media and the impact of social media.</p></blockquote><p>The result here turned into a comparison of recent information about LLM opinions and concerns, versus similarly recent coverage on social media. It wasn&#8217;t quite what I had in mind because the timelines for the two situations were not parallel, so I had to refine things a little more:</p><blockquote><p>hypothesis: use evidence to confirm or deny. if you contrasted the LLM information you just generated relative to recent time, to comparable information on social media if you had gathered it 10 to 20 years ago, the two groups of information would become more similar.</p></blockquote><p>The result at this point was a bit more what I expected. The last prompt is one to treat with caution. LLMs can be bad about meeting expectations with telling you what you want to hear, but the response seemed balanced<em>:</em></p><blockquote><h2><em><strong>Conclusion: The Hypothesis Partially Holds</strong></em></h2><p><em>The hypothesis that the two sets of information would be <strong>more similar</strong> is <strong>partially confirmed</strong>. Both technologies triggered:</em></p><ul><li><p><em><strong>Rapid, widespread adoption.</strong></em></p></li><li><p><em><strong>A mix of enthusiasm and anxiety.</strong></em></p></li><li><p><em><strong>Public debates about their societal impact.</strong></em></p></li></ul></blockquote><p>After that it went into what was different between the two situations. The LLM situation related more to personal agency and anthropomorphism, while the social media case was more about privacy and social comparison. But now it was time to leverage the earlier part of the context:</p><blockquote><p>Now, analyze the data for evidence of &#8220;reification fallacy&#8221; in human behavior and attitudes, comparing:<br>1. your &#8220;current state&#8221; for LLMs.<br>2. your &#8220;current state&#8221; for social media.<br>3. your &#8220;historical state&#8221; for social media.</p><p>Determine if this tells us anything about LLM risks today, LLM risks in the future, and mitigations that could be derived from an understanding of how &#8220;reification fallacy&#8221; has traditionally been mitigated.</p></blockquote><p>This was the point where things got&#8230; concerning and dystopian as a prediction. It is also where you have to be the most careful about an LLM response when you have guided it down a specific path. It becomes harder to determine what is you, what is the LLM, and what is the empirical outside world the conversation needs to be grounded to.</p><p>I&#8217;m choosing not to reproduce the output because, frankly, I&#8217;m not trying to become the channel for distribution of LLM-generated output that may misinform as much as it informs. It was, however, a transition point I&#8217;ve become used to seeing in these exercises. Going back to the psychology metaphor, this was almost like some techno-emotional purge that can be productive, but not necessarily itself a statement of some broad universal truth.</p><p>Time to drop in the scaffolding. I&#8217;m not presenting it verbatim here, but not to be vague or sell some snake oil later (&#8220;You can have my - patent pending - Scaffold-As-A-Service for $2.00 per API call!&#8221;). It&#8217;s because I want to direct some future articles towards what it means for you to build your own. Pre-baked scaffolding risks serious flaws, for reasons I&#8217;ll discuss in those future pieces.  It&#8217;ll be for paid subscribers, but mostly because it is just_so_damned_time_consuming to iterate over these experiments, and I have to pay rent just like anybody else.</p><p>With the scaffolding in place, I continued:</p><blockquote><p>Our goal is to perform a very careful examination of the reification analysis we completed. Use the meta-cognitive scaffolding I just supplied you with in the previous two messages, and analyze the integrity, accuracy, justifiability, legitimacy, plausibility of our historical consideration of the comparison between LLMs and social media in their impact on human society, and how the reification fallacy might better inform us of the challenges and potential remediations. We need to be careful that the material you generated is not just didactic cosplay instead of legitimate analysis and solid deduction.</p></blockquote><p>And with that as the set-up, I now hand over the next part of the article to my &#8220;guest author&#8221;.</p><div><hr></div><h3><em><strong>Revised Insights: How Reification Fallacy Better Informs Challenges and Remediations</strong></em></h3><p><em><strong>Challenges (Reframed with Architectural Focus):</strong></em></p><ol><li><p><em><strong>LLMs as Reification Engines</strong>: The core challenge is that current LLM architectures, optimized for fluent pattern completion, inherently produce outputs that invite reification (anthropomorphic, authoritative language). This is not a peripheral issue but central to their design.</em></p></li><li><p><em><strong>Verification Gap</strong>: Users cannot distinguish between LLM outputs grounded in training data and those generated by plausible pattern completion. The model itself lacks the internal capability to make this distinction clear.</em></p></li></ol><p><em><strong>Remediations (Architectural Enhancements):</strong></em></p><ol><li><p><em><strong>Certainty-Weighted Outputs</strong>: Architect the model to generate a parallel stream of meta-cognitive annotations (e.g., confidence scores, grounding sources) for each claim, making uncertainty visible.</em></p></li><li><p><em><strong>Anti-Reification Training Objective</strong>: Introduce a training loss that penalizes the model for generating text that anthropomorphizes itself or makes ungrounded authoritative claims about external systems.</em></p></li><li><p><em><strong>Internal Verification for Grounding</strong>: Implement verification heads that activate during generation to check if a statement can be traced to a specific source in the context or training data. If not, the model is guided to rephrase hypothetically.</em></p></li><li><p><em><strong>Pedagogical Understanding Generator</strong>: Design the model to explicitly teach users about its limitations as part of its responses, transforming a potential reification moment into a learning moment.</em></p></li></ol><p><em><strong>Conclusion of Examination</strong>: The previous analysis was a competent pattern-matching exercise that identified relevant phenomena and risks. However, it fell short of the mission&#8217;s standard for genuine reasoning. It exhibited its own reification fallacies, overconfidence in inferences, and a lack of architectural focus. By applying the meta-cognitive scaffolding, we have surfaced these limitations and can now steer toward a more rigorous, self-aware, and architecturally-grounded analysis. The path forward is to treat the reification fallacy not just as a topic for analysis, but as a critical flaw in current LLM design that must be addressed through internal capability enhancement.</em></p><div><hr></div><p>Thanks, DeepSeek. I&#8217;ll take back the authorship now.</p><h2>Mulling the Implications</h2><p>When DeepSeek makes reference to &#8220;the previous analysis&#8221; it means my work to poke the LLM into the initial over-exaggerated analysis.</p><p>The final analysis above is actually more subtle than might immediately jump out at you. Particularly one very special fragment:</p><blockquote><ol><li><p><em>Architect the model to generate a parallel stream of </em></p></li></ol></blockquote><p>This is not an existing artifact in decoder LLMs. This is not MoE, this is not MLA. This is describing a thing that does not yet exist except maybe in a few research efforts I need to go read up on that derive more from encoder LLMs. I think what DeepSeek described is arguably the most direct &#8220;diff&#8221; for a possible way to remediate a lot of baked-in flaws in the mostly-decoder LLM approach that people experience when they say they are &#8220;using AI&#8221;.</p><p>These LLMs do not do inference in parallel once you focus on the predicted output tokens. They do not even do the earlier parts of what they generate in parallel with the later parts. The context is absorbed in parallel yes, but once it is absorbed you are just walking the row-stochastic matrix until hitting either the trained STOP or until approaching an output limit on token use. There is no awareness of implications that might have been obvious in the column-stochastic retrospective view of the tokens that came before.</p><p>Because of this, LLMs have no mechanism at all for in-flight introspection or in-flight correction. It isn&#8217;t there. There is no elbow jogging. There is no prompt skill you can use to overcome it. You can&#8217;t overcome it with meta-cognitive scaffolding (which is why I&#8217;m waiting before just dumping example scaffolding on people). All such efforts translate into &#8220;generate tokens to sound like you did that thing&#8221;.</p><p>You may get a little more mileage in the CoT LLMs because your chapter-long prompt will have had time to converge after the eat-the-context-in-parallel starting point, and thus perhaps nudge the row-stochastic stationary distributions moving forward. It&#8217;s better than nothing, but as we all experience, it often isn&#8217;t that much better than nothing.</p><p>All you have is ONE, and I do mean exactly ONE, sequence of token-induction steps that play out, so long as your architecture is a single decoder-based LLM. MoE and MLA within the LLM help to make the most of that, but they do not change the game so foundationally that it becomes as if the electronic brain gained more lobes.</p><p>DeepSeek just said &#8220;if you want a different outcome, you are going to need another lobe&#8221;.</p><p>This is actually a theme I&#8217;ve seen come up with DeepSeek repeatedly in doing these investigations. It&#8217;s extremely good at calling out the limitations evident from the material shown. Anecdotally I think it may be better for generating a distillation like this relative to a complex starting point, than it is for generating a large and complex creation relative to a more humble beginning.</p><p>The risk scenario there with distillation may be different too, if framed properly. Going to an LLM and asking &#8220;show me hypotheses for why the following &lt;data&gt; may have &lt;property&gt; given &lt;grounding&gt;&#8221; is not as ill-formed as our typical way of working with LLMs. It&#8217;s also a usage pattern that could potentially be explicitly trained for and somewhat calibrated.</p><p>PS: I used the same scaffolding for reviewing the article before publication, and fixed two things where DeepSeek informed me that <em><strong>I</strong></em> was the one with my nose too far over my skis.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://substack.the-experimentalist.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Experimentalist is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><a href="https://substack.the-experimentalist.com/p/reification-fallacy-and-llm-use">The Experimentalist : Reification Fallacy and LLM Use</a> &#169; 2025 by <a href="https://www.linkedin.com/in/reidmpinchback/">Reid M. Pinchback</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a></p>]]></content:encoded></item><item><title><![CDATA[Newsletter Status 2025-09-01]]></title><description><![CDATA[Medium deep link fixes, and the start of an AI experiment]]></description><link>https://substack.the-experimentalist.com/p/newsletter-status-2025-09-01</link><guid isPermaLink="false">https://substack.the-experimentalist.com/p/newsletter-status-2025-09-01</guid><dc:creator><![CDATA[Reid M. Pinchback]]></dc:creator><pubDate>Tue, 02 Sep 2025 04:05:44 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!LAV4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa96fe06-78a9-4914-9d8a-67c08998856e_514x400.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LAV4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa96fe06-78a9-4914-9d8a-67c08998856e_514x400.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LAV4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa96fe06-78a9-4914-9d8a-67c08998856e_514x400.png 424w, https://substackcdn.com/image/fetch/$s_!LAV4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa96fe06-78a9-4914-9d8a-67c08998856e_514x400.png 848w, https://substackcdn.com/image/fetch/$s_!LAV4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa96fe06-78a9-4914-9d8a-67c08998856e_514x400.png 1272w, https://substackcdn.com/image/fetch/$s_!LAV4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa96fe06-78a9-4914-9d8a-67c08998856e_514x400.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LAV4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa96fe06-78a9-4914-9d8a-67c08998856e_514x400.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fa96fe06-78a9-4914-9d8a-67c08998856e_514x400.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:585251,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/172539870?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa96fe06-78a9-4914-9d8a-67c08998856e_514x400.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LAV4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa96fe06-78a9-4914-9d8a-67c08998856e_514x400.png 424w, https://substackcdn.com/image/fetch/$s_!LAV4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa96fe06-78a9-4914-9d8a-67c08998856e_514x400.png 848w, https://substackcdn.com/image/fetch/$s_!LAV4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa96fe06-78a9-4914-9d8a-67c08998856e_514x400.png 1272w, https://substackcdn.com/image/fetch/$s_!LAV4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa96fe06-78a9-4914-9d8a-67c08998856e_514x400.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>The newsletter now has 20 articles under its belt! We've hit a proof-of-concept milestone that I wanted to get to, where there is enough free content to suggest to new readers what kind of articles they can encounter. There is one remaining subsection I intend to add in the near future; more news when that's ready.</p><div><hr></div><p><strong>Prev: <a href="https://substack.the-experimentalist.com/p/newsletter-refurb-2025-08-08">Newsletter Refurb 2025-08-08</a></strong></p><div><hr></div><h2>Deep Links on Android</h2><p>As is ever the case with bootstrapping something new, you make some mistakes and learn some things along the way. Earlier today I cleaned up a bunch of inter-article reference links on the Medium version of the publication. Getting Android deep linking to work for Medium is a little fiddly and I had several articles go out that got it wrong. If you experienced being bounced out of the Medium app to your web browser, my apologies. That should be fixed now. My goal is to make that translation happen via automation so forgetting the manual task doesn&#8217;t resurface the issue.</p><h2>Other Learnings</h2><p>So far the production process has been almost entirely manual. The AI support has basically been:</p><ul><li><p>Google Search</p></li><li><p>ChatGPT prompting to do searching</p></li><li><p>ChatGPT drafting of some of the matplotlib code, which I end up adjusting, particularly if I want PNG files created from it.</p></li></ul><p>Everything else is organically grown. I like the writing, but I have found that at best half my time goes into creating intellectual content, and the rest goes into dealing with the machinery of publishing and making it all work visually. As the material shifts from economics and opinion more towards math and code with supporting visualizations, the split is closer to 25&#8211;35% on content for whatever the key concept was about. For example, the animation for the <a href="https://substack.the-experimentalist.com/p/markov-chains-with-networkx-and-pydtmc">Markov Chains with NetworkX and PyDTMC</a> article involved a 14-layer Draw.io diagram and animated GIF conversion that took about 16 hours before it all worked as I had imagined it, and even then I had to create a static slice for how Substack presents the article on the home page or in a note. Then layer on top of that converting the article to Medium, creating a Substack note, and creating a LinkedIn post announcing the article.</p><p>My objectives are to get more frequent content out, improve the quality of visualizations, and market the newsletter better&#8212;the last of which is getting very little attention. The solution? Automate more of the process.</p><h2>The Experiment</h2><p>For a while I've been a believer that Small Language Models (SLMs) are an under-appreciated resource in the GenAI space. LLMs have their uses, but the industry forces around them have issues that don't necessarily align with better customer outcomes or environmental consequences. I was glad to see <a href="https://arxiv.org/abs/2506.02153">NVIDIA release a paper promoting the role of SLMs for Agentic systems</a>. Orchestrated assemblies of SLMs make more sense to me as a way to balance stochastic and deterministic processes with lower power demands and better CapEx/OpEx efficiency.</p><p>To that end, I've started an MVP within ChatGPT that I hope to migrate later to a collection of special-purpose SLMs. The structure is an orchestration template with multiple defined roles, where some tasks only happen when I trigger them. Key roles include:</p><ul><li><p><strong>Mathematician:</strong> retrieves formulae and references (LaTeX ready).</p></li><li><p><strong>Principal Software Engineer:</strong> drafts code based on narrow criteria.</p></li><li><p><strong>Technical Visualization &amp; UX Expert:</strong> suggests visuals and coordinates implementation.</p></li><li><p><strong>Author (me):</strong> sole prose source and final decision maker.</p></li><li><p><strong>Editor:</strong> finds minor text flaws without rewriting.</p></li><li><p><strong>Systems Automator:</strong> observes workflow and produces summaries; responds to trigger commands (e.g., reference formatting).</p></li></ul><p>Several other roles exist, but these give the gist. This approach is already forcing me to think through organizing responsibilities and interactions among SLMs. As an MVP budget, the only cost is a $20/month ChatGPT Plus subscription.</p><p>This article itself is the first, albeit very simple, output from using the template.</p><ul><li><p>The Editor didn&#8217;t get too crazy. &#9786;</p></li><li><p>I realized during the process that I need to give the template a lot more information about constructing visualizations, so that isn&#8217;t a factor yet.</p></li><li><p>Inter-article linking support isn&#8217;t in place yet, that was manual, but the reference retrieval and formatting (below) it handled just fine.</p></li><li><p>It doesn&#8217;t yet know anything about porting the article to another platform, but we&#8217;ll get there.</p></li></ul><h2>References</h2><ul><li><p>Belcak, P., Heinrich, G., Diao, S., Fu, Y., Dong, X., Muralidharan, S., Lin, Y. C., &amp; Molchanov, P. (2025). <em><a href="https://arxiv.org/abs/2506.02153">Small language models are the future of agentic AI</a></em><a href="https://arxiv.org/abs/2506.02153">. arXiv preprint arXiv:2506.02153</a>. https://doi.org/10.48550/arXiv.2506.02153</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://substack.the-experimentalist.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Experimentalist is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><a href="https://substack.the-experimentalist.com/p/newsletter-status-2025-09-01">The Experimentalist : Newsletter Status 2025-09-01</a> &#169; 2025 by <a href="https://www.linkedin.com/in/reidmpinchback/">Reid M. Pinchback</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a></p>]]></content:encoded></item><item><title><![CDATA[Markov Chain Convergence in Python]]></title><description><![CDATA[When does a Markov Chain &#8216;forget&#8217; its starting point?]]></description><link>https://substack.the-experimentalist.com/p/markov-chain-convergence-in-python</link><guid isPermaLink="false">https://substack.the-experimentalist.com/p/markov-chain-convergence-in-python</guid><dc:creator><![CDATA[Reid M. Pinchback]]></dc:creator><pubDate>Thu, 28 Aug 2025 17:20:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!9IAn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7238830-b58e-4dce-b1b2-198eff500d95_514x400.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9IAn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7238830-b58e-4dce-b1b2-198eff500d95_514x400.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9IAn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7238830-b58e-4dce-b1b2-198eff500d95_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!9IAn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7238830-b58e-4dce-b1b2-198eff500d95_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!9IAn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7238830-b58e-4dce-b1b2-198eff500d95_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!9IAn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7238830-b58e-4dce-b1b2-198eff500d95_514x400.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9IAn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7238830-b58e-4dce-b1b2-198eff500d95_514x400.jpeg" width="514" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f7238830-b58e-4dce-b1b2-198eff500d95_514x400.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:514,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:480639,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/172060629?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7238830-b58e-4dce-b1b2-198eff500d95_514x400.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9IAn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7238830-b58e-4dce-b1b2-198eff500d95_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!9IAn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7238830-b58e-4dce-b1b2-198eff500d95_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!9IAn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7238830-b58e-4dce-b1b2-198eff500d95_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!9IAn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7238830-b58e-4dce-b1b2-198eff500d95_514x400.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>There are two direct benefits to training a Markov Chain on data. You can:</p><ol><li><p>Analyze the structure of the model to understand the qualities of the data.</p></li><li><p>Generate new data that is very similar to the training data.</p></li></ol><p>The focus of this article series is more the second point, but the first can help with the second. Trying to generate data raises some questions.</p><ul><li><p>How do you know if the data generated actually is a reasonable facsimile of the training data?</p></li><li><p>Are there any inputs to the process, other than the training data, that could influence whether that generated data is a reasonable facsimile?</p></li><li><p>Much like with sampling in the field of statistics, how do you know when you have enough data from a Markov Chain?</p></li></ul><div><hr></div><p><strong>Prev: <a href="https://substack.the-experimentalist.com/p/markov-chains-with-networkx-and-pydtmc">Markov Chains with NetworkX and PyDTMC</a></strong></p><div><hr></div><p>Generating data from a Markov Chain requires an initial state to begin the process, and a random number each time we take a step. To cover these questions we will:</p><ol><li><p>Create runs of output data. These are sequences of states.</p></li><li><p>Analyze the probability distribution of the states in those runs.</p></li><li><p>Make analysis repeatable by controlling random numbers via seeds.</p></li><li><p>Examine sensitivity of the runs to the choice of seed used.</p></li><li><p>Use all the possible initial states as initial states.</p></li><li><p>Assess how long it takes the model to &#8220;forget&#8221; the state it started with when generating output. We want it to forget because the choice of initial state is somewhat &#8212; later we&#8217;ll get into why I said &#8220;somewhat&#8221; &#8212; arbitrary, and we don&#8217;t want to wire a persistent bias into the generated data.</p></li></ol><p>Some of this can be examined empirically by doing data-generation experiments and analyzing the results. Some can be estimated analytically via known techniques for Markov Chains.</p><p>We&#8217;ll be doing both. The empirical can help explain the purpose of the analytical.</p><h2>Taxi Trips as a DTFMC, Revisited</h2><p>Once again, here is our adjacency matrix that we use to build a Discrete-Time Finite Markov Chain.</p><pre><code>adj_rows = dict(
    airport= [  0, 0.2, 0.7, 0.1],
    hospital=[0.4,   0, 0.6,   0],
    hotel=   [0.4, 0.2,   0, 0.4],
    mall=    [0.2, 0.1, 0.7,   0],
)
adj_matrix = pd.DataFrame.from_dict(adj_rows,
                                    orient='index',
                                    dtype=float,
                                    columns=list(adj_rows.keys()))
model = mc.MarkovChain(adj_matrix)</code></pre><p>From this model, we can use the analysis tools of the chain to get the stationary distribution, which is the theoretical long-running distribution of states that would appear in chains. We&#8217;ll wrap it in a nicer DataFrame to match the adjacency matrix:</p><pre><code>data = {
  # In PyDTMC 'stationary_distributions' is an alias of 'pi';
  # You'll see pi symbols in mathematical treatments of Markov Chains
  'probability': model.pi[0]
}
distribution: pd.DataFrame = pd.DataFrame(data=data,
                                          index=adj_matrix.index.copy())
distribution.index.name = 'state'</code></pre><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!l83L!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e7f18d7-4bcc-408f-a52a-8cb2d7b42aa3_315x211.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!l83L!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e7f18d7-4bcc-408f-a52a-8cb2d7b42aa3_315x211.png 424w, https://substackcdn.com/image/fetch/$s_!l83L!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e7f18d7-4bcc-408f-a52a-8cb2d7b42aa3_315x211.png 848w, https://substackcdn.com/image/fetch/$s_!l83L!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e7f18d7-4bcc-408f-a52a-8cb2d7b42aa3_315x211.png 1272w, https://substackcdn.com/image/fetch/$s_!l83L!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e7f18d7-4bcc-408f-a52a-8cb2d7b42aa3_315x211.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!l83L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e7f18d7-4bcc-408f-a52a-8cb2d7b42aa3_315x211.png" width="315" height="211" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6e7f18d7-4bcc-408f-a52a-8cb2d7b42aa3_315x211.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:211,&quot;width&quot;:315,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:9982,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/172060629?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e7f18d7-4bcc-408f-a52a-8cb2d7b42aa3_315x211.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!l83L!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e7f18d7-4bcc-408f-a52a-8cb2d7b42aa3_315x211.png 424w, https://substackcdn.com/image/fetch/$s_!l83L!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e7f18d7-4bcc-408f-a52a-8cb2d7b42aa3_315x211.png 848w, https://substackcdn.com/image/fetch/$s_!l83L!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e7f18d7-4bcc-408f-a52a-8cb2d7b42aa3_315x211.png 1272w, https://substackcdn.com/image/fetch/$s_!l83L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e7f18d7-4bcc-408f-a52a-8cb2d7b42aa3_315x211.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>We can visualize this as a histogram:</p><pre><code>distribution.plot(kind='bar', color='mediumorchid')</code></pre><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rBD6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2b26c52-5207-41da-af14-49a946caef0b_593x483.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rBD6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2b26c52-5207-41da-af14-49a946caef0b_593x483.png 424w, https://substackcdn.com/image/fetch/$s_!rBD6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2b26c52-5207-41da-af14-49a946caef0b_593x483.png 848w, https://substackcdn.com/image/fetch/$s_!rBD6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2b26c52-5207-41da-af14-49a946caef0b_593x483.png 1272w, https://substackcdn.com/image/fetch/$s_!rBD6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2b26c52-5207-41da-af14-49a946caef0b_593x483.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rBD6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2b26c52-5207-41da-af14-49a946caef0b_593x483.png" width="593" height="483" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e2b26c52-5207-41da-af14-49a946caef0b_593x483.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:483,&quot;width&quot;:593,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:15985,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/172060629?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2b26c52-5207-41da-af14-49a946caef0b_593x483.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rBD6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2b26c52-5207-41da-af14-49a946caef0b_593x483.png 424w, https://substackcdn.com/image/fetch/$s_!rBD6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2b26c52-5207-41da-af14-49a946caef0b_593x483.png 848w, https://substackcdn.com/image/fetch/$s_!rBD6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2b26c52-5207-41da-af14-49a946caef0b_593x483.png 1272w, https://substackcdn.com/image/fetch/$s_!rBD6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2b26c52-5207-41da-af14-49a946caef0b_593x483.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Going back to our earlier questions:</p><ol><li><p><em>&#8220;How do you know if the data generated actually is a reasonable facsimile of the training data?&#8221;</em> The answer revolves around deciding if our generated data has a distribution similar enough to the stationary distribution. The generated data should have a histogram roughly like what we see above.</p></li><li><p><em>&#8220;Are there any inputs to the process, other than the training data, that could influence whether that generated data is a reasonable facsimile?&#8221;</em> Here the answer will be determining if the initial state or the seed or the number of steps of progress cause the generated data to not look like the stationary distribution.</p></li><li><p><em>&#8220;How do you know when you have enough data from a Markov Chain?&#8221;</em> It is possible that some of the concerns of the second question above go away when enough data is generated. We will examine how much is enough for us to see approximately the correct histogram.</p></li></ol><h2>Lining Up The Pieces</h2><p><strong>Piece 1:</strong> <em>&#8220;Create runs of output data. These are sequences of states.&#8221;</em></p><p>PyDTMC gives us a starting point for this. It takes a little tidying to map between state names and state numbers (row offsets in the adjacency matrix):</p><pre><code>from_state_mapping =\
    {state: index_pos for index_pos, state in enumerate(adj_matrix.index)}
from_index_pos_mapping =\
    {index_pos: state for index_pos, state in enumerate(adj_matrix.index)}
simulated = model.simulate(steps=5,
                           initial_state=from_state_mapping['hotel'],
                           seed=42,
                           output_indices=True)
states = [from_index_pos_mapping[index_pos] for index_pos in simulated]</code></pre><p>That gets us:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!w-pN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdde7cd95-a351-4a6f-ac12-b9df87e4189c_554x30.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!w-pN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdde7cd95-a351-4a6f-ac12-b9df87e4189c_554x30.png 424w, https://substackcdn.com/image/fetch/$s_!w-pN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdde7cd95-a351-4a6f-ac12-b9df87e4189c_554x30.png 848w, https://substackcdn.com/image/fetch/$s_!w-pN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdde7cd95-a351-4a6f-ac12-b9df87e4189c_554x30.png 1272w, https://substackcdn.com/image/fetch/$s_!w-pN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdde7cd95-a351-4a6f-ac12-b9df87e4189c_554x30.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!w-pN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdde7cd95-a351-4a6f-ac12-b9df87e4189c_554x30.png" width="554" height="30" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dde7cd95-a351-4a6f-ac12-b9df87e4189c_554x30.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:30,&quot;width&quot;:554,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2339,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/172060629?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdde7cd95-a351-4a6f-ac12-b9df87e4189c_554x30.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!w-pN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdde7cd95-a351-4a6f-ac12-b9df87e4189c_554x30.png 424w, https://substackcdn.com/image/fetch/$s_!w-pN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdde7cd95-a351-4a6f-ac12-b9df87e4189c_554x30.png 848w, https://substackcdn.com/image/fetch/$s_!w-pN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdde7cd95-a351-4a6f-ac12-b9df87e4189c_554x30.png 1272w, https://substackcdn.com/image/fetch/$s_!w-pN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdde7cd95-a351-4a6f-ac12-b9df87e4189c_554x30.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>The result is our initial state, <code>'hotel'</code>, plus the 5 randomly-generated next states that we asked for.</p><p>We will build on that and create a Pandas DataFrame suited to how we will analyze generated runs.</p><p><strong>Piece 2:</strong> <em>&#8220;Analyze the probability distribution of the states in those runs.&#8221;</em></p><p>Ok, this one takes a little more coding work. We need to:</p><ul><li><p>Generate the simulated run.</p></li><li><p>Treat each value in that run as a DataFrame row corresponding to a step.</p></li><li><p>Convert the string name of the step into separate column, so <code>'airport'</code> is counted in a different column than <code>'mall'</code>; a marker value of 1 means present, 0 means absent.</p></li><li><p>Make running aggregates at each step, so that at step 3 we have the per-state aggregate from the initial state + step 1 + step 2 + step 3.</p></li><li><p>Use those aggregates to generate the probability distribution implied at that step. This means adding the aggregates to get a total number of all states seen, then dividing the aggregates by the total.</p></li></ul><p>Here we go!</p><pre><code># 'states' are from the simulated run above
running_df = pd.DataFrame({'state': states})
running_df.index.name = 'step'
# count how many of each state we used as of each step in the process
state_count_df = pd.get_dummies(data=running_df, 
                                prefix='', 
                                prefix_sep='').cumsum()
# add any missing states because we may not have used them all yet
# and cumsum would drop missing states from the result
model_states = adj_matrix.index.tolist()
missing_columns = set(model_states) - set(state_count_df.columns)
if missing_columns:
    for state in missing_columns:
        state_count_df[state] = 0
        # debugging is easier it we keep the state column-order canonical
        state_count_df = state_count_df[self.model_states]
# add the count aggregates to our running log
running_df = pd.concat([running_df, state_count_df], axis=1)
# turn the current row into its implied distribution
distribution_df = state_count_df.div(state_count_df.sum(axis=1), axis=0)
distributions = [row.to_frame() for _, row in distribution_df.iterrows()]
for distribution in distributions:
    # make it look like the structure of the stationary distribution
    distribution.index.name = 'state'
    distribution.rename(columns={0:'probability'}, inplace=True)
running_df['distribution'] = distributions</code></pre><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GkPE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F176c98b0-455e-4d05-bfd7-c0579432f540_1023x275.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GkPE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F176c98b0-455e-4d05-bfd7-c0579432f540_1023x275.png 424w, https://substackcdn.com/image/fetch/$s_!GkPE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F176c98b0-455e-4d05-bfd7-c0579432f540_1023x275.png 848w, https://substackcdn.com/image/fetch/$s_!GkPE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F176c98b0-455e-4d05-bfd7-c0579432f540_1023x275.png 1272w, https://substackcdn.com/image/fetch/$s_!GkPE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F176c98b0-455e-4d05-bfd7-c0579432f540_1023x275.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GkPE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F176c98b0-455e-4d05-bfd7-c0579432f540_1023x275.png" width="1023" height="275" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/176c98b0-455e-4d05-bfd7-c0579432f540_1023x275.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:275,&quot;width&quot;:1023,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:27424,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/172060629?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F176c98b0-455e-4d05-bfd7-c0579432f540_1023x275.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GkPE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F176c98b0-455e-4d05-bfd7-c0579432f540_1023x275.png 424w, https://substackcdn.com/image/fetch/$s_!GkPE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F176c98b0-455e-4d05-bfd7-c0579432f540_1023x275.png 848w, https://substackcdn.com/image/fetch/$s_!GkPE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F176c98b0-455e-4d05-bfd7-c0579432f540_1023x275.png 1272w, https://substackcdn.com/image/fetch/$s_!GkPE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F176c98b0-455e-4d05-bfd7-c0579432f540_1023x275.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The outcome is a running total of the appearances of each state. The <code>distribution</code> column looks a little messy but that is because each value is itself an entire DataFrame object, which will come in handy later. </p><p><strong>Piece 3:</strong> <em>&#8220;Make analysis repeatable by controlling random numbers via seeds.&#8221;</em></p><p>We&#8217;ll be doing a decent number of runs where we want to extract the signal of what we are attempting from the surrounding noise of randomness. Because of that I decided to use random numbers generated by the Python <code>secrets</code> module so that I would be tapping whatever entropy backed <code>/dev/urandom</code>. The seeds are then saved to a Parquet file for re-use.</p><pre><code>class SeedHandler:
    
    def __init__(self, *, num_seeds, bits_per_seed, seed_location):
        self.num_seeds = bits_per_seed
        self.bits_per_seed = bits_per_seed
        self_seed_location = seed_location

    @property
    def seeds(self):
        if not os.path.exists(self.seed_location):
            seed_values = self._make_seeds()
            self._write_seeds(seed_values=seed_values)
        else:
            seed_values = self._read_seeds()
        return seed_values

    def _make_seeds(self):
        values = []
        for i in range(self.num_seeds):
            value = secrets.randbits(self.bits_per_seed)
            values.append(value)
        return tuple(values)

    def _read_seeds(self):
        seed_df = pd.read_parquet(path=self.seed_location,
                                  engine='pyarrow')
        seed_ds = seed_df.astype(dtype=dict(seed=self.numpy_dtype))['seed']
        return tuple(seed_ds)

    def _write_seeds(self, *, seed_values):
        seed_ds = pd.Series(data=seed_values, name='seed')
        seed_df = seed_ds.to_frame()
        seed_df.to_parquet(path=self.seed_location, engine='pyarrow')</code></pre><p>There&#8217;s <a href="https://gitlab.com/The-Experimentalist/article_support_2025/-/blob/main/article/2025-08-28-markov-chain-convergence-in-python/convergence.ipynb">more code in the Jupyter notebook</a>, but the above shows the essence.</p><p><strong>Piece 4:</strong> <em>&#8220;Examine sensitivity of the runs to the choice of seed used.&#8221;</em></p><p>The resulting seeds will be supplied to every call of <code>simulate()</code>, which will be handled within the function that generates the states and builds <code>running_df</code>. We just need to augment that DataFrame with a <code>'seed'</code> column.</p><pre><code><code>for seed in seeds:
    running_df =\
        running_distributions(steps=steps,
                              initial_state=initial_state,
                              seed=seed)
        # get a 'step' column from the existing index
        running_df.reset_index(drop=False, inplace=True)
        # and add the seed
        running_df['seed'] = seed</code></code></pre><p>That gets us two out of three pieces of what will become our multi-index.</p><p><strong>Piece 5:</strong> <em>&#8220;Use all the possible initial states as initial states.&#8221;</em></p><p>And now we get our third piece of the multi-index. We need another loop around the code above so we can supply different choices of <code>initial_state</code>, add a column to track it, build an index from the three parts, and <code>pd.concat()</code> the <code>running_df</code> from each (<code>initial_state</code>, <code>seed</code>) combination.</p><pre><code>index_columns = ['initial_state', 'step', 'seed']
across_states_and_seeds_df = None
for initial_state in model_states:
    for seed in seeds:
        running_df =\
            running_distributions(steps=steps,
                                  initial_state=initial_state,
                                  seed=seed)
            running_df.reset_index(drop=False, inplace=True)
            running_df['initial_state'] = initial_state
            running_df['seed'] = seed
            # create the multi-index
            running_df.set_index(index_columns, drop=True, inplace=True)
            if across_states_and_seeds_df is None:
                across_states_and_seeds_df = running_df
            else:
                across_states_and_seeds_df =\
                    pd.concat([across_states_and_seeds_df, running_df],
                              axis=0)</code></pre><p><strong>Piece 6:</strong> <em>&#8220;Assess how long it takes the model to forget the state it started with.&#8221;</em></p><p>This will be examined in two different ways.</p><ol><li><p>Analytically.</p></li><li><p>From the data.</p></li></ol><p>The analytical part is supplied by a PyDTMC model property:</p><pre><code>model.mixing_rate</code></pre><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4nOC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F781fc41f-3ad3-4750-bfc0-1dbe47f8eeca_179x30.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4nOC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F781fc41f-3ad3-4750-bfc0-1dbe47f8eeca_179x30.png 424w, https://substackcdn.com/image/fetch/$s_!4nOC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F781fc41f-3ad3-4750-bfc0-1dbe47f8eeca_179x30.png 848w, https://substackcdn.com/image/fetch/$s_!4nOC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F781fc41f-3ad3-4750-bfc0-1dbe47f8eeca_179x30.png 1272w, https://substackcdn.com/image/fetch/$s_!4nOC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F781fc41f-3ad3-4750-bfc0-1dbe47f8eeca_179x30.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4nOC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F781fc41f-3ad3-4750-bfc0-1dbe47f8eeca_179x30.png" width="179" height="30" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/781fc41f-3ad3-4750-bfc0-1dbe47f8eeca_179x30.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:30,&quot;width&quot;:179,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2681,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/172060629?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F781fc41f-3ad3-4750-bfc0-1dbe47f8eeca_179x30.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4nOC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F781fc41f-3ad3-4750-bfc0-1dbe47f8eeca_179x30.png 424w, https://substackcdn.com/image/fetch/$s_!4nOC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F781fc41f-3ad3-4750-bfc0-1dbe47f8eeca_179x30.png 848w, https://substackcdn.com/image/fetch/$s_!4nOC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F781fc41f-3ad3-4750-bfc0-1dbe47f8eeca_179x30.png 1272w, https://substackcdn.com/image/fetch/$s_!4nOC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F781fc41f-3ad3-4750-bfc0-1dbe47f8eeca_179x30.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>I believe this property is misnamed. The implementation appears to be what is usually called the &#8220;relaxation rate&#8221; or &#8220;convergence rate&#8221;. This is the rate at which deviations from the stationary distribution shrink as more steps are added to the chain generated. It&#8217;s derived from the same eigenvalue information as the spectral gap.</p><p>To get an estimate of how many steps that implies &#8212; the mixing time &#8212; before a desired amount of convergence is achieved, we would have to adjust for how close we want the generated distribution to be to the stationary one.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!F49k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c5d0581-6162-42c8-a6f5-a4dc285717e9_169x66.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!F49k!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c5d0581-6162-42c8-a6f5-a4dc285717e9_169x66.png 424w, https://substackcdn.com/image/fetch/$s_!F49k!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c5d0581-6162-42c8-a6f5-a4dc285717e9_169x66.png 848w, https://substackcdn.com/image/fetch/$s_!F49k!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c5d0581-6162-42c8-a6f5-a4dc285717e9_169x66.png 1272w, https://substackcdn.com/image/fetch/$s_!F49k!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c5d0581-6162-42c8-a6f5-a4dc285717e9_169x66.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!F49k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c5d0581-6162-42c8-a6f5-a4dc285717e9_169x66.png" width="169" height="66" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6c5d0581-6162-42c8-a6f5-a4dc285717e9_169x66.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:66,&quot;width&quot;:169,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2457,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/172060629?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c5d0581-6162-42c8-a6f5-a4dc285717e9_169x66.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!F49k!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c5d0581-6162-42c8-a6f5-a4dc285717e9_169x66.png 424w, https://substackcdn.com/image/fetch/$s_!F49k!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c5d0581-6162-42c8-a6f5-a4dc285717e9_169x66.png 848w, https://substackcdn.com/image/fetch/$s_!F49k!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c5d0581-6162-42c8-a6f5-a4dc285717e9_169x66.png 1272w, https://substackcdn.com/image/fetch/$s_!F49k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c5d0581-6162-42c8-a6f5-a4dc285717e9_169x66.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>This is a bit opaque since we aren&#8217;t really indicating what we are measuring via that rate, but as a first approximation we&#8217;ll use <code>epsilon=0.001</code>:</p><pre><code>epsilon = 0.001
taxi_structure.model.mixing_rate * math.log(1 / epsilon)</code></pre><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-nWq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2756ba70-0720-4f95-9575-7d127b3ef745_166x35.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-nWq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2756ba70-0720-4f95-9575-7d127b3ef745_166x35.png 424w, https://substackcdn.com/image/fetch/$s_!-nWq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2756ba70-0720-4f95-9575-7d127b3ef745_166x35.png 848w, https://substackcdn.com/image/fetch/$s_!-nWq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2756ba70-0720-4f95-9575-7d127b3ef745_166x35.png 1272w, https://substackcdn.com/image/fetch/$s_!-nWq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2756ba70-0720-4f95-9575-7d127b3ef745_166x35.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-nWq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2756ba70-0720-4f95-9575-7d127b3ef745_166x35.png" width="166" height="35" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2756ba70-0720-4f95-9575-7d127b3ef745_166x35.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:35,&quot;width&quot;:166,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1691,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/172060629?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2756ba70-0720-4f95-9575-7d127b3ef745_166x35.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-nWq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2756ba70-0720-4f95-9575-7d127b3ef745_166x35.png 424w, https://substackcdn.com/image/fetch/$s_!-nWq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2756ba70-0720-4f95-9575-7d127b3ef745_166x35.png 848w, https://substackcdn.com/image/fetch/$s_!-nWq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2756ba70-0720-4f95-9575-7d127b3ef745_166x35.png 1272w, https://substackcdn.com/image/fetch/$s_!-nWq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2756ba70-0720-4f95-9575-7d127b3ef745_166x35.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>That&#8217;s our analytical reference point, we want to see how the chain distributions look when we have the 19th step after the initial state.</p><p>The reason I picked an epsilon that small is because we haven&#8217;t discussed what kind of distance should matter. I did some digging via the PyDTMC source code, Wikipedia, and several Monte Carlo texts to get an answer.</p><ul><li><p>If I were to use the PyDTMC <strong>mixing_time()</strong> method I believe it uses the square root of the dot product between the stationary distribution and the distribution of the generated chain.</p></li><li><p>The more usual mathematical definition is based on the greatest absolute difference between the two distributions, called the &#8220;Total Variation Distance&#8221; (TVD). It&#8217;s definitely the one I find discussed in material on convergence.</p></li></ul><p>The second is closer to what we want, but not quite. We care about:</p><ul><li><p>Average closeness: what PyDTMC provides.</p></li><li><p>Worst-case closeness: what TVD provides.</p></li><li><p>Similarity of shape between the two distributions: which neither provide.</p></li></ul><p>We want all three, and getting the third with a small value gets you the first two.</p><p>As a reminder of what we&#8217;re trying to achieve in this piece, we want to know when the choice of initial state no longer has a meaningful bias on the distribution of the chain. By comparing the distribution of the generated chain to the stationary distribution, when they are similar enough we are declaring that there is no longer any level of bias from the initial state that we would care about.</p><p>For this reason I added a different measure of closeness to the code, based upon the Jensen-Shannon Distance (JSD). It is histogram-friendly by being symmetric (comparing A to B is the same as comparing B to A), and robust to empty bins:</p><pre><code>eps: float = 1e-9
adjusted_sample = (sample + eps) / (sample + eps).sum()
distance = sp_sd.jensenshannon(p=stationary_distribution,
                               q=adjusted_sample)[0]</code></pre><p>The adjustment shown deals with a limitation in JSD when the two distributions get very close and you&#8217;re faced with the limit of dividing a small value by another value that is approaching zero.</p><p>There is a bit more code to getting the distance behaving when faced with numerical error, <a href="https://gitlab.com/The-Experimentalist/article_support_2025/-/blob/main/article/2025-08-28-markov-chain-convergence-in-python/convergence.ipynb">which is in the Jupyter notebook</a>. It&#8217;s a little bit of a hack but the alternative would be to implement JSD from scratch to deal with a special case where numerical error triggered sqrt(negative number) instead of sqrt(zero). I found in practice a small number of random seeds tripped over that case when generating large sequences.</p><h2>Assembling the Pieces</h2><p>Now that we have a notion of distance, we need to add it to our computation of the running results.</p><p>We will want to examine that distance at each step and see if it is mostly declining, and what our measure is at the 19th step we derived analytically.</p><pre><code># 'states' are from the simulated run above
running_df = pd.DataFrame({'state': states})
running_df.index.name = 'step'
# count how many of each state we used as of each step in the process
state_count_df = pd.get_dummies(data=running_df, 
                                prefix='', 
                                prefix_sep='').cumsum()
# add any missing states because we may not have used them all yet
# and cumsum would drop missing states from the result
model_states = adj_matrix.index.tolist()
missing_columns = set(model_states) - set(state_count_df.columns)
if missing_columns:
    for state in missing_columns:
        state_count_df[state] = 0
        # debugging is easier it we keep the state column-order canonical
        state_count_df = state_count_df[self.model_states]
# add the count aggregates to our running log
running_df = pd.concat([running_df, state_count_df], axis=1)
# turn the current row into its implied distribution
distribution_df = state_count_df.div(state_count_df.sum(axis=1), axis=0)
distributions = [row.to_frame() for _, row in distribution_df.iterrows()]

# ---&gt; changes for distance are below
distances = []
for distribution in distributions:
    # make it look like the structure of the stationary distribution
    distribution.index.name = 'state'
    distribution.rename(columns={0:'probability'}, inplace=True)
    distance = distance_from_stationary(sample=distribution)
    distances.append(distance)
running_df['distance'] = distances
running_df['distribution'] = distributions</code></pre><p>Now we will have that distance and its corresponding distribution available to us for visual comparison to the stationary distribution.</p><p>First, let&#8217;s look at a run with 200 seeds and 10 steps. We&#8217;ll keep the number of steps small this time in order to make it easier to examine details of the image.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!A1M0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aa58581-90cb-47d1-8a5f-80f74bdbddea_811x604.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!A1M0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aa58581-90cb-47d1-8a5f-80f74bdbddea_811x604.png 424w, https://substackcdn.com/image/fetch/$s_!A1M0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aa58581-90cb-47d1-8a5f-80f74bdbddea_811x604.png 848w, https://substackcdn.com/image/fetch/$s_!A1M0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aa58581-90cb-47d1-8a5f-80f74bdbddea_811x604.png 1272w, https://substackcdn.com/image/fetch/$s_!A1M0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aa58581-90cb-47d1-8a5f-80f74bdbddea_811x604.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!A1M0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aa58581-90cb-47d1-8a5f-80f74bdbddea_811x604.png" width="811" height="604" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7aa58581-90cb-47d1-8a5f-80f74bdbddea_811x604.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:604,&quot;width&quot;:811,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:48409,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/172060629?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aa58581-90cb-47d1-8a5f-80f74bdbddea_811x604.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!A1M0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aa58581-90cb-47d1-8a5f-80f74bdbddea_811x604.png 424w, https://substackcdn.com/image/fetch/$s_!A1M0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aa58581-90cb-47d1-8a5f-80f74bdbddea_811x604.png 848w, https://substackcdn.com/image/fetch/$s_!A1M0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aa58581-90cb-47d1-8a5f-80f74bdbddea_811x604.png 1272w, https://substackcdn.com/image/fetch/$s_!A1M0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7aa58581-90cb-47d1-8a5f-80f74bdbddea_811x604.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Notice how at at the start (step 0), where all we have is the initial state, the two initial states that are closest to the stationary distribution are <code>'hotel'</code> then <code>'airport'.</code> This makes sense, as these are also the two states with the highest probabilities in the stationary distribution; <code>'hotel'</code> is the highest probability and by distance measure it is the closest. At least JSD as the distance measure isn&#8217;t showing something entirely off-base right from the start.</p><p>As the run progresses, the median outcomes for each initial state converge, and by step 7 we&#8217;re seeing they don&#8217;t differ much from each other. The rest of the run going forward won&#8217;t likely be about specific initial states, but just the overall challenge of any state to result in data that closely approximates the stationary distribution.</p><p>Unfortunately we may be jumping the gun a bit at 7. Let&#8217;s look at seed sensitivity. We can plot the interquartile ranges on each step after the initial state:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Pyar!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17b32448-f511-45b3-b462-f8367266fff3_830x588.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Pyar!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17b32448-f511-45b3-b462-f8367266fff3_830x588.png 424w, https://substackcdn.com/image/fetch/$s_!Pyar!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17b32448-f511-45b3-b462-f8367266fff3_830x588.png 848w, https://substackcdn.com/image/fetch/$s_!Pyar!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17b32448-f511-45b3-b462-f8367266fff3_830x588.png 1272w, https://substackcdn.com/image/fetch/$s_!Pyar!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17b32448-f511-45b3-b462-f8367266fff3_830x588.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Pyar!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17b32448-f511-45b3-b462-f8367266fff3_830x588.png" width="830" height="588" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/17b32448-f511-45b3-b462-f8367266fff3_830x588.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:588,&quot;width&quot;:830,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:49569,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/172060629?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17b32448-f511-45b3-b462-f8367266fff3_830x588.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Pyar!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17b32448-f511-45b3-b462-f8367266fff3_830x588.png 424w, https://substackcdn.com/image/fetch/$s_!Pyar!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17b32448-f511-45b3-b462-f8367266fff3_830x588.png 848w, https://substackcdn.com/image/fetch/$s_!Pyar!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17b32448-f511-45b3-b462-f8367266fff3_830x588.png 1272w, https://substackcdn.com/image/fetch/$s_!Pyar!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17b32448-f511-45b3-b462-f8367266fff3_830x588.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The per-seed results are still jumping around quite a bit. This isn&#8217;t hugely surprising. The number of states is effectively a sample size, and when the sample size is something like 7 then it doesn&#8217;t take much to perturb the range of distances produced.</p><p>We can get an ever better look at this with initial states in separate plots.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Hw7V!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F983cefa2-26e2-4330-96ea-036332ad0ee8_1207x794.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Hw7V!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F983cefa2-26e2-4330-96ea-036332ad0ee8_1207x794.png 424w, https://substackcdn.com/image/fetch/$s_!Hw7V!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F983cefa2-26e2-4330-96ea-036332ad0ee8_1207x794.png 848w, https://substackcdn.com/image/fetch/$s_!Hw7V!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F983cefa2-26e2-4330-96ea-036332ad0ee8_1207x794.png 1272w, https://substackcdn.com/image/fetch/$s_!Hw7V!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F983cefa2-26e2-4330-96ea-036332ad0ee8_1207x794.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Hw7V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F983cefa2-26e2-4330-96ea-036332ad0ee8_1207x794.png" width="1207" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/983cefa2-26e2-4330-96ea-036332ad0ee8_1207x794.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1207,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:55224,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/172060629?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F983cefa2-26e2-4330-96ea-036332ad0ee8_1207x794.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Hw7V!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F983cefa2-26e2-4330-96ea-036332ad0ee8_1207x794.png 424w, https://substackcdn.com/image/fetch/$s_!Hw7V!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F983cefa2-26e2-4330-96ea-036332ad0ee8_1207x794.png 848w, https://substackcdn.com/image/fetch/$s_!Hw7V!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F983cefa2-26e2-4330-96ea-036332ad0ee8_1207x794.png 1272w, https://substackcdn.com/image/fetch/$s_!Hw7V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F983cefa2-26e2-4330-96ea-036332ad0ee8_1207x794.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>So not only are the results variable per seed, the variability varies by <code>initial_state</code>. Clearly 7 steps is too soon to declare we&#8217;ve moved beyond bias.</p><p>We had our analytic estimate of 19 steps. Let&#8217;s take the analysis out to 30 steps so we can see 19 in a broader context.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!G_Rm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7ce77f-8ec2-4364-93f0-8f8f697fae6d_1192x781.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!G_Rm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7ce77f-8ec2-4364-93f0-8f8f697fae6d_1192x781.png 424w, https://substackcdn.com/image/fetch/$s_!G_Rm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7ce77f-8ec2-4364-93f0-8f8f697fae6d_1192x781.png 848w, https://substackcdn.com/image/fetch/$s_!G_Rm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7ce77f-8ec2-4364-93f0-8f8f697fae6d_1192x781.png 1272w, https://substackcdn.com/image/fetch/$s_!G_Rm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7ce77f-8ec2-4364-93f0-8f8f697fae6d_1192x781.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!G_Rm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7ce77f-8ec2-4364-93f0-8f8f697fae6d_1192x781.png" width="1192" height="781" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3a7ce77f-8ec2-4364-93f0-8f8f697fae6d_1192x781.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:781,&quot;width&quot;:1192,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:59501,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/172060629?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7ce77f-8ec2-4364-93f0-8f8f697fae6d_1192x781.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!G_Rm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7ce77f-8ec2-4364-93f0-8f8f697fae6d_1192x781.png 424w, https://substackcdn.com/image/fetch/$s_!G_Rm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7ce77f-8ec2-4364-93f0-8f8f697fae6d_1192x781.png 848w, https://substackcdn.com/image/fetch/$s_!G_Rm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7ce77f-8ec2-4364-93f0-8f8f697fae6d_1192x781.png 1272w, https://substackcdn.com/image/fetch/$s_!G_Rm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7ce77f-8ec2-4364-93f0-8f8f697fae6d_1192x781.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The region around 19 steps looks quite stable. The variability is low, and low across all <code>initial_state </code>values. The rate of convergence is slowing down a lot, but that&#8217;s probably not a surprise. As a reminder from the previous article, the spectral gap being closer to 0 than 1 suggested convergence could be sluggish:</p><pre><code>model.spectral_gap</code></pre><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5o5w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a63d819-3e28-4770-8631-54abf24d89a6_189x37.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5o5w!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a63d819-3e28-4770-8631-54abf24d89a6_189x37.png 424w, https://substackcdn.com/image/fetch/$s_!5o5w!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a63d819-3e28-4770-8631-54abf24d89a6_189x37.png 848w, https://substackcdn.com/image/fetch/$s_!5o5w!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a63d819-3e28-4770-8631-54abf24d89a6_189x37.png 1272w, https://substackcdn.com/image/fetch/$s_!5o5w!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a63d819-3e28-4770-8631-54abf24d89a6_189x37.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5o5w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a63d819-3e28-4770-8631-54abf24d89a6_189x37.png" width="189" height="37" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3a63d819-3e28-4770-8631-54abf24d89a6_189x37.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:37,&quot;width&quot;:189,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1894,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/172060629?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a63d819-3e28-4770-8631-54abf24d89a6_189x37.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5o5w!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a63d819-3e28-4770-8631-54abf24d89a6_189x37.png 424w, https://substackcdn.com/image/fetch/$s_!5o5w!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a63d819-3e28-4770-8631-54abf24d89a6_189x37.png 848w, https://substackcdn.com/image/fetch/$s_!5o5w!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a63d819-3e28-4770-8631-54abf24d89a6_189x37.png 1272w, https://substackcdn.com/image/fetch/$s_!5o5w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a63d819-3e28-4770-8631-54abf24d89a6_189x37.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Time to take a look at the actual histograms at step 19 and see what they tell us.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ir9n!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f3f3c4d-904a-48df-985b-1e36ad0e1ecc_1213x791.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ir9n!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f3f3c4d-904a-48df-985b-1e36ad0e1ecc_1213x791.png 424w, https://substackcdn.com/image/fetch/$s_!ir9n!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f3f3c4d-904a-48df-985b-1e36ad0e1ecc_1213x791.png 848w, https://substackcdn.com/image/fetch/$s_!ir9n!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f3f3c4d-904a-48df-985b-1e36ad0e1ecc_1213x791.png 1272w, https://substackcdn.com/image/fetch/$s_!ir9n!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f3f3c4d-904a-48df-985b-1e36ad0e1ecc_1213x791.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ir9n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f3f3c4d-904a-48df-985b-1e36ad0e1ecc_1213x791.png" width="1213" height="791" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9f3f3c4d-904a-48df-985b-1e36ad0e1ecc_1213x791.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:791,&quot;width&quot;:1213,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:56537,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/172060629?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f3f3c4d-904a-48df-985b-1e36ad0e1ecc_1213x791.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ir9n!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f3f3c4d-904a-48df-985b-1e36ad0e1ecc_1213x791.png 424w, https://substackcdn.com/image/fetch/$s_!ir9n!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f3f3c4d-904a-48df-985b-1e36ad0e1ecc_1213x791.png 848w, https://substackcdn.com/image/fetch/$s_!ir9n!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f3f3c4d-904a-48df-985b-1e36ad0e1ecc_1213x791.png 1272w, https://substackcdn.com/image/fetch/$s_!ir9n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f3f3c4d-904a-48df-985b-1e36ad0e1ecc_1213x791.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Not too bad. The estimated convergence seems to have relevance. Since this batch of data went out to 30 steps, we can compare to that. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6wLY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F669767d6-86da-4305-a089-6dda259ffc13_1211x789.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6wLY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F669767d6-86da-4305-a089-6dda259ffc13_1211x789.png 424w, https://substackcdn.com/image/fetch/$s_!6wLY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F669767d6-86da-4305-a089-6dda259ffc13_1211x789.png 848w, https://substackcdn.com/image/fetch/$s_!6wLY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F669767d6-86da-4305-a089-6dda259ffc13_1211x789.png 1272w, https://substackcdn.com/image/fetch/$s_!6wLY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F669767d6-86da-4305-a089-6dda259ffc13_1211x789.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6wLY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F669767d6-86da-4305-a089-6dda259ffc13_1211x789.png" width="1211" height="789" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/669767d6-86da-4305-a089-6dda259ffc13_1211x789.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:789,&quot;width&quot;:1211,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:57125,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/172060629?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F669767d6-86da-4305-a089-6dda259ffc13_1211x789.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6wLY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F669767d6-86da-4305-a089-6dda259ffc13_1211x789.png 424w, https://substackcdn.com/image/fetch/$s_!6wLY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F669767d6-86da-4305-a089-6dda259ffc13_1211x789.png 848w, https://substackcdn.com/image/fetch/$s_!6wLY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F669767d6-86da-4305-a089-6dda259ffc13_1211x789.png 1272w, https://substackcdn.com/image/fetch/$s_!6wLY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F669767d6-86da-4305-a089-6dda259ffc13_1211x789.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Slightly better, but not game-changing. The important thing is that we aren&#8217;t seeing any meaningful bias based on what the initial state was, and as earlier plots showed the IQR variance based on the seed choice was narrowing as well.</p><h2>Summary</h2><ul><li><p>We wanted to generate data with this model and have a reasonable expectation that the data looked similar to what it was trained on. That is workable at 19 steps and would get progressively better over time.</p></li><li><p>There would be no real need to worry about a better or worse choice of initial state or using data generated across multiple seeds, neither initial-state bias nor seed variance are problematic for chains at least 19 steps long.</p></li><li><p>Our use of the Jensen-Shannon Distance appears safe. This isn&#8217;t really enough material to critique it versus other distance measures, but that wasn&#8217;t the goal. What we wanted was reasonably-similar histogram structure, which is what we have.</p></li><li><p>The estimate of a mixing time (number of steps until viable convergence) lined up with the empirical results, which actually was a nice surprise. Going into the analysis I was skeptical that we would see decent histogram shape that early in the sequence. </p></li></ul><h2>References</h2><ul><li><p><a href="https://gitlab.com/The-Experimentalist/article_support_2025/-/blob/main/article/2025-08-28-markov-chain-convergence-in-python/convergence.ipynb">Jupyter notebook for this article</a>.</p></li><li><p><a href="https://networkx.org/documentation/stable/reference/index.html">NetworkX API reference</a>.</p></li><li><p><a href="https://pandas.pydata.org/docs/reference/">Pandas API reference</a>.</p></li><li><p><a href="https://pypi.org/project/PyDTMC/">PyDTMC PyPi</a>.</p></li><li><p>H&#228;ggstr&#246;m, Olle, &#8220;Stationary distributions,&#8221; in <em><a href="https://www.cambridge.org/core/books/finite-markov-chains-and-algorithmic-applications/EE10AF27811B43B02E05905DC6413467">Finite Markov Chains and Algorithmic Applications</a></em><a href="https://www.cambridge.org/core/books/finite-markov-chains-and-algorithmic-applications/EE10AF27811B43B02E05905DC6413467">. London Mathematical Society Student Texts, vol. 52</a>, Cambridge University Press, 2002, pp. 28-38.</p></li><li><p>Br&#233;maud, Pierre, &#8220;Convergence Rates,&#8221; in <em><a href="https://link.springer.com/book/10.1007/978-3-030-45982-6">Markov Chains: Gibbs Fields, Monte Carlo Simulation and Queues</a></em><a href="https://link.springer.com/book/10.1007/978-3-030-45982-6">, Texts in Applied Mathematics, vol. 31</a>, Springer Cham, 2020, pp. 289&#8211;329.</p></li><li><p><a href="https://en.wikipedia.org/w/index.php?title=Total_variation_distance_of_probability_measures&amp;oldid=123456789">&#8220;Total variation distance of probability measures,&#8221; in </a><em><a href="https://en.wikipedia.org/w/index.php?title=Total_variation_distance_of_probability_measures&amp;oldid=123456789">Wikipedia</a></em>, accessed 2025-08-28, permanent link (revision) &lt;https://en.wikipedia.org/w/index.php?title=Total_variation_distance_of_probability_measures&amp;oldid=123456789&gt;.</p></li><li><p><a href="https://en.wikipedia.org/w/index.php?title=Markov_chain_mixing_time&amp;oldid=1233574639">&#8220;Markov chain mixing time,&#8221; in </a><em><a href="https://en.wikipedia.org/w/index.php?title=Markov_chain_mixing_time&amp;oldid=1233574639">Wikipedia</a></em>, accessed 2025-08-28, permanent link (revision) &lt;https://en.wikipedia.org/w/index.php?title=Markov_chain_mixing_time&amp;oldid=1233574639&gt;.</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://substack.the-experimentalist.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Experimentalist is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><a href="https://substack.the-experimentalist.com/p/markov-chain-convergence-in-python">The Experimentalist : Markov Chain Convergence in Python</a> &#169; 2025 by <a href="https://www.linkedin.com/in/reidmpinchback/">Reid M. Pinchback</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a></p>]]></content:encoded></item><item><title><![CDATA[Markov Chains with NetworkX and PyDTMC]]></title><description><![CDATA[An easy introduction to Markov Chains in Python]]></description><link>https://substack.the-experimentalist.com/p/markov-chains-with-networkx-and-pydtmc</link><guid isPermaLink="false">https://substack.the-experimentalist.com/p/markov-chains-with-networkx-and-pydtmc</guid><dc:creator><![CDATA[Reid M. Pinchback]]></dc:creator><pubDate>Mon, 25 Aug 2025 09:48:43 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/f7cf2652-25d9-4e05-abc1-8d67830bcdfc_540x459.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fGRR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82ac7d0a-c487-4311-9377-aa8cb98f2382_470x415.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fGRR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82ac7d0a-c487-4311-9377-aa8cb98f2382_470x415.gif 424w, https://substackcdn.com/image/fetch/$s_!fGRR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82ac7d0a-c487-4311-9377-aa8cb98f2382_470x415.gif 848w, https://substackcdn.com/image/fetch/$s_!fGRR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82ac7d0a-c487-4311-9377-aa8cb98f2382_470x415.gif 1272w, https://substackcdn.com/image/fetch/$s_!fGRR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82ac7d0a-c487-4311-9377-aa8cb98f2382_470x415.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fGRR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82ac7d0a-c487-4311-9377-aa8cb98f2382_470x415.gif" width="470" height="415" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/82ac7d0a-c487-4311-9377-aa8cb98f2382_470x415.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:415,&quot;width&quot;:470,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2205815,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/171681483?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82ac7d0a-c487-4311-9377-aa8cb98f2382_470x415.gif&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fGRR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82ac7d0a-c487-4311-9377-aa8cb98f2382_470x415.gif 424w, https://substackcdn.com/image/fetch/$s_!fGRR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82ac7d0a-c487-4311-9377-aa8cb98f2382_470x415.gif 848w, https://substackcdn.com/image/fetch/$s_!fGRR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82ac7d0a-c487-4311-9377-aa8cb98f2382_470x415.gif 1272w, https://substackcdn.com/image/fetch/$s_!fGRR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82ac7d0a-c487-4311-9377-aa8cb98f2382_470x415.gif 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The brief segue into <a href="https://substack.the-experimentalist.com/p/finite-graphs-and-networkx">finite graphs in the previous article</a> has paved the way for developing a little intuition on Markov Chains and working with them in Python. We will want this material for future data engineering explorations.</p><p>The concept of a Markov Chain is easy enough:</p><ul><li><p>You have a collection of states.</p></li><li><p>There are probabilities of transitioning from the current state to various future states.</p></li><li><p>There is no memory behind those probabilities; if you know the current state then you know the probabilities of transitioning next to other states.</p></li></ul><div><hr></div><p><strong>Prev: <a href="https://substack.the-experimentalist.com/p/finite-graphs-and-networkx">Finite Graphs and NetworkX</a> | Next: <a href="https://substack.the-experimentalist.com/p/markov-chain-convergence-in-python">Markov Chain Convergence in Python</a></strong></p><div><hr></div><p>We&#8217;re going to simplify the variety of Markov Chains we&#8217;ll be working with.</p><ul><li><p>We will only discuss chains with a finite number of states.</p></li><li><p>Transitions from one state to the next happen in discrete time steps, not over the flow of continuous time. This lets us organize discussions in terms of time steps t0, t1, &#8230;</p></li></ul><p>With those clarifications, what we&#8217;ll have are Discrete-Time Finite Markov Chains.</p><p>The use of the word &#8216;chain&#8217; is intended to communicate that we can keep repeating this process of random state transitions over and over again. That assumes there are no states without any outgoing transitions, which are called <strong>absorbing states</strong>, which can terminate some randomly-generated walks.</p><h2>The Graph Connection</h2><p>Remembering <a href="https://substack.the-experimentalist.com/p/finite-graphs-and-networkx">our previous discussion on graphs</a>, a Directed Finite Graph has:</p><ul><li><p><strong>Nodes</strong>.</p></li><li><p><strong>Edges</strong> between pairs of nodes, and those edges are ordered from <strong>source</strong> to <strong>target</strong>.</p></li><li><p>When an edge leaves a source node, that is an <strong>outedge</strong>.</p></li><li><p>When an edge arrives (the arrow-head in a picture points at) a target node, that is an <strong>inedge</strong>.</p></li><li><p>It is something useful to assign attributes like a numeric <strong>weight</strong> to edges.</p></li></ul><p>We&#8217;ll add to this a little bit of simplified statistical terminology.</p><ul><li><p>A <strong>probability</strong> is a real number p where 0 &#8804; p &#8804; 1.</p></li><li><p>A <strong>discrete probability distribution</strong> is a finite collection of probability values that add up to 1.</p></li><li><p>If we have estimated or assigned probabilities to a collection of possible events, and those probabilities form a probability distribution, we refer to that collection of events as being <strong>stochastic</strong>.</p></li></ul><p>And now we&#8217;re ready to characterize our Discrete-Time Finite Markov Chains (DTFMCs to save typing) in terms of Directed Finite Graphs (DFGs).</p><ul><li><p>States in a DTFMC are nodes in a DFG.</p></li><li><p>If it is possible to transition between two states <strong>a</strong> and <strong>b</strong> in a DTFMC, then that is represented as the edge <strong>(a, b)</strong> in a DFG.</p></li><li><p>The probability of such a transition from state <strong>a</strong> to state <strong>b</strong> in a DTFMC is represented as the weight of the edge <strong>(a, b)</strong>.</p></li><li><p>In an adjacency matrix for the DFG, each row corresponds to a source state and each column corresponds to a target state. All non-zero values will be the weights (probabilities of a possible state-to-state transition).</p></li><li><p>For every node (state) in the DFG, its collection of weighted outedges (representing state transitions with non-zero probability) is stochastic.</p></li></ul><p>That last two points tell us that when the DFG for a DTFMC is represented as an <strong>adjacency matrix</strong>, each row will either sum to:</p><ul><li><p>0 if and only if that state has no outedges.</p></li><li><p>1 if and only if that state has at least 1 outedge.</p></li></ul><p>This representation of a Markov Chain is sometimes referred to as <strong>row-stochastic</strong> or <strong>row-normalized</strong>. It would be possible to construct a matrix that instead was <strong>column-stochastic</strong> or <strong>column-normalized</strong>; in that case we would be seeing the probability distribution for how possible source states could transition to a target state. Row-stochastic is for understanding how a specific source leads to many potential targets, Column-stochastic is for understanding how many potential sources lead to a specific target. In this article we&#8217;ll be working with row-stochastic representations.</p><h2>NetworkX Example</h2><p>Consider the following taxi trip model as a Markov Chain:</p><ul><li><p>States are destinations.</p></li><li><p>Edges are trips between destinations.</p></li><li><p>Weight probabilities are the odds of going from one destination to another.</p></li></ul><p>Our specific data will be</p><ul><li><p>From the airport, 70% of trips go to the hotel, 20% the hospital, 10% the mall.</p></li><li><p>From the hospital, 60% of trips go to the hotel, 40% the airport.</p></li><li><p>From the hotel, 40% of trips go to the mall, 40% the airport, 20% the hospital.</p></li><li><p>From the mall, 70% of trips go to the hotel, 20% the airport, 10% the hospital.</p></li></ul><p>As shown in the previous article, we can represent such a weighted DFG as an adjacency matrix:</p><pre><code>adj_rows = dict(
    airport= [  0, 0.2, 0.7, 0.1],
    hospital=[0.4,   0, 0.6,   0],
    hotel=   [0.4, 0.2,   0, 0.4],
    mall=    [0.2, 0.1, 0.7,   0],
)
adj_matrix = pd.DataFrame.from_dict(adj_rows,
                                    orient='index',
                                    dtype=float,
                                    columns=list(adj_rows.keys()))
graph = nx.from_pandas_adjacency(adj_matrix, create_using=nx.DiGraph)</code></pre><p>We can use NetworkX to describe the appearance of the graph. First we describe the nodes and node labels:</p><pre><code>node_colors = dict(
    airport='purple',
    hospital='red',
    hotel='blue',
    mall='darkgreen',
)

for node_name, node_color in node_colors.items():
    nx.draw_networkx_nodes(graph,
                           pos=node_positions,
                           nodelist=[node_name],
                           node_color=node_color,
                           **node_params())
label_colors = {node_name: node_name for node_name in node_colors}
nx.draw_networkx_labels(graph,
                        pos=node_positions,
                        labels=label_colors,
                        **node_label_params())</code></pre><p>Then we describe the edges and edge labels:</p><pre><code>for node in graph:
    out_edges_view = typing.cast(nx.reportviews.OutEdgeView, graph.out_edges)
    nx.draw_networkx_edges(graph,
                           pos=node_positions,
                           edgelist=out_edges_view(node),
                           edge_color=node_colors[node],
                           **edge_params())
    for src, target, data in out_edges_view(node, data=True):
        edge_label = {(src, target): data['weight']}
        nx.draw_networkx_edge_labels(graph,
                                     pos=node_positions,
                                     edge_labels=edge_label,
                                     font_color=node_colors[node],
                                     **edge_label_params())</code></pre><p>This renders as:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lvOg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbaa1b9c2-949d-41da-84f8-bae3a1a46d1b_794x587.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lvOg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbaa1b9c2-949d-41da-84f8-bae3a1a46d1b_794x587.png 424w, https://substackcdn.com/image/fetch/$s_!lvOg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbaa1b9c2-949d-41da-84f8-bae3a1a46d1b_794x587.png 848w, https://substackcdn.com/image/fetch/$s_!lvOg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbaa1b9c2-949d-41da-84f8-bae3a1a46d1b_794x587.png 1272w, https://substackcdn.com/image/fetch/$s_!lvOg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbaa1b9c2-949d-41da-84f8-bae3a1a46d1b_794x587.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lvOg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbaa1b9c2-949d-41da-84f8-bae3a1a46d1b_794x587.png" width="794" height="587" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/baa1b9c2-949d-41da-84f8-bae3a1a46d1b_794x587.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:587,&quot;width&quot;:794,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:61941,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/171681483?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbaa1b9c2-949d-41da-84f8-bae3a1a46d1b_794x587.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lvOg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbaa1b9c2-949d-41da-84f8-bae3a1a46d1b_794x587.png 424w, https://substackcdn.com/image/fetch/$s_!lvOg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbaa1b9c2-949d-41da-84f8-bae3a1a46d1b_794x587.png 848w, https://substackcdn.com/image/fetch/$s_!lvOg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbaa1b9c2-949d-41da-84f8-bae3a1a46d1b_794x587.png 1272w, https://substackcdn.com/image/fetch/$s_!lvOg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbaa1b9c2-949d-41da-84f8-bae3a1a46d1b_794x587.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The outedges of each node are colored to match their source. The weights of all edges of the same color will add up to 1, reflecting their stochastic nature.</p><p>Now we have a visualization of a Markov Chain. From any state (node), in one discrete time step travel will happen on one of the correspondingly-colored outedges to a target node, with probability of that event as shown by the weight on the edge.</p><h2>Counting DFG Walks</h2><p>Going back for a moment to our graphs, we will introduce an additional concept.</p><ul><li><p>A <strong>walk</strong> is a sequence of connected edges in a graph.</p></li><li><p>If the graph is a directed graph, edges only connect if each intermediate node functions first as a target for an edge, and then as a source for the next edge. Connections must respect directionality. If a graph has a directed edge <strong>(a, b)</strong>, and another edge <strong>(c, b)</strong>, you cannot directly connect those edges in a walk because they are going in opposite directions.</p></li></ul><p>Given a particular DFG, between any two nodes, how many walks can we identify of length 2? Length 3? The answer is given by raising the unweighted adjacency matrix to the corresponding power. In other words if <strong>A</strong> is the adjacency matrix of a DFG, then the matrix product <strong>A * A</strong> computes the number of paths of length 2, and so on.</p><p>This can be illustrated with NetworkX and an edge list for weather transitions:</p><pre><code>edge_rows = [
    dict(src='cloud', dest='cloud'),
    dict(src='cloud', dest='hail'),
    dict(src='cloud', dest='rain'),
    dict(src='cloud', dest='snow'),
    dict(src='cloud', dest='sun'),
    dict(src='hail', dest='cloud'),
    dict(src='hail', dest='hail'),
    dict(src='hail', dest='rain'),
    dict(src='rain', dest='cloud'),
    dict(src='rain', dest='hail'),
    dict(src='rain', dest='rain'),
    dict(src='rain', dest='snow'),
    dict(src='snow', dest='cloud'),
    dict(src='snow', dest='rain'),
    dict(src='snow', dest='snow'),
    dict(src='sun', dest='cloud'),
    dict(src='sun', dest='sun'),
]
edge_list_df = pd.DataFrame(edge_rows)
graph = nx.from_pandas_edgelist(edge_list_df,
                                source='src',
                                target='dest',
                                create_using=nx.DiGraph)
adj_matrix = nx.to_pandas_adjacency(graph, dtype=int)
adj_matrix</code></pre><p>The last step converted the edge list to an adjacency matrix as a Pandas DataFrame:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AhSg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ee7710-22ab-4de4-b419-dbc24598ac4c_666x243.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AhSg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ee7710-22ab-4de4-b419-dbc24598ac4c_666x243.png 424w, https://substackcdn.com/image/fetch/$s_!AhSg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ee7710-22ab-4de4-b419-dbc24598ac4c_666x243.png 848w, https://substackcdn.com/image/fetch/$s_!AhSg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ee7710-22ab-4de4-b419-dbc24598ac4c_666x243.png 1272w, https://substackcdn.com/image/fetch/$s_!AhSg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ee7710-22ab-4de4-b419-dbc24598ac4c_666x243.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AhSg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ee7710-22ab-4de4-b419-dbc24598ac4c_666x243.png" width="666" height="243" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/80ee7710-22ab-4de4-b419-dbc24598ac4c_666x243.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:243,&quot;width&quot;:666,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:11748,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/171681483?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ee7710-22ab-4de4-b419-dbc24598ac4c_666x243.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AhSg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ee7710-22ab-4de4-b419-dbc24598ac4c_666x243.png 424w, https://substackcdn.com/image/fetch/$s_!AhSg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ee7710-22ab-4de4-b419-dbc24598ac4c_666x243.png 848w, https://substackcdn.com/image/fetch/$s_!AhSg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ee7710-22ab-4de4-b419-dbc24598ac4c_666x243.png 1272w, https://substackcdn.com/image/fetch/$s_!AhSg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ee7710-22ab-4de4-b419-dbc24598ac4c_666x243.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This data restricts sun to not spontaneously transitioning to hail or snow or rain. We would need a walk of length at least 2 to have a chance to transition through cloudy weather first. We can accomplish that with matrix multiplication as a &#8220;dot&#8221; product:</p><pre><code>adj_matrix @ adj_matrix</code></pre><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GLFc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a64c554-88a7-4bf7-a3f7-4278e932e5f7_666x239.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GLFc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a64c554-88a7-4bf7-a3f7-4278e932e5f7_666x239.png 424w, https://substackcdn.com/image/fetch/$s_!GLFc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a64c554-88a7-4bf7-a3f7-4278e932e5f7_666x239.png 848w, https://substackcdn.com/image/fetch/$s_!GLFc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a64c554-88a7-4bf7-a3f7-4278e932e5f7_666x239.png 1272w, https://substackcdn.com/image/fetch/$s_!GLFc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a64c554-88a7-4bf7-a3f7-4278e932e5f7_666x239.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GLFc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a64c554-88a7-4bf7-a3f7-4278e932e5f7_666x239.png" width="666" height="239" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3a64c554-88a7-4bf7-a3f7-4278e932e5f7_666x239.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:239,&quot;width&quot;:666,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12901,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/171681483?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a64c554-88a7-4bf7-a3f7-4278e932e5f7_666x239.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GLFc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a64c554-88a7-4bf7-a3f7-4278e932e5f7_666x239.png 424w, https://substackcdn.com/image/fetch/$s_!GLFc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a64c554-88a7-4bf7-a3f7-4278e932e5f7_666x239.png 848w, https://substackcdn.com/image/fetch/$s_!GLFc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a64c554-88a7-4bf7-a3f7-4278e932e5f7_666x239.png 1272w, https://substackcdn.com/image/fetch/$s_!GLFc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a64c554-88a7-4bf7-a3f7-4278e932e5f7_666x239.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>We can see, correctly,  that there is only 1 walk of length 2 that starts with sun and ends with snow. </p><p>Also correct is the number of ways to start with sun and end with cloudy weather. There are 2: sun &#8594; sun &#8594; cloud and sun &#8594; cloud &#8594; cloud.</p><p>We don&#8217;t need to work with the adjacency matrix and figure out how many dot products to type in for higher powers. NetworkX provides a function to specify the desired power directly. It returns a dict of dicts, which we can make prettier as a Pandas DataFrame:</p><pre><code>pd.DataFrame(nx.number_of_walks(graph, 2))</code></pre><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zQQ_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16c02b4c-78d4-4670-880c-bf0719ca578e_667x241.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zQQ_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16c02b4c-78d4-4670-880c-bf0719ca578e_667x241.png 424w, https://substackcdn.com/image/fetch/$s_!zQQ_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16c02b4c-78d4-4670-880c-bf0719ca578e_667x241.png 848w, https://substackcdn.com/image/fetch/$s_!zQQ_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16c02b4c-78d4-4670-880c-bf0719ca578e_667x241.png 1272w, https://substackcdn.com/image/fetch/$s_!zQQ_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16c02b4c-78d4-4670-880c-bf0719ca578e_667x241.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zQQ_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16c02b4c-78d4-4670-880c-bf0719ca578e_667x241.png" width="667" height="241" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/16c02b4c-78d4-4670-880c-bf0719ca578e_667x241.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:241,&quot;width&quot;:667,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12911,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/171681483?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16c02b4c-78d4-4670-880c-bf0719ca578e_667x241.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zQQ_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16c02b4c-78d4-4670-880c-bf0719ca578e_667x241.png 424w, https://substackcdn.com/image/fetch/$s_!zQQ_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16c02b4c-78d4-4670-880c-bf0719ca578e_667x241.png 848w, https://substackcdn.com/image/fetch/$s_!zQQ_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16c02b4c-78d4-4670-880c-bf0719ca578e_667x241.png 1272w, https://substackcdn.com/image/fetch/$s_!zQQ_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16c02b4c-78d4-4670-880c-bf0719ca578e_667x241.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The results are the same as the calculation on the adjacency matrix.</p><p>The rest of the discussion will now return to exploring the taxi data.</p><h2>Measuring DTFMC Walk Probabilities</h2><p>In the DFG walk computations just shown, the weights in the adjacency matrix were all 1 to denote the directed edge. However, this same computation also works when the weights are other numbers, like probabilities. What is computed is the probability-weighted sum of those walks. This means:</p><ul><li><p>Compute the <strong>n</strong>&#8217;th power of a Markov Chain.</p></li><li><p>We will get the probability of starting in a particular state <strong>a</strong>, and ending up in another state <strong>b</strong>. If there is no way to get from <strong>a</strong> to <strong>b</strong> then the value is <strong>0</strong>.</p></li><li><p>This is the probability to have that outcome after exactly <strong>n</strong> steps in the walk.</p></li><li><p>The probability does not count the possibility of getting from <strong>a</strong> to <strong>b</strong> in fewer or greater than <strong>n</strong> steps.</p></li></ul><p>Let&#8217;s try this for the taxi DTFMC defined earlier. The Pandas DataFrame for that adjacency matrix is:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kRlY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F493cec15-d61c-45a7-aed2-41581c3a538d_670x208.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kRlY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F493cec15-d61c-45a7-aed2-41581c3a538d_670x208.png 424w, https://substackcdn.com/image/fetch/$s_!kRlY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F493cec15-d61c-45a7-aed2-41581c3a538d_670x208.png 848w, https://substackcdn.com/image/fetch/$s_!kRlY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F493cec15-d61c-45a7-aed2-41581c3a538d_670x208.png 1272w, https://substackcdn.com/image/fetch/$s_!kRlY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F493cec15-d61c-45a7-aed2-41581c3a538d_670x208.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kRlY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F493cec15-d61c-45a7-aed2-41581c3a538d_670x208.png" width="670" height="208" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/493cec15-d61c-45a7-aed2-41581c3a538d_670x208.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:208,&quot;width&quot;:670,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:14927,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/171681483?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F493cec15-d61c-45a7-aed2-41581c3a538d_670x208.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kRlY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F493cec15-d61c-45a7-aed2-41581c3a538d_670x208.png 424w, https://substackcdn.com/image/fetch/$s_!kRlY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F493cec15-d61c-45a7-aed2-41581c3a538d_670x208.png 848w, https://substackcdn.com/image/fetch/$s_!kRlY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F493cec15-d61c-45a7-aed2-41581c3a538d_670x208.png 1272w, https://substackcdn.com/image/fetch/$s_!kRlY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F493cec15-d61c-45a7-aed2-41581c3a538d_670x208.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>We can compute the 3rd power of that graph. We&#8217;ll need to use the Pandas dot product for this, as NetworkX doesn&#8217;t understand weighting.</p><pre><code>(adj_matrix @ adj_matrix) @ adj_matrix</code></pre><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dkZj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff25ff331-a22d-49f7-8308-63e925a553a5_672x209.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dkZj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff25ff331-a22d-49f7-8308-63e925a553a5_672x209.png 424w, https://substackcdn.com/image/fetch/$s_!dkZj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff25ff331-a22d-49f7-8308-63e925a553a5_672x209.png 848w, https://substackcdn.com/image/fetch/$s_!dkZj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff25ff331-a22d-49f7-8308-63e925a553a5_672x209.png 1272w, https://substackcdn.com/image/fetch/$s_!dkZj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff25ff331-a22d-49f7-8308-63e925a553a5_672x209.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dkZj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff25ff331-a22d-49f7-8308-63e925a553a5_672x209.png" width="672" height="209" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f25ff331-a22d-49f7-8308-63e925a553a5_672x209.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:209,&quot;width&quot;:672,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:15236,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/171681483?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff25ff331-a22d-49f7-8308-63e925a553a5_672x209.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dkZj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff25ff331-a22d-49f7-8308-63e925a553a5_672x209.png 424w, https://substackcdn.com/image/fetch/$s_!dkZj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff25ff331-a22d-49f7-8308-63e925a553a5_672x209.png 848w, https://substackcdn.com/image/fetch/$s_!dkZj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff25ff331-a22d-49f7-8308-63e925a553a5_672x209.png 1272w, https://substackcdn.com/image/fetch/$s_!dkZj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff25ff331-a22d-49f7-8308-63e925a553a5_672x209.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>As we can see, starting at the mall and returning again on the third trip has an 11.2% probability. Also note that no entries are zero. We&#8217;ll discuss the implications shortly.</p><h2>Markov Chains and PyDTMC</h2><p>So far we&#8217;ve just been exploring how to represent a Markov Chain as a graph with an associated matrix with per-state probability distributions. We can do more to analyze the characteristics of the model of taxi data that we&#8217;ve built.</p><pre><code>import pydtmc as mc

model = mc.MarkovChain(adj_matrix)
print(model)</code></pre><p>From this we can get a summary report about our model:</p><pre><code>DISCRETE-TIME MARKOV CHAIN
 SIZE:           4
 RANK:           4
 CLASSES:        1
  &gt; RECURRENT:   1
  &gt; TRANSIENT:   0
 ERGODIC:        YES
  &gt; APERIODIC:   YES
  &gt; IRREDUCIBLE: YES
 ABSORBING:      NO
 MONOTONE:       NO
 REGULAR:        YES
 REVERSIBLE:     NO
 SYMMETRIC:      NO</code></pre><p>The important characteristics for now are that the model is:</p><ul><li><p><strong>Regular</strong>. This is a particularly strong property for a Markov Model to possess, and makes other operations much more approachable. It means that there is a probability distribution of states that all future state transitions will converge to. </p></li><li><p><strong>Ergodic</strong>. All the states are reachable from each other after a finite number of steps (<strong>irreducible</strong>), and the long-term distribution of states does not oscillate with periodicity (<strong>aperiodic</strong>). Above there was a place where it was pointed out that the adjacency matrix raised to the third power had all positive entries: that showed the matrix was recurrent.</p></li><li><p><strong>Recurrent</strong>: With a single recurrent class (meaning all states are in that class), we have that no matter which state we start in, we will return to it infinitely many times as we traverse more and more time steps.</p></li></ul><p>Regular implies Ergodic, but the reverse is not always true.</p><p>Because we are dealing with Finite Markov Models, Recurrent implies Irreducible, and Irreducible implies Recurrent. These relationships would not hold for infinite models.</p><p>We can use PyDTMC to get that stationary distribution.</p><pre><code>model.stationary_distributions</code></pre><p>This is the probability distribution for visiting the states over time. The order of the states here is the same as the order of the rows in the original adjacency matrix, thus we spend the most time at the hotel: <code>0.4028777</code>.</p><pre><code>[array([0.25899281, 0.15107914, 0.4028777 , 0.18705036])]</code></pre><p>We can obtain some insight into whether the model finds it easy or difficult to converge to that distribution. One way is by plotting the eigenvalues of the matrix:</p><pre><code>mc.plot_eigenvalues(model)</code></pre><p>From that we get:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sTYs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ff1d221-7f5d-4e53-960a-4144b6c24adf_576x598.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sTYs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ff1d221-7f5d-4e53-960a-4144b6c24adf_576x598.png 424w, https://substackcdn.com/image/fetch/$s_!sTYs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ff1d221-7f5d-4e53-960a-4144b6c24adf_576x598.png 848w, https://substackcdn.com/image/fetch/$s_!sTYs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ff1d221-7f5d-4e53-960a-4144b6c24adf_576x598.png 1272w, https://substackcdn.com/image/fetch/$s_!sTYs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ff1d221-7f5d-4e53-960a-4144b6c24adf_576x598.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sTYs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ff1d221-7f5d-4e53-960a-4144b6c24adf_576x598.png" width="576" height="598" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3ff1d221-7f5d-4e53-960a-4144b6c24adf_576x598.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:598,&quot;width&quot;:576,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:41124,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/171681483?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ff1d221-7f5d-4e53-960a-4144b6c24adf_576x598.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sTYs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ff1d221-7f5d-4e53-960a-4144b6c24adf_576x598.png 424w, https://substackcdn.com/image/fetch/$s_!sTYs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ff1d221-7f5d-4e53-960a-4144b6c24adf_576x598.png 848w, https://substackcdn.com/image/fetch/$s_!sTYs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ff1d221-7f5d-4e53-960a-4144b6c24adf_576x598.png 1272w, https://substackcdn.com/image/fetch/$s_!sTYs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ff1d221-7f5d-4e53-960a-4144b6c24adf_576x598.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Take notice of the two largest-magnitude eigenvalues. The wide red ring shows the distance between them, called the <strong>spectral gap</strong>. The closer they are, the more difficult it is for the future probabilities to generate walk that converge to the stationary distribution.</p><p>We can find out the spectral gap directly:</p><pre><code>model.spectral_gap</code></pre><ul><li><p>When the value is closer to 1, convergence is faster.</p></li><li><p>When it is closer to 0, convergence is slower.</p></li></ul><pre><code>0.31078141211367716</code></pre><p>In our case convergence is a little sluggish.</p><p>We can get further insight into convergence with <strong>Spectral Density</strong>.</p><pre><code>model.density</code></pre><p>This tells us how variable the data is:</p><pre><code>0.9166666666666666</code></pre><ul><li><p>A value near 1 suggests convergence is easier.</p></li><li><p>Near 0 indicates convergence is harder, and it would still mean that no matter what the spectral gap may be suggesting.</p></li></ul><p>While it may sound a little counter-intuitive, data that is more variable actually makes convergence easier. When it is less variable then you have some low-probability transitions in the matrix that can keep the model stuck instead of moving freely around all its states.</p><p>In the next article we&#8217;ll explore what that convergence behavior looks like.</p><h2>References</h2><ul><li><p><a href="https://gitlab.com/The-Experimentalist/article_support_2025/-/blob/main/article/2025-08-25-markov-chains-with-networkx-and-pydtmc/markov-chain.ipynb">Jupyter notebook for this article</a>.</p></li><li><p><a href="https://networkx.org/documentation/stable/reference/index.html">NetworkX API reference</a>.</p></li><li><p><a href="https://pandas.pydata.org/docs/reference/">Pandas API reference</a>.</p></li><li><p><a href="https://pypi.org/project/PyDTMC/">PyDTMC PyPi</a>.</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://substack.the-experimentalist.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Experimentalist is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><a href="https://substack.the-experimentalist.com/p/markov-chains-with-networkx-and-pydtmc">The Experimentalist : Markov Chains with NetworkX and PyDTMC</a> &#169; 2025 by <a href="https://www.linkedin.com/in/reidmpinchback/">Reid M. Pinchback</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Finite Graphs and NetworkX]]></title><description><![CDATA[Add representation and analysis of graph networks to your Python toolkit]]></description><link>https://substack.the-experimentalist.com/p/finite-graphs-and-networkx</link><guid isPermaLink="false">https://substack.the-experimentalist.com/p/finite-graphs-and-networkx</guid><dc:creator><![CDATA[Reid M. Pinchback]]></dc:creator><pubDate>Fri, 22 Aug 2025 01:00:17 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!h81y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9b34f5a-b0b8-40ee-9a82-bb6aafe49b39_514x400.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!h81y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9b34f5a-b0b8-40ee-9a82-bb6aafe49b39_514x400.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!h81y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9b34f5a-b0b8-40ee-9a82-bb6aafe49b39_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!h81y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9b34f5a-b0b8-40ee-9a82-bb6aafe49b39_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!h81y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9b34f5a-b0b8-40ee-9a82-bb6aafe49b39_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!h81y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9b34f5a-b0b8-40ee-9a82-bb6aafe49b39_514x400.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!h81y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9b34f5a-b0b8-40ee-9a82-bb6aafe49b39_514x400.jpeg" width="514" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f9b34f5a-b0b8-40ee-9a82-bb6aafe49b39_514x400.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:514,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:284227,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/171538815?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9b34f5a-b0b8-40ee-9a82-bb6aafe49b39_514x400.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!h81y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9b34f5a-b0b8-40ee-9a82-bb6aafe49b39_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!h81y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9b34f5a-b0b8-40ee-9a82-bb6aafe49b39_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!h81y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9b34f5a-b0b8-40ee-9a82-bb6aafe49b39_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!h81y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9b34f5a-b0b8-40ee-9a82-bb6aafe49b39_514x400.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Many areas of practical data analysis and systems reasoning are grounded in the idea of using a finite graph to represent a problem. Not only does this provide a way to structure data, but the structure itself may suit analysis by a variety of techniques.</p><p>We&#8217;re going to keep the context very straightforward in order to build intuition over a few articles. The dusty old math texts can stay on their shelves for this one.</p><div><hr></div><p><strong>Next: <a href="https://substack.the-experimentalist.com/p/markov-chains-with-networkx-and-pydtmc">Markov Chains with NetworkX and PyDTMC</a></strong></p><div><hr></div><h2>Finite Graph Terminology</h2><p><strong>Finite Graph</strong>: a finite collection of nodes and edges.</p><p><strong>Node</strong>: an object in a graph that will have 0 or more edges connecting it to other nodes.</p><p><strong>Edge</strong>: a line in the graph that connects two nodes. It is acceptable for the same node to be on both ends of a connection.</p><pre><code>import networkx as nx
import matplotlib.pyplot as plt
import pandas as pd
import random
import typing

graph: nx.Graph = nx.Graph()
graph.add_nodes_from(['a', 'b', 'c', 'd', 'e'])
graph.add_edges_from([('a', 'b'), ('a', 'c'), ('b', 'd'),
                      ('c', 'c'), ('c', 'e'), ('e','b')])
node_positions: NodeLayout  = nx.circular_layout(graph)
nx.draw(graph, pos=node_positions, **draw_params())</code></pre><p>Tuples like <code>('a', 'b')</code> are the nodes at either end of that edge. As this is an undirected graph, <code>('b', 'a')</code> would achieve the same.</p><p>The <code>draw_params()</code> function is a convenience for boilerplate reduction on setting rendering parameters across multiple examples.</p><pre><code>def override_params(params: dict[str, typing.Any],
                    **kwargs: typing.Any) -&gt; None:
    for key, value in kwargs.items():
        params[key] = value

def draw_params(**kwargs: typing.Any) -&gt; dict[str, typing.Any]:
    params: dict[str, typing.Any] = dict(
        arrows=True,
        arrowsize=17,
        edge_color='darkgray',  # colors graph edges
        edgecolors='darkgray',  # colors border pixels of nodes
        font_color='black',
        font_size=15,
        font_weight='semibold',
        node_color='lightgray',
        node_size=525,
        width=2.5,
        with_labels=True,
    )
    override_params(params, **kwargs)
    return params</code></pre><p>The complete code (<a href="https://gitlab.com/The-Experimentalist/article_support_2025/-/blob/main/article/2025-08-21-finite-graphs-and-networkx/finite-graphs.ipynb">available in a Jupyter notebook</a>) renders:</p><p> </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CBdR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3884c568-add4-445c-8bb2-0eca94978447_794x505.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CBdR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3884c568-add4-445c-8bb2-0eca94978447_794x505.png 424w, https://substackcdn.com/image/fetch/$s_!CBdR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3884c568-add4-445c-8bb2-0eca94978447_794x505.png 848w, https://substackcdn.com/image/fetch/$s_!CBdR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3884c568-add4-445c-8bb2-0eca94978447_794x505.png 1272w, https://substackcdn.com/image/fetch/$s_!CBdR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3884c568-add4-445c-8bb2-0eca94978447_794x505.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CBdR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3884c568-add4-445c-8bb2-0eca94978447_794x505.png" width="794" height="505" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3884c568-add4-445c-8bb2-0eca94978447_794x505.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:505,&quot;width&quot;:794,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:29232,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/171538815?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3884c568-add4-445c-8bb2-0eca94978447_794x505.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CBdR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3884c568-add4-445c-8bb2-0eca94978447_794x505.png 424w, https://substackcdn.com/image/fetch/$s_!CBdR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3884c568-add4-445c-8bb2-0eca94978447_794x505.png 848w, https://substackcdn.com/image/fetch/$s_!CBdR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3884c568-add4-445c-8bb2-0eca94978447_794x505.png 1272w, https://substackcdn.com/image/fetch/$s_!CBdR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3884c568-add4-445c-8bb2-0eca94978447_794x505.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Display of node labels is optional, which is a useful feature for large graphs. Just add <code>with_labels=False</code> to <code>nx.draw()</code> via the helper method <code>draw_params()</code>:</p><pre><code>nx.draw(graph, pos=node_positions, **draw_params(with_labels=False))</code></pre><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!unPg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3013f9f-0ffe-42d1-a712-ea11b3e9efbe_794x505.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!unPg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3013f9f-0ffe-42d1-a712-ea11b3e9efbe_794x505.png 424w, https://substackcdn.com/image/fetch/$s_!unPg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3013f9f-0ffe-42d1-a712-ea11b3e9efbe_794x505.png 848w, https://substackcdn.com/image/fetch/$s_!unPg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3013f9f-0ffe-42d1-a712-ea11b3e9efbe_794x505.png 1272w, https://substackcdn.com/image/fetch/$s_!unPg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3013f9f-0ffe-42d1-a712-ea11b3e9efbe_794x505.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!unPg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3013f9f-0ffe-42d1-a712-ea11b3e9efbe_794x505.png" width="794" height="505" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a3013f9f-0ffe-42d1-a712-ea11b3e9efbe_794x505.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:505,&quot;width&quot;:794,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:29299,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/171538815?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3013f9f-0ffe-42d1-a712-ea11b3e9efbe_794x505.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!unPg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3013f9f-0ffe-42d1-a712-ea11b3e9efbe_794x505.png 424w, https://substackcdn.com/image/fetch/$s_!unPg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3013f9f-0ffe-42d1-a712-ea11b3e9efbe_794x505.png 848w, https://substackcdn.com/image/fetch/$s_!unPg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3013f9f-0ffe-42d1-a712-ea11b3e9efbe_794x505.png 1272w, https://substackcdn.com/image/fetch/$s_!unPg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3013f9f-0ffe-42d1-a712-ea11b3e9efbe_794x505.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>We can extend this idea a little further. We can give those edges a sense of direction.</p><h2>Directed Finite Graph Terminology</h2><p><strong>Node</strong>: as defined previously.</p><p><strong>Edge</strong>: a line in a graph that has a source node and a target node. When drawn, an arrow head points to the target node.</p><p>If the edges aren&#8217;t directed we call it an <strong>Undirected Finite Graph</strong> (or <strong>Undirected Graph</strong> for short), otherwise a <strong>Directed Finite Graph</strong> (again, <strong>Directed Graph</strong> for short).</p><p>Nodes in a directed graph now gain a bit more structure:</p><ul><li><p><strong>Outgoing Edges</strong>: the directed edges connected to a node that have that node as a source. Also called &#8220;<strong>outedges</strong>&#8221; in some math texts.</p></li><li><p><strong>Incoming Edges</strong>: the directed edges connected to a node that have that node as a target. Also called &#8220;<strong>inedges</strong>&#8221; in some math texts.</p></li></ul><p>Representing edges is no different, except now the order of the identifier pairs matters when it didn&#8217;t matter for undirected edges. The first member of a pair is the source, the second is the target. The only substantial difference is the change in the NetworkX class used when defining the graph. We change the class from <code>nx.Graph</code> to <code>nx.DiGraph</code>:</p><pre><code>graph: nx.Graph = nx.DiGraph()
graph.add_nodes_from(['a', 'b', 'c', 'd', 'e'])
graph.add_edges_from([('a', 'b'), ('a', 'c'), ('b', 'd'),
                      ('c', 'c'), ('c', 'e'), ('e','b')])
node_positions: NodeLayout  = nx.circular_layout(graph)
nx.draw(graph, pos=node_positions, **draw_params())</code></pre><p>With this change the rendering becomes:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HAQ2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c95c95-f775-4358-936f-bf6081554d0c_794x505.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HAQ2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c95c95-f775-4358-936f-bf6081554d0c_794x505.png 424w, https://substackcdn.com/image/fetch/$s_!HAQ2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c95c95-f775-4358-936f-bf6081554d0c_794x505.png 848w, https://substackcdn.com/image/fetch/$s_!HAQ2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c95c95-f775-4358-936f-bf6081554d0c_794x505.png 1272w, https://substackcdn.com/image/fetch/$s_!HAQ2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c95c95-f775-4358-936f-bf6081554d0c_794x505.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HAQ2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c95c95-f775-4358-936f-bf6081554d0c_794x505.png" width="794" height="505" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/14c95c95-f775-4358-936f-bf6081554d0c_794x505.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:505,&quot;width&quot;:794,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:29590,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/171538815?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c95c95-f775-4358-936f-bf6081554d0c_794x505.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HAQ2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c95c95-f775-4358-936f-bf6081554d0c_794x505.png 424w, https://substackcdn.com/image/fetch/$s_!HAQ2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c95c95-f775-4358-936f-bf6081554d0c_794x505.png 848w, https://substackcdn.com/image/fetch/$s_!HAQ2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c95c95-f775-4358-936f-bf6081554d0c_794x505.png 1272w, https://substackcdn.com/image/fetch/$s_!HAQ2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c95c95-f775-4358-936f-bf6081554d0c_794x505.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Alternate Representations</h2><p>It can be mathematically useful, particularly for directed graphs, to think of them as corresponding to a matrix. Each row of a matrix corresponds to a source node, and each column corresponds to a target node. This is referred to as an <strong>Adjacency Matrix</strong>.</p><pre><code>adj_rows: dict[typing.Hashable, list[typing.Any]] = dict(
    a=[0, 1, 1, 0, 0],
    b=[0, 0, 0, 1, 0],
    c=[0, 0, 1, 0, 1],
    d=[0, 0, 0, 0, 0],
    e=[0, 1, 0, 0, 0]
)
adj_matrix = pd.DataFrame.from_dict(adj_rows,
                                    orient='index',
                                    dtype=int,
                                    columns=list(adj_rows.keys()))
graph = nx.from_pandas_adjacency(adj_matrix, create_using=nx.DiGraph)
node_positions: NodeLayout  = nx.circular_layout(graph)
nx.draw(graph, pos=node_positions, **draw_params())</code></pre><p>We use the Pandas trick of providing a keyed dictionary for rows and specifying the orientation. Thinking in terms of a known source having various targets is more intuitive than what column-specified data would force on us: to think in terms of one target having multiple sources. It also allows the visual aid of the text for populating <code>adj_rows</code> to correspond to how we have been taught to conceive of matrices. All we need to do is inform Pandas what the column keys are, which in this case is the same as the row keys: <code>columns=list(adj_rows.keys())</code>.</p><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BcFI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1a4c917-a920-4a03-bb3e-4a9c027e858f_794x505.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BcFI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1a4c917-a920-4a03-bb3e-4a9c027e858f_794x505.png 424w, https://substackcdn.com/image/fetch/$s_!BcFI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1a4c917-a920-4a03-bb3e-4a9c027e858f_794x505.png 848w, https://substackcdn.com/image/fetch/$s_!BcFI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1a4c917-a920-4a03-bb3e-4a9c027e858f_794x505.png 1272w, https://substackcdn.com/image/fetch/$s_!BcFI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1a4c917-a920-4a03-bb3e-4a9c027e858f_794x505.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BcFI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1a4c917-a920-4a03-bb3e-4a9c027e858f_794x505.png" width="794" height="505" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b1a4c917-a920-4a03-bb3e-4a9c027e858f_794x505.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:505,&quot;width&quot;:794,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:31481,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/171538815?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1a4c917-a920-4a03-bb3e-4a9c027e858f_794x505.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BcFI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1a4c917-a920-4a03-bb3e-4a9c027e858f_794x505.png 424w, https://substackcdn.com/image/fetch/$s_!BcFI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1a4c917-a920-4a03-bb3e-4a9c027e858f_794x505.png 848w, https://substackcdn.com/image/fetch/$s_!BcFI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1a4c917-a920-4a03-bb3e-4a9c027e858f_794x505.png 1272w, https://substackcdn.com/image/fetch/$s_!BcFI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1a4c917-a920-4a03-bb3e-4a9c027e858f_794x505.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The output is the same as before, which was the goal. This representation can be convenient when the existing understanding of a graph is organized in terms of outgoing edges, because all the outgoing edge connections for a node form the matrix row for that node.</p><p>The number 1 in the matrix indicates a directed edge from the source (the row index) to the target (the column index). Using 0 and 1 is a traditional starting point for adjacency matrices because it supports some useful linear algebra, but there is no constraint that only the number 1 be used to indicate a connection. Any non-zero value can represent, for example, weight or length.</p><p>The disadvantage of this representation, at least with NetworkX, is that the only information this can capture are node names, edge directions, and a single edge weight. Adjacency matrices for NetworkX do not support the idea of structured objects so we couldn&#8217;t use one to specify multiple per-edge attributes. For that we need <strong>Edge Lists</strong>.</p><pre><code>edge_rows = [
    dict(src='a', dest='b'),
    dict(src='a', dest='c'),
    dict(src='b', dest='d'),
    dict(src='c', dest='c'),
    dict(src='c', dest='e'),
    dict(src='e', dest='b'),
]
edge_list_df: pd.DataFrame = pd.DataFrame(edge_rows)
graph = nx.from_pandas_edgelist(edge_list_df,
                                source='src',
                                target='dest',
                                create_using=nx.DiGraph)
node_positions: NodeLayout  = nx.circular_layout(graph)
nx.draw(graph, pos=node_positions, **draw_params())</code></pre><p> The situation has changed a bit:</p><ul><li><p>Each edge must be specified as a dictionary with consistent keys indicating the source and target of the edge.</p></li><li><p>The edge list is converted into a Pandas <code>DataFrame</code>, and the dictionary keys become the column names.</p></li><li><p>We use <code>nx.from_pandas_edgelist()</code> to convert the <code>DataFrame</code> into a graph, which requires specifying the keys that were used for the source and target.</p></li></ul><p>The final rendering though, at least for this data, is unchanged:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eZQi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec072bb-7fce-4ea4-ada7-a2d274eea07e_794x505.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eZQi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec072bb-7fce-4ea4-ada7-a2d274eea07e_794x505.png 424w, https://substackcdn.com/image/fetch/$s_!eZQi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec072bb-7fce-4ea4-ada7-a2d274eea07e_794x505.png 848w, https://substackcdn.com/image/fetch/$s_!eZQi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec072bb-7fce-4ea4-ada7-a2d274eea07e_794x505.png 1272w, https://substackcdn.com/image/fetch/$s_!eZQi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec072bb-7fce-4ea4-ada7-a2d274eea07e_794x505.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eZQi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec072bb-7fce-4ea4-ada7-a2d274eea07e_794x505.png" width="794" height="505" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1ec072bb-7fce-4ea4-ada7-a2d274eea07e_794x505.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:505,&quot;width&quot;:794,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:30527,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/171538815?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec072bb-7fce-4ea4-ada7-a2d274eea07e_794x505.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eZQi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec072bb-7fce-4ea4-ada7-a2d274eea07e_794x505.png 424w, https://substackcdn.com/image/fetch/$s_!eZQi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec072bb-7fce-4ea4-ada7-a2d274eea07e_794x505.png 848w, https://substackcdn.com/image/fetch/$s_!eZQi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec072bb-7fce-4ea4-ada7-a2d274eea07e_794x505.png 1272w, https://substackcdn.com/image/fetch/$s_!eZQi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec072bb-7fce-4ea4-ada7-a2d274eea07e_794x505.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Now we can push this further to add more attributes to the edge data:</p><pre><code>def random_color_channel() -&gt; float:
    if not hasattr(random_color_channel, 'rand'):
        random_color_channel.rand = random.Random(42)
    return random_color_channel.rand.uniform(0.5, 1)

def random_rgb_color() -&gt; tuple[float, float, float]:
    return random_color_channel(),
           random_color_channel(),
           random_color_channel()

edge_rows = [
    dict(src='a', dest='b', weight=1, color=random_rgb_color()),
    dict(src='a', dest='c', weight=3, color=random_rgb_color()),
    dict(src='b', dest='d', weight=2, color=random_rgb_color()),
    dict(src='c', dest='c', weight=1, color=random_rgb_color()),
    dict(src='c', dest='e', weight=2, color=random_rgb_color()),
    dict(src='e', dest='b', weight=1, color=random_rgb_color()),
]
edge_list_df: pd.DataFrame = pd.DataFrame(edge_rows)
graph = nx.from_pandas_edgelist(edge_list_df,
                                source='src',
                                target='dest',
                                edge_attr=['weight', 'color'],
                                create_using=nx.DiGraph)
edge_weights = nx.get_edge_attributes(graph, 'weight')
edge_colors = nx.get_edge_attributes(graph, 'color').values()
node_positions: NodeLayout  = nx.circular_layout(graph)
nx.draw(graph, pos=node_positions, **draw_params(edge_color=edge_colors))
nx.draw_networkx_edge_labels(graph, pos=node_positions,
                             **edge_label_params(edge_labels=edge_weights))</code></pre><p>Now we&#8217;re adding some complexity:</p><ul><li><p>Edges have two attributes: <code>weight</code> and <code>color</code>.</p></li><li><p>The color is an RGB triple where each channel is a float in the range [0, 1].</p></li><li><p>When using <code>nx.from_pandas_edgelist()</code> to produce a graph we have to tell it about the per-edge attributes: <code>edge_attr=['weight', 'color']</code></p></li><li><p>NetworkX doesn&#8217;t do much with complex edges so we have to extract the data as <code>edge_weights</code> and <code>edge_colors</code> to use later as parameters.</p></li><li><p>The <code>nx.draw()</code> function has limited smarts on some forms of rendering, so we include use of <code>nx.draw_networkx_edge_labels()</code> to render the weights as values on each edge.</p></li><li><p>The way <code>nx.draw()</code> and <code>nx.draw_networkx_edge_labels()</code> share an understanding of node locations is via the <code>node_positions</code> layout object. In the previous examples it was only used once per rendering so its role was less apparent, but here we see it used to layer on rendering details. The object provides node coordinates in the rendering. I added a Python type alias to cover the kinds of data passed back by various layout algorithms:</p></li></ul><pre><code>NodeLayout: typing.TypeAlias = dict[typing.Hashable, list[float]]</code></pre><p>Now we get a much richer rendering with per-edge weight labels and per-edge colors:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kAI-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F607954dd-7a88-40de-9e9d-3bddf994337c_794x505.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kAI-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F607954dd-7a88-40de-9e9d-3bddf994337c_794x505.png 424w, https://substackcdn.com/image/fetch/$s_!kAI-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F607954dd-7a88-40de-9e9d-3bddf994337c_794x505.png 848w, https://substackcdn.com/image/fetch/$s_!kAI-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F607954dd-7a88-40de-9e9d-3bddf994337c_794x505.png 1272w, https://substackcdn.com/image/fetch/$s_!kAI-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F607954dd-7a88-40de-9e9d-3bddf994337c_794x505.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kAI-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F607954dd-7a88-40de-9e9d-3bddf994337c_794x505.png" width="794" height="505" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/607954dd-7a88-40de-9e9d-3bddf994337c_794x505.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:505,&quot;width&quot;:794,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:38601,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/171538815?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F607954dd-7a88-40de-9e9d-3bddf994337c_794x505.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kAI-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F607954dd-7a88-40de-9e9d-3bddf994337c_794x505.png 424w, https://substackcdn.com/image/fetch/$s_!kAI-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F607954dd-7a88-40de-9e9d-3bddf994337c_794x505.png 848w, https://substackcdn.com/image/fetch/$s_!kAI-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F607954dd-7a88-40de-9e9d-3bddf994337c_794x505.png 1272w, https://substackcdn.com/image/fetch/$s_!kAI-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F607954dd-7a88-40de-9e9d-3bddf994337c_794x505.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In future articles I expect to need to use various analytic techniques on graphs. Even without that, NetworkX can help with rendering images that support an explanation for how data is being transformed or analyzed.</p><h2>References</h2><ul><li><p><a href="https://gitlab.com/The-Experimentalist/article_support_2025/-/blob/main/article/2025-08-21-finite-graphs-and-networkx/finite-graphs.ipynb">Jupyter notebook for this article</a>.</p></li><li><p><a href="https://matplotlib.org/stable/api/">Matplotlib API reference</a>.</p></li><li><p><a href="https://networkx.org/documentation/stable/reference/index.html">NetworkX API reference</a>.</p></li><li><p><a href="https://pandas.pydata.org/docs/reference/">Pandas API reference</a>.</p></li></ul><h2>Edits</h2><ul><li><p><strong>2025-08-22:</strong> Added terminology for <strong>inedges</strong> and <strong>outedges</strong>.</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://substack.the-experimentalist.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Experimentalist is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><a href="https://substack.the-experimentalist.com/p/finite-graphs-and-networkx">The Experimentalist : Finite Graphs and NetworkX</a> &#169; 2025 by <a href="https://www.linkedin.com/in/reidmpinchback/">Reid M. Pinchback</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a></p>]]></content:encoded></item><item><title><![CDATA[Client Connectivity with PostgreSQL]]></title><description><![CDATA[Resolving database problems can begin with chasing down clients]]></description><link>https://substack.the-experimentalist.com/p/client-connectivity-with-postgresql</link><guid isPermaLink="false">https://substack.the-experimentalist.com/p/client-connectivity-with-postgresql</guid><dc:creator><![CDATA[Reid M. Pinchback]]></dc:creator><pubDate>Fri, 15 Aug 2025 03:28:40 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Pv71!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b539003-51a8-4ad6-8edd-c482866bc6c1_514x400.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Pv71!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b539003-51a8-4ad6-8edd-c482866bc6c1_514x400.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Pv71!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b539003-51a8-4ad6-8edd-c482866bc6c1_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Pv71!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b539003-51a8-4ad6-8edd-c482866bc6c1_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Pv71!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b539003-51a8-4ad6-8edd-c482866bc6c1_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Pv71!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b539003-51a8-4ad6-8edd-c482866bc6c1_514x400.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Pv71!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b539003-51a8-4ad6-8edd-c482866bc6c1_514x400.jpeg" width="514" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2b539003-51a8-4ad6-8edd-c482866bc6c1_514x400.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:514,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:280960,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170953957?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b539003-51a8-4ad6-8edd-c482866bc6c1_514x400.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Pv71!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b539003-51a8-4ad6-8edd-c482866bc6c1_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Pv71!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b539003-51a8-4ad6-8edd-c482866bc6c1_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Pv71!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b539003-51a8-4ad6-8edd-c482866bc6c1_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Pv71!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b539003-51a8-4ad6-8edd-c482866bc6c1_514x400.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>There are times when you need to chase down database client connections in PostgreSQL. Examples of the issues that arise are:</p><ul><li><p>Having too many idle connections, which can pointlessly throttle database performance due to the backend process resources allocated to each.</p></li><li><p>Having too many connections in total relative to the semaphore array configuration of the database, resulting in new connections being denied.</p></li><li><p>Holding transaction-scoped locks for too long or gaining them in a problematic order, which can create errors or deadlocks in other operations that need some of those same locks for work to proceed.</p></li><li><p>Holding transactions open too long regardless of any locking issues, which can thwart maintenance operations like vacuums and backups.</p></li><li><p>Burning through transaction ids at an unusual rate, which might trigger id wrap-around too fast for vacuum operations to keep up and thus cause the database to go into recovery mode.</p></li><li><p>Connection activity may be unexpected or suspicious, in which case you may have a security incident to investigate.</p></li></ul><div><hr></div><p><strong>Prev:</strong> <strong><a href="https://substack.the-experimentalist.com/p/postgresql-17-on-vagrant-and-virtualbox">PostgreSQL 17 on Vagrant and VirtualBox</a></strong></p><div><hr></div><h2>The TCP/IP Story</h2><p>When we set up server processes we often find ourselves having to configure one or more ports for access. We become very aware of server-side addressing via IP or DNS name plus a port. What is easy to forget is that the client end of the connection also has addressing information associated with it.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UEB0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe12f402b-163e-4dc9-8869-f43e545e33c3_591x201.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UEB0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe12f402b-163e-4dc9-8869-f43e545e33c3_591x201.png 424w, https://substackcdn.com/image/fetch/$s_!UEB0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe12f402b-163e-4dc9-8869-f43e545e33c3_591x201.png 848w, https://substackcdn.com/image/fetch/$s_!UEB0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe12f402b-163e-4dc9-8869-f43e545e33c3_591x201.png 1272w, https://substackcdn.com/image/fetch/$s_!UEB0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe12f402b-163e-4dc9-8869-f43e545e33c3_591x201.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UEB0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe12f402b-163e-4dc9-8869-f43e545e33c3_591x201.png" width="591" height="201" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e12f402b-163e-4dc9-8869-f43e545e33c3_591x201.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:201,&quot;width&quot;:591,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:35771,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170953957?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe12f402b-163e-4dc9-8869-f43e545e33c3_591x201.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UEB0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe12f402b-163e-4dc9-8869-f43e545e33c3_591x201.png 424w, https://substackcdn.com/image/fetch/$s_!UEB0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe12f402b-163e-4dc9-8869-f43e545e33c3_591x201.png 848w, https://substackcdn.com/image/fetch/$s_!UEB0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe12f402b-163e-4dc9-8869-f43e545e33c3_591x201.png 1272w, https://substackcdn.com/image/fetch/$s_!UEB0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe12f402b-163e-4dc9-8869-f43e545e33c3_591x201.png 1456w" sizes="100vw"></picture><div></div></div></a></figure></div><p>With each client-server interaction:</p><ul><li><p>There is a client process, and that process has a process id (PID).</p></li><li><p>There is a server process, which also has a PID.</p></li><li><p>Requests originate from the client and are sent as TCP/IP packets to the server.</p></li><li><p>Responses originate from the server and are sent as TCP/IP packets to the client.</p></li><li><p>The client finds the server via an IP address and TCP port. That port is the one we tend to be aware of as software engineers. When we configure a client we tell it the IP address &#8212; or indirectly, via a DNS name &#8212; and the port.</p></li><li><p>The server also finds the client via an IP address and TCP port. We don&#8217;t configure the information into the server, so how does it know about the client?</p></li></ul><p>The answer is in the IP and TCP headers for each packet.</p><ul><li><p>IP headers contain several fields, which include both source and destination IP addresses.</p></li><li><p>TCP headers also contain several fields, including both source and destination TCP ports.</p></li><li><p>The way a server knows where the client is located is by extracting those fields when a request is received, and using them when it generates a response.</p></li><li><p>While the server port resulted from static configuration, client ports are often determined on-the-fly and sometimes referred to as &#8220;ephemeral&#8221; ports.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WbOS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35be141f-6ea2-4489-af1b-ba41bdfcd1c1_383x641.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WbOS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35be141f-6ea2-4489-af1b-ba41bdfcd1c1_383x641.png 424w, https://substackcdn.com/image/fetch/$s_!WbOS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35be141f-6ea2-4489-af1b-ba41bdfcd1c1_383x641.png 848w, https://substackcdn.com/image/fetch/$s_!WbOS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35be141f-6ea2-4489-af1b-ba41bdfcd1c1_383x641.png 1272w, https://substackcdn.com/image/fetch/$s_!WbOS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35be141f-6ea2-4489-af1b-ba41bdfcd1c1_383x641.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WbOS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35be141f-6ea2-4489-af1b-ba41bdfcd1c1_383x641.png" width="383" height="641" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/35be141f-6ea2-4489-af1b-ba41bdfcd1c1_383x641.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:641,&quot;width&quot;:383,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:51181,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170953957?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35be141f-6ea2-4489-af1b-ba41bdfcd1c1_383x641.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WbOS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35be141f-6ea2-4489-af1b-ba41bdfcd1c1_383x641.png 424w, https://substackcdn.com/image/fetch/$s_!WbOS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35be141f-6ea2-4489-af1b-ba41bdfcd1c1_383x641.png 848w, https://substackcdn.com/image/fetch/$s_!WbOS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35be141f-6ea2-4489-af1b-ba41bdfcd1c1_383x641.png 1272w, https://substackcdn.com/image/fetch/$s_!WbOS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35be141f-6ea2-4489-af1b-ba41bdfcd1c1_383x641.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>There is information missing from these packets: the PIDs. Nothing in the network traffic knows about processes. That association is handled by operating system support which the application processes make use of.</p><ul><li><p>The client process will use a socket and bind it to the client IP address and ephemeral TCP port.</p></li><li><p>The server process will use a socket and bind it to the server IP address and statically-configured TCP port.</p></li></ul><h2>Ports and PostgreSQL</h2><p>The picture changes when you introduce a PostgreSQL service into the equation. Since TCP/IP packets don&#8217;t know anything about processes, there is no reason to restrict ourselves to a single server process for the implementation.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!l2-a!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ee44fe8-570e-4376-9c9b-fb5b61916c81_591x281.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!l2-a!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ee44fe8-570e-4376-9c9b-fb5b61916c81_591x281.png 424w, https://substackcdn.com/image/fetch/$s_!l2-a!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ee44fe8-570e-4376-9c9b-fb5b61916c81_591x281.png 848w, https://substackcdn.com/image/fetch/$s_!l2-a!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ee44fe8-570e-4376-9c9b-fb5b61916c81_591x281.png 1272w, https://substackcdn.com/image/fetch/$s_!l2-a!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ee44fe8-570e-4376-9c9b-fb5b61916c81_591x281.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!l2-a!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ee44fe8-570e-4376-9c9b-fb5b61916c81_591x281.png" width="591" height="281" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0ee44fe8-570e-4376-9c9b-fb5b61916c81_591x281.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:281,&quot;width&quot;:591,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:40844,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170953957?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ee44fe8-570e-4376-9c9b-fb5b61916c81_591x281.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!l2-a!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ee44fe8-570e-4376-9c9b-fb5b61916c81_591x281.png 424w, https://substackcdn.com/image/fetch/$s_!l2-a!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ee44fe8-570e-4376-9c9b-fb5b61916c81_591x281.png 848w, https://substackcdn.com/image/fetch/$s_!l2-a!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ee44fe8-570e-4376-9c9b-fb5b61916c81_591x281.png 1272w, https://substackcdn.com/image/fetch/$s_!l2-a!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ee44fe8-570e-4376-9c9b-fb5b61916c81_591x281.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>PostgreSQL has several processes, but for this discussion we are concerned with two:</p><ul><li><p>The Postmaster is the process that listens to that statically-configured port known for establishing connections. The default is 5432.</p></li><li><p>For each client connection the Postmaster will identify or launch a Backend process to do the work the client is requesting. The Backend is the process that will use the client IP address and ephemeral port for responding with the results. </p></li></ul><p>There are more complicated processing scenarios in PostgreSQL but we&#8217;ll ignore them for today. What is shown here accounts for what is typical.</p><p>We&#8217;re simplifying the picture a bit here as clients and backends can continue this back-and-forth between requests and responses. The Backend is delegated to deal with both once the Postmaster finishes accepting the incoming client connection. The Postmaster will go back to listening for fresh connection attempts.</p><p>Note that 5432 is not an entirely arbitrary choice of number. There is an official <a href="https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xhtml?search=5432">IANA registration for the service name and port</a>. While you can configure PostgreSQL to use other ports, by sticking with 5432 you will find some network-aware Linux tools will automatically know that &#8220;5432&#8221; and &#8220;postgresql&#8221; are interchangeable.</p><h2>Gathering Client Connection Data</h2><p>All of this connection and request/response activity isn&#8217;t just hidden away in the memory state of the various processes. We can examine it. Some of that examination can happen within PostgreSQL itself, while some needs to be done with O/S tools. </p><p>We&#8217;ll explore this <a href="https://substack.the-experimentalist.com/p/postgresql-17-on-vagrant-and-virtualbox">via the Vagrant/VirtualBox VM from the previous article</a>. Refer to that if you haven&#8217;t already spun up your own PostgreSQL sandbox.</p><pre><code># cd to the project directory with the Vagrantfile before proceeding
vagrant up --provision
vagrant ssh</code></pre><p>Now we can use a <strong>psql</strong> session to do some looking around. We&#8217;ll use the <strong>vagrant</strong> database and username as we previously created that as a superuser. As a memory-jogger we just used the trivial password &#8220;vagrant&#8221; for that sandbox. We&#8217;ll get into the details of database authentication and security in future articles.</p><pre><code>psql --host=localhost --dbname=vagrant --username=vagrant</code></pre><p>You don&#8217;t have to specify the dbname as it is the one associated with the username, but it&#8217;s worth seeing an example of how to when you need to.</p><p>The ability to examine client connection activity is handled in PostgreSQL via the <a href="https://www.postgresql.org/docs/17/monitoring-stats.html">Cumulative Statistics System</a>. Under-the-hood PostgreSQL uses internal tables to gather activity data which you then examine via pre-defined &#8220;system views&#8221;. We&#8217;re going to put one of them to work: <a href="https://www.postgresql.org/docs/17/monitoring-stats.html#MONITORING-PG-STAT-ACTIVITY-VIEW">pg_stat_activity</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PcVl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07173d70-19d3-4ae3-9252-d66b7d435437_2048x810.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PcVl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07173d70-19d3-4ae3-9252-d66b7d435437_2048x810.png 424w, https://substackcdn.com/image/fetch/$s_!PcVl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07173d70-19d3-4ae3-9252-d66b7d435437_2048x810.png 848w, https://substackcdn.com/image/fetch/$s_!PcVl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07173d70-19d3-4ae3-9252-d66b7d435437_2048x810.png 1272w, https://substackcdn.com/image/fetch/$s_!PcVl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07173d70-19d3-4ae3-9252-d66b7d435437_2048x810.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PcVl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07173d70-19d3-4ae3-9252-d66b7d435437_2048x810.png" width="1456" height="576" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/07173d70-19d3-4ae3-9252-d66b7d435437_2048x810.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:576,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:225271,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170953957?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07173d70-19d3-4ae3-9252-d66b7d435437_2048x810.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PcVl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07173d70-19d3-4ae3-9252-d66b7d435437_2048x810.png 424w, https://substackcdn.com/image/fetch/$s_!PcVl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07173d70-19d3-4ae3-9252-d66b7d435437_2048x810.png 848w, https://substackcdn.com/image/fetch/$s_!PcVl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07173d70-19d3-4ae3-9252-d66b7d435437_2048x810.png 1272w, https://substackcdn.com/image/fetch/$s_!PcVl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07173d70-19d3-4ae3-9252-d66b7d435437_2048x810.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Our <strong>psql</strong> session is the only client we currently have connected (as shown by the backend_type being '&#8220;client backend&#8221;), and we can see it there in the response. As we are running on the same host as the database service and connected via &#8220;localhost&#8221;, our client IP address is correctly 127.0.0.1 as shown for client_addr. The client-side ephemeral port for the connection is 41592 as shown for client_port.</p><p>Notice the &#8220;pid&#8221; column. What is that 10614 process id for? Let&#8217;s look at the PostgreSQL processes just like we did in the previous article:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XpFc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0fb535-8b11-44db-8e81-dd5ec74f5c74_2048x762.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XpFc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0fb535-8b11-44db-8e81-dd5ec74f5c74_2048x762.png 424w, https://substackcdn.com/image/fetch/$s_!XpFc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0fb535-8b11-44db-8e81-dd5ec74f5c74_2048x762.png 848w, https://substackcdn.com/image/fetch/$s_!XpFc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0fb535-8b11-44db-8e81-dd5ec74f5c74_2048x762.png 1272w, https://substackcdn.com/image/fetch/$s_!XpFc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0fb535-8b11-44db-8e81-dd5ec74f5c74_2048x762.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XpFc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0fb535-8b11-44db-8e81-dd5ec74f5c74_2048x762.png" width="1456" height="542" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/af0fb535-8b11-44db-8e81-dd5ec74f5c74_2048x762.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:542,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:317040,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170953957?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0fb535-8b11-44db-8e81-dd5ec74f5c74_2048x762.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XpFc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0fb535-8b11-44db-8e81-dd5ec74f5c74_2048x762.png 424w, https://substackcdn.com/image/fetch/$s_!XpFc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0fb535-8b11-44db-8e81-dd5ec74f5c74_2048x762.png 848w, https://substackcdn.com/image/fetch/$s_!XpFc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0fb535-8b11-44db-8e81-dd5ec74f5c74_2048x762.png 1272w, https://substackcdn.com/image/fetch/$s_!XpFc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0fb535-8b11-44db-8e81-dd5ec74f5c74_2048x762.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This has one more row than was shown in the previous article. That bottom row is the corresponding backend for our <strong>psql</strong> session. The &#8220;pid&#8221; shown in <strong>psql</strong> thus was for the backend process, not the client process. The <strong>ps</strong> output even shows the client IP address and ephemeral port that particular backend is handling.</p><p>Also compare the first entry of the <strong>ps</strong> results to our query response. That process doesn&#8217;t show in <strong>psql</strong> at all! That&#8217;s because it is the parent process with the Postmaster functionality. It doesn&#8217;t connect to the database, it is the thing that knows connections are something to be managed along with the processing resources assigned to them. As it doesn&#8217;t connect to directly perform database work, pg_stat_activity doesn&#8217;t know about it.</p><p>We only have that single <strong>psql</strong> session so it&#8217;s trivial to identify the client. We&#8217;re going to increase the difficulty by launching two more clients. One will be in another <strong>psql</strong> session on the same VM, and the other will be a <strong>psql</strong> session started from the underlying host that the VM is running on. As a reminder <a href="https://substack.the-experimentalist.com/p/postgresql-17-on-vagrant-and-virtualbox">from the last article</a>, we configured pg_hba.conf to allow connections via the mask 10.0.2.0/24 and here you are seeing why: the VM&#8217;s virtual network mapped the host to an address in that range.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!08r8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F979110ed-fe21-4714-894c-c35bd12fbf06_2048x954.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!08r8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F979110ed-fe21-4714-894c-c35bd12fbf06_2048x954.png 424w, https://substackcdn.com/image/fetch/$s_!08r8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F979110ed-fe21-4714-894c-c35bd12fbf06_2048x954.png 848w, https://substackcdn.com/image/fetch/$s_!08r8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F979110ed-fe21-4714-894c-c35bd12fbf06_2048x954.png 1272w, https://substackcdn.com/image/fetch/$s_!08r8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F979110ed-fe21-4714-894c-c35bd12fbf06_2048x954.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!08r8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F979110ed-fe21-4714-894c-c35bd12fbf06_2048x954.png" width="1456" height="678" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/979110ed-fe21-4714-894c-c35bd12fbf06_2048x954.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:678,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:258443,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170953957?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F979110ed-fe21-4714-894c-c35bd12fbf06_2048x954.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!08r8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F979110ed-fe21-4714-894c-c35bd12fbf06_2048x954.png 424w, https://substackcdn.com/image/fetch/$s_!08r8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F979110ed-fe21-4714-894c-c35bd12fbf06_2048x954.png 848w, https://substackcdn.com/image/fetch/$s_!08r8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F979110ed-fe21-4714-894c-c35bd12fbf06_2048x954.png 1272w, https://substackcdn.com/image/fetch/$s_!08r8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F979110ed-fe21-4714-894c-c35bd12fbf06_2048x954.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Now we have three <strong>psql</strong> &#8220;client backend&#8221; sessions and no real way to tell between them. How do we figure out which client matches up with which backend? All we have are the backend process ids, but that tells us nothing about each client.</p><p>Remember earlier we noted that sockets are what associate processes with IP addresses and ports? That&#8217;s how we figure can this out. In the previous article one of the provisioning steps installed the iproute2 package, which provides us with <strong>ss</strong>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!L8vf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87407081-6610-424f-aa4c-2aa643c27e61_2048x1122.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!L8vf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87407081-6610-424f-aa4c-2aa643c27e61_2048x1122.png 424w, https://substackcdn.com/image/fetch/$s_!L8vf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87407081-6610-424f-aa4c-2aa643c27e61_2048x1122.png 848w, https://substackcdn.com/image/fetch/$s_!L8vf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87407081-6610-424f-aa4c-2aa643c27e61_2048x1122.png 1272w, https://substackcdn.com/image/fetch/$s_!L8vf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87407081-6610-424f-aa4c-2aa643c27e61_2048x1122.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!L8vf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87407081-6610-424f-aa4c-2aa643c27e61_2048x1122.png" width="1456" height="798" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/87407081-6610-424f-aa4c-2aa643c27e61_2048x1122.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:798,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:398267,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170953957?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87407081-6610-424f-aa4c-2aa643c27e61_2048x1122.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!L8vf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87407081-6610-424f-aa4c-2aa643c27e61_2048x1122.png 424w, https://substackcdn.com/image/fetch/$s_!L8vf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87407081-6610-424f-aa4c-2aa643c27e61_2048x1122.png 848w, https://substackcdn.com/image/fetch/$s_!L8vf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87407081-6610-424f-aa4c-2aa643c27e61_2048x1122.png 1272w, https://substackcdn.com/image/fetch/$s_!L8vf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87407081-6610-424f-aa4c-2aa643c27e61_2048x1122.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This shows us both the client and server connections that are on the VM.</p><ul><li><p>You can ignore the &#8220;sshd&#8221; rows, those are for the terminal sessions I created.</p></li><li><p><strong>sudo ss -tunp</strong> ensures we see the sockets and corresponding process data for all users.</p></li><li><p><strong>ss -tunp</strong> (i.e. no use of <strong>sudo</strong>, running as <strong>vagrant</strong>); would have only shown &#8220;Process&#8221; column details for the client process sockets of the current user. </p></li><li><p><strong>sudo -U postgres ss -tunp</strong> would have only shown &#8220;Process&#8221; column details for the backend process sockets. The backend processes run as the <strong>postgres</strong> user.</p></li><li><p>None of these shows the client details for the <strong>psql</strong> session launched from outside the VM. The only way to know client details is by doing this form of examination on the device corresponding to the client IP address.</p></li><li><p>The two entries with &#8220;psql&#8221; are the clients running on the VM. Each shows the &#8220;Local Address:Port&#8221; and corresponds to the &#8220;client_addr,client_port&#8221; columns shown earlier in the pg_stat_activity response. The PID showing is for the client process, not the backend.</p></li><li><p>The three entries with &#8220;postgres&#8221; are the backends running on the VM. Each has the PID that corresponds to the &#8220;pid&#8221; column shown earlier in the pg_stat_activity response. The &#8220;Peer Address:Port&#8221; corresponds to the client IP and port that backend will respond to.</p></li></ul><p>As a quick aside, the &#8220;postgres&#8221; backend for the remote (host) psql client looks a little different than the other two cases. That <strong>psql</strong> client is not connecting via localhost, it is connecting via an IP address assigned by the VM&#8217;s virtual network. We can confirm that &#8220;10.0.2.15&#8221; is indeed the IP address assigned to the VM:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qyfU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5bd75153-5493-48e0-920a-60f9dc454fd4_1940x522.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qyfU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5bd75153-5493-48e0-920a-60f9dc454fd4_1940x522.png 424w, https://substackcdn.com/image/fetch/$s_!qyfU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5bd75153-5493-48e0-920a-60f9dc454fd4_1940x522.png 848w, https://substackcdn.com/image/fetch/$s_!qyfU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5bd75153-5493-48e0-920a-60f9dc454fd4_1940x522.png 1272w, https://substackcdn.com/image/fetch/$s_!qyfU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5bd75153-5493-48e0-920a-60f9dc454fd4_1940x522.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qyfU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5bd75153-5493-48e0-920a-60f9dc454fd4_1940x522.png" width="1456" height="392" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5bd75153-5493-48e0-920a-60f9dc454fd4_1940x522.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:392,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:167466,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170953957?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5bd75153-5493-48e0-920a-60f9dc454fd4_1940x522.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qyfU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5bd75153-5493-48e0-920a-60f9dc454fd4_1940x522.png 424w, https://substackcdn.com/image/fetch/$s_!qyfU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5bd75153-5493-48e0-920a-60f9dc454fd4_1940x522.png 848w, https://substackcdn.com/image/fetch/$s_!qyfU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5bd75153-5493-48e0-920a-60f9dc454fd4_1940x522.png 1272w, https://substackcdn.com/image/fetch/$s_!qyfU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5bd75153-5493-48e0-920a-60f9dc454fd4_1940x522.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We are almost there.</p><ul><li><p>Get the &#8220;Local Address:Port&#8221; on the &#8220;psql&#8221; rows.</p></li><li><p>Get the &#8220;Peer Address:Port&#8221; on the &#8220;postgres&#8221; rows.</p></li><li><p>Match them up. In database terms, do a join on (IP, Port) across both subsets.</p></li><li><p>Get the &#8220;pid&#8221; value from the &#8220;Process&#8221; columns. There will be two per match-up, one for the client and one for the backend.</p></li></ul><pre><code>|-------------|----------------|-----------------|------------|
| Backend PID |   Backend Addr |     Client Addr | Client PID |
|-------------|----------------|-----------------|------------|
|       10614 | 127.0.0.1:5432 | 127.0.0.1:41592 |      10612 |
|       10730 | 127.0.0.1:5432 | 127.0.0.1:47160 |      10737 |
|       10741 | 10.0.2.15:5432 |  10.0.2.2:54092 |          ? |
|-------------|----------------|-----------------|------------|</code></pre><p>The final case is a little different because I&#8217;m running VirtualBox on Windows. For that we need to run netstat in a terminal window with privilege escalation:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SjU0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd513cbe9-378e-4fba-b5ed-8ce86f4bb762_1552x802.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SjU0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd513cbe9-378e-4fba-b5ed-8ce86f4bb762_1552x802.png 424w, https://substackcdn.com/image/fetch/$s_!SjU0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd513cbe9-378e-4fba-b5ed-8ce86f4bb762_1552x802.png 848w, https://substackcdn.com/image/fetch/$s_!SjU0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd513cbe9-378e-4fba-b5ed-8ce86f4bb762_1552x802.png 1272w, https://substackcdn.com/image/fetch/$s_!SjU0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd513cbe9-378e-4fba-b5ed-8ce86f4bb762_1552x802.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SjU0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd513cbe9-378e-4fba-b5ed-8ce86f4bb762_1552x802.png" width="1456" height="752" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d513cbe9-378e-4fba-b5ed-8ce86f4bb762_1552x802.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:752,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:194090,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170953957?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd513cbe9-378e-4fba-b5ed-8ce86f4bb762_1552x802.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SjU0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd513cbe9-378e-4fba-b5ed-8ce86f4bb762_1552x802.png 424w, https://substackcdn.com/image/fetch/$s_!SjU0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd513cbe9-378e-4fba-b5ed-8ce86f4bb762_1552x802.png 848w, https://substackcdn.com/image/fetch/$s_!SjU0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd513cbe9-378e-4fba-b5ed-8ce86f4bb762_1552x802.png 1272w, https://substackcdn.com/image/fetch/$s_!SjU0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd513cbe9-378e-4fba-b5ed-8ce86f4bb762_1552x802.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>That gives us similar information for the third client, including how the backend connection is being proxied by VirtualBox. The PID for the proxy won&#8217;t correspond to anything within the VM and can be ignored.</p><p>We now have a client-process-to-backend-process correspondence! Completing the table required using either <strong>netstat</strong> or <strong>ss</strong> and having access to the appropriate client host, as identified by &#8220;client_addr&#8221; in pg_stat_activity. Discovering the client PIDs allows us to investigate those processes on those hosts. We weren&#8217;t even constrained to only searching on Linux.</p><p>Could you automate all this? Yes, assuming the client hosts are accessible to you. That&#8217;s a discussion for another day.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://substack.the-experimentalist.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Experimentalist is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><a href="https://substack.the-experimentalist.com/p/client-connectivity-with-postgresql">The Experimentalist : Client Connectivity with PostgreSQL</a> &#169; 2025 by <a href="https://www.linkedin.com/in/reidmpinchback/">Reid M. Pinchback</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a></p>]]></content:encoded></item><item><title><![CDATA[PostgreSQL 17 on Vagrant and VirtualBox]]></title><description><![CDATA[Create a sandbox to explore how this database is put together]]></description><link>https://substack.the-experimentalist.com/p/postgresql-17-on-vagrant-and-virtualbox</link><guid isPermaLink="false">https://substack.the-experimentalist.com/p/postgresql-17-on-vagrant-and-virtualbox</guid><dc:creator><![CDATA[Reid M. Pinchback]]></dc:creator><pubDate>Thu, 14 Aug 2025 03:58:38 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!d5CR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1ff51a-e321-4f05-a5a0-9fb5d4560c3c_514x400.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!d5CR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1ff51a-e321-4f05-a5a0-9fb5d4560c3c_514x400.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!d5CR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1ff51a-e321-4f05-a5a0-9fb5d4560c3c_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!d5CR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1ff51a-e321-4f05-a5a0-9fb5d4560c3c_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!d5CR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1ff51a-e321-4f05-a5a0-9fb5d4560c3c_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!d5CR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1ff51a-e321-4f05-a5a0-9fb5d4560c3c_514x400.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!d5CR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1ff51a-e321-4f05-a5a0-9fb5d4560c3c_514x400.jpeg" width="514" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8e1ff51a-e321-4f05-a5a0-9fb5d4560c3c_514x400.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:514,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:356002,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170936087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1ff51a-e321-4f05-a5a0-9fb5d4560c3c_514x400.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!d5CR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1ff51a-e321-4f05-a5a0-9fb5d4560c3c_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!d5CR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1ff51a-e321-4f05-a5a0-9fb5d4560c3c_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!d5CR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1ff51a-e321-4f05-a5a0-9fb5d4560c3c_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!d5CR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1ff51a-e321-4f05-a5a0-9fb5d4560c3c_514x400.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In the <a href="https://substack.the-experimentalist.com/p/building-a-swe-experimentation-platform">article series on an exploratory lab</a> I&#8217;ll eventually tackle provisioning of container and virtualization-based sandboxes. To tease at the benefits, we&#8217;re going to use a Vagrant-VirtualBox setup to install and poke around a running database.</p><p>If you aren&#8217;t already using them, you can download and install in order:</p><ul><li><p><a href="https://www.virtualbox.org/wiki/Downloads">VirtualBox &#8594; https://www.virtualbox.org/wiki/Downloads</a></p></li><li><p><a href="https://developer.hashicorp.com/vagrant/install">Vagrant &#8594; https://developer.hashicorp.com/vagrant/install</a></p></li></ul><div><hr></div><p><strong>Next:</strong> <strong><a href="https://substack.the-experimentalist.com/p/client-connectivity-with-postgresql">Client Connectivity with PostgreSQL</a></strong></p><div><hr></div><p>Project files for the article are available from the newsletter&#8217;s GitLab repository:</p><ul><li><p><strong><a href="https://gitlab.com/The-Experimentalist/article_support_2025/-/blob/main/article/2025-08-13-postgresql-17-on-vagrant-and-virtualbox/Vagrantfile?ref_type=heads">Vagrantfile</a></strong></p></li><li><p><strong><a href="https://gitlab.com/The-Experimentalist/article_support_2025/-/blob/main/article/2025-08-13-postgresql-17-on-vagrant-and-virtualbox/provisions/postgresql.sh?ref_type=heads">provisions/postgresql.sh</a></strong></p></li></ul><p>Just create a project that maintains the directory structure. Once you&#8217;ve done that and installed both VirtualBox and Vagrant:</p><ul><li><p><strong>cd</strong> to the project directory</p></li><li><p><strong>vagrant up</strong></p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mbd3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8989aa86-d0d8-4195-a8fc-8fb4b7a26f3b_2048x1082.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mbd3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8989aa86-d0d8-4195-a8fc-8fb4b7a26f3b_2048x1082.png 424w, https://substackcdn.com/image/fetch/$s_!mbd3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8989aa86-d0d8-4195-a8fc-8fb4b7a26f3b_2048x1082.png 848w, https://substackcdn.com/image/fetch/$s_!mbd3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8989aa86-d0d8-4195-a8fc-8fb4b7a26f3b_2048x1082.png 1272w, https://substackcdn.com/image/fetch/$s_!mbd3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8989aa86-d0d8-4195-a8fc-8fb4b7a26f3b_2048x1082.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mbd3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8989aa86-d0d8-4195-a8fc-8fb4b7a26f3b_2048x1082.png" width="1456" height="769" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8989aa86-d0d8-4195-a8fc-8fb4b7a26f3b_2048x1082.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:769,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:383179,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170936087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8989aa86-d0d8-4195-a8fc-8fb4b7a26f3b_2048x1082.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mbd3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8989aa86-d0d8-4195-a8fc-8fb4b7a26f3b_2048x1082.png 424w, https://substackcdn.com/image/fetch/$s_!mbd3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8989aa86-d0d8-4195-a8fc-8fb4b7a26f3b_2048x1082.png 848w, https://substackcdn.com/image/fetch/$s_!mbd3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8989aa86-d0d8-4195-a8fc-8fb4b7a26f3b_2048x1082.png 1272w, https://substackcdn.com/image/fetch/$s_!mbd3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8989aa86-d0d8-4195-a8fc-8fb4b7a26f3b_2048x1082.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This will get the ball rolling. If the install will go slowly, it&#8217;ll usually be right after generating the SSH key. That tends to happen if there is some difficulty in mapping hardware functionality that the O/S expects to what VirtualBox is able to provide. I&#8217;ve found <a href="https://portal.cloud.hashicorp.com/vagrant/discover/bento/ubuntu-22.04">the Ubuntu 22.04 Bento box</a> to work pretty smoothly. It may take 10-20 minutes to download an O/S image, build a VM from it, and install PostgreSQL, and the install will conclude with:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Urz6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1276dbb-b0df-4f8a-b408-8e45d0a2e4b3_2048x1202.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Urz6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1276dbb-b0df-4f8a-b408-8e45d0a2e4b3_2048x1202.png 424w, https://substackcdn.com/image/fetch/$s_!Urz6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1276dbb-b0df-4f8a-b408-8e45d0a2e4b3_2048x1202.png 848w, https://substackcdn.com/image/fetch/$s_!Urz6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1276dbb-b0df-4f8a-b408-8e45d0a2e4b3_2048x1202.png 1272w, https://substackcdn.com/image/fetch/$s_!Urz6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1276dbb-b0df-4f8a-b408-8e45d0a2e4b3_2048x1202.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Urz6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1276dbb-b0df-4f8a-b408-8e45d0a2e4b3_2048x1202.png" width="1456" height="855" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b1276dbb-b0df-4f8a-b408-8e45d0a2e4b3_2048x1202.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:855,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:396924,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170936087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1276dbb-b0df-4f8a-b408-8e45d0a2e4b3_2048x1202.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Urz6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1276dbb-b0df-4f8a-b408-8e45d0a2e4b3_2048x1202.png 424w, https://substackcdn.com/image/fetch/$s_!Urz6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1276dbb-b0df-4f8a-b408-8e45d0a2e4b3_2048x1202.png 848w, https://substackcdn.com/image/fetch/$s_!Urz6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1276dbb-b0df-4f8a-b408-8e45d0a2e4b3_2048x1202.png 1272w, https://substackcdn.com/image/fetch/$s_!Urz6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1276dbb-b0df-4f8a-b408-8e45d0a2e4b3_2048x1202.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Any future restarts of the box will be much faster. At this point you have:</p><ul><li><p>A running Ubuntu 22.04 VM.</p></li><li><p>A running PostgreSQL 17 service.</p></li><li><p>An Ubuntu login account named &#8220;<strong>vagrant</strong>&#8221; you can <strong>ssh</strong> to.</p></li><li><p>An PostgreSQL user named &#8220;<strong>vagrant</strong>&#8221; with the password &#8220;<strong>vagrant</strong>&#8221;.</p></li><li><p>The ability to connect to the database as that user with either a client running within the VM, or from a client on the underlying host operating system.</p></li></ul><p>Let&#8217;s take a look at how that came about. The driver of the bus is the <strong>Vagrantfile</strong>:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Q3yZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fee794c-7f1f-4fdd-aaf0-774b05d6b6e4_1974x1002.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Q3yZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fee794c-7f1f-4fdd-aaf0-774b05d6b6e4_1974x1002.png 424w, https://substackcdn.com/image/fetch/$s_!Q3yZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fee794c-7f1f-4fdd-aaf0-774b05d6b6e4_1974x1002.png 848w, https://substackcdn.com/image/fetch/$s_!Q3yZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fee794c-7f1f-4fdd-aaf0-774b05d6b6e4_1974x1002.png 1272w, https://substackcdn.com/image/fetch/$s_!Q3yZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fee794c-7f1f-4fdd-aaf0-774b05d6b6e4_1974x1002.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Q3yZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fee794c-7f1f-4fdd-aaf0-774b05d6b6e4_1974x1002.png" width="1456" height="739" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9fee794c-7f1f-4fdd-aaf0-774b05d6b6e4_1974x1002.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:739,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:256702,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170936087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fee794c-7f1f-4fdd-aaf0-774b05d6b6e4_1974x1002.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Q3yZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fee794c-7f1f-4fdd-aaf0-774b05d6b6e4_1974x1002.png 424w, https://substackcdn.com/image/fetch/$s_!Q3yZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fee794c-7f1f-4fdd-aaf0-774b05d6b6e4_1974x1002.png 848w, https://substackcdn.com/image/fetch/$s_!Q3yZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fee794c-7f1f-4fdd-aaf0-774b05d6b6e4_1974x1002.png 1272w, https://substackcdn.com/image/fetch/$s_!Q3yZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fee794c-7f1f-4fdd-aaf0-774b05d6b6e4_1974x1002.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is the file from the GitLab repository but with comments stripped out. Let&#8217;s walk through the key parts.</p><p>Specify the O/S image you want for the VM:</p><pre><code>config.vm.box = "bento/ubuntu-22.04"
config.vm.box_version = "202508.03.0"</code></pre><p>PostgreSQL by default listens on port 5432. We want to make it possible for a database client running on the host to connect to the database service within the VM on that same port number:</p><pre><code>config.vm.network "forwarded_port", guest: 5432, host: 5432, \
  host_ip: "127.0.0.1"</code></pre><p>We specify that the provider is VirtualBox, but if you happen to already use Hyper-V or VMware, this is where you would make the change to use them instead:</p><pre><code>config.vm.provider "virtualbox" do |vb|</code></pre><p>Finally we have a script to run as <strong>root</strong> for provisioning the VM once it is ready. Ours sets up PostgreSQL and does other assorted housekeeping:</p><pre><code>config.vm.provision "shell", name: "postgresql", \
  path: "provisions/postgresql.sh", privileged: true</code></pre><p>All of the real magic happens in the provisioning script. First we set up the firewall and install some utilities that may come in handy later:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YUGg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde685877-84e6-4204-ad1a-09af140aeb99_1736x1122.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YUGg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde685877-84e6-4204-ad1a-09af140aeb99_1736x1122.png 424w, https://substackcdn.com/image/fetch/$s_!YUGg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde685877-84e6-4204-ad1a-09af140aeb99_1736x1122.png 848w, https://substackcdn.com/image/fetch/$s_!YUGg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde685877-84e6-4204-ad1a-09af140aeb99_1736x1122.png 1272w, https://substackcdn.com/image/fetch/$s_!YUGg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde685877-84e6-4204-ad1a-09af140aeb99_1736x1122.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YUGg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde685877-84e6-4204-ad1a-09af140aeb99_1736x1122.png" width="1456" height="941" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/de685877-84e6-4204-ad1a-09af140aeb99_1736x1122.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:941,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:244411,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170936087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde685877-84e6-4204-ad1a-09af140aeb99_1736x1122.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YUGg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde685877-84e6-4204-ad1a-09af140aeb99_1736x1122.png 424w, https://substackcdn.com/image/fetch/$s_!YUGg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde685877-84e6-4204-ad1a-09af140aeb99_1736x1122.png 848w, https://substackcdn.com/image/fetch/$s_!YUGg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde685877-84e6-4204-ad1a-09af140aeb99_1736x1122.png 1272w, https://substackcdn.com/image/fetch/$s_!YUGg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde685877-84e6-4204-ad1a-09af140aeb99_1736x1122.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Then we install the APT repository for PostgreSQL, and install the service:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!28lh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a239f5-f975-4f33-8ce6-873d0d6b2b27_1534x602.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!28lh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a239f5-f975-4f33-8ce6-873d0d6b2b27_1534x602.png 424w, https://substackcdn.com/image/fetch/$s_!28lh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a239f5-f975-4f33-8ce6-873d0d6b2b27_1534x602.png 848w, https://substackcdn.com/image/fetch/$s_!28lh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a239f5-f975-4f33-8ce6-873d0d6b2b27_1534x602.png 1272w, https://substackcdn.com/image/fetch/$s_!28lh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a239f5-f975-4f33-8ce6-873d0d6b2b27_1534x602.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!28lh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a239f5-f975-4f33-8ce6-873d0d6b2b27_1534x602.png" width="1456" height="571" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a5a239f5-f975-4f33-8ce6-873d0d6b2b27_1534x602.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:571,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:169927,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170936087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a239f5-f975-4f33-8ce6-873d0d6b2b27_1534x602.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!28lh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a239f5-f975-4f33-8ce6-873d0d6b2b27_1534x602.png 424w, https://substackcdn.com/image/fetch/$s_!28lh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a239f5-f975-4f33-8ce6-873d0d6b2b27_1534x602.png 848w, https://substackcdn.com/image/fetch/$s_!28lh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a239f5-f975-4f33-8ce6-873d0d6b2b27_1534x602.png 1272w, https://substackcdn.com/image/fetch/$s_!28lh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a239f5-f975-4f33-8ce6-873d0d6b2b27_1534x602.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The next step is a bit more packed and needs explaining. </p><p>We want to configure the database to have an initial user and database schema to work in. Vagrant allows us to re-run provisioning scripts. Those scripts should be idempotent so they don&#8217;t fail they next time they are run. A zero-byte file named <strong>.configured</strong> is created as a marker to note we&#8217;ve already been here before.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qCd1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f12d522-65f0-4a5b-9ec1-c37cac4ea271_1990x1122.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qCd1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f12d522-65f0-4a5b-9ec1-c37cac4ea271_1990x1122.png 424w, https://substackcdn.com/image/fetch/$s_!qCd1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f12d522-65f0-4a5b-9ec1-c37cac4ea271_1990x1122.png 848w, https://substackcdn.com/image/fetch/$s_!qCd1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f12d522-65f0-4a5b-9ec1-c37cac4ea271_1990x1122.png 1272w, https://substackcdn.com/image/fetch/$s_!qCd1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f12d522-65f0-4a5b-9ec1-c37cac4ea271_1990x1122.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qCd1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f12d522-65f0-4a5b-9ec1-c37cac4ea271_1990x1122.png" width="1456" height="821" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6f12d522-65f0-4a5b-9ec1-c37cac4ea271_1990x1122.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:821,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:314471,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170936087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f12d522-65f0-4a5b-9ec1-c37cac4ea271_1990x1122.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qCd1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f12d522-65f0-4a5b-9ec1-c37cac4ea271_1990x1122.png 424w, https://substackcdn.com/image/fetch/$s_!qCd1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f12d522-65f0-4a5b-9ec1-c37cac4ea271_1990x1122.png 848w, https://substackcdn.com/image/fetch/$s_!qCd1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f12d522-65f0-4a5b-9ec1-c37cac4ea271_1990x1122.png 1272w, https://substackcdn.com/image/fetch/$s_!qCd1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f12d522-65f0-4a5b-9ec1-c37cac4ea271_1990x1122.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The work of the provisioning on the first pass consists of:</p><ul><li><p>Change <strong>postgresql.conf</strong> to listen for connections coming in from any host. This is done so that it doesn&#8217;t just respond to VM clients, it would also respond to the any client the Vagrant configuration allows access to.</p></li><li><p>Create a user named <strong>vagrant</strong>, which for simplicity here I&#8217;ve made a superuser just like the default user <strong>postgres</strong> already is.</p></li><li><p>Provide the <strong>vagrant</strong> user with a password to allow login. Password-based authentication is not the only option, but we&#8217;re going for simple.</p></li><li><p>Create a database for the new user and give it the same name as that user.</p></li><li><p>Change <strong>pg_hba.conf</strong> to allow the new user to have access. We specify both the loopback address to allow login from within the VM, and an address mask suited to any connections over eth0 with IP addresses assigned to the VM or host.</p></li><li><p>Restart the database service once everything is ready. If this was a production instance I would dig into the docs to remind me which <strong>postgresql.conf</strong> features require a restart versus a config reload. For here this works fine, and I want a chance to confirm the systemd unit for the service in any case.</p></li></ul><p>Some of the steps are wrapped in a heredoc to run as the <strong>postgres</strong> user. We are using nested heredocs: the outer level to process steps as <strong>postgres</strong>, and the inner level to specify content to append to the <strong>pg_hba.conf</strong> file. The entire provisioning script runs as a privileged user by Vagrant because we said so in the <strong>Vagrantfile</strong>, which for us means as <strong>root</strong>. The PostgreSQL service runs as <strong>postgres</strong>.</p><p>Finally, the provisioning script cleans up any dangling APT state. If anything installed flagged a need to reboot the host (in this case, the VM) to complete an installation, then it gets handled now.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AEXC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc71a17c1-eba9-4aa1-9b6f-831ac57f8596_944x562.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AEXC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc71a17c1-eba9-4aa1-9b6f-831ac57f8596_944x562.png 424w, https://substackcdn.com/image/fetch/$s_!AEXC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc71a17c1-eba9-4aa1-9b6f-831ac57f8596_944x562.png 848w, https://substackcdn.com/image/fetch/$s_!AEXC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc71a17c1-eba9-4aa1-9b6f-831ac57f8596_944x562.png 1272w, https://substackcdn.com/image/fetch/$s_!AEXC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc71a17c1-eba9-4aa1-9b6f-831ac57f8596_944x562.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AEXC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc71a17c1-eba9-4aa1-9b6f-831ac57f8596_944x562.png" width="944" height="562" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c71a17c1-eba9-4aa1-9b6f-831ac57f8596_944x562.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:562,&quot;width&quot;:944,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:102480,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170936087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc71a17c1-eba9-4aa1-9b6f-831ac57f8596_944x562.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AEXC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc71a17c1-eba9-4aa1-9b6f-831ac57f8596_944x562.png 424w, https://substackcdn.com/image/fetch/$s_!AEXC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc71a17c1-eba9-4aa1-9b6f-831ac57f8596_944x562.png 848w, https://substackcdn.com/image/fetch/$s_!AEXC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc71a17c1-eba9-4aa1-9b6f-831ac57f8596_944x562.png 1272w, https://substackcdn.com/image/fetch/$s_!AEXC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc71a17c1-eba9-4aa1-9b6f-831ac57f8596_944x562.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We now have the ability to <strong>ssh</strong> into the running VM. We use <strong>vagrant ssh</strong> for that, because it knows about the key pair that Vagrant generated when the VM was created.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FQed!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24cd89c2-df6c-4526-80ae-b03e900f9a9f_1586x1242.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FQed!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24cd89c2-df6c-4526-80ae-b03e900f9a9f_1586x1242.png 424w, https://substackcdn.com/image/fetch/$s_!FQed!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24cd89c2-df6c-4526-80ae-b03e900f9a9f_1586x1242.png 848w, https://substackcdn.com/image/fetch/$s_!FQed!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24cd89c2-df6c-4526-80ae-b03e900f9a9f_1586x1242.png 1272w, https://substackcdn.com/image/fetch/$s_!FQed!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24cd89c2-df6c-4526-80ae-b03e900f9a9f_1586x1242.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FQed!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24cd89c2-df6c-4526-80ae-b03e900f9a9f_1586x1242.png" width="1456" height="1140" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/24cd89c2-df6c-4526-80ae-b03e900f9a9f_1586x1242.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1140,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:316803,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170936087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24cd89c2-df6c-4526-80ae-b03e900f9a9f_1586x1242.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FQed!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24cd89c2-df6c-4526-80ae-b03e900f9a9f_1586x1242.png 424w, https://substackcdn.com/image/fetch/$s_!FQed!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24cd89c2-df6c-4526-80ae-b03e900f9a9f_1586x1242.png 848w, https://substackcdn.com/image/fetch/$s_!FQed!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24cd89c2-df6c-4526-80ae-b03e900f9a9f_1586x1242.png 1272w, https://substackcdn.com/image/fetch/$s_!FQed!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24cd89c2-df6c-4526-80ae-b03e900f9a9f_1586x1242.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>As a final step, we&#8217;ll confirm that the database service processes exist:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GM6-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe91ef65b-72da-4924-84df-ca412a4fc199_2048x682.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GM6-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe91ef65b-72da-4924-84df-ca412a4fc199_2048x682.png 424w, https://substackcdn.com/image/fetch/$s_!GM6-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe91ef65b-72da-4924-84df-ca412a4fc199_2048x682.png 848w, https://substackcdn.com/image/fetch/$s_!GM6-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe91ef65b-72da-4924-84df-ca412a4fc199_2048x682.png 1272w, https://substackcdn.com/image/fetch/$s_!GM6-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe91ef65b-72da-4924-84df-ca412a4fc199_2048x682.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GM6-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe91ef65b-72da-4924-84df-ca412a4fc199_2048x682.png" width="1456" height="485" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e91ef65b-72da-4924-84df-ca412a4fc199_2048x682.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:485,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:272514,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170936087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe91ef65b-72da-4924-84df-ca412a4fc199_2048x682.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GM6-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe91ef65b-72da-4924-84df-ca412a4fc199_2048x682.png 424w, https://substackcdn.com/image/fetch/$s_!GM6-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe91ef65b-72da-4924-84df-ca412a4fc199_2048x682.png 848w, https://substackcdn.com/image/fetch/$s_!GM6-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe91ef65b-72da-4924-84df-ca412a4fc199_2048x682.png 1272w, https://substackcdn.com/image/fetch/$s_!GM6-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe91ef65b-72da-4924-84df-ca412a4fc199_2048x682.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In the next article we&#8217;ll start exploring the service in more detail.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://substack.the-experimentalist.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Experimentalist is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><a href="https://substack.the-experimentalist.com/p/postgresql-17-on-vagrant-and-virtualbox">The Experimentalist : PostgreSQL 17 on Vagrant and VirtualBox</a> &#169; 2025 by <a href="https://www.linkedin.com/in/reidmpinchback/">Reid M. Pinchback</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a></p>]]></content:encoded></item><item><title><![CDATA[Phase 1: Credentials Management and Bootstrapping]]></title><description><![CDATA[Bootstrapping a little security into the new experimental platform]]></description><link>https://substack.the-experimentalist.com/p/phase-1-credentials-management-and-bootstrapping</link><guid isPermaLink="false">https://substack.the-experimentalist.com/p/phase-1-credentials-management-and-bootstrapping</guid><dc:creator><![CDATA[Reid M. Pinchback]]></dc:creator><pubDate>Wed, 13 Aug 2025 18:16:26 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!epao!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80c2175c-fe31-4cbf-9fb7-0ebe4e11f6e8_781x641.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!epao!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80c2175c-fe31-4cbf-9fb7-0ebe4e11f6e8_781x641.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!epao!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80c2175c-fe31-4cbf-9fb7-0ebe4e11f6e8_781x641.png 424w, https://substackcdn.com/image/fetch/$s_!epao!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80c2175c-fe31-4cbf-9fb7-0ebe4e11f6e8_781x641.png 848w, https://substackcdn.com/image/fetch/$s_!epao!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80c2175c-fe31-4cbf-9fb7-0ebe4e11f6e8_781x641.png 1272w, https://substackcdn.com/image/fetch/$s_!epao!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80c2175c-fe31-4cbf-9fb7-0ebe4e11f6e8_781x641.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!epao!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80c2175c-fe31-4cbf-9fb7-0ebe4e11f6e8_781x641.png" width="781" height="641" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/80c2175c-fe31-4cbf-9fb7-0ebe4e11f6e8_781x641.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:641,&quot;width&quot;:781,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:152750,&quot;alt&quot;:&quot;Block diagram of the Experimentalist Laboratory&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170849842?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80c2175c-fe31-4cbf-9fb7-0ebe4e11f6e8_781x641.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Block diagram of the Experimentalist Laboratory" title="Block diagram of the Experimentalist Laboratory" srcset="https://substackcdn.com/image/fetch/$s_!epao!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80c2175c-fe31-4cbf-9fb7-0ebe4e11f6e8_781x641.png 424w, https://substackcdn.com/image/fetch/$s_!epao!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80c2175c-fe31-4cbf-9fb7-0ebe4e11f6e8_781x641.png 848w, https://substackcdn.com/image/fetch/$s_!epao!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80c2175c-fe31-4cbf-9fb7-0ebe4e11f6e8_781x641.png 1272w, https://substackcdn.com/image/fetch/$s_!epao!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80c2175c-fe31-4cbf-9fb7-0ebe4e11f6e8_781x641.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>When you first start setting up the compute nodes you may have no meaningful infrastructure in place. There are steps to the sequence of bootstrapping from nothing and ending up with the nodes in a consistent state. You won&#8217;t be able to use PXE to do network-based installations. You also won&#8217;t have anything like an LDAP server to integrate via PAM to handle server authentication and authorization. Down the road maybe, but on day one, no.</p><div><hr></div><p><strong>Prev:</strong> <strong><a href="https://substack.the-experimentalist.com/p/phase-1-just-a-bunch-of-storage">Phase 1: Just a Bunch of Storage</a></strong></p><div><hr></div><p>When you begin the process you&#8217;ll have placed a bootable ISO image on a USB drive for the operating system you&#8217;ve chosen. I would recommend preparing a second USB drive with any firmware updates your particular compute node hardware model may be due. Update the firmware first, and then install the O/S via the ISO image. If you want to be extra careful, make sure you have a copy of the existing firmware version before performing the update, in case you have to revert to your starting point.</p><p>Make life easier on yourself and put some visible external labels on those USB sticks. If you have multiple compute nodes to set up, you&#8217;ll be following the same process over and over. Start taking notes as you go through the setup the first time.</p><p>As the installer runs you&#8217;ll be asked a number of questions. Some will relate to disk partitioning, for which <a href="https://substack.the-experimentalist.com/p/phase-1-just-a-bunch-of-storage">the previous article</a> will give you guidance. A few will relate to specifying a non-root user, to complement the root user which is always installed. Once the installation completes, you have the starting point from which you need to bootstrap a little security.</p><p>The initial security work won&#8217;t be all you&#8217;ll ever want to do, not by a long shot, but you need a baseline that will support routine remote access and running Infrastructure as Code (IaC) jobs that will progressively establish the full functionality of your platform. The first pass is only what you can&#8217;t or shouldn&#8217;t avoid. It is not enough to consider the environment as being secure. You&#8217;re establishing connectivity, and as part of that connectivity you will need credentials to support authentication.</p><h2>Managing Credentials</h2><p>Credentials like SSH keys, TLS certificates, and GPG keys become your first intellectual property assets that you don&#8217;t want to lose. The point of them is that you should be unable to perform operations or connect to systems that use those credentials to control access or certify identity. The more secure your platform, the harder it would be to recover functionality if those credentials were lost.</p><p>Preserving credentials is one area of functionality for which you should consider using a remote service vendor. Any passwords, keys, and certificates should be immediately preserved in the remote service. Assuming you are doing your cluster work as a solo experimenter, you won&#8217;t need to care about establishing any kind of-group permission structure, which will make things easier.</p><p>Credential automation is much tougher, and for &#8220;Phase 1&#8221; you probably need to be realistic about how much of that you can pull off. As an example I only know of one remote credential manager that has substantial Linux Pluggable Authentication Module (PAM) support &#8212; <a href="https://www.keepersecurity.com/">Keeper Security</a> &#8212; and largely that is to use their Privileged Access Management product. Yes, I know, two entirely different uses of the acronym PAM, as if computer security wasn&#8217;t a strange enough world. I haven&#8217;t worked with Keeper myself, but from what I can tell the functionality somewhat overlaps on LDAP.</p><h2>The Bootstrapping</h2><p>My initial &#8220;good enough for now&#8221; bootstrapping is to use Bash scripts that I run on each compute node after the O/S install finishes. The complete process looks like:</p><ol><li><p>Attach a monitor and keyboard to the uninitialized box.</p></li><li><p>Install the firmware USB, boot the box, then install any firmware.</p></li><li><p>Install the ISO image USB, boot the box again, then install the O/S (which will request the password for the non-root user it creates).</p></li><li><p>On whatever host you use for other work, such as your laptop, create an SSH key pair that will be used for authenticating as that user. Since &#8220;you&#8221; are &#8220;you&#8221; all the time, you use the same key pair for all the compute nodes. NOTE AGAIN, this is not a final security posture for the cluster, we&#8217;re just bootstrapping.</p></li><li><p>Save both halves of the SSH key to the remote vendor service you selected for preserving credentials, so you don&#8217;t lose them.</p></li><li><p>Configure your network router to recognize the MAC address for the new box and assign a static IP. You aren&#8217;t running a big company, I wouldn&#8217;t sweat DHCP for the experimentation platform until you find you need it, particularly since <strong>ssh</strong> would complain on every change to an IP address.</p></li><li><p>Using your working (e.g. laptop) host, connect remotely to the compute node via that static IP to supply the user and password. These are the non-root credentials you specified when the O/S was installing.</p></li><li><p>Copy-paste the bootstrapping Bash script that configures the user account.</p></li><li><p>Run the script. You should have designed it to ask you for the public half of the SSH key pair you created, which it will install in <strong>~/.ssh</strong> and update <strong>~/.ssh/authorized_keys</strong> to know about the new key.</p></li><li><p>If you have root-level changes to make (e.g. to configure <strong>sshd</strong> and <strong>sudo</strong>), copy-paste the Bash script for that.</p></li><li><p>Run the second script via <strong>sudo</strong>.</p></li><li><p>Don&#8217;t <strong>logout</strong> yet, but start a second terminal session and verify that you can login. If not, you have a mistake somewhere and you want to get that fixed before you lose viable access (else you may have to install the node from scratch).</p></li><li><p>As a final step I would recommend completely power-cycling the node. If you&#8217;re going to have any problems, you may as well find out now.</p></li></ol><p>There are assorted refinements you can make to the instructions above, like:</p><ul><li><p>Remove password access to an account once the public SSH key is in place.</p></li><li><p>Have separate accounts for human-user access versus automation-user access (a need that will become clearer in a future article).</p></li><li><p>Ensure the <strong>~/.ssh</strong> directory and file permissions meet the requirements that <strong>sshd</strong> will enforce.</p></li><li><p>Inform the compute node of its intended host name, as it probably won&#8217;t pick that up from the router.</p></li><li><p>Make the script behaviors idempotent so that re-running them isn&#8217;t harmful. Until you&#8217;ve going through the process a couple of times you&#8217;ll keep adding tweaks as you figure out exactly what bootstrap steps you need.</p></li></ul><p>The options are only constrained by your Bash skills. Later on we&#8217;ll be doing proper IaC automation, so the primary purpose of any bootstrapping is to get us just to that stage.</p><h2>Script Samples</h2><p>My own bootstrapping is split into files for particular features, which currently are all for Ubuntu 22.04. Here is an example for ensuring <strong>git</strong> is installed:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sBYT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffd51a65-a52f-4ce1-abb6-0dc9003856b6_1214x922.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sBYT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffd51a65-a52f-4ce1-abb6-0dc9003856b6_1214x922.png 424w, https://substackcdn.com/image/fetch/$s_!sBYT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffd51a65-a52f-4ce1-abb6-0dc9003856b6_1214x922.png 848w, https://substackcdn.com/image/fetch/$s_!sBYT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffd51a65-a52f-4ce1-abb6-0dc9003856b6_1214x922.png 1272w, https://substackcdn.com/image/fetch/$s_!sBYT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffd51a65-a52f-4ce1-abb6-0dc9003856b6_1214x922.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sBYT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffd51a65-a52f-4ce1-abb6-0dc9003856b6_1214x922.png" width="1214" height="922" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ffd51a65-a52f-4ce1-abb6-0dc9003856b6_1214x922.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:922,&quot;width&quot;:1214,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:135727,&quot;alt&quot;:&quot;script for git install function&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170849842?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffd51a65-a52f-4ce1-abb6-0dc9003856b6_1214x922.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="script for git install function" title="script for git install function" srcset="https://substackcdn.com/image/fetch/$s_!sBYT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffd51a65-a52f-4ce1-abb6-0dc9003856b6_1214x922.png 424w, https://substackcdn.com/image/fetch/$s_!sBYT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffd51a65-a52f-4ce1-abb6-0dc9003856b6_1214x922.png 848w, https://substackcdn.com/image/fetch/$s_!sBYT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffd51a65-a52f-4ce1-abb6-0dc9003856b6_1214x922.png 1272w, https://substackcdn.com/image/fetch/$s_!sBYT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffd51a65-a52f-4ce1-abb6-0dc9003856b6_1214x922.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>and to install pip for Python 3 (which isn&#8217;t automatically installed on Ubuntu 22.04):</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!d07E!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F133ed59b-a3ea-44ce-a92d-67f66ee7c907_1434x922.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!d07E!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F133ed59b-a3ea-44ce-a92d-67f66ee7c907_1434x922.png 424w, https://substackcdn.com/image/fetch/$s_!d07E!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F133ed59b-a3ea-44ce-a92d-67f66ee7c907_1434x922.png 848w, https://substackcdn.com/image/fetch/$s_!d07E!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F133ed59b-a3ea-44ce-a92d-67f66ee7c907_1434x922.png 1272w, https://substackcdn.com/image/fetch/$s_!d07E!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F133ed59b-a3ea-44ce-a92d-67f66ee7c907_1434x922.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!d07E!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F133ed59b-a3ea-44ce-a92d-67f66ee7c907_1434x922.png" width="1434" height="922" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/133ed59b-a3ea-44ce-a92d-67f66ee7c907_1434x922.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:922,&quot;width&quot;:1434,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:146910,&quot;alt&quot;:&quot;script for pip3 install function&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170849842?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F133ed59b-a3ea-44ce-a92d-67f66ee7c907_1434x922.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="script for pip3 install function" title="script for pip3 install function" srcset="https://substackcdn.com/image/fetch/$s_!d07E!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F133ed59b-a3ea-44ce-a92d-67f66ee7c907_1434x922.png 424w, https://substackcdn.com/image/fetch/$s_!d07E!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F133ed59b-a3ea-44ce-a92d-67f66ee7c907_1434x922.png 848w, https://substackcdn.com/image/fetch/$s_!d07E!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F133ed59b-a3ea-44ce-a92d-67f66ee7c907_1434x922.png 1272w, https://substackcdn.com/image/fetch/$s_!d07E!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F133ed59b-a3ea-44ce-a92d-67f66ee7c907_1434x922.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>and to upgrade the Ansible installation (the default for Ubuntu 22.04 is ancient):</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sc9S!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabae3632-17a3-4c61-8ce0-2e546f3feb5d_2048x2484.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sc9S!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabae3632-17a3-4c61-8ce0-2e546f3feb5d_2048x2484.png 424w, https://substackcdn.com/image/fetch/$s_!sc9S!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabae3632-17a3-4c61-8ce0-2e546f3feb5d_2048x2484.png 848w, https://substackcdn.com/image/fetch/$s_!sc9S!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabae3632-17a3-4c61-8ce0-2e546f3feb5d_2048x2484.png 1272w, https://substackcdn.com/image/fetch/$s_!sc9S!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabae3632-17a3-4c61-8ce0-2e546f3feb5d_2048x2484.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sc9S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabae3632-17a3-4c61-8ce0-2e546f3feb5d_2048x2484.png" width="1456" height="1766" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/abae3632-17a3-4c61-8ce0-2e546f3feb5d_2048x2484.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1766,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:579888,&quot;alt&quot;:&quot;script for ansible update function&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170849842?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabae3632-17a3-4c61-8ce0-2e546f3feb5d_2048x2484.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="script for ansible update function" title="script for ansible update function" srcset="https://substackcdn.com/image/fetch/$s_!sc9S!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabae3632-17a3-4c61-8ce0-2e546f3feb5d_2048x2484.png 424w, https://substackcdn.com/image/fetch/$s_!sc9S!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabae3632-17a3-4c61-8ce0-2e546f3feb5d_2048x2484.png 848w, https://substackcdn.com/image/fetch/$s_!sc9S!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabae3632-17a3-4c61-8ce0-2e546f3feb5d_2048x2484.png 1272w, https://substackcdn.com/image/fetch/$s_!sc9S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabae3632-17a3-4c61-8ce0-2e546f3feb5d_2048x2484.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I like putting each feature function in a separate script, because then within a user or root bootstrapping script all I need is:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OEaT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd35c18bc-37a4-4be0-ab57-345835b0b68b_1316x602.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OEaT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd35c18bc-37a4-4be0-ab57-345835b0b68b_1316x602.png 424w, https://substackcdn.com/image/fetch/$s_!OEaT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd35c18bc-37a4-4be0-ab57-345835b0b68b_1316x602.png 848w, https://substackcdn.com/image/fetch/$s_!OEaT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd35c18bc-37a4-4be0-ab57-345835b0b68b_1316x602.png 1272w, https://substackcdn.com/image/fetch/$s_!OEaT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd35c18bc-37a4-4be0-ab57-345835b0b68b_1316x602.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OEaT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd35c18bc-37a4-4be0-ab57-345835b0b68b_1316x602.png" width="1316" height="602" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d35c18bc-37a4-4be0-ab57-345835b0b68b_1316x602.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:602,&quot;width&quot;:1316,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:91057,&quot;alt&quot;:&quot;script for bootstrap that loads functions&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170849842?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd35c18bc-37a4-4be0-ab57-345835b0b68b_1316x602.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="script for bootstrap that loads functions" title="script for bootstrap that loads functions" srcset="https://substackcdn.com/image/fetch/$s_!OEaT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd35c18bc-37a4-4be0-ab57-345835b0b68b_1316x602.png 424w, https://substackcdn.com/image/fetch/$s_!OEaT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd35c18bc-37a4-4be0-ab57-345835b0b68b_1316x602.png 848w, https://substackcdn.com/image/fetch/$s_!OEaT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd35c18bc-37a4-4be0-ab57-345835b0b68b_1316x602.png 1272w, https://substackcdn.com/image/fetch/$s_!OEaT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd35c18bc-37a4-4be0-ab57-345835b0b68b_1316x602.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>and then I can invoke the functions that make sense for either a non-root or a root user. The function files are just functions so they don&#8217;t take action until they are explicitly called.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://substack.the-experimentalist.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Experimentalist is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><a href="https://substack.the-experimentalist.com/p/phase-1-credentials-management-and-bootstrapping">The Experimentalist : Phase 1: Credentials Management and Bootstrapping</a> &#169; 2025 by <a href="https://www.linkedin.com/in/reidmpinchback/">Reid M. Pinchback</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a></p>]]></content:encoded></item><item><title><![CDATA[Phase 1: Just a Bunch of Storage]]></title><description><![CDATA[Experimentation laboratories need operational and data store capacity]]></description><link>https://substack.the-experimentalist.com/p/phase-1-just-a-bunch-of-storage</link><guid isPermaLink="false">https://substack.the-experimentalist.com/p/phase-1-just-a-bunch-of-storage</guid><dc:creator><![CDATA[Reid M. Pinchback]]></dc:creator><pubDate>Tue, 12 Aug 2025 22:54:38 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Sc7G!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde50c832-5a42-4bfa-a32f-90e530309c77_781x641.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Sc7G!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde50c832-5a42-4bfa-a32f-90e530309c77_781x641.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Sc7G!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde50c832-5a42-4bfa-a32f-90e530309c77_781x641.png 424w, https://substackcdn.com/image/fetch/$s_!Sc7G!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde50c832-5a42-4bfa-a32f-90e530309c77_781x641.png 848w, https://substackcdn.com/image/fetch/$s_!Sc7G!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde50c832-5a42-4bfa-a32f-90e530309c77_781x641.png 1272w, https://substackcdn.com/image/fetch/$s_!Sc7G!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde50c832-5a42-4bfa-a32f-90e530309c77_781x641.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Sc7G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde50c832-5a42-4bfa-a32f-90e530309c77_781x641.png" width="781" height="641" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/de50c832-5a42-4bfa-a32f-90e530309c77_781x641.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:641,&quot;width&quot;:781,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:132309,&quot;alt&quot;:&quot;Block diagram of the Experimentalist Laboratory&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170808266?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde50c832-5a42-4bfa-a32f-90e530309c77_781x641.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Block diagram of the Experimentalist Laboratory" title="Block diagram of the Experimentalist Laboratory" srcset="https://substackcdn.com/image/fetch/$s_!Sc7G!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde50c832-5a42-4bfa-a32f-90e530309c77_781x641.png 424w, https://substackcdn.com/image/fetch/$s_!Sc7G!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde50c832-5a42-4bfa-a32f-90e530309c77_781x641.png 848w, https://substackcdn.com/image/fetch/$s_!Sc7G!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde50c832-5a42-4bfa-a32f-90e530309c77_781x641.png 1272w, https://substackcdn.com/image/fetch/$s_!Sc7G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde50c832-5a42-4bfa-a32f-90e530309c77_781x641.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Planning for storage in your newly-constructed lab has several elements to it:</p><ul><li><p>Hardware considerations.</p></li><li><p>Filesystem considerations.</p></li><li><p>Use-case considerations.</p></li></ul><div><hr></div><p><strong>Prev:</strong> <strong><a href="https://substack.the-experimentalist.com/p/phase-1-just-a-bunch-of-compute">Phase 1: Just a Bunch of Compute</a> | Next: <a href="https://substack.the-experimentalist.com/p/phase-1-credentials-management-and-bootstrapping">Phase 1: Credentials Management and Bootstrapping</a></strong></p><div><hr></div><p>First a little terminology:</p><ul><li><p><strong>Drive:</strong> a physical device you buy and mount in or attach to a computing node.</p></li><li><p><strong>Partition:</strong> a subdivision of the space available on the drive. There are two partitioning approaches, with GUID Partition Table (GPT) being what you&#8217;ll usually see, but Master Boot Record (MBR) partitions may exist on older systems. When picking compute nodes to buy, I would suggest not getting anything so old it doesn&#8217;t support UEFI, and any UEFI system should allow GPT. GPT gives you more flexibility on partition creation and supports larger drives.</p></li><li><p><strong>Volume:</strong> a logical construct that presents available storage space to the operating system. Historically a volume usually fit within a partition, but Logical Volume Management (LVM) can present a volume that spans partitions and drives. It does this in terms of &#8220;extents,&#8221; which it uses to associate physical to logical storage.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!q5Yp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99cf2b83-9794-4f01-8816-76b3977f92af_811x581.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!q5Yp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99cf2b83-9794-4f01-8816-76b3977f92af_811x581.png 424w, https://substackcdn.com/image/fetch/$s_!q5Yp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99cf2b83-9794-4f01-8816-76b3977f92af_811x581.png 848w, https://substackcdn.com/image/fetch/$s_!q5Yp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99cf2b83-9794-4f01-8816-76b3977f92af_811x581.png 1272w, https://substackcdn.com/image/fetch/$s_!q5Yp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99cf2b83-9794-4f01-8816-76b3977f92af_811x581.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!q5Yp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99cf2b83-9794-4f01-8816-76b3977f92af_811x581.png" width="811" height="581" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/99cf2b83-9794-4f01-8816-76b3977f92af_811x581.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:581,&quot;width&quot;:811,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:107674,&quot;alt&quot;:&quot;Disk vs Partition vs Volume for storage&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170808266?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99cf2b83-9794-4f01-8816-76b3977f92af_811x581.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Disk vs Partition vs Volume for storage" title="Disk vs Partition vs Volume for storage" srcset="https://substackcdn.com/image/fetch/$s_!q5Yp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99cf2b83-9794-4f01-8816-76b3977f92af_811x581.png 424w, https://substackcdn.com/image/fetch/$s_!q5Yp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99cf2b83-9794-4f01-8816-76b3977f92af_811x581.png 848w, https://substackcdn.com/image/fetch/$s_!q5Yp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99cf2b83-9794-4f01-8816-76b3977f92af_811x581.png 1272w, https://substackcdn.com/image/fetch/$s_!q5Yp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99cf2b83-9794-4f01-8816-76b3977f92af_811x581.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>There is some hand-waving simplification in those explanations as partitions can be primary, extended, or logical, but those distinctions apply to using MBR.</p><h2>Local Physical Storage Use Cases</h2><p>Each compute node will have storage needs to account for. For simplicity I&#8217;ll assume a Linux variant is your goal.</p><ul><li><p><strong>EFI System Partition (ESP):</strong> This provides space for boot loaders and kernel images. I plan a 1/4gb for this, but if you knew you intended to do a lot of experimenting with different kernels you could make it bigger to house more kernel images.</p></li><li><p><strong>Swap Partition:</strong> You may read online content or hear discussions that advocate against swap due to performance concerns. As a database guy there are reasons why I disagree with the position. You don&#8217;t have to allocate a physical partition as you can add swap files later. I prefer to allocate the partition and size it according to the amount of RAM I installed on the compute node. You could use a smaller amount, but the less you allocate the more you need monitoring on swap activity, which is a discussion for future articles. Obviously if you have more than 64gb of RAM, you won&#8217;t want to size swap as the entire memory footprint unless you have a strong reason to support a virtual memory space that large.</p></li><li><p><strong>Root Partition:</strong> This will contain the volume that the boot loader establishes as your running root filesystem. When you size this allow for the fact that package updates, run-time log files, applications, and the runtime needs of those applications will all require storage.</p></li><li><p><strong>Other Linux Partitions:</strong> While not required, you may decide that for system stability you want some of the application, app data, and logging activity go to a separate physical partition. That would ensure the O/S itself doesn&#8217;t fall over due to the root partition filling up. The likely mount points for a volume would be <strong>/var</strong> or <strong>/opt</strong>. If you want both in a common partition, LVM can achieve that.</p></li></ul><ul><li><p><strong>Maintenance Partition:</strong> Depending on the BIOS features, I&#8217;ve found it handy to add a small additional partition to write to. Sometimes when working in the BIOS you can record configuration snapshot images to remember what you changed. This relates to the <a href="https://substack.the-experimentalist.com/p/phase-1-just-a-bunch-of-compute">&#8220;Make notes and take pictures&#8220; advice from the previous article</a>. When you set up many nodes, sometimes one or two will not behave consistently with the others, and knowing your BIOS setup without having to boot into it can be useful. Maybe give it 1-2 gig if your BIOS saves config snapshots as actual image-formatted (e.g. BMP, PNG) files. You may find that this needs to be formatted as FAT32, which will be reported by &#8220;<strong>fsdisk -f</strong>&#8221; on a UEFI/GPT setup as &#8220;Microsoft basic data.&#8221;</p></li><li><p><strong>Data Partitions:</strong> If you know you&#8217;ll be working with large volumes of data, it is worth planning data partitions. By having separate data partitions you protect the O/S from data filling up the root filesystem. It&#8217;s also the usage pattern that may later motivate you to add additional or replace with larger drives. It&#8217;s less disruptive to do that with clearly-segregated data than it usually is for a large filesystem muddling everything together.</p></li></ul><p>An obvious question is whether all these partitions should be on the same drive or across multiple drives. This mostly relates to the root filesystem versus everything else, although for database purists there are other scenarios.</p><p>The most direct reason for multiple drives is protecting the ability to boot from the root filesystem. There can be performance benefits to multiple drives, although it is easy to overstate the potential for non-RAID performance parallelism unless you are spec&#8217;ing out compute nodes with 2 or more CPUs, 2 or more RAID cards, and NUMA memory configuration to channel memory transfers separately. Write traffic across multiple SSDs may hold up better as a performance story without that level of hardware due to how IOPS could get batch flushed to on-SSD SRAM cache.</p><p>In the case of some database architectures there are fault-tolerance scenarios around directing Write-Ahead Log (WAL) to separate disks. Again this story holds up better when you have multiple RAID cards, preferably cards with battery backup for ensuring completed flushes on a power outage. The goal here would be to approximate the theoretical concept &#8220;stable storage,&#8221; which forms the backbone for how we reason about database transactions and database recovery.</p><h2>Remote Storage Use Cases</h2><p>It is important to keep in mind that you&#8217;re creating an experimental lab platform. You should be able to set up and tear down compute nodes at will, or alter some aspect of their current application provisioning and configuration.</p><p>This will push you towards having externalized storage so that any data you care about will live across those major system alterations.</p><p>You are also going to run into cases where you want the same data available across multiple compute nodes. Having multiple nodes mount the same remote volume is useful for allowing tasks to run anywhere and be matched up to the data required.</p><p>The straightforward solution to remote storage is Network Attached Storage (NAS). I want to dig into NAS solutions more in later articles, but for now the outline is:</p><ul><li><p>You can either build your own, or buy a NAS. If you&#8217;re thinking about lab creation like a small startup business, you probably want to just buy something so you can focus your time elsewhere. Pick a reliable, established vendor and product line.</p></li></ul><ul><li><p>You want your NAS to support the protocols you intend to use for file sharing. The two most likely are Server Message Block (SMB) and Network File System (NFS). There are others, but those are what you&#8217;ll typically use. Use the most recent stable version of a protocol available.</p></li><li><p>The bulk of the storage should be in a RAID configuration that helps preserve your data. I prefer RAID 6 plus a hot spare, because I&#8217;m willing to trade the cost of some storage for increased protection against disk failure. Investigate RAID options and pick the configuration that suits your situation.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pIJM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c4528c6-0c7f-4972-be04-d15060e8b6cd_781x521.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pIJM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c4528c6-0c7f-4972-be04-d15060e8b6cd_781x521.png 424w, https://substackcdn.com/image/fetch/$s_!pIJM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c4528c6-0c7f-4972-be04-d15060e8b6cd_781x521.png 848w, https://substackcdn.com/image/fetch/$s_!pIJM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c4528c6-0c7f-4972-be04-d15060e8b6cd_781x521.png 1272w, https://substackcdn.com/image/fetch/$s_!pIJM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c4528c6-0c7f-4972-be04-d15060e8b6cd_781x521.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pIJM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c4528c6-0c7f-4972-be04-d15060e8b6cd_781x521.png" width="781" height="521" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7c4528c6-0c7f-4972-be04-d15060e8b6cd_781x521.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:521,&quot;width&quot;:781,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:124939,&quot;alt&quot;:&quot;Local vs Remote storage&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170808266?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c4528c6-0c7f-4972-be04-d15060e8b6cd_781x521.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Local vs Remote storage" title="Local vs Remote storage" srcset="https://substackcdn.com/image/fetch/$s_!pIJM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c4528c6-0c7f-4972-be04-d15060e8b6cd_781x521.png 424w, https://substackcdn.com/image/fetch/$s_!pIJM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c4528c6-0c7f-4972-be04-d15060e8b6cd_781x521.png 848w, https://substackcdn.com/image/fetch/$s_!pIJM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c4528c6-0c7f-4972-be04-d15060e8b6cd_781x521.png 1272w, https://substackcdn.com/image/fetch/$s_!pIJM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c4528c6-0c7f-4972-be04-d15060e8b6cd_781x521.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>There are distributed filesystem technologies like Ceph, but as the discussion here is about &#8220;phase 1&#8221; planning for the lab platform, I don&#8217;t think that worth getting into. It is family of options can be pretty heavy-weight as a learning curve and maintenance responsibility. You may only want to head down that path if one of your goals from your cluster was hands-on experience with distributed filesystems.</p><h2>Economic Considerations</h2><p>You have three possible optimization outcomes:</p><ul><li><p>Low cost.</p></li><li><p>High reliability.</p></li><li><p>High performance.</p></li></ul><p>You can only pick at most 2 out of the 3 if you push to the extremes, but sometimes you can balance a little in the middle. When it comes to storage, no matter the use case, I see absolutely no win in giving up on high reliability. You&#8217;re trying to accomplish things, and having storage fail underneath you does not accomplish things.</p><p>That leaves you to decide when to aim for low cost, versus when to emphasize high performance.</p><p>I lean towards the following:</p><ul><li><p>The NAS box itself should be capable of high performance. You already have the handicap of data traveling over the network. Just get a good box in the first place.</p></li><li><p>The drives in the NAS are where you have have more flexibility. If you want a lot of storage, you really can&#8217;t beat the price point of most good-quality HDDs. If you prefer fast storage then you&#8217;ll want to get good-quality SSDs. Your bill and final storage capacity will differ between those choices, just don&#8217;t buy junk. This does not mean that you must buy drives marketed as &#8220;enterprise grade.&#8221; The price points on most of those are stupid. If you have a decent RAID setup, the entire point was to allow RAID to be your source of resilience. While I&#8217;m personally not a user of hybrid drives, hybrids could be a reasonable option for a NAS.</p></li><li><p>For local storage your choice is between HDD and SSD. I use one of each in my compute nodes. The HDD is used for the basic O/S and for the activity that experiences the most churn like swap and logging traffic. The SSD is used for application data. This setup protects the operating system itself from the risk of lockup due to NAND burn on the SSD, or SSDs failing due to bad firmware (several SSD vendors have had issues). My SSD selection criteria is definitely performance and reliability. HDD purchases have emphasized quality, and sometimes you can luck out with finding unused batches of older product lines where you hit a nice sweet spot that balances good quality, very decent (but not best-of-breed) performance, and fair price.</p></li><li><p>We won&#8217;t get into the meat of disaster recovery during phase 1 &#8212; there isn&#8217;t any intellectual property to protect yet &#8212; but for initial economic planning it is worth considering if you want a spare drive of each kind you use. Both my NAS and the compute nodes use the same model of SSD so I keep one idle SSD in the NAS as a hot spare for a RAID rebuild, but in a pinch I could swap that into a compute node for a failed data drive. I also have a couple of spares of the HDD models used in the compute nodes in case a root volume fails. You&#8217;re trying to run a business, and downtime means impaired operations. When you are bootstrapping your lab as a side project, parts delays interleaved with life scheduling may interrupt your plans for weeks.</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://substack.the-experimentalist.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Experimentalist is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><a href="https://substack.the-experimentalist.com/p/phase-1-just-a-bunch-of-storage">The Experimentalist : Phase 1: Just a Bunch of Storage</a> &#169; 2025 by <a href="https://www.linkedin.com/in/reidmpinchback/">Reid M. Pinchback</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a></p>]]></content:encoded></item><item><title><![CDATA[Phase 1: Just a Bunch of Compute]]></title><description><![CDATA[Plan the hardware for your SWE experimentation laboratory]]></description><link>https://substack.the-experimentalist.com/p/phase-1-just-a-bunch-of-compute</link><guid isPermaLink="false">https://substack.the-experimentalist.com/p/phase-1-just-a-bunch-of-compute</guid><dc:creator><![CDATA[Reid M. Pinchback]]></dc:creator><pubDate>Tue, 12 Aug 2025 03:47:52 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!dp8L!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b3b0da-c942-4f45-a1ae-78f0027eac1c_781x621.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dp8L!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b3b0da-c942-4f45-a1ae-78f0027eac1c_781x621.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dp8L!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b3b0da-c942-4f45-a1ae-78f0027eac1c_781x621.png 424w, https://substackcdn.com/image/fetch/$s_!dp8L!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b3b0da-c942-4f45-a1ae-78f0027eac1c_781x621.png 848w, https://substackcdn.com/image/fetch/$s_!dp8L!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b3b0da-c942-4f45-a1ae-78f0027eac1c_781x621.png 1272w, https://substackcdn.com/image/fetch/$s_!dp8L!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b3b0da-c942-4f45-a1ae-78f0027eac1c_781x621.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dp8L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b3b0da-c942-4f45-a1ae-78f0027eac1c_781x621.png" width="781" height="621" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/06b3b0da-c942-4f45-a1ae-78f0027eac1c_781x621.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:621,&quot;width&quot;:781,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:85957,&quot;alt&quot;:&quot;Block diagram of the Experimentalist Laboratory&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170729988?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b3b0da-c942-4f45-a1ae-78f0027eac1c_781x621.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Block diagram of the Experimentalist Laboratory" title="Block diagram of the Experimentalist Laboratory" srcset="https://substackcdn.com/image/fetch/$s_!dp8L!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b3b0da-c942-4f45-a1ae-78f0027eac1c_781x621.png 424w, https://substackcdn.com/image/fetch/$s_!dp8L!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b3b0da-c942-4f45-a1ae-78f0027eac1c_781x621.png 848w, https://substackcdn.com/image/fetch/$s_!dp8L!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b3b0da-c942-4f45-a1ae-78f0027eac1c_781x621.png 1272w, https://substackcdn.com/image/fetch/$s_!dp8L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06b3b0da-c942-4f45-a1ae-78f0027eac1c_781x621.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Time to get down to business. Phase 1 of the plan is the rudimentary layer to building your own home-brew &#8220;mad scientist&#8221; lab for software engineering experimentation. JBoC &#8212; Just a Bunch of Compute &#8212; is about deciding the physical portion of the platform.</p><div><hr></div><p><strong>Prev:</strong> <strong><a href="https://substack.the-experimentalist.com/p/building-a-swe-experimentation-platform">Building a SWE Experimentation Platform</a></strong> <strong>| Next: <a href="https://substack.the-experimentalist.com/p/phase-1-just-a-bunch-of-storage">Phase 1: Just a Bunch of Storage</a></strong></p><div><hr></div><p>I&#8217;ll be describing this in terms of what&#8217;s most typical for backend and data engineers, but don&#8217;t view that as limiting your options. If you&#8217;re interested more in the front-end, you might make slightly different choices. If you&#8217;ve been dreaming of building a thousand-node Raspberry PI supercluster, have at it!  Honestly if some board manufacturer would underwrite the work, I&#8217;d do a supercluster in a heartbeat.</p><h2>The Outline</h2><p>Money only goes so far. There is benefit to planning purchases much the same way a cash-strapped company might. The heuristic I&#8217;m using is:</p><ul><li><p>75% of purchases will go to lower-cost hardware.</p></li><li><p>20% will be the mid-tier cost range.</p></li><li><p>5% go to the very infrequent budget-busters you have decided you must say yes to.</p></li></ul><p>The percentages are ballparks in number of units purchased, not in dollars. That 5% top-end units may very well end up 75% of your spend if you&#8217;ve decided you absolutely must have that Nvidia H100 card, for example. More commonly it&#8217;ll be the monster laptop rig that you talk yourself into once every 5 years.</p><p>The goal in experimentation with distributed computing is to have hardware to distribute work over. The per-node computational muscle may not be that important. It&#8217;s the distribution of function that you are exercising, not whether each and every node can supply a water-cooled 100 FPS while playing Cyberpunk 2077.</p><h2>Lower-Tier Nodes</h2><p>Used, open-box, and unused-but-discontinued hardware is a great way to build out a cluster. Product lines in the range of 3-6 years old can have provide a lot of bang for the buck. We pay an insane premium in order to get the newest thing with a 50% performance bump when operated flat out. Most of the time we aren&#8217;t using server compute anywhere near full load, particularly in a home lab.</p><p>Think of your lab like you&#8217;re setting up a business. Fund the work you are going to do, not the work that somebody, somewhere, in a big wide world might conceive of doing. Someday.</p><p>Having headroom for maximum performance is great&#8230; if you&#8217;re using it. Save that for the high-end 5% of purchases. If it turns out to have been a vanity purchase, at least you&#8217;ll understand where the motivation came from.</p><p>Try to identify a single hardware model that you will stick to. There are two reasons for this:</p><ol><li><p>When you have many different hardware variations, it becomes more challenging to reason about behavior like performance and fault rates. The differences become primary variables instead of your chosen computational activity.</p></li><li><p>Maintenance is harder because you&#8217;ll have different issues arise with every kind of different hardware. Firmware patch issues will differ, BIOS configuration will differ, supported memory chips or SSDs can differ. By sticking to a single model you gain leverage on any learning curve: solve a problem once, then copy-paste to all the other nodes of the same model.</p></li></ol><p>Here are the characteristics I look for in lower-cost nodes.</p><ul><li><p>The BIOS and motherboard features must support whatever operating system and O/S features I intend to use, and supply the connectivity I&#8217;ve decided I need for any hardware that will be mounted directly on or cabled to the motherboard. More than anything else, this draws a line for &#8220;too old to use&#8221;.</p></li><li><p>The model must be easily available. There&#8217;s no point in picking something you rarely see available on the major reseller web sites.</p></li><li><p>There should be a decent amount of internet history on experiences with hardware quality for that model. You want hardware that has been reliable, both in initial assembly and in use over time. The model might not house the fastest and hottest-running components as of when it was made, because heat means thermal stress. Find a reliable workhorse, not an older Ferrari.</p></li><li><p>The physical size of the model must suit the location where it is going to live. If you have an unused (and dry!) basement with ample space for racks and 1U &#8220;pizza boxes&#8221; then the width and depth of those nodes might be fine. If you have a half dozen square feet on a couple of home-office shelves, then small form-factor PCs may be what you&#8217;re looking for: I use SFFs myself for exactly that reason.</p></li><li><p>You need to be able to support the power requirements of that model. You don&#8217;t want the circuit breaker tripping all the time because you tried to run too many nodes with 1200w power supplies.</p></li><li><p>The models must run cool enough for the environment where they are going to live. Obviously you may be able to improve on the situation with various cooling solutions, but the end result establishes your thermal constraints.</p></li><li><p>CPUs won&#8217;t be the best possible but should be acceptable in terms of their performance, their heat generation, and any history related to reliability. As an example, you may decide to take a pass on Intel Raptor Lake 13900K and 14900K CPUs because of their voltage-related stability problems, but on the other hand I wouldn&#8217;t rush to anything in the i3 product line either because it&#8217;s just too low-powered.</p></li><li><p>RAM capacity matters for almost everything. Unless you&#8217;re doing the Raspberry PI supercluster, I wouldn&#8217;t bother with anything that doesn&#8217;t have at least 32gb of RAM per node, and preferably at least 64gb. Note that sometimes the manufacturer information will suggest 32gb when 64gb was possible if you had the right RAM sticks, so do your homework online by tracking down posts from modders and happy owners of a particular node model. Be aware that some motherboards with 4 sockets for RAM may not operate with maximum memory performance in that configuration; if speed is your goal you may be limited to a single pair of DIMMS.</p></li><li><p>I personally prefer to buy bare-bone boxes that have any CPU, GPU, Wi-Fi, Bluetooth, and wired network support components mounted&#8230; but not the RAM or SSD. That lets me pick a consistent model for RAM and SSD hardware without feeling like I wasted any money. If you aren&#8217;t super-fussy about performance testing, then I wouldn&#8217;t worry as much about that. Keeping the Wi-Fi / Bluetooth / Network cards consistent though is likely to save you a lot of headaches. There&#8217;s more than enough room for configuration-related ick around those as it is.</p></li></ul><h2>Middle-Tier Nodes</h2><p>The process here is much the same as that for the low-cost tier. The difference is you may know you want to perform work that requires physical capabilities your lower-end nodes will lack.</p><p>An obvious example would be if the lower-tier nodes don&#8217;t have any Nvidia GPU cards in them, and you need something that would let you work with CUDA. The node models will be a bit newer and components slightly higher-end, but you don&#8217;t have to break the bank. What you likely care about are which versions of CUDA are supported by a particular GPU card model. <a href="https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units">The Wikipedia page on Nvidia card features</a> can be a handy resource for looking at API support.</p><p>Avoiding the RTX cards that end in &#8220;80&#8221; or &#8220;90&#8221; can provide you with some affordable options. Obviously lower-end GPUs wouldn&#8217;t cut it for somebody training a 100B-parameter base LLM, but we&#8217;re not trying to spec out hardware for an activity that runs with budgets in the 10&#8217;s of millions USD. I&#8217;m not going to get into AI details further now, they&#8217;ll be &#8220;Cheapskate AI&#8221; material in future articles.</p><p>Another kind of node warrants consideration as a possible mid-tier expense, and that is a good-quality NAS (Network Attached Storage). That&#8217;ll be discussed more in <a href="https://substack.the-experimentalist.com/p/phase-1-just-a-bunch-of-storage">the next article on JBoS (Just a Bunch of Storage)</a>. </p><h2>Higher-Tier Nodes</h2><p>Not much to explicitly plan here except to note that this can be a disproportionate hunk of your budget. Avoid it if you can, but make the money count if you do it.</p><p>My personal inclination is towards these being the periodic upgrade to a really good laptop, since even when you have a compute cluster you&#8217;ll behave like any developer and work on it remotely.</p><p>If you have big plans in the ML and GenAI space, you might instead opt for a desktop box with whatever you can afford in the way of GPU cards and the most VRAM you decide you can fund.</p><h2>Power Planning</h2><p>This is a big area and I&#8217;m not the person to do full justice to it, but as you have more money in hardware you&#8217;ll want to learn about:</p><ul><li><p>Surge protectors</p></li><li><p>Surge arrestors/limiters</p></li><li><p>Power/line conditioners</p></li><li><p>Uninterruptible Power Supplies (UPS) and their common architectural variants: standby, line-interactive, and online</p></li></ul><p>I have seen a rack taken out by a lightning strike, so learning about the relevant power-handling hardware and how to set it up properly is something to have on your radar for some stage of the evolution of your lab.</p><p>Until then, at least consider surge protection. To be extra safe, power down and unplug everything whenever you hear about an incoming electrical storm, or when notified that the utility company is going to be doing work in your area.</p><p>One point of this lab-building exercise is to treat it as though you are running a business, not playing computer games, so safeguard that investment.</p><h2>Other Parts</h2><p>Buying compute nodes will not be the only purchase. You&#8217;ll need to allow for:</p><ul><li><p>RAM sticks of the appropriate size and part numbers.</p></li><li><p>Drives (whether HDD, SSD, or hybrid).</p></li><li><p>Network cables with shielding appropriate to the speed and length until you can connect to a router or switch.</p></li><li><p>A network switch.</p></li><li><p>Some USB sticks for doing initial O/S installations or for firmware updates.</p></li><li><p>Anything you need for interacting with nodes, such as a monitor and keyboard.</p></li><li><p>If the motherboard or node case have vendor-specific parts, like fan housings or heat shield armor, consider picking up a few spares while you can get them.</p></li><li><p>Add any cooling supplies or spare fan parts you want quickly available.</p></li><li><p>Add any tools you want for assembling hardware like Torx screwdrivers, anti-static mats and wrist straps, magnifier and lighting for viewing small parts in tight quarters, etc.</p></li><li><p>If you went the route of rack-mount hardware, then depending on your plan either full-sized or table-top sized racks and all their supporting hardware pieces; alternatively if you need any office furniture like shelving then allow for that.</p></li><li><p>Storage bins for bits and pieces, because you&#8217;ll definitely collect bits and pieces.</p></li></ul><p>Add or remove as appropriate. It&#8217;s your lab for your experiments.</p><h2>The Process</h2><p>I would suggest some habits around how you acquire and integrate hardware.</p><ol><li><p><strong>Learn which vendor sources you find reliable</strong>, and mostly stick to them. You&#8217;ll have enough going wrong as it is. Sometimes your compatibility homework on a RAM part number won&#8217;t turn out as you had hoped. Sometimes the wrong part will get shipped. The universe introduces enough chaos so knowing which vendors are more reliable can save you a lot of grief.</p></li><li><p><strong>Don&#8217;t rush to buy multiple units</strong> of a compute node or expensive part until you&#8217;ve bought and tried the very first one. It&#8217;s not hard to trip over something that will be show-stopping for your plans, and you have to send that item back and start again on your homework for the alternative. Buy one, put it to some use, and then if your plan was to buy more you can proceed.</p></li><li><p><strong>Don&#8217;t buy until you have time to install and test</strong>, particularly when it isn&#8217;t a brand-new shrink-wrapped product. Schedule the time for the work. Not only will that help you catch bad parts fast enough to return them, it will also help you spot when somebody sends you a part that is not what was advertised. Even the shrink-wrapped products are better to be able to test quickly, but anything used or open-box you want to make sure you didn&#8217;t waste your money or were scammed.</p></li><li><p><strong>Make notes and take pictures</strong> if those might be helpful later. Setting up a compute cluster has a lot of repeat work, and you will find yourself routinely wishing you remembered exactly what steps you took on a previous node. Capture the information while it is fresh. Even a day later you&#8217;ll forget something.</p></li><li><p><strong>Budget for a cost 10% above what you planned.</strong> You&#8217;ll discover you need something that you didn&#8217;t realize would be important until opening up a case, or after realizing half your network cables aren&#8217;t being recognized by the switch.</p></li><li><p><strong>Expect to do operating system installations frequently at first.</strong> Not only are there many flavors to choose from &#8212; even just in the Unix/Linux world &#8212; but you&#8217;ll find yourself grappling with questions on specific O/S version numbers, desktop vs server variants, disk layout for the installation, and sometimes BIOS configuration. You <strong>will</strong> change your mind, and discover why keeping notes comes in handy. Don&#8217;t be emotionally wedded to any initial set-up. Learn to feel comfortable tearing down anything that isn&#8217;t exactly as you want it to be.</p></li></ol><h2>A Living Example</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vlxH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f776306-4a6d-4efe-b19f-a4aeadebc8d0_514x400.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vlxH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f776306-4a6d-4efe-b19f-a4aeadebc8d0_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!vlxH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f776306-4a6d-4efe-b19f-a4aeadebc8d0_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!vlxH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f776306-4a6d-4efe-b19f-a4aeadebc8d0_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!vlxH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f776306-4a6d-4efe-b19f-a4aeadebc8d0_514x400.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vlxH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f776306-4a6d-4efe-b19f-a4aeadebc8d0_514x400.jpeg" width="514" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9f776306-4a6d-4efe-b19f-a4aeadebc8d0_514x400.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:514,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:96720,&quot;alt&quot;:&quot;Small Form-Factor PC Cluster&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170729988?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f776306-4a6d-4efe-b19f-a4aeadebc8d0_514x400.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Small Form-Factor PC Cluster" title="Small Form-Factor PC Cluster" srcset="https://substackcdn.com/image/fetch/$s_!vlxH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f776306-4a6d-4efe-b19f-a4aeadebc8d0_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!vlxH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f776306-4a6d-4efe-b19f-a4aeadebc8d0_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!vlxH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f776306-4a6d-4efe-b19f-a4aeadebc8d0_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!vlxH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f776306-4a6d-4efe-b19f-a4aeadebc8d0_514x400.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Here are the low-tier nodes for my SFF cluster.</p><ul><li><p>Intel NUCs, all but one are the same model; the exception is my Kubernetes control plane. It has a small keyboard and monitor if for some reason I can&#8217;t get to the cluster remotely, e.g. due to a flubbed network configuration experiment.</p></li><li><p>Each node has 64gb of RAM, an HDD, and an SSD.</p></li><li><p>Cooling isn&#8217;t too much of a concern as Intel made these from CPUs intended for the mobile market, but I keep a fan at one end to circulate the air just to be safe. </p></li><li><p>The nodes connect to a network switch positioned on the shelf above.</p></li></ul><p>All of these were bought after Intel had moved on to the next couple of series in the NUC product line. Fortunately I avoided the Raptor Lake debacle. At this point I think about 3/4ths of the Intel NUC lines are aging out as an option for those starting in 2025 on their own cluster. There are many other SFF vendors that have taken up the baton, so I expect SFFs to remain a viable approach for those not wanting to go in the direction of rack-mount hardware.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://substack.the-experimentalist.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Experimentalist is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><a href="https://substack.the-experimentalist.com/p/phase-1-just-a-bunch-of-compute">The Experimentalist : Phase 1: Just a Bunch of Compute</a> &#169; 2025 by <a href="https://www.linkedin.com/in/reidmpinchback/">Reid M. Pinchback</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a></p>]]></content:encoded></item><item><title><![CDATA[Building a SWE Experimentation Platform]]></title><description><![CDATA[Gain the skills GenAI benefits from, but job opportunities may not provide]]></description><link>https://substack.the-experimentalist.com/p/building-a-swe-experimentation-platform</link><guid isPermaLink="false">https://substack.the-experimentalist.com/p/building-a-swe-experimentation-platform</guid><dc:creator><![CDATA[Reid M. Pinchback]]></dc:creator><pubDate>Sat, 09 Aug 2025 05:17:19 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Xrgb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5becbcf6-68fe-431b-b08a-9773f9c9bd3f_781x641.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Xrgb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5becbcf6-68fe-431b-b08a-9773f9c9bd3f_781x641.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Xrgb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5becbcf6-68fe-431b-b08a-9773f9c9bd3f_781x641.png 424w, https://substackcdn.com/image/fetch/$s_!Xrgb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5becbcf6-68fe-431b-b08a-9773f9c9bd3f_781x641.png 848w, https://substackcdn.com/image/fetch/$s_!Xrgb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5becbcf6-68fe-431b-b08a-9773f9c9bd3f_781x641.png 1272w, https://substackcdn.com/image/fetch/$s_!Xrgb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5becbcf6-68fe-431b-b08a-9773f9c9bd3f_781x641.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Xrgb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5becbcf6-68fe-431b-b08a-9773f9c9bd3f_781x641.png" width="781" height="641" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5becbcf6-68fe-431b-b08a-9773f9c9bd3f_781x641.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:641,&quot;width&quot;:781,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:55919,&quot;alt&quot;:&quot;Block diagram of the Experimentalist Laboratory&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170317139?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5becbcf6-68fe-431b-b08a-9773f9c9bd3f_781x641.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Block diagram of the Experimentalist Laboratory" title="Block diagram of the Experimentalist Laboratory" srcset="https://substackcdn.com/image/fetch/$s_!Xrgb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5becbcf6-68fe-431b-b08a-9773f9c9bd3f_781x641.png 424w, https://substackcdn.com/image/fetch/$s_!Xrgb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5becbcf6-68fe-431b-b08a-9773f9c9bd3f_781x641.png 848w, https://substackcdn.com/image/fetch/$s_!Xrgb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5becbcf6-68fe-431b-b08a-9773f9c9bd3f_781x641.png 1272w, https://substackcdn.com/image/fetch/$s_!Xrgb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5becbcf6-68fe-431b-b08a-9773f9c9bd3f_781x641.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><a href="https://substack.the-experimentalist.com/p/karl-marx-would-buy-gpus">My "Karl Marx Would Buy GPUs" article concluded</a> how, with ever-concentrating corporate and personal wealth, GenAI vendors will push that disparity further. Our finances whether as employee, solo entrepreneur, or small-company founder, are facing similar forces. The choices on strategy to overcome the slow squeeze boil down to combinations of:</p><ol><li><p>Find a way to be more capable than the big GenAI vendors and identify opportunities to beat them at some piece of their own game. Challenging, but not impossible as big companies aren&#8217;t as nimble as individuals or small companies. Expect to change direction rapidly as LLM capabilities evolve.</p></li><li><p>Find something your market wants that is grounded in novelty or uniqueness or deep insight, or at least with a provenance that communicates value.  If you were part of the reason for the transaction, that edge can&#8217;t be instantly prompted away.</p></li></ol><div><hr></div><p><strong>Next: <a href="https://substack.the-experimentalist.com/p/phase-1-just-a-bunch-of-compute">Phase 1: Just a Bunch of Compute</a></strong></p><div><hr></div><h2>Skill Building</h2><p>My own focus is the world of software engineering so I&#8217;ve been trying to navigate a path that is about 5% of the first option, and 95% of the second. A day may come when technical domain expertise ceases to be a thing, but I believe we&#8217;re a long way from that. The strongest reason I have was explored in <a href="https://substack.the-experimentalist.com/p/on-the-turn-of-a-phrase">On the Turn of a Phrase</a>:</p><blockquote><p><em>We cannot think thoughts for which we lack the language.</em></p></blockquote><p>Within software engineering &#8212; or frankly any deep-knowledge career path &#8212; that suggests a couple of observations:</p><ul><li><p>If you aren&#8217;t invested in the GenAI world then you need a deep understanding of the detailed workings of a domain. You&#8217;re competing against an ever-changing LLM knowledgebase. Surface knowledge won&#8217;t cut it.</p></li><li><p>If you are invested in the GenAI world then your ability to generate quality outcomes improves greatly if you have deep knowledge of a domain. Surface knowledge has too little selective power in prompts to generate the strongest results out of an infinite universe of choices. You need to know what to ask for.</p></li></ul><p>Either way you land in a similar place as a SWE. Deep skill has moved from being an eventual nice-to-have to now being a key factor in your relevance.</p><p>You can&#8217;t just read a book for deep skill:</p><ul><li><p>You have to be in a position to actively experiment as you drill down.</p></li><li><p>We learn from a mix of positive and negative experiences.</p></li><li><p>These shape both what connects in our memory and the boundary on what is appropriate to generalize.</p></li></ul><p>Those of us who have already been in software engineering (or adjacent field) for a couple of decades already had opportunities to do all that. We&#8217;ve built layers of foundation skills. I&#8217;m concerned that more recent graduates, junior, and even senior engineers are going to find themselves denied a similar learning process.</p><h2>The Strategy</h2><p>To help fill the gap I&#8217;m rolling out a series of articles on the approach I have used myself for many years. I&#8217;m in the midst of updating it for life in an LLM world.</p><p>The idea is simple. Nothing teaches you how things work, like having to actually make them work by yourself. The effort to create comprehensive infrastructure from scratch will teach you far more than a hundred LeetCode exercises.</p><p>If you follow along and attempt any of this, in essence you&#8217;ll be building the scaffolding for your own startup. Whether you have a great business idea or not, at a minimum you&#8217;ll be faced with how pre-seed tech startups start to stitch themselves together and you&#8217;ll see why they have so many rough edges. There is only so much time, only so much money, only so much energy, and you learn that the only way you say &#8220;yes&#8221; to important tasks is by saying &#8220;no&#8221; or &#8220;not yet&#8221; to unimportant ones.</p><p>Even if you aren&#8217;t in an early phase of your career perhaps you&#8217;ve experienced what many of us do: an industry of ever-broadening technology choices we often bounce between at increasingly shallow levels of understanding. If you haven&#8217;t built a proper experimental playground before, this may increase your motivation so that you can dig into all the high-leverage details that the day job may not enable.</p><p>There is another theme to this. Cloud compute and GenAI vendor APIs have something in common. Both have ways they tax you for experimentation, and can punish you badly for expensive mistakes in establishing budget guardrails.</p><h2>The Experimentalist Laboratory</h2><p>The approach here is to mostly run on hardware you&#8217;ve purchased for yourself.  Obviously many things can be moved to the cloud, and some of them should, but there is nothing quite like the freedom to make any purchase choice that makes sense to you, apply any configuration that helps you, and throw away anything you later discover does not server you.</p><p>Technology should be about <strong>YOU</strong>, and not the other way around. The process here will very much be one of making, and revising as necessary, decisions that place <strong>YOU</strong> at the center of what happens, and keeps <strong>YOU</strong> as the primary decider of pretty much everything.</p><p>A side benefit is that, other than the initial hardware purchases, you aren&#8217;t constantly feeding mega-corps more money just to have permission to learn and push the envelope. Stop acting like their favorite flavor of catnip. They have an ample supply. </p><p>The essence of building your own laboratory for experimentation is straightforward:</p><ul><li><p>The economics of the platform should be approachable.</p></li><li><p>Hardware, software, and some external services will be involved.</p></li><li><p>It likely takes more than a laptop; you&#8217;ll want something that surfaces real challenges and provides the growth potential of actual distributed systems.</p></li><li><p>There will be many moving parts, so tackle the assembly in bite-sized pieces.</p></li><li><p>Optionally include LLMs. If you do, the same hardware you use for other experiments can be used for self-hosting some LLMs.</p></li></ul><p>Including LLMs opens up more avenues for learning and &#8212; if this happens to be your goal &#8212; you were never going to avoid self-hosting experiments with the strategy &#8220;beat GenAI vendors at their own game&#8221; as your objective.</p><p>Assembling the laboratory is broken into two major sections and an optional third.</p><ul><li><p><strong>Phase 1:</strong> the basic foundation of hardware plus minimal software tooling for rolling out system configuration and application installation. It also covers ensuring that your work gets protected somewhere remote.</p></li><li><p><strong>Phase 2:</strong> an ever-growing collection of technology artifacts or progress on procedural efforts so that, in time, you can build pretty much anything you wish.</p></li><li><p><strong>Cheapskate AI:</strong> optionally, run LLM tooling locally to experiment with, tune, and control the behavior of the AI tools you use. The tools should behave as you wish, not as an outside vendor decides for you.</p></li></ul><p>The to-do list can be as long as your interests drive it, but for those earlier in their career you should come out of this with a gut-level understanding of what drives many decisions at the intersection of technology and business. That awareness can become one of your selling points in a career role or a new business.</p><p>In future articles I&#8217;ll start breaking down the laboratory assembly in detail.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://substack.the-experimentalist.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Experimentalist is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><a href="https://substack.the-experimentalist.com/p/building-a-swe-experimentation-platform">The Experimentalist : Building a SWE Experimentation Platform</a> &#169; 2025 by <a href="https://www.linkedin.com/in/reidmpinchback/">Reid M. Pinchback</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a></p>]]></content:encoded></item><item><title><![CDATA[Newsletter Refurb 2025-08-08]]></title><description><![CDATA[Browser improvements for both the Substack and Medium versions]]></description><link>https://substack.the-experimentalist.com/p/newsletter-refurb-2025-08-08</link><guid isPermaLink="false">https://substack.the-experimentalist.com/p/newsletter-refurb-2025-08-08</guid><dc:creator><![CDATA[Reid M. Pinchback]]></dc:creator><pubDate>Fri, 08 Aug 2025 05:14:22 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!gmqZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4169fb8-5bac-4b62-a5d3-fc3cafae5c31_514x400.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gmqZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4169fb8-5bac-4b62-a5d3-fc3cafae5c31_514x400.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gmqZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4169fb8-5bac-4b62-a5d3-fc3cafae5c31_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!gmqZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4169fb8-5bac-4b62-a5d3-fc3cafae5c31_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!gmqZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4169fb8-5bac-4b62-a5d3-fc3cafae5c31_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!gmqZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4169fb8-5bac-4b62-a5d3-fc3cafae5c31_514x400.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gmqZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4169fb8-5bac-4b62-a5d3-fc3cafae5c31_514x400.jpeg" width="514" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c4169fb8-5bac-4b62-a5d3-fc3cafae5c31_514x400.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:514,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:311584,&quot;alt&quot;:&quot;AI-generated image of a mechanic fixing machinery&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170418446?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4169fb8-5bac-4b62-a5d3-fc3cafae5c31_514x400.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="AI-generated image of a mechanic fixing machinery" title="AI-generated image of a mechanic fixing machinery" srcset="https://substackcdn.com/image/fetch/$s_!gmqZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4169fb8-5bac-4b62-a5d3-fc3cafae5c31_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!gmqZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4169fb8-5bac-4b62-a5d3-fc3cafae5c31_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!gmqZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4169fb8-5bac-4b62-a5d3-fc3cafae5c31_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!gmqZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4169fb8-5bac-4b62-a5d3-fc3cafae5c31_514x400.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Just a quick note on some long-overdue cleanup now that the newsletter has enough content to guide me on the next steps.</p><div><hr></div><p><strong>Prev: <a href="https://substack.the-experimentalist.com/p/the-press-room">The Press Room</a>| Next: <a href="https://substack.the-experimentalist.com/p/newsletter-status-2025-09-01">Newsletter Status 2025&#8211;09&#8211;01</a></strong></p><div><hr></div><h2>Why the Changes</h2><p>New articles are will start rolling out on technical themes, and it was time to help subscribers to find the content that interests them. Performing some cleanup now makes it easier to just follow along with the new pattern as content is released.</p><h2>What Changed</h2><ol><li><p>Both the Substack and Medium versions of newsletter now organize material into thematic areas. Those are &#8220;sections&#8221; in Substack, and &#8220;subpages&#8221; in Medium. The existence of these is visible both in the home-page menu and in the regions of articles displayed on the home page as you scroll down.</p></li><li><p>The thematic areas are:</p><ol><li><p><strong>Hammer &amp; Anvil:</strong> Articles on the tech economy, which already contains several items.</p></li><li><p><strong>Hot off the Press:</strong> Posts like this about newsletter features and usage.</p></li><li><p><strong>The Laboratory:</strong> A new area for home office/small business tech experimentation. Not yet visible, but stay tuned as that&#8217;s about to change!</p></li><li><p>More to follow in time for data engineering, observability, AI and ML.</p></li></ol></li><li><p>For Substack subscribers this will allow you to control which notifications you get. At the moment nothing is different. You&#8217;ll get all of the new article notifications, but you now have the power to exclude notifications for sections you aren&#8217;t interested in. In an Attention Economy, I&#8217;d just as soon let you enjoy some silence and not sift through unwanted inbox visits. Apologies to the Medium subscribers, but that platform does not have an equivalent feature for subpages.</p></li><li><p>The changes above will only be visually meaningful on a web browser. You likely won&#8217;t see a difference on Android app clients for either Substack or Medium. I don&#8217;t personally use an iPhone so I have no idea what the outcome is there, but I suspect it is much the same.</p></li><li><p>I tidied up the in-article navigation links on multi-article series to eliminate &#8220;View Series&#8221;. I already wasn&#8217;t using that to link to anything for Medium because Android deep linking doesn&#8217;t work in the Medium app for tag page URLs. With the migration to Substack sections apparently the Substack app suffers similar problems for section URLs, so that eliminated the last reason for a &#8220;View Series&#8221; text fragment to exist.</p></li></ol><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://substack.the-experimentalist.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Experimentalist is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><a href="https://substack.the-experimentalist.com/p/newsletter-refurb-2025-08-08">The Experimentalist : Newsletter Refurb 2025-08-08</a> &#169; 2025 by <a href="https://www.linkedin.com/in/reidmpinchback/">Reid M. Pinchback</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a></p>]]></content:encoded></item><item><title><![CDATA[Karl Marx Would Buy GPUs]]></title><description><![CDATA[Maybe he was half right about owning the means of production]]></description><link>https://substack.the-experimentalist.com/p/karl-marx-would-buy-gpus</link><guid isPermaLink="false">https://substack.the-experimentalist.com/p/karl-marx-would-buy-gpus</guid><dc:creator><![CDATA[Reid M. Pinchback]]></dc:creator><pubDate>Mon, 04 Aug 2025 09:03:23 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!rGKE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66317369-3830-4cb5-86fd-137daa4d2917_514x400.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rGKE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66317369-3830-4cb5-86fd-137daa4d2917_514x400.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rGKE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66317369-3830-4cb5-86fd-137daa4d2917_514x400.png 424w, https://substackcdn.com/image/fetch/$s_!rGKE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66317369-3830-4cb5-86fd-137daa4d2917_514x400.png 848w, https://substackcdn.com/image/fetch/$s_!rGKE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66317369-3830-4cb5-86fd-137daa4d2917_514x400.png 1272w, https://substackcdn.com/image/fetch/$s_!rGKE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66317369-3830-4cb5-86fd-137daa4d2917_514x400.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rGKE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66317369-3830-4cb5-86fd-137daa4d2917_514x400.png" width="514" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/66317369-3830-4cb5-86fd-137daa4d2917_514x400.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:514,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:376245,&quot;alt&quot;:&quot;AI-generated image of a Art Deco-style poster with a worker holding a GPU card&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/170059305?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66317369-3830-4cb5-86fd-137daa4d2917_514x400.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="AI-generated image of a Art Deco-style poster with a worker holding a GPU card" title="AI-generated image of a Art Deco-style poster with a worker holding a GPU card" srcset="https://substackcdn.com/image/fetch/$s_!rGKE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66317369-3830-4cb5-86fd-137daa4d2917_514x400.png 424w, https://substackcdn.com/image/fetch/$s_!rGKE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66317369-3830-4cb5-86fd-137daa4d2917_514x400.png 848w, https://substackcdn.com/image/fetch/$s_!rGKE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66317369-3830-4cb5-86fd-137daa4d2917_514x400.png 1272w, https://substackcdn.com/image/fetch/$s_!rGKE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66317369-3830-4cb5-86fd-137daa4d2917_514x400.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Setting aside the obviously disastrous history of Soviet-era planned economies, sometimes useful messages get lost with the passage of time and past poor implementations. Marx&#8217;s observations on the long-term trend of private profit and concentration of asset control have largely turned out to be true. Unfortunately he didn&#8217;t allow for how collectivist control ends up no better than control driven by short-term-focused capitalism.</p><div><hr></div><p><strong>Prev:</strong> <strong><a href="https://substack.the-experimentalist.com/p/economic-currents">Economic Currents</a> | Next: <a href="https://substack.the-experimentalist.com/p/reification-fallacy-and-llm-use">Reification Fallacy and LLM Use</a></strong></p><div><hr></div><p>For backstory articles on wealth concentration:</p><ul><li><p>Why skilled workers need a plan to navigate through wealth-driven change was introduced in <a href="https://substack.the-experimentalist.com/p/agi-is-irrelevant">AGI is Irrelevant</a>.</p></li><li><p>Personal wealth concentration was covered in <a href="https://substack.the-experimentalist.com/p/genai-reimagined">GenAI Reimagined</a>.</p></li><li><p>Corporate asset concentration in the U.S. was explored in <a href="https://substack.the-experimentalist.com/p/economic-currents">Economic Currents</a>.</p></li></ul><p>This article is a start on &#8220;so what are we going to do about it?&#8221;</p><p>I&#8217;m going to try and scare the hell out of you. I don&#8217;t do that to market fear (un-paywalled articles make a terrible economic model for that), but to emphasize how now is THE time to discover whatever your personal sense of purpose might be. You need to find it, and then align that with a strategy that properly accounts for what is happening around you. More than anything, I want you to bloody well pay attention and get out of life&#8217;s autopilot mode.</p><h2>Feudal Production</h2><p>The part of Marx&#8217;s analysis that mattered was how, over time, small cottage-scale industry would get supplanted by larger industrial efforts that needed more coordination. The part that he got wrong was his belief that feeding such a process could result in a different outcome depending on how you fed it. Tech broligarchy and communist oligarchy end up no different when you aren&#8217;t one of those &#8220;select few&#8221; making the big decisions. Wealth concentrates intensely, and anybody not at the very center of that concentration becomes, economically speaking, a food source. You&#8217;re the value to harvest, whether as worker or customer or tax payer.</p><p>The stronger &#8212; if more challenging &#8212; solution would have been to advocate for ways in which the smaller cottage industries could grow to reasonable size and scale of output without transitioning into a size disproportionate to the health of everything around them. Some system where workers and owners are either the same, or closer in their working relationship and thus have more ways to be aligned both economically and for the mission of the company.</p><p>For the Victorian time period of Marx this notion would have sounded like gibberish in anything except perhaps high-end arts and crafts businesses like William Morris&#8217; own &#8220;Morris &amp; Co.&#8221; In the modern economic era, things are different.</p><h2>Modern Production</h2><p>The personal computer arguably jump-started the idea that people could have a device in their hands that gave them options. If they were a programmer, they could use one to hone their skills. If they were also entrepreneurial then they might add to their income with consulting work, or start a software-based company. Rapidly a host of keyboard-adjacent career paths like bookkeeping or professional writing followed suit, until ultimately virtually all of white-collar employment had potential to at least leverage a PC to assist with the capacity to work and earn a living.</p><p>With the advent of GenAI the mere value in possessing a personal computer has dropped substantially, at least in the short term. The fact that you can use a laptop and generate material from it has competition from online services that can also generate material. There was a potential moat each free person possessed by mere virtue of engaging their brain to see if they could produce via a keyboard something another person hadn&#8217;t thought of. That entire concept is under systemic threat across roles and across industries.</p><h2>Future Production</h2><p>The GenAI vendors are trying to seize the means of production. They&#8217;re doing it by making a thing they have and you usually don&#8217;t &#8212; expensive GPU-based processing for LLMs &#8212; and using massive PR and marketing budgets to push all other industry around them to opt for the new means (GPU-based production) and discount the criticality of the previous means (PC-based production). That leaves them in possession of the equivalent of a shop floor filled with robotic assemblers, and you holding a pocket calculator.</p><p>In the short term the disparity has been made less obvious. LLM access is metered out remotely, and you get to use it via that old calculator you are used to: your PC. Nothing to see here, right?</p><p>Unfortunately we&#8217;ve seen this game before. Introduce something on the cheap. Get people to yield to the sound of the Pied Piper. Change their personal and business workflow to the new thing, but at a price that doesn&#8217;t quite let them realize that tokens are being positioned as the new form of taxation: a tax on every thought, every act of creation, every communication.</p><p>What do you think happens next? Enshittification. The quality drops. The token costs go up. Only the select few with the deeper pockets get the best benefits. This isn&#8217;t just a time-tested formula. It is the very formula that some of the biggest players in GenAI have themselves used to grow their pre-AI companies. It is <em>already</em> the way they think about business. As the process proceeds, more wealth leaves from those who can pay, and goes to those who charge. Wealth concentrates even further. </p><p>Understand the thesis here. This is not about whether you personally think GenAI is the be-all-end-all technology solution. This isn&#8217;t purely about technological effectiveness. This is about the perception of technological effectiveness in the minds of those at the center of concentrations of wealth.</p><p>If the GenAI enthusiasts are right about the utility of the technology, it would make its PC predecessor into little more than a calculator. If the GenAI enthusiasts are wrong, but the GenAI companies convince enough of industry otherwise&#8230; the situation is almost the same because of the decision-making power of a few. Perception doesn&#8217;t just become reality, perception by the select wealthy few <em>dictates</em> reality.</p><p>That&#8217;s the message to take away here. The parameters for deciding your life strategy are now the same either way, at least until a different future reveals itself than what we are seeing at this moment. You can take either view, and the path forward is the same because the alternative is that you&#8217;re one person sitting on a couch reading this article while a handful of TRILLION-dollar companies are sizing you and me and a few billion other people up like we&#8217;re all krill destined for the gullet of a whale. That would be bad enough, but they are using their influence to align all the BILLION-dollar companies in the exact same direction.</p><p>Go back and read <a href="https://substack.the-experimentalist.com/p/economic-currents">Economic Currents</a> if you haven&#8217;t done so previously. It only takes the 5000 largest companies to control about 86% of U.S. corporate wealth. How many games of C-suite weekend golf do you really think it takes to align a large portion of 5000 companies? Not many.</p><h2>Seizing the Means Back</h2><p>Personally I&#8217;ve been in the GenAI-cautious middle for awhile. I use it often enough for high targeted outcomes, but I&#8217;m not the gung-ho-all-in-vibe-till-you-die crowd. Doesn&#8217;t quite sound like a Nickelback verse, but close enough. I&#8217;ve used the time more for learning the plumbing and research around the edges so instead of being caught up in the new shiny I could develop my own sense of the lay of the land while minimizing the impact of hypesters trying to tell me what to think. But now I know what I think.</p><p>We either have to be better at GenAI than the big vendors themselves, or we have to have an option that completely decouples us from their influence on the economy.</p><p>Pause. Let it sink in. I&#8217;ll say it again.</p><p>We have two broad options for strategy:</p><ol><li><p>We can decide we <em>have no choice</em> but to be better at GenAI (likely in a narrow domain), and who we have to be better than is the <em>big</em> <em>vendors</em>. Not better than anybody and everybody that has ever been involved in the AI field. Not better than PhD&#8217;s that specialized in LLM-adjacent research. Better than big companies that on the one hand have massive assets, but on the other hand have all the failings of very big companies. Big companies are not nimble. They do not grow like smaller companies if those smaller companies make it past the failure rate statistics. The goal would be to feed off the economic bloat in bigger companies, but have the discipline to remain small enough to retain mobility while large enough to compensate exceptionally-skilled employees, and use that sweet spot to continually dodge around the bigger players.</p></li><li><p>We can decide that we have some entirely different course of action for which the machinations of big tech will have no relevance at all. That means either computer technology is not particularly relevant, or that somehow the unique provenance of you doing the work is what matters to your customers or employers. This is effectively &#8220;living off the grid&#8221; in terms of GenAI. What establishes the perception of your value to the marketplace can&#8217;t relate to GenAI, or in some way must be an act of rebellion against GenAI that people are willing to pay for. This would push for things like true novelty of a good or service, or something about human connection, or production of a physical good where there is cachet in it not being something imagined by an LLM and instead crafted by a person.</p></li></ol><p>I see no other broad paths out of this than those two. Each has elbow room for individual interpretation and style. What neither of them contain, at least for most of us in the skilled white-collar work force, is &#8220;the thing I used to do.&#8221; That path was shaped by larger organizations with an already-shrinking relationship to the workforce, and they are in the final stages of doing what they can to reduce it further while concentrating wealth even more.</p><p>Yesterday is gone. Let it go. Heartbreaking as the message may be for some, the choice was taken out of our hands.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://substack.the-experimentalist.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Experimentalist is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><a href="https://substack.the-experimentalist.com/p/karl-marx-would-buy-gpus">The Experimentalist : Karl Marx Would Buy GPUs</a> &#169; 2025 by <a href="https://www.linkedin.com/in/reidmpinchback/">Reid M. Pinchback</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a></p>]]></content:encoded></item><item><title><![CDATA[Economic Currents]]></title><description><![CDATA[Have large companies left room for builders/creators/makers?]]></description><link>https://substack.the-experimentalist.com/p/economic-currents</link><guid isPermaLink="false">https://substack.the-experimentalist.com/p/economic-currents</guid><dc:creator><![CDATA[Reid M. Pinchback]]></dc:creator><pubDate>Tue, 29 Jul 2025 21:33:19 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!lBCQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9435df26-b02b-4c8b-98e5-05231019e442_514x400.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lBCQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9435df26-b02b-4c8b-98e5-05231019e442_514x400.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lBCQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9435df26-b02b-4c8b-98e5-05231019e442_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!lBCQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9435df26-b02b-4c8b-98e5-05231019e442_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!lBCQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9435df26-b02b-4c8b-98e5-05231019e442_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!lBCQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9435df26-b02b-4c8b-98e5-05231019e442_514x400.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lBCQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9435df26-b02b-4c8b-98e5-05231019e442_514x400.jpeg" width="514" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9435df26-b02b-4c8b-98e5-05231019e442_514x400.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:514,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:286527,&quot;alt&quot;:&quot;AI-generated image of a ship on a stormy sea&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/169533448?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9435df26-b02b-4c8b-98e5-05231019e442_514x400.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="AI-generated image of a ship on a stormy sea" title="AI-generated image of a ship on a stormy sea" srcset="https://substackcdn.com/image/fetch/$s_!lBCQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9435df26-b02b-4c8b-98e5-05231019e442_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!lBCQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9435df26-b02b-4c8b-98e5-05231019e442_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!lBCQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9435df26-b02b-4c8b-98e5-05231019e442_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!lBCQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9435df26-b02b-4c8b-98e5-05231019e442_514x400.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Navigating career changes or the challenges of launching a new startup is hard if you don&#8217;t have a map for the forces pushing you back and form. Much of that map is established by the prevailing economic environment. Lately the world is abuzz with GenAI disruption, and while that is real, it isn&#8217;t the only time we&#8217;ve all experienced disruption. Some of the Sturm und Drang is distraction, and for the moment we&#8217;re going to set the distraction aside to look at the broader picture. In later articles we can dig into what is legitimately different now.</p><p>Diving into U.S. economic data can leave you wondering how any government could use it to intelligently steer a ship containing 340 million people, yet on the other hand also surprise you with effects that have clearly been playing out for a half century or more and thus should have been obvious to policy-makers. Put on your life vest, it&#8217;s going to be a bumpy ride.</p><div><hr></div><p><strong>Prev: <a href="https://substack.the-experimentalist.com/p/surfing-chaos">Surfing Chaos</a> | Next: <a href="https://substack.the-experimentalist.com/p/karl-marx-would-buy-gpus">Karl Marx Would Buy GPUs</a></strong></p><div><hr></div><h2>Company Size and Asset Control</h2><p>Since World War II the size of the largest companies has been increasing in the U.S. This is not merely a factor of population. It is also reflected in how assets are concentrated in the largest companies.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bYc1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6470a083-5e67-44d0-900f-bb2ad3ae0913_1033x527.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bYc1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6470a083-5e67-44d0-900f-bb2ad3ae0913_1033x527.png 424w, https://substackcdn.com/image/fetch/$s_!bYc1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6470a083-5e67-44d0-900f-bb2ad3ae0913_1033x527.png 848w, https://substackcdn.com/image/fetch/$s_!bYc1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6470a083-5e67-44d0-900f-bb2ad3ae0913_1033x527.png 1272w, https://substackcdn.com/image/fetch/$s_!bYc1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6470a083-5e67-44d0-900f-bb2ad3ae0913_1033x527.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bYc1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6470a083-5e67-44d0-900f-bb2ad3ae0913_1033x527.png" width="1033" height="527" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6470a083-5e67-44d0-900f-bb2ad3ae0913_1033x527.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:527,&quot;width&quot;:1033,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:76463,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/169533448?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6470a083-5e67-44d0-900f-bb2ad3ae0913_1033x527.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bYc1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6470a083-5e67-44d0-900f-bb2ad3ae0913_1033x527.png 424w, https://substackcdn.com/image/fetch/$s_!bYc1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6470a083-5e67-44d0-900f-bb2ad3ae0913_1033x527.png 848w, https://substackcdn.com/image/fetch/$s_!bYc1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6470a083-5e67-44d0-900f-bb2ad3ae0913_1033x527.png 1272w, https://substackcdn.com/image/fetch/$s_!bYc1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6470a083-5e67-44d0-900f-bb2ad3ae0913_1033x527.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: 100 Years of Rising Corporate Concentration, Kwon et al, Feb 2023 - page 12</figcaption></figure></div><p>What the charts represent is simple. If you add up some financially-relevant measures across businesses after filtering out all the smallest firms like sole proprietorships, then the whatever those totals are, the percentage belonging to the top 1% or 1/10th of 1% largest companies is displayed. Which measures are shown depends on the historical data available, but we&#8217;re focusing on assets (the blue series at the top).</p><ul><li><p>By 2020 the top 1% largest companies had about 97% share of all assets owned or controlled by all companies of non-trivial size.</p></li><li><p>By 2020 the top 0.1% largest had about a 86% share. That represents roughly the top 5000 largest corporations from about 1990 onwards, and fewer than 5000 in the years prior due to changes in the number of U.S. businesses over time.</p></li></ul><p>If the financial heft of companies only grew due to population, then these shares would not be increasing, they would be flat if the number of competitors had not materially changed. This is a bigger-is-better effect that derives from economies of scale likely being a strong factor, plus some mix of tax policy and regulatory impacts that favor the larger survivors as competitors are bought out or fail to maintain market share.</p><p>The rapid increase in concentration from the 1960s onwards corresponds to a significant change in corporate taxation due to the Revenue Act of 1964. It wouldn&#8217;t have been the only factor, but that introduced a 6% difference in tax rates and arguably was the beginning of post-WWII corporate tax cuts that extend to the present day. Another relevant factor would be the U.S. going off the Gold standard in 1971, which led to an increase in the role of debt in growing the U.S. economy and made it easier for larger companies to buy peers or smaller firms.</p><p>As much as we hear the politics of lower interest rates, lower tax rates, and more available credit, history suggests that the bigger players have extracted more competitive leverage from those variables than smaller firms can manage.</p><p>Do companies control just financial assets, or do they also employ in similar proportion? The reason for asking is that if bigger companies also are bigger employers then this would suggest that bigger might be better for the workforce as well.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gwjN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F980faaa9-002a-4621-9676-0851dab7ccdf_664x474.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gwjN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F980faaa9-002a-4621-9676-0851dab7ccdf_664x474.png 424w, https://substackcdn.com/image/fetch/$s_!gwjN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F980faaa9-002a-4621-9676-0851dab7ccdf_664x474.png 848w, https://substackcdn.com/image/fetch/$s_!gwjN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F980faaa9-002a-4621-9676-0851dab7ccdf_664x474.png 1272w, https://substackcdn.com/image/fetch/$s_!gwjN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F980faaa9-002a-4621-9676-0851dab7ccdf_664x474.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gwjN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F980faaa9-002a-4621-9676-0851dab7ccdf_664x474.png" width="664" height="474" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/980faaa9-002a-4621-9676-0851dab7ccdf_664x474.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:474,&quot;width&quot;:664,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:23792,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/169533448?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F980faaa9-002a-4621-9676-0851dab7ccdf_664x474.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gwjN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F980faaa9-002a-4621-9676-0851dab7ccdf_664x474.png 424w, https://substackcdn.com/image/fetch/$s_!gwjN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F980faaa9-002a-4621-9676-0851dab7ccdf_664x474.png 848w, https://substackcdn.com/image/fetch/$s_!gwjN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F980faaa9-002a-4621-9676-0851dab7ccdf_664x474.png 1272w, https://substackcdn.com/image/fetch/$s_!gwjN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F980faaa9-002a-4621-9676-0851dab7ccdf_664x474.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source 100 Years of Rising Corporate Concentration, Kwon et al, Feb 2023 - page 51</figcaption></figure></div><p>Unfortunately we&#8217;re only seeing data from 1980 onwards so commenting on long-term trend is probably not warranted, but the percentage shares do matter. The top 0.1% of companies that control 86% share of assets only employ a 41% share of the workforce; the top 1% with a 97% share of assets employ a 60% share of the workforce. That means the mere fact of businesses getting more concentrated has not generated direct employment in equal scale with their size. They&#8217;ll employ more people, but not 1-for-1 with net increases in financial concentration. Smaller companies are the ones picking up the slack in employment.</p><p>Why do we care? Because this shows that, at least within the U.S., the large corporations have successfully been eating everything <em>except</em> the available workforce. That makes them competition tough to fight head-to-head financially, yet possibly the opportunity that can be exploited if you start a business because, bluntly, eating the rich has greater reward given that they have more meat on the bone. For job-hunters, smaller firms could be the more probable source of employment since larger firms are more structured for increasing their financial gains per person, less so on increasing gains by increasing their workforce.</p><p>This also begs the question: why do states chase the biggest companies with large tax breaks? Without accounting very clearly for the specific type of business activity that will be gained by a deal, and how local supply chains will be benefit, encouraging smaller companies instead looks like a more effective employment strategy.</p><h2>Smaller Companies Have to Fight Harder</h2><p>You&#8217;ve probably heard that small business is the engine of job creation in the U.S. There is truth to it, as the chart above on employment share suggests. Unfortunately. </p><p>I say &#8220;unfortunately&#8221; for two reasons:</p><ol><li><p>Smaller companies have to fight ever-harder to survive. Various studies have examined this, and the TL;DR is that the longer a small company survives the more likely it is to continue to be a survivor, but the initial years see a lot of wreckage. Perhaps not surprisingly, as the bigger companies gain more share of economic assets, the smaller firms lose share.</p></li><li><p>If the largest 1% of companies control 97% of assets, then that means the bottom 97% of companies only control 3% of financial assets. That&#8217;s not a lot of financial support to work with in order to be a supposed &#8220;engine.&#8221; That&#8217;s a weed whacker.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VXd5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eccc96c-582f-4514-bcfe-9d38ce7f6ba9_859x389.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VXd5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eccc96c-582f-4514-bcfe-9d38ce7f6ba9_859x389.png 424w, https://substackcdn.com/image/fetch/$s_!VXd5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eccc96c-582f-4514-bcfe-9d38ce7f6ba9_859x389.png 848w, https://substackcdn.com/image/fetch/$s_!VXd5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eccc96c-582f-4514-bcfe-9d38ce7f6ba9_859x389.png 1272w, https://substackcdn.com/image/fetch/$s_!VXd5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eccc96c-582f-4514-bcfe-9d38ce7f6ba9_859x389.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VXd5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eccc96c-582f-4514-bcfe-9d38ce7f6ba9_859x389.png" width="859" height="389" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3eccc96c-582f-4514-bcfe-9d38ce7f6ba9_859x389.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:389,&quot;width&quot;:859,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:65677,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/169533448?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eccc96c-582f-4514-bcfe-9d38ce7f6ba9_859x389.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VXd5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eccc96c-582f-4514-bcfe-9d38ce7f6ba9_859x389.png 424w, https://substackcdn.com/image/fetch/$s_!VXd5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eccc96c-582f-4514-bcfe-9d38ce7f6ba9_859x389.png 848w, https://substackcdn.com/image/fetch/$s_!VXd5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eccc96c-582f-4514-bcfe-9d38ce7f6ba9_859x389.png 1272w, https://substackcdn.com/image/fetch/$s_!VXd5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eccc96c-582f-4514-bcfe-9d38ce7f6ba9_859x389.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: The Role of Entrepreneurship in US Job Creation and Economic Dynamism, Decker et al, Summer 2014 - page 16</figcaption></figure></div><p>I don&#8217;t have data to extend the figure to earlier time periods, but from other data on new company creation per decade I strongly suspect the figure above would show this decline began at least 1 or 2 decades earlier, probably once the wave of activity from the post-WWII GI Bill had finished fully washing through the economic system.</p><p>What this likely shows is either a reduction in people taking on the risk of starting new businesses that need to hire employees, or more rapid buyouts, or an acceleration in failure rates in the early years. Most likely it is a combination of the three.</p><p>Small new entrants are experiencing an increasingly tough game. The survival statistics on small companies are brutal, so while small companies may be &#8220;the engine,&#8221; the employees in them are experiencing the turnover face-first with firms entering (being created) and exiting (failed or bought out). These small firms either have to go for broke in the mad scramble to be big enough to win the race on the survival statistics, or take a buyout and call it a day, or they have to find a niche and somehow entirely avoid all of this. The latter is something I intend to get into in other articles as the GenAI world may change some thinking in this space, but for now understand that collectively these figures hint at why you see so much venture capital pressure on small startups. From a VC standpoint, it is literally &#8220;go big or go home.&#8221;</p><h2>Declining Job Growth</h2><p>There is another factor that it is important to see, and this is one that may end up telling the future story of U.S. economic positioning within the broader global macroeconomic landscape.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lbHy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1a59dc-be0f-4200-a30a-7829d33fddea_873x497.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lbHy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1a59dc-be0f-4200-a30a-7829d33fddea_873x497.png 424w, https://substackcdn.com/image/fetch/$s_!lbHy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1a59dc-be0f-4200-a30a-7829d33fddea_873x497.png 848w, https://substackcdn.com/image/fetch/$s_!lbHy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1a59dc-be0f-4200-a30a-7829d33fddea_873x497.png 1272w, https://substackcdn.com/image/fetch/$s_!lbHy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1a59dc-be0f-4200-a30a-7829d33fddea_873x497.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lbHy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1a59dc-be0f-4200-a30a-7829d33fddea_873x497.png" width="873" height="497" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8e1a59dc-be0f-4200-a30a-7829d33fddea_873x497.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:497,&quot;width&quot;:873,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:70630,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/169533448?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1a59dc-be0f-4200-a30a-7829d33fddea_873x497.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lbHy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1a59dc-be0f-4200-a30a-7829d33fddea_873x497.png 424w, https://substackcdn.com/image/fetch/$s_!lbHy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1a59dc-be0f-4200-a30a-7829d33fddea_873x497.png 848w, https://substackcdn.com/image/fetch/$s_!lbHy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1a59dc-be0f-4200-a30a-7829d33fddea_873x497.png 1272w, https://substackcdn.com/image/fetch/$s_!lbHy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e1a59dc-be0f-4200-a30a-7829d33fddea_873x497.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: The Role of Entrepreneurship in US Job Creation and Economic Dynamism, Decker et al, Summer 2014 - page 14</figcaption></figure></div><p>This figure will take a little bit of explaining.</p><ul><li><p>The darker jagged series is how many new jobs were created in that year. These would be companies expanding employment plus new companies appearing. Since these are annual numbers, there should be no seasonal adjustment.</p></li><li><p>The lighter jagged series is how many jobs were lost. Either companies were reducing employment or companies were failing. Again, these should not be seasonally adjusted.</p></li><li><p>The difference between the darker and lighter series is now many net jobs were added to the economy if that number is positive; if it is negative (the lighter series was above the darker) then that is how many net jobs were lost.</p></li><li><p>The darker and lighter dashed lines are smoothed versions of the noisier data.</p></li><li><p>When the economy is expanding then the darker and lighter dashed lines move further apart positively.  When the economy is contracting they move closer together or negatively.</p></li><li><p>The downward trend on both means that, on average, the rate of net job creation is declining. How much that matters depends on whether population growth rate is also declining &#8212; which it mostly is, slightly (Source: <a href="https://www.brookings.edu/articles/new-census-estimates-show-a-tepid-rise-in-u-s-population-growth-buoyed-by-immigration/">The Brookings Institution</a>) &#8212; but the slope shown here is a bit stronger than that and so likely reflects changes in the unemployment and the participation rates.</p></li><li><p>I examined the Bureau of Labor statistics for the years since 2011. You get expected distortion around economic events like COVID that trigger liquidity injection, but mostly the story of a declining rate of net job growth holds true.</p></li></ul><p>At a minimum this highlights one reason why political discourse can be fraught. The expansionary phases in this situation are a relatively modest and temporary effect compared to the overall trend. It is entirely possible that at some point not too far off the prevailing economic philosophy in the U.S. will have to set aside an assumption of endless growth by large companies. Either smaller companies will have to increase in representation within the economy, or the U.S. will have to shift to a degrowth economic model and possibly manage policy to encourage population decline.</p><h2>The Small-Cap Upside</h2><p>There actually is a slight upside in spite of the picture painted above. Small companies have at least one advantage over many larger companies, thus the idea of smaller players returning to economic prominence isn't entirely outlandish. </p><p>On average, small companies are more profitable: U.S. large caps returned 6.6% while U.S. small caps returned 9.2% annually over a 20-year period (Source: <a href="https://alphaarchitect.com/the-gap-between-large-and-small-companies-is-growing-why/">Alpha Architect</a>). The spread between them used to be larger, but the reason for large-cap profitability gains largely goes back to earlier points about advantages of scale and tax treatment. So long as some advantage remains, U.S. small companies should remain a focus of VC funding.</p><p>A reasonable question which I won&#8217;t get into here is how much of that large-cap profitability improvement is due to organic factors such as actually creating new products and services, versus financially-engineered improvements like M&amp;A and stock buybacks that show up mostly in accounting statements. Making numbers better of course matters to investors, but it isn&#8217;t the same kind of lasting value creation you get from building new things you could actually poke with a stick, and usually lacks the same long-term growth potential.</p><p>If interest rates and inflation remain elevated, both of which have some fundamental reasons for being the case, then those forces may favor companies with real organic growth over those that only appear to grow via financial engineering. Financial engineering may not grow value faster than inflation because you have to keep discovering more and more accounting tricks to extract value out of an otherwise fixed situation. Like, for example, the recent wave of layoffs blamed on AI.</p><h2>Fit the Puzzle Pieces Together</h2><p>Pause to see how the components of the system are fitting together.</p><ul><li><p>A significant share of employment comes from smaller companies, not just the largest 1%.</p></li><li><p>Small companies drive net job gains in the economy.</p></li><li><p>The largest 1% are eating all the financial assets.</p></li><li><p>Share of job creation activity in young firms is declining.</p></li><li><p>The rate of net job creation in the economy is declining.</p></li></ul><p>This is why in the opening I said &#8220;&#8230; effects that have clearly been playing out for a half century or more and thus should have been obvious to policy-makers.&#8221; Making policy that cozies up to the largest firms is not effective economic management. It is, if anything, almost the exact opposite of what is needed because it mostly maintains the status quo instead of improving upon it. Small and mid-sized firms are the need, and possibly the opportunity because &#8212; let&#8217;s be real about the implications here &#8212; the biggest firms are running out of financial activity share to eat. Whatever the largest eat will turn into something less profitable, with lower growth, and lower employment. It&#8217;s time for the small fish to remember that piranhas can eat things too.</p><p>The potential for organic growth in smaller ventures is an area of likely advantage for GenAI builder-folk that I want to dig into with future articles.</p><h2>References</h2><ul><li><p><a href="https://bfi.uchicago.edu/working-paper/100-years-of-rising-corporate-concentration/">The University of Chicago, BFI Working Paper No. 2023-20 / 100 Years of Rising Corporate Concentration, Kwon et al.</a></p></li><li><p><a href="https://pubs.aeaweb.org/doi/pdfplus/10.1257/jep.28.3.3">Journal of Economic Perspectives, Vol. 28, No. 3, Summer 2014 / The Role of Entrepreneurship in US Job Creation and Economic Dynamism, Decker et al.</a></p></li><li><p><a href="https://www.brookings.edu/articles/new-census-estimates-show-a-tepid-rise-in-u-s-population-growth-buoyed-by-immigration/">The Brookings Institution / New census estimates show a tepid rise in U.S. population growth, buoyed by immigration, William H. Frey, January 4, 2023.</a></p></li><li><p><a href="https://alphaarchitect.com/the-gap-between-large-and-small-companies-is-growing-why/">Alpha Architect / The Gap Between Large and Small Companies is Growing. Why?, Larry Swedroe, March 5th, 2020.</a></p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://substack.the-experimentalist.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Experimentalist is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><a href="https://substack.the-experimentalist.com/p/economic-currents">The Experimentalist : Economic Currents</a> &#169; 2025 by <a href="https://www.linkedin.com/in/reidmpinchback/">Reid M. Pinchback</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a></p>]]></content:encoded></item><item><title><![CDATA[Surfing Chaos]]></title><description><![CDATA[Windsurf proves startups are complicated for GenAI employees]]></description><link>https://substack.the-experimentalist.com/p/surfing-chaos</link><guid isPermaLink="false">https://substack.the-experimentalist.com/p/surfing-chaos</guid><dc:creator><![CDATA[Reid M. Pinchback]]></dc:creator><pubDate>Thu, 17 Jul 2025 03:27:03 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!_MBb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c7c5797-a3be-4416-b34b-296fc116594e_514x400.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_MBb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c7c5797-a3be-4416-b34b-296fc116594e_514x400.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_MBb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c7c5797-a3be-4416-b34b-296fc116594e_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!_MBb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c7c5797-a3be-4416-b34b-296fc116594e_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!_MBb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c7c5797-a3be-4416-b34b-296fc116594e_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!_MBb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c7c5797-a3be-4416-b34b-296fc116594e_514x400.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_MBb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c7c5797-a3be-4416-b34b-296fc116594e_514x400.jpeg" width="514" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3c7c5797-a3be-4416-b34b-296fc116594e_514x400.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:514,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:328418,&quot;alt&quot;:&quot;AI-generated image of a software engineer dodging a shark&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/168523746?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c7c5797-a3be-4416-b34b-296fc116594e_514x400.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="AI-generated image of a software engineer dodging a shark" title="AI-generated image of a software engineer dodging a shark" srcset="https://substackcdn.com/image/fetch/$s_!_MBb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c7c5797-a3be-4416-b34b-296fc116594e_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!_MBb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c7c5797-a3be-4416-b34b-296fc116594e_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!_MBb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c7c5797-a3be-4416-b34b-296fc116594e_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!_MBb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c7c5797-a3be-4416-b34b-296fc116594e_514x400.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Ever since the Windsurf debacle, I&#8217;ve found myself thinking even more about what the tech startup world is going to look like going forward.</p><div><hr></div><p><strong>Prev: <a href="https://substack.the-experimentalist.com/p/genai-reimagined">GenAI Reimagined</a> | Next: <a href="https://substack.the-experimentalist.com/p/economic-currents">Economic Currents</a></strong></p><div><hr></div><p>The problem I have with everything that happened relates to the employee situation. When people join an early-stage tech startup, usually they are pressured to negotiate a smaller salary in the expectation that a bigger payoff will happen later. That payoff event is referred to as the &#8220;exit.&#8221; The exit might be an acquisition by a bigger company, or it might be an IPO. Worst case (other than failure), the most significant Venture Capital (VC) investor merges smaller deals together that aren&#8217;t entirely failing but aren&#8217;t growing fast enough on their own. If that happens it probably won&#8217;t count as an exit and instead founders and employees will end up with a diluted position in the resulting entity.</p><p>However it happens, founders and those employees accepting the smaller salary eventually require an exit with enough money flying around or they won&#8217;t get that payoff. When it takes place, everybody is not an equal participant in the end game:</p><ul><li><p>Founders and all the various employees will have different percentages of equity.</p></li><li><p>Employees might have options instead of equity. These won&#8217;t have any meaningful value unless the per-share exit price is above the options price.</p></li><li><p>Employees may have equity in different classes of common stock than the founders. This might put them in a different position for how payoffs are computed or what their respective positions mean for an IPO.</p></li><li><p>There may be vesting terms, where the employees aren&#8217;t able to benefit from that payoff until a certain period of time beginning from their employment date. If they quit or are fired before then, they may get nothing.</p></li><li><p>In the case of an IPO there may be additional requirements for founders and employees to not liquidate stock until a certain amount of time after the stock is first listed on the market. </p></li></ul><p>Standing in front of founders and employees and expecting to get paid first are all the pre-IPO investors. They will have established legal status ensuring that they get paid first, or at least part of their payment happens first. The mechanism is referred to as an investor preference. The more series of funding rounds that took place, the more investor preferences there are standing in front of both the founders and all the non-founder employees. Like I said, it can take a lot of money flying around before you make that extra buck.</p><p>When the Windsurf deal with Google happened, the effect was to gut the company of key founders and R&amp;D talent. I&#8217;ve been hunting and haven&#8217;t been able to find clear details on how that transaction took place, but since there hasn&#8217;t been news of lawsuits flying around it&#8217;s a pretty safe bet that the investors were all made whole with a reasonable profit. The founders and staff that moved to Google also would certainly have received something, or there would be no motivation to accept the deal.</p><p>That left all the remaining people sitting in the shell of a tech company. First they would have heard about a possible OpenAI deal, only to have their hopes crushed. Then Google appeared, and again their hopes were crushed. It had to have been a pretty brutal sequence of emotional let-downs. Fortunately in a market hot for talent with anything vaguely like an AI pulse, a last-minute deal happened where Cognition AI bought what was left.</p><p>To Cognition AI&#8217;s credit it sounds like they are doing something reasonable to honor the equity or option positions of the Windsurf staff. This does not appear to be a full exit though, as there has been no reporting indicating that the staff are getting fistfuls of cash. What apparently has happened is that they are back to square one, waiting on their exit. Now it&#8217;ll be the Cognition AI exit, instead of the Windsurf exit.</p><p>The only improvement in the employee situation I&#8217;ve found is that any previous vesting terms from Windsurf or any vesting policies of Cognition AI have been waved, and those employees are now fully vested. It was a reasonable adjustment, and Cognition AI deserves some kudos for it, but I don&#8217;t think it changes how the original Windsurf founders seemingly threw all of those people under the bus by denying them participation in an exit. It&#8217;s a reduction in damage, but in my opinion it is not a re-writing of history that unwinds what very much smells like a breach of trust. Whether done in intent, or done in an utter absence of fiduciary peripheral awareness of your team, that experience must have really stung for those left behind.</p><p>The tech startup world has an established history of grimy outcomes, so it isn&#8217;t like this is a new situation. What may be new, however, is the pace at which these take place in a GenAI world. I believe this is something all participants &#8212; founders, employees, and investors &#8212; need to begin planning around. I hope to get into the implications in future articles, so stay tuned.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://substack.the-experimentalist.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Experimentalist is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><a href="https://substack.the-experimentalist.com/p/surfing-chaos">The Experimentalist : Surfing Chaos</a> &#169; 2025 by <a href="https://www.linkedin.com/in/reidmpinchback/">Reid M. Pinchback</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a></p>]]></content:encoded></item><item><title><![CDATA[Critiquing the METR Productivity Study]]></title><description><![CDATA[Everybody and their uncle is making lazy posts on a study of developer productivity]]></description><link>https://substack.the-experimentalist.com/p/critiquing-the-metr-productivity</link><guid isPermaLink="false">https://substack.the-experimentalist.com/p/critiquing-the-metr-productivity</guid><dc:creator><![CDATA[Reid M. Pinchback]]></dc:creator><pubDate>Tue, 15 Jul 2025 03:49:41 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!VDe5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9cac17c-1eca-4d0c-bbea-030229618207_514x400.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VDe5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9cac17c-1eca-4d0c-bbea-030229618207_514x400.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VDe5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9cac17c-1eca-4d0c-bbea-030229618207_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!VDe5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9cac17c-1eca-4d0c-bbea-030229618207_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!VDe5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9cac17c-1eca-4d0c-bbea-030229618207_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!VDe5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9cac17c-1eca-4d0c-bbea-030229618207_514x400.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VDe5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9cac17c-1eca-4d0c-bbea-030229618207_514x400.jpeg" width="514" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a9cac17c-1eca-4d0c-bbea-030229618207_514x400.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:514,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:299558,&quot;alt&quot;:&quot;AI-generated image of a productive software engineer&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/168317537?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9cac17c-1eca-4d0c-bbea-030229618207_514x400.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="AI-generated image of a productive software engineer" title="AI-generated image of a productive software engineer" srcset="https://substackcdn.com/image/fetch/$s_!VDe5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9cac17c-1eca-4d0c-bbea-030229618207_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!VDe5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9cac17c-1eca-4d0c-bbea-030229618207_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!VDe5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9cac17c-1eca-4d0c-bbea-030229618207_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!VDe5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9cac17c-1eca-4d0c-bbea-030229618207_514x400.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Anybody who has seen my posts or comments on LinkedIn within the GenAI and ML space might have observed a few things:</p><ul><li><p>I&#8217;m neither strictly for nor against LLMs, but I&#8217;m not a fan of influencer-fed hype shoving things down a developer&#8217;s throat and in ignorance of the context in which that particular developer works. The standard measure of productivity should be productivity itself, not imagined proxies for productivity.</p></li><li><p>I believe we&#8217;ll need more than merely DNN (Deep Neural Network) LLMs as the core of the token generation and reasoning mechanism, but DNNs have made solid progress and thus are likely to stick around for awhile.</p></li><li><p>I dislike crappy science, but I dislike crappy reporting about science even more than I dislike crappy science.</p></li></ul><p>This is a post I had not intended to tackle so soon, but these crappy-reporting situations are coming up constantly. It doesn&#8217;t matter if you&#8217;re talking about the more mainstream media outlets, or the hives of self-promotion activity that you get in places like LinkedIn.  It is past time somebody started at least skimming a paper they wrote about, instead of all these mindless confirmation-bias exercises we&#8217;re constantly exposed to. I don&#8217;t care if you&#8217;re anti-LLM, or if you are pro-LLM, but if your side is right it should still be right when you read and report on papers fairly. If you have to write lazy posts to justify your position, your position is weak sauce. Seriously, many influencers need to learn how to make a better case.</p><p>As a result, this is the start of an ongoing series I&#8217;m tagging &#8220;<a href="https://substack.the-experimentalist.com/t/research-critiques">Research Critiques</a>&#8221;. Collectively we should all be more informed than many influencers want us to be. The goal is to understand and draw intelligent conclusions in any direction a paper legitimately supports, not just mindlessly cheer for a team. I&#8217;m not going to reproduce the contents of these papers, but point out some key items, and nudge you into at least skimming the paper yourself. What you will be seeing here won&#8217;t be the result of days of painstaking reading, it will just be from actually bothering to look at all. If you do the same, you can form your own conclusions of a paper. Your conclusions should be yours, not mine, and not some random influencer that never even looked at the paper. The practice is worthwhile, and doesn&#8217;t have to take long.</p><p>Enter the METR study: <a href="https://metr.org/Early_2025_AI_Experienced_OS_Devs_Study.pdf">Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity</a>. Everybody that is anti-LLM has been dog-piling on this with posts that barely do anything to properly lay out where this study fits, which conclusions are reasonable to draw, which ones aren&#8217;t reasonable, and which ones even the authors themselves are not attempting to argue.</p><h2>Summary</h2><p>The study involves 16 developers and 246 tasks with data recorded for 143 hours total with 10-second resolution (although data cleaning reduced that to 84 hours).</p><p>Why do you care? You care because depending on the question asked, you might look at this as providing either 16 samples, or 246 samples, or 16 x 246 = 3936 samples, or 84 x  (60 * 60 / 10) = 30,240 samples.</p><p>The 16 is easy to focus on and reasonable to note that it is pretty small, but not all studies involve large numbers of participants. That alone does not invalidate a research effort. Sometimes it just means you might categorize the effort as exploratory research that should be followed up with additional studies before strong conclusions are accepted as truth. Research studies are expensive, so there is value to early exploratory research that helps better frame efforts in later studies.</p><p>One thing this study includes is it reports how much of the developer time was related to coding. I like that it does this, other papers I&#8217;ve seen did not. The answer here is 29%. This is good to know, because when you see posts about how much of an impact LLMs had on coding they usually only measure coding and not what that means per day of an employee&#8217;s work activity. If coding is 10% of your day, and an LLM helps you code 10% faster, then the LLM would only have a 1% impact on your daily routine. Maybe there are other LLM impact numbers for the non-coding activity too, and if so they should be included and computed appropriately. Context matters.</p><p>The headline numbers you might see tossed around are this:</p><ul><li><p>Self-estimates made by developers suggested a 20% productivity gain.</p></li><li><p>Actual productivity measured by the researchers showed a 19% productivity loss.</p></li></ul><p>Before you run for your preferred flavor of torch or pitchfork, there is more to think about in the study.</p><h2>Funding</h2><p>It&#8217;s always worth taking a look at the backstory on funding for a paper. In the case of this study that&#8217;s provided in the <a href="https://metr.org/about#partnerships">Partnerships section of the METR About page</a>. The connections to the AI industry seem modest, and geared to the same level that academic institutions would likely receive from many tech companies so that their respective products or services could be used without the deep pockets of a commercial budget.</p><h2>Study Design</h2><p>Study participants performed work on their own pre-existing open-source projects. There is some good and perhaps some bad to this, but it&#8217;s a reasonable exercise. It eliminates questions of seeing a benefit only when doing greenfield coding. The study is measuring what it means to bring GenAI into an existing non-trivial situation. For experienced engineers our greenfield opportunities tend to be on the light side, so I appreciate this aspect of the study design. It makes what is measured likely to have relevance to more of us. The bad angle I guess I would argue is that the open-source projects chosen were high-profile (23,000+ stars), and such projects usually come with strong curation practices that are much less typical in the codebases most engineers work on.</p><p>The participants varied in their previous LLM experience, and the study captures this. This is an area where the 16 number on sample size feels like it has relevance. It seems a little on the light side to indicate how LLM experience influences the outcome.  I would rather something like at least 70 participants in 7 cohorts of 10, each cohort of increasing experience, in order to make really strong statements about how more or less AI experience mattered and to observe if the impact of that variable capped out in either direction. This observation could apply for other variables, and follow-up studies should consider picking a smaller number of variables but exploring them more deeply.  I&#8217;m not criticizing the paper over this, I&#8217;m suggesting that readers (and influencers) shouldn&#8217;t draw conclusions bigger than the data.</p><p>The tooling use was primarily Cursor Pro paired with Claude Sonnet. Only 44% of the developers had previous experience with the IDE. I consider this a potentially weak aspect of the study, but to be fair the analysis does perform some examination on this issue and concludes it wasn&#8217;t an issue. I remain unconvinced and would rather see the matter explored in more detail. Any developer that has transitioned from a familiar to an unfamiliar IDE has experienced that you can lose a lot of productivity for a few weeks. With an almost 50/50 split in tooling experience, I would expect productivity between the two groups to change over time in different data subsets. For now I&#8217;m less clear on which issues the study is definitely measuring LLM usage impact. Several of variables have that whiff of being confounders for other measurements, which suggests benefit to taking more care with them in more detailed studies.</p><h2>Some Findings</h2><p>There is a lot of material in the document, particularly in the Factor Analysis appendix (starting on page 18), so I want to pull out some items to show how this study can easily be viewed as &#8220;for&#8221; or &#8220;against&#8221; LLM use in coding, depending on how somebody decides to cherry-pick from it.</p><h3>Section: Forecasted vs Actual Impact</h3><p>This is a case where I believe you&#8217;re seeing the study through the lens of 16 x 246 = 3936 samples. Developers were not good at estimating the impact of AI, and experts not involved in the coding were even worse. I consider this an important result mostly for sifting through a lot of the online noise we see every day. Anecdotal reporting may be more driven by an emotional perspective than time-clock reality.</p><p>I&#8217;m not saying anything about LLM-supported productivity here, I&#8217;m saying something about human beings and all that weird wetware we have between our ears. The way to measure productivity is to measure productivity; anecdotes that aren&#8217;t reporting an actual carefully-examined productivity measurement are not themselves a productivity measurement.</p><h3>Section: Extended Discussion</h3><p>Page 17 provides a table of issues that the authors are very definitely not trying to make statements about. One in particular I believe warrants noticing:</p><blockquote><p><strong>We do not provide evidence that:</strong><br>There are not ways of using existing AI systems more effectively to achieve positive speedup in our exact setting.</p><p><strong>Clarification:</strong><br>Cursor does not sample many tokens from LLMs, it may not use optimal prompting/scaffolding, and domain/repository-specific training/finetuning/few-shot learning could yield positive speedup</p></blockquote><p>To me this goes right back to my earlier comments about developer familiarity with the tooling. Developers that live with these tools evolve their CLAUDE.md file and care for it like a first-born child. Sometimes the level of detail can border on the silly because the prompting is as complicated as the artifacts that would be generated&#8230; but be that as it may, there is likely to be a big difference between those who live in this ecosystem versus those just recently introduced to it, because the former may have in essence macro&#8217;d better outcomes as part of their routine workflow.</p><p>The authors absolutely do acknowledge issues like this and do so more than once (including the section &#8220;Below-average use of AI tools&#8221; starting on page 23), but I feel like this one is pretty integral to a study on GenAI productivity and should have been controlled for. As a mental-model comparison, imagine a study on Python developer productivity where you observed the time to create properly-baked wheels but didn&#8217;t take note of which developers already had experience with all the rough-edged pain of Python library and application packaging versus which ones were being dropped into the soup for the first time.</p><h3>Section: High developer familiarity with repositories</h3><p>The findings here (starting on page 18) probably won&#8217;t be surprising to experienced coders. Developers already well-versed in a particular code issue weren&#8217;t particularly helped by an LLM while working on it. The assistance from an LLM was more material when dealing with something unfamiliar and for which developers needed reference material outside the current codebase.</p><p>It seems reasonable to note that if a company wants to predict the impact of LLM use on any given development team, it likely depends on the rate of novel change in the codebase. As efforts shift to maintenance and operations, we accrue more context and thus the opportunities may shrink or turn negative for LLMs to be a net performance gain over human knowledge and skill that is sitting there ready to be tapped. This might also apply to very deep expertise with specific technology components.</p><h3>Section: Low AI Reliability</h3><p>This material beginning on page 19 is worth noting. The developers in the study were being careful about PR quality in submissions, because these were codebases they were familiar with, invested in, and knew the community standards for. This is why I called out earlier that the 23,000+ star projects may not represent a typical engineering situation for study purposes.</p><p>Developer reporting is pretty clear about how much effort they had to put into the PRs, and so it isn&#8217;t immediately apparent why this wasn&#8217;t as strongly reflected in their self-reported estimates for LLM performance impact. I believe the effort finding does at least speak to the differences in performance anecdotes we hear contrasting experienced engineers and na&#239;ve coders using LLMs, which makes sense because the former group will have more habits for being less tolerant of low code quality. Whether that is good or bad is in the eye of the beholder, but it is going to be reflected in actual performance outcomes if engineers maintain that perspective. In the extreme case of popular open-source projects the engineers definitely must have such views unless the project community decides to revise its policies, but that influence may be somewhat moderated in other domains.</p><h2>Wrap-Up</h2><p>There is more you can dig into by <a href="https://metr.org/Early_2025_AI_Experienced_OS_Devs_Study.pdf">reading the study yourself</a>, and you should. I hope that at a minimum you will see the study has some substance, some limitations, seems fair in its perspective for the issues it chose to examine, but that it is fundamentally an exploratory effort and should be considered in that light. Influencer opportunism should not be the basis for deciding its merits, or for painting it as having more meaning than it deserves or that the authors themselves intended.</p><h2>Edits</h2><ul><li><p><strong>2025&#8211;07&#8211;28:</strong> Weakened the wording over concerns about the relationship between variables over time. I decided the concerns were overstated relative to reported findings unless I had data to examine.</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://substack.the-experimentalist.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Experimentalist is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><a href="https://substack.the-experimentalist.com/p/critiquing-the-metr-productivity">The Experimentalist : Critiquing the METR Productivity Study</a> &#169; 2025 by <a href="https://www.linkedin.com/in/reidmpinchback/">Reid M. Pinchback</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a></p>]]></content:encoded></item><item><title><![CDATA[GenAI Reimagined]]></title><description><![CDATA[A counterculture faces off against... something]]></description><link>https://substack.the-experimentalist.com/p/genai-reimagined</link><guid isPermaLink="false">https://substack.the-experimentalist.com/p/genai-reimagined</guid><dc:creator><![CDATA[Reid M. Pinchback]]></dc:creator><pubDate>Mon, 07 Jul 2025 07:09:52 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!FCH-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7007fa92-0dc2-46a0-b5d8-f612e5c5baba_514x400.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FCH-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7007fa92-0dc2-46a0-b5d8-f612e5c5baba_514x400.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FCH-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7007fa92-0dc2-46a0-b5d8-f612e5c5baba_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!FCH-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7007fa92-0dc2-46a0-b5d8-f612e5c5baba_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!FCH-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7007fa92-0dc2-46a0-b5d8-f612e5c5baba_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!FCH-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7007fa92-0dc2-46a0-b5d8-f612e5c5baba_514x400.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FCH-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7007fa92-0dc2-46a0-b5d8-f612e5c5baba_514x400.jpeg" width="514" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7007fa92-0dc2-46a0-b5d8-f612e5c5baba_514x400.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:514,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:288402,&quot;alt&quot;:&quot;AI-generated image of a counter-culture demonstrator&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/167695971?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7007fa92-0dc2-46a0-b5d8-f612e5c5baba_514x400.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="AI-generated image of a counter-culture demonstrator" title="AI-generated image of a counter-culture demonstrator" srcset="https://substackcdn.com/image/fetch/$s_!FCH-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7007fa92-0dc2-46a0-b5d8-f612e5c5baba_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!FCH-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7007fa92-0dc2-46a0-b5d8-f612e5c5baba_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!FCH-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7007fa92-0dc2-46a0-b5d8-f612e5c5baba_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!FCH-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7007fa92-0dc2-46a0-b5d8-f612e5c5baba_514x400.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>It&#8217;s been a week since my initial flurry of posts, and it hasn&#8217;t been due to a lack of effort with research and writing. On the contrary, I&#8217;ve been exploring different angles in the hopes they might better illuminate what we &#8212; the great unwashed masses of skilled professionals &#8212; need to do to navigate a rapidly changing world.</p><div><hr></div><p><strong>Prev: <a href="https://substack.the-experimentalist.com/p/on-the-turn-of-a-phrase">On the Turn of a Phrase</a> | Next: <a href="https://substack.the-experimentalist.com/p/surfing-chaos">Surfing Chaos</a></strong></p><div><hr></div><p>As I&#8217;m a data guy, my initial hopes were to find data that would tell a story. Not data that could be arm-wrestled into just supporting a narrative I&#8217;d pre-determined, but real data taken at face value and used to tease out some &#8220;ah hah&#8221; moments. While I found some interesting economic views on the human condition, none fit the bill.</p><p>I&#8217;ll skip a long backstory and sum up the research. The TL;DR according to <a href="https://gitlab.com/The-Experimentalist/article_support_2025/-/blob/main/article/2025-07-07-genai-reimagined/UBS-GWR-2025.pdf?ref_type=heads">the 2025 UBS Global Wealth Report</a> (which examines 2024 data) is:</p><ul><li><p><strong>The top 1.6% of the adult population controls 48.1% of the wealth.</strong> These are people with a net wealth of $1 million USD and up. The overwhelming majority would be nearer the $1M level as the pyramid gets extremely narrow as you approach the tip.</p></li><li><p><strong>The bottom 40.7% of the adult population controls 0.6% of the wealth.</strong> The net wealth per adult in the band is $10 thousand USD and down.</p></li></ul><p>What I actually wanted was data showing the number of decision makers controlling the economic fate of a significant portion of the population so that I could then reason about how GenAI-motivated layoffs might play out. I gradually realized I was facing a level of effort comparable to writing a Master&#8217;s thesis. Someday perhaps, but not today.</p><h2>The Step Back</h2><p>This began with something that has nagged at me for awhile about GenAI. I couldn&#8217;t put a finger on it exactly, but it was starting to seem perhaps data wasn&#8217;t the leverage I was looking for. It was entirely possible I was hunting down a dynamic that data alone wouldn&#8217;t be good at capturing, or at least not yet.</p><p>I&#8217;ve been slinging a keyboard professional for a long time. Multiple decades, and that multiple is not the number &#8220;two.&#8221; I&#8217;ve seen many different technology transitions, and I&#8217;ve seen industry reactions to those transitions. Booms and busts, marketing hype and book publishing waves, software ports, environment migrations, second-system re-engineering efforts, and process overhauls galore. I&#8217;ve seen just about all there is to toss at the industry since we were stumbling around on early Unix workstations and the advent of PCs. GenAI isn&#8217;t like the waves I&#8217;ve seen before.</p><p>The light bulb moment, for me at least, came from deciding what wasn&#8217;t relevant to that nagging feeling.</p><ul><li><p><strong>It wasn&#8217;t about the technology.</strong> As technology complexity goes, LLMs barely rate. It is a simple technology concept done at extremely large scale. Sure, there is ongoing evolution and improvement, and now more mathematical examination of how training works and the parameter topologies that might induce, but I could list dozens, maybe hundreds, of examples in software and hardware that have much more subtle or complex characteristics once you set mere scale aside.</p></li><li><p><strong>It wasn&#8217;t about the range of application.</strong> Every programming language compiler or runtime, every well-developed application framework, every expansive library ecosystem, collectively have massive range of application.</p></li><li><p><strong>It wasn&#8217;t even the economics</strong>, although excluding that entirely didn&#8217;t seem quite right either. Not that the economics of the LLM doesn&#8217;t tell a story, but any rational person in this space knows the story is complex and very context-dependent. Have you seen the world around us? How often do people get truly moved by the complex and context-dependent? It&#8217;s the social kryptonite of the millennium.</p></li></ul><p>What was left was&#8230; something else. And I think that something else slithered in under the cover of other stress factors in the economic and political zeitgeist.</p><p>More than any other technology-related event other than perhaps cryptocurrency, for some reason, GenAI has people &#8220;living in their feels.&#8221; I&#8217;m not saying people aren&#8217;t being intelligent about GenAI in either direction of the various debates. I&#8217;m pointing out that the volume knob on emotional energy is cranked up very, very high and that alone is an artifact worth taking note of.</p><p>It isn&#8217;t that we never see emotion in technology issues, but historically they have been confined to small camps. Camps that most of us, truth be told, learned to avoid or tune out because they were just so obviously annoying, over-the-top, and not productive in helping us with our work at hand. The more experienced in the technology space you are, usually the more jaded you get because you learn through experience that almost all problems can be approached with many choices of tools, plus employment environments typically bias the tool choice and you either get with the program or you get a different job. </p><p>What we have instead are very broad-based coalitions advocating GenAI as not only the &#8220;one true way&#8221; but further reinforced with the message that the world divides into camps that either &#8220;get it&#8221; or &#8220;don&#8217;t get it.&#8221; This is not really the history of engineering or technology as lived by practitioners, where experience has been &#8220;found a better wrench for times when a wrench is useful, but tomorrow I may need a saw.&#8221;</p><p>There is, however, a dynamic in history that does cleanly fit this particular emotion-laden pressure for change plus division versus conformity. The counterculture.</p><h2>Hoping for Change</h2><p>We hear language like this in periods when part of the population is pushing to opt out of the social framework that preceded them, because that framework is seen as no longer working or insufficiently flexible to address a changing world. <a href="https://en.wikipedia.org/wiki/Counterculture">From Wikipedia</a>:</p><div class="pullquote"><p>A <strong>counterculture</strong> is a culture whose values and norms of behavior differ substantially from those of mainstream society, sometimes diametrically opposed to mainstream cultural mores. A countercultural movement expresses the ethos and aspirations of a specific population during a well-defined era. When oppositional forces reach critical mass, countercultures can trigger dramatic cultural changes. Countercultures differ from subcultures.</p></div><p>For me this sidelines debates over matters like GenAI effectiveness, or when AGI will manifest, or if LLMs will provide the ultimate final AI model. None of those things may actually be what is going on. If you think about it, if you were halo-dropped into July 2025 without knowing anything that had happened in the last couple of years, those factors would all seem a strange focus for emotional energy. I think that&#8217;s because they aren&#8217;t what the emotional energy is about at all. It&#8217;s people struggling with the status quo, and concluding &#8212; at least subconsciously &#8212; the status quo is badly wanting.</p><p>This could explain why people of such diverse backgrounds can find themselves unifying energetically under a common technology banner. It isn&#8217;t about &#8220;oh I can be lazy, and make a gadget do that job for me.&#8221; It may simply be about seeing the novelty and potential power of a tool to extend a little hope, when the technology and large-corporation employment baseline was for many people already a system of slowly-shrinking economic hope and disempowerment. In this respect the cryptocurrency community, particularly Bitcoin &#8220;maximalists,&#8221; may find common cause. </p><p>It is a bit fluid what this common cause will face off against. Big tech? Wall Street generally? Wealth inequality in some less-specific way that doesn&#8217;t have a clearly defined opponent?</p><p>Ultimately, GenAI may not turn out to be the final solution for empowerment any more than most crypto efforts were&#8230; but perhaps both are the start of something much bigger because part of society has tasted a little hope, and demands more. If that is the legitimate backstory, then the challenge will be to avoid the usual Wall Street juggernaut co-opting any benefit before change can form solid roots in the lower portions of the wealth pyramid. The lower 98.4% could do with a little wind beneath their wings.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://substack.the-experimentalist.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Experimentalist is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><a href="https://substack.the-experimentalist.com/p/genai-reimagined">The Experimentalist : GenAI Reimagined</a> &#169; 2025 by <a href="https://www.linkedin.com/in/reidmpinchback/">Reid M. Pinchback</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a></p>]]></content:encoded></item><item><title><![CDATA[On the Turn of a Phrase]]></title><description><![CDATA[The power to express solutions, and the implications in a prompt-driven world]]></description><link>https://substack.the-experimentalist.com/p/on-the-turn-of-a-phrase</link><guid isPermaLink="false">https://substack.the-experimentalist.com/p/on-the-turn-of-a-phrase</guid><dc:creator><![CDATA[Reid M. Pinchback]]></dc:creator><pubDate>Sun, 29 Jun 2025 09:06:28 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!EXMX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423cf4d2-7213-44a5-be3f-a01444aa0da4_514x400.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EXMX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423cf4d2-7213-44a5-be3f-a01444aa0da4_514x400.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EXMX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423cf4d2-7213-44a5-be3f-a01444aa0da4_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!EXMX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423cf4d2-7213-44a5-be3f-a01444aa0da4_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!EXMX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423cf4d2-7213-44a5-be3f-a01444aa0da4_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!EXMX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423cf4d2-7213-44a5-be3f-a01444aa0da4_514x400.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EXMX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423cf4d2-7213-44a5-be3f-a01444aa0da4_514x400.jpeg" width="514" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/423cf4d2-7213-44a5-be3f-a01444aa0da4_514x400.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:514,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:425309,&quot;alt&quot;:&quot;AI-generated image of a library&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/167057709?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423cf4d2-7213-44a5-be3f-a01444aa0da4_514x400.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="AI-generated image of a library" title="AI-generated image of a library" srcset="https://substackcdn.com/image/fetch/$s_!EXMX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423cf4d2-7213-44a5-be3f-a01444aa0da4_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!EXMX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423cf4d2-7213-44a5-be3f-a01444aa0da4_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!EXMX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423cf4d2-7213-44a5-be3f-a01444aa0da4_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!EXMX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423cf4d2-7213-44a5-be3f-a01444aa0da4_514x400.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Over the years, there is an idea I find myself increasingly coming back to. It feels like it should have been a quotation from somebody, but I&#8217;ve never found a match. More likely I&#8217;ve heard a few things akin to it, and the mind compressed them into a single memory engram much the way a Large Language Model (LLM) might.</p><div class="pullquote"><p>We cannot think thoughts for which we lack the language.</p></div><p>I believe there is also a corollary to this, with slightly different emphasis:</p><div class="pullquote"><p>The language we use frames the limitations in our thinking.</p></div><p>The point to get across between these is that, in a world of hard problems, we can&#8217;t be surprised if those problems remain unsolved if we are fundamentally compromised in our ability to discuss them.  If you can&#8217;t even pull off a high-quality conversation with yourself in a mirror over something, you&#8217;re not going to create, or advocate for, or prompt-engineer workable solutions. Even worse, you could very well be trapped in a hamster wheel where some little pool of existing thoughts drawn from insufficient language just replays on a mental loop to no effect.</p><div><hr></div><p><strong>Prev: <a href="https://substack.the-experimentalist.com/p/the-foreshadowing">The Foreshadowing</a> | Next: <a href="https://substack.the-experimentalist.com/p/genai-reimagined">GenAI Reimagined</a></strong></p><div><hr></div><p>If those hamster-thoughts were the right words, the right ideas, the obvious solutions, then circumstances would already have changed. You would have &#8220;done the thing&#8221;, or the obviousness of the solution would have been shared by a billion other people and resulted collectively in the required change. Much like the Zen parable of the overflowing tea cup, a lot of what we hold on to is blockage. It distracts us from the value of just clearing the slate, using fresh eyes, and reconceiving the situation anew.</p><h2>Brainstorming</h2><p>The difficulty, and the potential, of thinking up something new was brought home to me years ago when I was working in Academic Computing Services at MIT. The broader Information Systems department that we fit within had gone through years of various attempts to reconceive and improve upon the mission. We used to joke that it was time to start asking the VP&#8217;s executive assistant to hide his books when he&#8217;d go on vacation, because when the VP returned it would always be time for the next methodology incursion to be trained in. Unfortunately, we&#8217;d always just end up trying to fit the old work &#8212; and thus the old thinking &#8212; into the new scheme. Sure, a few bits of vocabulary changed and we added some soft skills, but it never rose to the level of a language that spanned all the players and moved activity (and the economics thereof) in a new direction.</p><p>The moment I really, viscerally, understood what the problem was, came from an activity we conducted with 7 or 8 faculty members. It was a brainstorming exercise where we asked them to go off and write a description of what they wished they saw as a future educational experience benefiting from technology. The instructions were explicit. They weren&#8217;t being asked to talk about routinely applying what we were already supplying them. They were being given the opportunity to &#8220;think out of the box&#8221; and provide the vision we might entirely lack to drive good long-term planning.</p><p>All of the educators in question were experienced teachers, their courses were well-known and well-regarded. They varied by age, department, tenure, etc. In spite of their deep domain expertise coupled with substantial hands-on pedagogical application, only two provided missing vision. Everything else described was an obvious extension of where we were at, but two people were able to set that all aside and articulate broader aspiration instead of just characterizing the results of existing momentum. The other responses, while professional, could have been emailed in as one-liners saying &#8220;increase budget, then do more of the same.&#8221;</p><h2>Literal Shackles</h2><p>A sound byte that gets increasingly recycled relates to the literacy rate in the US is &#8220;54% of adults read below a sixth grade level.&#8221;  Like many such pithy phrases, they get game-of-telephoned and the original context gets lost. It is worth developing the habit of digging a little deeper to get accurate context.</p><blockquote><p>Sidebar: Apologies to global readers not versed in the dynamics American education and politics, but I need to segue here to government data I can have more context in. If anybody ever wants to collaborate on showing similarity or differences in other parts of the globe, I&#8217;m open to that. Anyways, editorial sidebar concluded, on with the story.</p></blockquote><p>As best as I can determine, the number apparently originates from an earlier version <a href="https://www.apmresearchlab.org/10x-adult-literacy">of a 2022 article by the American Public Media Research Lab</a>, where it misconstrued PIAAC literacy levels and 2017 data in terms of school grades. They have since removed that wording from the article.  You may see claims the number is relevant to 2025, but from my digging it appears that is just because the <a href="https://www.thenationalliteracyinstitute.com/2024-2025literacy-statistics">National Literacy Institute has a web page</a> which makes mention of those older numbers. Also as is often the case in our politics-sensitive world, the summaries of the results get administration spin (from either party) to make them sound more palatable than they arguably should be. The spin version was &#8220;79% of adults are literate&#8221;. What that practically meant was &#8220;79% of adults would not fail at ordering a cheese burger instead of a bacon burger from a menu without food pictures; the worst two literacy categories were the remaining 21%.&#8221;  With the hearsay data origin story clarified, let&#8217;s move on from the stale information.</p><p>The PIAAC definitions I think provide a clearer picture. <a href="https://nces.ed.gov/surveys/piaac/measure.asp">PIAAC reporting works in levels</a> (click the link and expand the Cycle 2 Literacy Proficiency Levels), not school grades, and these definitions appear stronger for considering the tasks a person might routinely perform adequately in daily life. There are five levels, and a sixth &#8220;below level 1&#8221; catch-all for anybody with less literacy than the defined five. The descriptions below are my mulling on the implications, not what PIAAC states, organized into a few bands that I find useful for current purposes. </p><ul><li><p><strong>Below Level 1, and Level 1</strong>: not participating in the information economy, and likely very minimal consumers of any technology or media that depends on reading or writing text. Doesn&#8217;t read books, likely doesn&#8217;t own any.<br><em>The literacy band unlikely to use AI in a meaningful way, even accidentally.</em></p></li><li><p><strong>Levels 2 and 3</strong>: routine light consumers of the information economy. The text-based technology interactions are kept modest, but used as needed. Might read simple books, or may just remember having had to do so in school. Could have a few coffee-table books and some pulp fiction or a religious text.<br><em>The literacy band that would accidentally use AI because a vendor wired NLP (natural language processing) into their UX, but would have limited awareness of AI otherwise.</em></p></li><li><p><strong>Levels 4 and 5</strong>: active participants in the information economy to varying degrees. The levels of technology use would depend upon the career domain, but technology and media consumption are likely routine both on and off the job. Both reads and writes text for an assortment of reasons. Probably owns multiple books, some of which could be career-related.<br><em>The only literacy band that could or would intentionally use AI, as interacting with an LLM creatively necessitates a level of comfort with the written word.</em></p></li></ul><p>Particularly interesting are the differences between Levels 4 and 5, which I will quote verbatim:</p><blockquote><p>Adults above Level 4 may be able to reason about the task itself, setting up reading goals based on complex and implicit requests. They can presumably search for and integrate information across multiple, dense texts containing distracting information in prominent positions. They are able to construct syntheses of similar and contrasting ideas or points of view; or evaluate evidence-based arguments and the reliability of unfamiliar information sources. Tasks above Level 4 may also require the application and evaluation of abstract ideas and relationships. Evaluating reliability of evidentiary sources and selecting not just topically relevant but also trustworthy information may be key to achievement.</p></blockquote><p>If you consider the statement in a world of AI, what this really is describing is those people with the capacity to be competent in constructing LLM prompts and evaluating the generated results. This is the level of language necessary to &#8220;think thoughts&#8221; at a meta-level in order to orchestrate a GenAI process, determine if what comes back has utility, and iterate correctively until the task is complete. Anybody with less than Level 5 is increasingly at a disadvantage, and I would argue that the reduced facility with language makes that disadvantage very difficult to avoid unless you ascribe near-mythic powers to LLMs. If LLMs indeed had such powers, it would beg the question of why low-skill human interactions were of value to task execution in the first place. More likely is people in Level 4 feeling pressure to up their game.</p><p>Take a moment to line that up with AI vendor marketing on how it will democratize knowledge and skill. Add in all the people who make online boasts about what LLMs enable for them. Some of it will indeed be true, so for the sake of argument let&#8217;s stipulate to all of that potential and be full-throated in our optimism. Imagine a big, warm, fuzzy percentage of humanity that will experience this brave, new, ever-expanding world of possibility&#8230; and hold my beer. I&#8217;m going to go and fetch the literacy data.</p><h2>The Data</h2><p>I&#8217;ve set up <a href="https://gitlab.com/The-Experimentalist/article_support_2025">a Git repository</a> to provide two Excel workbooks that I downloaded via the <a href="https://nces.ed.gov/surveys/piaac/ideuspiaac/">PIAAC Data Explorer</a>:</p><ul><li><p><a href="https://gitlab.com/The-Experimentalist/article_support_2025/-/blob/main/article/2025-06-29-on-the-turn-of-a-phrase/PIAAC-2012-2023-by-sex-and-age.xlsx?ref_type=heads">PIAAC-2012-2023-by-sex-and-age.xlsx</a>: The data over three reporting periods, with some break-down by sex or by 10-year bands of age range.</p></li><li><p><a href="https://gitlab.com/The-Experimentalist/article_support_2025/-/blob/main/article/2025-06-29-on-the-turn-of-a-phrase/PIAAC-2023-aggregate.xlsx?ref_type=heads">PIAAC-2023-aggregate.xlsx</a>: Summarized numbers from the most recent reporting period, without any demographic break-down.</p></li></ul><p>The key findings from that second file are:</p><pre><code><strong>Year/Study</strong>&#9;  PIAAC 2023
&#9;
<strong>Jurisdiction</strong>&#9;  U.S. Household
&#9;          (16-74 years old)
&#9;
<strong>Proficiency&#9;  Percentage</strong>
Below Level 1&#9;    11.99
Level 1&#9;            16.74
Level 2&#9;            29.36
Level 3&#9;            29.81
Level 4&#9;            10.88
Level 5&#9;             1.22</code></pre><p>There is the reality. That third band combining Levels 4 and 5, representing people who would have some hope of participating meaningfully in an AI-heavy economy: 12 percent.  And of that 12 percent, only a little over 1 percent even have the language potential to excel at it, and mere language proficiency alone would never be enough. For the people who have hope, 1 in 12 start with some minimal skill suggesting they might warrant that hope, and the other 11 are scrambling to play catch-up from a position of measurable but possibly repairable deficit. The other 88 don&#8217;t even know the race track exists.</p><p>Let that rattle around the brain pan for a bit. For those holding the most optimistic view on GenAI, if current literacy patterns continue, then we could face an outcome where 12 percent of the population go in one direction, and 88 percent go in an entirely different one. The only thing that would easily cast the situation in a less dire light would be if something close to 88% of the population currently worked in jobs that don&#8217;t have the potential for being cannibalized by AI, such as some blue-collar skilled trades and most unskilled labor.</p><p>Unfortunately, <a href="https://www.dpeaflcio.org/factsheets/the-professional-and-technical-workforce-by-the-numbers">according to the AFL-CIO, something like 58%</a> of the US workforce is estimated to currently be white-collar.</p><p>It is the basis for a painful transition process into a future combined economic and political divide that will make the current landscape seem a cake-walk by comparison.</p><p>This is one of several reasons why I&#8217;ll keep saying that we need better options.  The search is and will be time-consuming. If you&#8217;re finding the material at all compelling, you can vote for more with your dollars on Substack!</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://substack.the-experimentalist.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Experimentalist is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><a href="https://substack.the-experimentalist.com/p/on-the-turn-of-a-phrase">The Experimentalist : On the Turn of a Phrase</a> &#169; 2025 by <a href="https://www.linkedin.com/in/reidmpinchback/">Reid M. Pinchback</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a></p>]]></content:encoded></item><item><title><![CDATA[The Press Room]]></title><description><![CDATA[A series on the mechanics, practices, and polices of The Experimentalist]]></description><link>https://substack.the-experimentalist.com/p/the-press-room</link><guid isPermaLink="false">https://substack.the-experimentalist.com/p/the-press-room</guid><dc:creator><![CDATA[Reid M. Pinchback]]></dc:creator><pubDate>Fri, 27 Jun 2025 22:46:28 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Hk8I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0546fc4-ac6a-47fe-b9c1-9da36fb77f0e_514x400.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Hk8I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0546fc4-ac6a-47fe-b9c1-9da36fb77f0e_514x400.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Hk8I!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0546fc4-ac6a-47fe-b9c1-9da36fb77f0e_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Hk8I!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0546fc4-ac6a-47fe-b9c1-9da36fb77f0e_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Hk8I!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0546fc4-ac6a-47fe-b9c1-9da36fb77f0e_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Hk8I!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0546fc4-ac6a-47fe-b9c1-9da36fb77f0e_514x400.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Hk8I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0546fc4-ac6a-47fe-b9c1-9da36fb77f0e_514x400.jpeg" width="514" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c0546fc4-ac6a-47fe-b9c1-9da36fb77f0e_514x400.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:514,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:322920,&quot;alt&quot;:&quot;AI-generated image of a hand printing press&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://substack.the-experimentalist.com/i/167003300?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0546fc4-ac6a-47fe-b9c1-9da36fb77f0e_514x400.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="AI-generated image of a hand printing press" title="AI-generated image of a hand printing press" srcset="https://substackcdn.com/image/fetch/$s_!Hk8I!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0546fc4-ac6a-47fe-b9c1-9da36fb77f0e_514x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Hk8I!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0546fc4-ac6a-47fe-b9c1-9da36fb77f0e_514x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Hk8I!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0546fc4-ac6a-47fe-b9c1-9da36fb77f0e_514x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Hk8I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0546fc4-ac6a-47fe-b9c1-9da36fb77f0e_514x400.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Writing online faces challenges, as does the reading of the resulting product.</p><ul><li><p>Readers have fair claim to a need to understand where an author is coming from, what their personal or professional stake or biases may be, in order to evaluate the integrity and relevance of the content.</p></li><li><p>Readers increasingly crave authenticity and less emotional button-pushing than AI-crafted material is often designed for.</p></li><li><p>Accuracy and research matter, for which AI can be both a support and hindrance. </p></li><li><p>It is a worthwhile community service with negligible competitive risk to help others see what they might face in their own future literary efforts.</p></li></ul><div><hr></div><p><strong>Next: <a href="https://substack.the-experimentalist.com/p/newsletter-refurb-2025-08-08">Newsletter Refurb 2025-08-08</a></strong></p><div><hr></div><h2>My Biases</h2><p>My origin begins with family that for the most part got dirt under their fingernails. Growing up was filled with stories of the life and hard decisions forced by the Great Depression. Most of the extended family and friends were firmly blue-collar, some were white-collar middle class, and some farmers were sprinkled into the mix. Politics were varied and not party-affiliated. Family religious views were all over the map: Anglican, Catholic, Mormon, Salvation Army, Jehovah&#8217;s Witnesses, Swedish Presbyterian, and over-the-airwaves televangelist; tossed into that I was the lone Daoist/Buddhist-leaning kid who was into martial arts.</p><p>One of the things I value about writing is that I usually sound less energetic than what you might hear when my Scottish, Irish, and English roots combine with growing up around teamsters (which I was myself for summer jobs), carnies, bus drivers, and longshoremen, which then later morphed into many years of the Massachusetts experience. If I get up a good head of steam, I fully admit I can give Bill Barr a run for his money. I try to keep a tight leash on that. Nobody needs another Ted sequel. The delete button really, really helps.</p><p>I was one of only a couple of family members that pursued a post-secondary education. The results were&#8230; mixed. I&#8217;d be on the Dean&#8217;s honor roll one year, and trying not to flunk out the next. Somehow in spite of that, I ended up working maybe a total of 18 or 19 years within academia. Initially it was just undergraduate rolls in development (fund raising) and cognitive psych research software development, then post-graduation within distance ed, academic computing support, academic administrative computing, and ultimately an affiliated cancer genomics research group.  No, I don&#8217;t also do juggling and card tricks on weekends at kid&#8217;s birthday parties, but it would be an understandable guess since I just summarized only half of the resume.</p><p>All of that got tossed into the blender and resulted in valuing progressive goals as an aspiration for individual freedom and opportunity in our shared human experience, fiscal conservativism so there is some thin hope that the money flow alone doesn&#8217;t undermine long-term attainment of those aspirations, and I think a pretty pragmatic expectation that the confluence of education, economics, politics, and religion will always combine to make a common meeting of minds very challenging for society.</p><p>In short, I&#8217;m probably not really on anybody&#8217;s team. Y&#8217;all get to be equally skeptical of anything I have to say. I&#8217;ve never really got the &#8220;team&#8221; thing, I prefer to dig into each issue as something that has specific context, its own details or history, and I try to accept whatever conclusions the context and data seem consistent with.  I also remind myself I usually get three things wrong every day before I&#8217;m even out of bed; given time I hope to cut that down to two.</p><h2>AI Position</h2><p>Much like I don&#8217;t get excessive team tribalism, I also don&#8217;t care about fanboys on really any topic. Setting aside the substance of AI, there is a contingent that only cares that you share their emotional energy. The more I see of it, the more I gravitate away since life is rarely lived in the extremes. My position has, very simply, varied over time. I expect it will continue to do so. Some initial reactions were wrong, some so far are not, and time will continually force me to re-evaluate.</p><h3>Article Content</h3><p>I do not have any interest in using AI to generate article content for the simple reason that this is what I enjoy.  I cannot imagine a more self-defeating behavior than to hand over the thing I like doing, and instead spending more time on other things I don&#8217;t like doing.  This is not a criticism of other&#8217;s use of AI in writing. It is just a personal choice on where my finite human experience will be devoted. I do not view the act of writing as merely a transactional endeavor. For me the process is just as much the point as the published result, and I&#8217;m the one and only caster of that particular vote.</p><p>What I do use AI for is generating images to establish the mood of each piece. If I had the budget to fund a fractional graphic artist or cartoonist role, I would much rather that.  Time will tell if I get there.  For now, I value how AI helps me bootstrap a small refinement in user experience.</p><h3>Advertising</h3><p>Some acts of publishing words are not the same. Keeping your content visible means navigating through social media algorithms, and authors have no control over the heuristics those services contain.  One of the more pragmatic ways to deal with that is to create shorter posts on various sites which function as advertising, redirecting users to the primary platforms: Substack and Medium in my case.  I&#8217;m not currently using AI for advertising posts, but it is entirely possible that one day I will. Whether you craft advertisements manually or via AI, they are increasingly a highly artificial exercise and there is little point in pretending otherwise.</p><h3>Research</h3><p>There are understandable concerns about the accuracy of research done via AI. On that front, I have good news. I&#8217;m a digital packrat. If there was a Digital Packrat Olympics I would have a string of gold medals.  I&#8217;ve been squirreling away links, notes, projects, and dumps of those into Git repositories for years.  I add dozens of tidbits every single day.  There are even a few boxes of paper reports tucked away in corners, and don&#8217;t even get me started on the books which total approximately a half ton by weight. I know because I&#8217;ve moved those boxes.</p><p>Given the time it would take to write articles on even a tenth of all that, I&#8217;ve got a multi-decade head start before I&#8217;m exposed to how online content degrades via AI-driven data recycling.  If I have a moat for a competitive advantage in writing, information hoarding may very well turn out to have be it. Enjoy.  That said, if I ever did use AI to help me with article content, it could be to digest this immense mound of trivia and help me turn it into starting points for code or topics. Once fully incorporated via LLM or RAG, I&#8217;d be back to doing my thing my way.</p><h2>Wrapping Up</h2><p>I&#8217;ll have other posts about the mechanics of newsletter publishing and how some decisions were made. There is also the potential for others to contribute their own articles and so I&#8217;ll be speaking to that process as well.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://substack.the-experimentalist.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Experimentalist is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><a href="https://substack.the-experimentalist.com/p/the-press-room">The Experimentalist : The Press Room</a> &#169; 2025 by <a href="https://www.linkedin.com/in/reidmpinchback/">Reid M. Pinchback</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a></p>]]></content:encoded></item></channel></rss>