<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://sprestridge.net/feed.xml" rel="self" type="application/atom+xml" /><link href="https://sprestridge.net/" rel="alternate" type="text/html" /><updated>2026-05-22T20:44:02+00:00</updated><id>https://sprestridge.net/feed.xml</id><title type="html">On the edge of tomorrow</title><subtitle>The latest incarnation of my little piece of the world wide web. Writing about anything that interests me...I&apos;m a business consultant, lacrosse addict, runner, beer geek, and artist - in no particular order.
</subtitle><entry><title type="html">2026 Monte Carlo Final Four</title><link href="https://sprestridge.net/data%20science/python/lacrosse/2026/05/22/Monte-Carlo-Lacrosse.html" rel="alternate" type="text/html" title="2026 Monte Carlo Final Four" /><published>2026-05-22T15:00:00+00:00</published><updated>2026-05-22T15:00:00+00:00</updated><id>https://sprestridge.net/data%20science/python/lacrosse/2026/05/22/Monte-Carlo-Lacrosse</id><content type="html" xml:base="https://sprestridge.net/data%20science/python/lacrosse/2026/05/22/Monte-Carlo-Lacrosse.html"><![CDATA[<h3 id="predicting-the-championship-from-the-final-four-field">Predicting the championship from the final four field</h3>

<p>I am revisiting last year’s Monte Carlo posts on predicting the outcomes of lacrosse games (<a href="/data%20science/python/lacrosse/2025/05/18/Monte-Carlo-Lacrosse.html">here</a> and <a href="/data%20science/python/lacrosse/2025/05/23/Monte-Carlo-Redux.html">here</a>). Heading into Memorial Day weekend 2026, the <a href="https://goheels.com/sports/womens-lacrosse">UNC Women’s Lacrosse</a> team has to be feeling confident. Defending champions and favored to win with 5 in 10 chances to take this year’s championship.</p>

<p>On the Men’s side, Notre Dame has 4 in 10 chances to win the championship with Princeton having 2.8 chances in 10.</p>

<p>Read on to learn more.</p>

<!--more-->

<p>Ratings are sourced from <a href="https://laxmath.com/laxpower/wom/list_prr001.php">LaxPower2</a> and are <a href="https://laxmath.com/laxpower/wom/ex_pr.php">based on a margin-of-victory</a> calculation. The difference between two ratings is the number of goals by which the higher rated team is expected to win. Monte Carlo models are used to generate the odds of each of the four teams advancing and winning the championship. If 1,000 tournaments were played, UNC would win 502 championships, Northwestern 270, Maryland 124, and <a href="https://hopkinssports.com/sports/womens-lacrosse/">JHU</a> 104 championships.</p>

<p><strong>Women’s D1</strong></p>

<table>
  <thead>
    <tr>
      <th style="text-align: left">Team</th>
      <th style="text-align: right">Record</th>
      <th style="text-align: right">Rating</th>
      <th style="text-align: right">AGD</th>
      <th style="text-align: right">GF</th>
      <th style="text-align: right">GA</th>
      <th style="text-align: right">Odds</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: left">UNC</td>
      <td style="text-align: right">18-1</td>
      <td style="text-align: right">99.9000</td>
      <td style="text-align: right">10.31</td>
      <td style="text-align: right">341</td>
      <td style="text-align: right">145</td>
      <td style="text-align: right">50.2%</td>
    </tr>
    <tr>
      <td style="text-align: left">Northwestern</td>
      <td style="text-align: right">17-3</td>
      <td style="text-align: right">98.2876</td>
      <td style="text-align: right">5.90</td>
      <td style="text-align: right">295</td>
      <td style="text-align: right">177</td>
      <td style="text-align: right">27.0%</td>
    </tr>
    <tr>
      <td style="text-align: left">Maryland</td>
      <td style="text-align: right">18-3</td>
      <td style="text-align: right">97.2975</td>
      <td style="text-align: right">3.14</td>
      <td style="text-align: right">272</td>
      <td style="text-align: right">206</td>
      <td style="text-align: right">12.4%</td>
    </tr>
    <tr>
      <td style="text-align: left">JHU</td>
      <td style="text-align: right">17-4</td>
      <td style="text-align: right">96.8323</td>
      <td style="text-align: right">4.90</td>
      <td style="text-align: right">319</td>
      <td style="text-align: right">216</td>
      <td style="text-align: right">10.4%</td>
    </tr>
  </tbody>
</table>

<p>In the semifinal against <a href="https://umterps.com/sports/womens-lacrosse">Maryland</a>, UNC is expected to win <strong>724 games out of 1,000</strong> and if <a href="https://nusports.com/sports/womens-lacrosse">Northwestern</a> were to advance to the final, UNC would be expected to win the championship game over Northwestern <strong>659 times out of 1,000</strong>.</p>

<p><strong>Women’s D3</strong></p>

<p>Because we have friends that have a daughter playing for a D3 program, I ran the numbers for that bracket as well.</p>

<table>
  <thead>
    <tr>
      <th style="text-align: left">Team</th>
      <th style="text-align: right">Record</th>
      <th style="text-align: right">Rating</th>
      <th style="text-align: right">AGD</th>
      <th style="text-align: right">GF</th>
      <th style="text-align: right">GA</th>
      <th style="text-align: right">Odds</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: left">Middlebury</td>
      <td style="text-align: right">21-0</td>
      <td style="text-align: right">99.9000</td>
      <td style="text-align: right">9.47</td>
      <td style="text-align: right">303</td>
      <td style="text-align: right">104</td>
      <td style="text-align: right">38.0%</td>
    </tr>
    <tr>
      <td style="text-align: left">Wesleyan</td>
      <td style="text-align: right">18-3</td>
      <td style="text-align: right">98.2876</td>
      <td style="text-align: right">5.66</td>
      <td style="text-align: right">245</td>
      <td style="text-align: right">126</td>
      <td style="text-align: right">25.7%</td>
    </tr>
    <tr>
      <td style="text-align: left">Tufts</td>
      <td style="text-align: right">17-3</td>
      <td style="text-align: right">97.2975</td>
      <td style="text-align: right">7.75</td>
      <td style="text-align: right">313</td>
      <td style="text-align: right">158</td>
      <td style="text-align: right">19.6%</td>
    </tr>
    <tr>
      <td style="text-align: left">Salisbury</td>
      <td style="text-align: right">20-0</td>
      <td style="text-align: right">96.8323</td>
      <td style="text-align: right">7.60</td>
      <td style="text-align: right">295</td>
      <td style="text-align: right">143</td>
      <td style="text-align: right">16.8%</td>
    </tr>
  </tbody>
</table>

<p><strong>Middlebury</strong> is expected to win the championship <strong>380 times out of 1,000</strong> tournaments. In the semifinal against Tufts, Middlebury is expected to win 606 times out of 1,000 games; Middlebury would be expected to win the championship over Wesleyan <strong>612 times out of 1,000 finals</strong>.</p>

<p>Wesleyan has a loss and a win with Tufts and two losses to Middlebury on their record.</p>

<p>Tufts has previously lost to Middlebury and Wesleyan but also has a win over Wesleyan.</p>

<p>Salisbury has the worst odds, winning just <strong>168 championships out of 1,000</strong> simulated tournaments. Salisbury is expected to get past Wesleyan just 243 times out of 1,000 and could defeat Middlebury in a championship game 360 times.</p>

<p><strong>Men’s D1</strong></p>

<table>
  <thead>
    <tr>
      <th style="text-align: left">Team</th>
      <th style="text-align: right">Record</th>
      <th style="text-align: right">Rating</th>
      <th style="text-align: right">AGD</th>
      <th style="text-align: right">GF</th>
      <th style="text-align: right">GA</th>
      <th style="text-align: right">Odds</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: left">Notre Dame</td>
      <td style="text-align: right">12-2</td>
      <td style="text-align: right">99.9000</td>
      <td style="text-align: right">4.71</td>
      <td style="text-align: right">186</td>
      <td style="text-align: right">120</td>
      <td style="text-align: right">40.2%</td>
    </tr>
    <tr>
      <td style="text-align: left">Princeton</td>
      <td style="text-align: right">15-2</td>
      <td style="text-align: right">99.2868</td>
      <td style="text-align: right">4.94</td>
      <td style="text-align: right">252</td>
      <td style="text-align: right">168</td>
      <td style="text-align: right">27.5%</td>
    </tr>
    <tr>
      <td style="text-align: left">Duke</td>
      <td style="text-align: right">11-4</td>
      <td style="text-align: right">98.8380</td>
      <td style="text-align: right">6.00</td>
      <td style="text-align: right">221</td>
      <td style="text-align: right">131</td>
      <td style="text-align: right">18.9%</td>
    </tr>
    <tr>
      <td style="text-align: left">Syracuse</td>
      <td style="text-align: right">13-5</td>
      <td style="text-align: right">98.3347</td>
      <td style="text-align: right">2.44</td>
      <td style="text-align: right">235</td>
      <td style="text-align: right">191</td>
      <td style="text-align: right">13.4%</td>
    </tr>
  </tbody>
</table>

<p>Princeton faces Duke in the first semifinal and Notre Dame faces Syracuse in the second. According to the model, <strong>Notre Dame should win 402 of 1,000 championship</strong> weekends. Syracuse has the longest odds, winning just 134 of 1,000 championship weekends. Head-to-head, Notre Dame should defeat Syracuse 662 of 1,000 semifinals. In a championship against Princeton, Notre Dame is expected to be victorious 665 of 1,000 times.</p>

<p>For posterity, the code used for the Men’s final four:</p>

<div class="language-zsh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% python3.10 mc_final_four.py 100000 24.5 99.90 98.3347 99.2868 98.8380
Running 100000 tournament simulations...

Tournament Win Probabilities:
Team  1: 40.18% - Notre Dame
Team  2: 13.38% - Syracuse
Team  3: 27.53% - Princeton
Team  4: 18.91% - Duke
</code></pre></div></div>]]></content><author><name></name></author><category term="data science" /><category term="python" /><category term="lacrosse" /><summary type="html"><![CDATA[Predicting the championship from the final four field I am revisiting last year’s Monte Carlo posts on predicting the outcomes of lacrosse games (here and here). Heading into Memorial Day weekend 2026, the UNC Women’s Lacrosse team has to be feeling confident. Defending champions and favored to win with 5 in 10 chances to take this year’s championship. On the Men’s side, Notre Dame has 4 in 10 chances to win the championship with Princeton having 2.8 chances in 10. Read on to learn more.]]></summary></entry><entry><title type="html">Pour-over coffee</title><link href="https://sprestridge.net/coffee/2025/06/16/Pour-Over-Coffee.html" rel="alternate" type="text/html" title="Pour-over coffee" /><published>2025-06-16T15:00:00+00:00</published><updated>2025-06-16T15:00:00+00:00</updated><id>https://sprestridge.net/coffee/2025/06/16/Pour-Over-Coffee</id><content type="html" xml:base="https://sprestridge.net/coffee/2025/06/16/Pour-Over-Coffee.html"><![CDATA[<h3 id="the-4-best-pour-over-coffee-makers-of-2025--reviews-by-wirecutter"><a href="https://www.nytimes.com/wirecutter/reviews/gear-for-making-great-coffee/">The 4 Best Pour-Over Coffee Makers of 2025 | Reviews by Wirecutter</a></h3>

<p>I know. Another coffee snob writes a post about coffee but I wanted to document my current use. After being a die-hard AeroPress aficionado for years, I think I am converted to simple (ha!) pour-over. We’ve been using a Bean Envy pour-over carafe and a Cilio V60 style brewer. I like both. Found the following ratios interesting:</p>

<blockquote>
  <p>In head-to-head tests, we kept coffee-to-water ratios and brew water temperature consistent (<strong>1 gram of coffee for every 17 grams water, heated to 206 °F</strong>). We also adjusted grind size, to best accommodate the flow rates for each dripper, and we landed on brew times common across multiple published recipes.</p>
</blockquote>

<!--more-->

<p>Tonx, from <a href="https://www.yesplz.coffee">YesPlz</a>–our favorite coffee subscription service–was a writer for the article. It doesn’t cover technique but does provide these comments in the <strong>How-to</strong> section:</p>

<blockquote>
  <p>But your recipe and technique also matter. The most important factors for getting consistent, delicious results are your coffee-to-water ratio, the coarseness of your grounds, the temperature of the water, and the speed of your pour.</p>

  <p>You can find numerous recipes and guides online, but most suggest <strong>using between a 1:15 to 1:17 ratio of coffee to water, in grams</strong>. The coffee should be ground medium-fine (a little finer than coarse sea salt). And the water should be heated to between 195 and 205 degrees Fahrenheit (depending on your preferences and on the coffee, with lighter or more delicate roasts preferring higher temperatures).</p>
</blockquote>

<p>The 1:15 ratio is about 0.067 in decimal and 1:17 ratio about 0.059 in decimal.</p>

<p>This <a href="https://youtu.be/X-fXQKqkYxI">Coffee with April video</a> has a small cup ratio of 0.065 (13 grams coffee to 200 grams water) and a medium of 0.067 (20 grams coffee to 300 grams water) with water temperature at 94 C (~201 F).</p>

<p><strong>Small</strong>: 200 grams is about 7 ounces.</p>

<p><strong>Medium</strong>: 300 grams is about 10.6 ounces.</p>

<p>April recommends <strong>pours of 100 grams</strong> at a time split between <strong>30 grams in a circle</strong> and <strong>70 grams in the center</strong>.</p>

<p>Their small cup is just 2 pours, while the medium is 3 pours.</p>

<p>I have been making my pour over with about 9 ounces (255 grams) of water from the fridge measurement, not a weighted measure. So I need to be using about <strong>15 - 17 grams of coffee</strong> to match these ratios. To really dial it in, a kitchen scale would be needed.</p>

<p>Wondering about the circle poor and center poor method employed by April? So was I. This <a href="https://youtu.be/PRZ-lD5si0M">more recent video</a> from them shows three identical cups brewed with a full circle pour, a full center pour, and a 50 circle / 50 center method with the 50/50 FTW. Mind, these are flat bottom filters / brewers and I am using a V60 style brewer (the Cilio Number 2).</p>]]></content><author><name></name></author><category term="coffee" /><summary type="html"><![CDATA[The 4 Best Pour-Over Coffee Makers of 2025 | Reviews by Wirecutter I know. Another coffee snob writes a post about coffee but I wanted to document my current use. After being a die-hard AeroPress aficionado for years, I think I am converted to simple (ha!) pour-over. We’ve been using a Bean Envy pour-over carafe and a Cilio V60 style brewer. I like both. Found the following ratios interesting: In head-to-head tests, we kept coffee-to-water ratios and brew water temperature consistent (1 gram of coffee for every 17 grams water, heated to 206 °F). We also adjusted grind size, to best accommodate the flow rates for each dripper, and we landed on brew times common across multiple published recipes.]]></summary></entry><entry><title type="html">Predicting the final four</title><link href="https://sprestridge.net/data%20science/python/lacrosse/2025/05/23/Monte-Carlo-Redux.html" rel="alternate" type="text/html" title="Predicting the final four" /><published>2025-05-23T17:00:00+00:00</published><updated>2025-05-23T17:00:00+00:00</updated><id>https://sprestridge.net/data%20science/python/lacrosse/2025/05/23/Monte-Carlo-Redux</id><content type="html" xml:base="https://sprestridge.net/data%20science/python/lacrosse/2025/05/23/Monte-Carlo-Redux.html"><![CDATA[<h3 id="who-is-going-to-win-it-all">Who is going to win it all?</h3>

<p>Heading into Memorial Day weekend the <a href="https://cornellbigred.com/sports/mens-lacrosse">Cornell Big Red</a> have to feel pretty confident. They’ve won their first two playoff games over UAlbany by a score of 15-6, and Richmond by a score of 13-12, and have <strong>increased their odds of winning the whole tournament from near 30% to 44%</strong>.</p>

<p>How did I determine the odds of each team winning the tournament? Monte Carlo analysis, <a href="https://laxnumbers.com/ratings.php?y=2025&amp;v=401">power ratings</a>, and <a href="https://en.wikipedia.org/wiki/Poisson_distribution">poisson distributions</a> to the rescue. Read on to learn more.</p>

<!--more-->

<table>
  <thead>
    <tr>
      <th style="text-align: left">Team</th>
      <th style="text-align: right">Record</th>
      <th style="text-align: right">Rating</th>
      <th style="text-align: right">AGD</th>
      <th style="text-align: right">GF</th>
      <th style="text-align: right">GA</th>
      <th style="text-align: right">Odds</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: left">Cornell</td>
      <td style="text-align: right">15-1</td>
      <td style="text-align: right">99.99</td>
      <td style="text-align: right">5.82</td>
      <td style="text-align: right">275</td>
      <td style="text-align: right">176</td>
      <td style="text-align: right">45%</td>
    </tr>
    <tr>
      <td style="text-align: left">Penn State</td>
      <td style="text-align: right">11-5</td>
      <td style="text-align: right">98.36</td>
      <td style="text-align: right">3.25</td>
      <td style="text-align: right">194</td>
      <td style="text-align: right">146</td>
      <td style="text-align: right">15%</td>
    </tr>
    <tr>
      <td style="text-align: left">Maryland</td>
      <td style="text-align: right">13-3</td>
      <td style="text-align: right">98.44</td>
      <td style="text-align: right">3.18</td>
      <td style="text-align: right">177</td>
      <td style="text-align: right">127</td>
      <td style="text-align: right">18%</td>
    </tr>
    <tr>
      <td style="text-align: left">Syracuse</td>
      <td style="text-align: right">12-5</td>
      <td style="text-align: right">98.51</td>
      <td style="text-align: right">3.72</td>
      <td style="text-align: right">250</td>
      <td style="text-align: right">183</td>
      <td style="text-align: right">23%</td>
    </tr>
  </tbody>
</table>

<p>For this analysis, I am using the ratings to calculate the expected goal difference for two teams playing one another. Game scores for sports follow a Poisson distribution. If we know the expected goals for each team, we can sample from Poisson distributions to get a simulated score and determine a winner.</p>

<p>If you are having trouble figuring out what that means in practical terms, the <a href="https://ezcalc.me/poisson-distribution-calculator/">Posson Distribution Calculator</a> has several good examples explaining usage. Also, a few lacrosse example may be instructive.</p>

<p>Use the mens average goals per game of 22.54 as the average rate (lambda) in the calculator and enter 25 for occurrences (k)–in this case, goals scored. With the type of Poisson probability set to <em>no more than “k” occurrences</em>, the calculator will output a probability (P) of 0.7406 which means that for 74.1% of the games played we can expect the total goals scored by both teams to be 25 goals or less.</p>

<p>The Maryland vs Georgetown game from last weekend is also instructive. The final score was 9-6 which means 15 total goals were scored. Using the calculator again tells us that we would expect just 6.2% of games to have 15 total goals or less. Yes, the game was slower paced than most and had far less scoring than most.</p>

<p>The power ratings in the table above likely differ slightly from those published by <a href="https://laxnumbers.com/ratings.php?y=2025&amp;v=401">LaxNumbers</a>. I have calculated the ratings myself.</p>

<p>Using the data in the table above and the script included below, the following command in the terminal will return the tournament win probabilities for each of the four teams.</p>

<p><code class="language-plaintext highlighter-rouge">python3.10 mc_final_four.py 10000 22.54 99.99 98.36 98.44 98.51</code></p>

<pre><code class="language-Markdown">Tournament Win Probabilities:
Team 1 - Cornell: 44.68%
Team 2 - Penn State: 14.69%
Team 3 - Maryland: 17.93%
Team 4 - Syracuse: 22.70%
</code></pre>

<p>Regarding the average total goals per game, I did an analysis for the mens and womens teams remaining to calculate the average across all games played. For men, it was <strong>22.54 goals per game</strong> and for women it was <strong>24.69 goals per game</strong>.</p>

<h3 id="womens-final-four">Women’s Final Four</h3>

<p>Also, here are the odds for the Women’s bracket. North Carolina (48%) and Boston College (47%) are heavily favored to win the weekend:</p>

<table>
  <thead>
    <tr>
      <th style="text-align: left">Team</th>
      <th style="text-align: right">Record</th>
      <th style="text-align: right">Rating</th>
      <th style="text-align: right">AGD</th>
      <th style="text-align: right">GF</th>
      <th style="text-align: right">GA</th>
      <th style="text-align: right">Odds</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: left">North Carolina</td>
      <td style="text-align: right">20-0</td>
      <td style="text-align: right">99.79</td>
      <td style="text-align: right">10.40</td>
      <td style="text-align: right">343</td>
      <td style="text-align: right">135</td>
      <td style="text-align: right">48%</td>
    </tr>
    <tr>
      <td style="text-align: left">Florida</td>
      <td style="text-align: right">20-2</td>
      <td style="text-align: right">93.54</td>
      <td style="text-align: right">7.63</td>
      <td style="text-align: right">364</td>
      <td style="text-align: right">196</td>
      <td style="text-align: right">1%</td>
    </tr>
    <tr>
      <td style="text-align: left">Boston College</td>
      <td style="text-align: right">19-2</td>
      <td style="text-align: right">99.99</td>
      <td style="text-align: right">9.80</td>
      <td style="text-align: right">364</td>
      <td style="text-align: right">158</td>
      <td style="text-align: right">47%</td>
    </tr>
    <tr>
      <td style="text-align: left">Northwestern</td>
      <td style="text-align: right">18-2</td>
      <td style="text-align: right">96.17</td>
      <td style="text-align: right">7.75</td>
      <td style="text-align: right">322</td>
      <td style="text-align: right">167</td>
      <td style="text-align: right">4%</td>
    </tr>
  </tbody>
</table>

<p><code class="language-plaintext highlighter-rouge">python3.10 mc_final_four.py 10000 24.68 99.79 93.54 99.99 96.17</code></p>

<pre><code class="language-Markdown">Tournament Win Probabilities:
Team  1: 46.98%
Team  2: 0.93%
Team  3: 47.33%
Team  4: 4.76%
</code></pre>

<hr />

<h3 id="how-to-run-from-the-command-line">How to Run from the Command Line</h3>

<ol>
  <li>
    <p><strong>Save the code:</strong> Save the script below as a Python file, for example, <code class="language-plaintext highlighter-rouge">mc_final_four.py</code>.</p>
  </li>
  <li>
    <p><strong>Open your terminal or command prompt.</strong></p>
  </li>
  <li>
    <p><strong>Navigate to the directory</strong> where you saved the file.</p>
  </li>
  <li>
    <p><strong>Run the script</strong> using the <code class="language-plaintext highlighter-rouge">python</code> command, followed by the script name and the arguments:</p>
  </li>
</ol>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>python mc_final_four.py &lt;num_simulations&gt; &lt;average_total_goals_per_game&gt; &lt;Team1_rating&gt; &lt;Team2_rating&gt; &lt;Team3_rating&gt; &lt;Team4_rating&gt;
</code></pre></div></div>

<p><strong>Example using your provided ratings (with <code class="language-plaintext highlighter-rouge">average_total_goals_per_game</code> at 23):</strong></p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>python mc_final_four.py 10000 23 99.79 93.54 99.99 96.17
</code></pre></div></div>

<p>Here’s what each argument represents:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">10000</code>: The number of tournament simulations to run.</li>
  <li><code class="language-plaintext highlighter-rouge">23</code>: The <code class="language-plaintext highlighter-rouge">average_total_goals_per_game</code> for the sport.</li>
  <li><code class="language-plaintext highlighter-rouge">99.79</code>: Team 1’s power rating (North Carolina).</li>
  <li><code class="language-plaintext highlighter-rouge">93.54</code>: Team 2’s power rating (Florida).</li>
  <li><code class="language-plaintext highlighter-rouge">99.99</code>: Team 3’s power rating (Boston College).</li>
  <li><code class="language-plaintext highlighter-rouge">96.17</code>: Team 4’s power rating (Northwestern).</li>
</ul>

<hr />

<h3 id="notes">Notes</h3>

<p>This version is more convenient for repeated analysis with different parameters without editing the code directly.</p>

<ul>
  <li><strong><code class="language-plaintext highlighter-rouge">argparse</code> Module:</strong> This is Python’s standard library for parsing command-line arguments.</li>
  <li><code class="language-plaintext highlighter-rouge">parser = argparse.ArgumentParser(...)</code>: Creates an argument parser object.</li>
  <li><code class="language-plaintext highlighter-rouge">parser.add_argument(...)</code>: Defines each expected argument (its name, type, and help text).</li>
  <li><code class="language-plaintext highlighter-rouge">args = parser.parse_args()</code>: Parses the arguments provided on the command line.</li>
</ul>

<p><strong>Input Handling:</strong></p>
<ul>
  <li>The <code class="language-plaintext highlighter-rouge">input()</code> prompts have been removed.</li>
  <li>The values for <code class="language-plaintext highlighter-rouge">num_simulations</code>, <code class="language-plaintext highlighter-rouge">average_total_goals_per_game</code>, and the four team ratings are now read directly from <code class="language-plaintext highlighter-rouge">args.&lt;argument_name&gt;</code>.</li>
  <li><strong><code class="language-plaintext highlighter-rouge">average_total_goals_per_game</code> as a parameter:</strong> This crucial value is no longer hardcoded within <code class="language-plaintext highlighter-rouge">simulate_game</code>. It’s now passed down from the <code class="language-plaintext highlighter-rouge">main</code> function through <code class="language-plaintext highlighter-rouge">run_monte_carlo_analysis</code> and <code class="language-plaintext highlighter-rouge">simulate_tournament</code> to <code class="language-plaintext highlighter-rouge">simulate_game</code>, allowing you to easily adjust it from the command line.</li>
</ul>

<p><strong>Input Validation:</strong></p>
<ul>
  <li>Basic checks are added to ensure <code class="language-plaintext highlighter-rouge">num_simulations</code> and <code class="language-plaintext highlighter-rouge">average_total_goals_per_game</code> are positive values.</li>
</ul>

<h3 id="script">Script</h3>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">random</span>
<span class="kn">import</span> <span class="nn">math</span>
<span class="kn">import</span> <span class="nn">argparse</span>
<span class="kn">from</span> <span class="nn">scipy.stats</span> <span class="kn">import</span> <span class="n">poisson</span>

<span class="k">def</span> <span class="nf">simulate_game</span><span class="p">(</span><span class="n">team1_power</span><span class="p">,</span> <span class="n">team2_power</span><span class="p">,</span> <span class="n">average_total_goals_per_game</span><span class="p">):</span>
    <span class="s">"""
    Simulates a single game between two teams based on their power ratings,
    modeling the difference in ratings as an expected goal difference,
    and then using Poisson distribution for scores.
    Returns True if team1 wins, False if team2 wins.
    """</span>
    <span class="n">expected_goal_difference</span> <span class="o">=</span> <span class="n">team1_power</span> <span class="o">-</span> <span class="n">team2_power</span>

    <span class="n">team1_expected_goals</span> <span class="o">=</span> <span class="p">(</span><span class="n">average_total_goals_per_game</span> <span class="o">+</span> <span class="n">expected_goal_difference</span><span class="p">)</span> <span class="o">/</span> <span class="mi">2</span>
    <span class="n">team2_expected_goals</span> <span class="o">=</span> <span class="p">(</span><span class="n">average_total_goals_per_game</span> <span class="o">-</span> <span class="n">expected_goal_difference</span><span class="p">)</span> <span class="o">/</span> <span class="mi">2</span>

    <span class="c1"># Ensure expected goals are positive and reasonable for Poisson distribution.
</span>    <span class="n">min_expected_goals</span> <span class="o">=</span> <span class="mf">0.1</span>
    <span class="n">team1_expected_goals</span> <span class="o">=</span> <span class="nb">max</span><span class="p">(</span><span class="n">min_expected_goals</span><span class="p">,</span> <span class="n">team1_expected_goals</span><span class="p">)</span>
    <span class="n">team2_expected_goals</span> <span class="o">=</span> <span class="nb">max</span><span class="p">(</span><span class="n">min_expected_goals</span><span class="p">,</span> <span class="n">team2_expected_goals</span><span class="p">)</span>

    <span class="c1"># Simulate scores using Poisson distribution
</span>    <span class="n">team1_score</span> <span class="o">=</span> <span class="n">poisson</span><span class="p">.</span><span class="n">rvs</span><span class="p">(</span><span class="n">team1_expected_goals</span><span class="p">)</span>
    <span class="n">team2_score</span> <span class="o">=</span> <span class="n">poisson</span><span class="p">.</span><span class="n">rvs</span><span class="p">(</span><span class="n">team2_expected_goals</span><span class="p">)</span>

    <span class="c1"># Determine the winner
</span>    <span class="k">if</span> <span class="n">team1_score</span> <span class="o">&gt;</span> <span class="n">team2_score</span><span class="p">:</span>
        <span class="k">return</span> <span class="bp">True</span>  <span class="c1"># Team 1 wins
</span>    <span class="k">elif</span> <span class="n">team2_score</span> <span class="o">&gt;</span> <span class="n">team1_score</span><span class="p">:</span>
        <span class="k">return</span> <span class="bp">False</span> <span class="c1"># Team 2 wins
</span>    <span class="k">else</span><span class="p">:</span>
        <span class="c1"># Tie-breaker: If scores are tied, the team with the higher power rating wins.
</span>        <span class="k">return</span> <span class="n">team1_power</span> <span class="o">&gt;</span> <span class="n">team2_power</span>

<span class="k">def</span> <span class="nf">simulate_tournament</span><span class="p">(</span><span class="n">initial_round_teams_with_ids</span><span class="p">,</span> <span class="n">average_total_goals_per_game</span><span class="p">):</span>
    <span class="s">"""
    Simulates a single 4-team tournament given the initial list of (team_id, power_rating) tuples
    for the first round matchups (1 vs 2, 3 vs 4).
    Returns the ID of the winning team.
    """</span>
    <span class="c1"># Round 1: Semifinals
</span>    <span class="c1"># Game 1: Team at index 0 vs Team at index 1
</span>    <span class="n">team1_id</span><span class="p">,</span> <span class="n">team1_power</span> <span class="o">=</span> <span class="n">initial_round_teams_with_ids</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
    <span class="n">team2_id</span><span class="p">,</span> <span class="n">team2_power</span> <span class="o">=</span> <span class="n">initial_round_teams_with_ids</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span>
    
    <span class="k">if</span> <span class="n">simulate_game</span><span class="p">(</span><span class="n">team1_power</span><span class="p">,</span> <span class="n">team2_power</span><span class="p">,</span> <span class="n">average_total_goals_per_game</span><span class="p">):</span>
        <span class="n">winner1</span> <span class="o">=</span> <span class="p">(</span><span class="n">team1_id</span><span class="p">,</span> <span class="n">team1_power</span><span class="p">)</span>
    <span class="k">else</span><span class="p">:</span>
        <span class="n">winner1</span> <span class="o">=</span> <span class="p">(</span><span class="n">team2_id</span><span class="p">,</span> <span class="n">team2_power</span><span class="p">)</span>

    <span class="c1"># Game 2: Team at index 2 vs Team at index 3
</span>    <span class="n">team3_id</span><span class="p">,</span> <span class="n">team3_power</span> <span class="o">=</span> <span class="n">initial_round_teams_with_ids</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span>
    <span class="n">team4_id</span><span class="p">,</span> <span class="n">team4_power</span> <span class="o">=</span> <span class="n">initial_round_teams_with_ids</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span>

    <span class="k">if</span> <span class="n">simulate_game</span><span class="p">(</span><span class="n">team3_power</span><span class="p">,</span> <span class="n">team4_power</span><span class="p">,</span> <span class="n">average_total_goals_per_game</span><span class="p">):</span>
        <span class="n">winner2</span> <span class="o">=</span> <span class="p">(</span><span class="n">team3_id</span><span class="p">,</span> <span class="n">team3_power</span><span class="p">)</span>
    <span class="k">else</span><span class="p">:</span>
        <span class="n">winner2</span> <span class="o">=</span> <span class="p">(</span><span class="n">team4_id</span><span class="p">,</span> <span class="n">team4_power</span><span class="p">)</span>

    <span class="c1"># Round 2: Championship
</span>    <span class="n">finalist1_id</span><span class="p">,</span> <span class="n">finalist1_power</span> <span class="o">=</span> <span class="n">winner1</span>
    <span class="n">finalist2_id</span><span class="p">,</span> <span class="n">finalist2_power</span> <span class="o">=</span> <span class="n">winner2</span>

    <span class="k">if</span> <span class="n">simulate_game</span><span class="p">(</span><span class="n">finalist1_power</span><span class="p">,</span> <span class="n">finalist2_power</span><span class="p">,</span> <span class="n">average_total_goals_per_game</span><span class="p">):</span>
        <span class="k">return</span> <span class="n">finalist1_id</span>
    <span class="k">else</span><span class="p">:</span>
        <span class="k">return</span> <span class="n">finalist2_id</span>

<span class="k">def</span> <span class="nf">run_monte_carlo_analysis</span><span class="p">(</span><span class="n">initial_teams_with_ids</span><span class="p">,</span> <span class="n">num_simulations</span><span class="p">,</span> <span class="n">average_total_goals_per_game</span><span class="p">):</span>
    <span class="s">"""
    Runs a Monte Carlo simulation to estimate each team's chance of winning.
    'initial_teams_with_ids' should be a list of (team_id, power_rating) tuples.
    """</span>
    <span class="n">all_team_ids</span> <span class="o">=</span> <span class="nb">sorted</span><span class="p">(</span><span class="nb">list</span><span class="p">(</span><span class="nb">set</span><span class="p">([</span><span class="n">team_id</span> <span class="k">for</span> <span class="n">team_id</span><span class="p">,</span> <span class="n">_</span> <span class="ow">in</span> <span class="n">initial_teams_with_ids</span><span class="p">])))</span>
    <span class="n">win_counts</span> <span class="o">=</span> <span class="p">{</span><span class="n">team_id</span><span class="p">:</span> <span class="mi">0</span> <span class="k">for</span> <span class="n">team_id</span> <span class="ow">in</span> <span class="n">all_team_ids</span><span class="p">}</span>

    <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"Running </span><span class="si">{</span><span class="n">num_simulations</span><span class="si">}</span><span class="s"> tournament simulations..."</span><span class="p">)</span>
    <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">num_simulations</span><span class="p">):</span>
        <span class="n">winner_id</span> <span class="o">=</span> <span class="n">simulate_tournament</span><span class="p">(</span><span class="n">initial_teams_with_ids</span><span class="p">,</span> <span class="n">average_total_goals_per_game</span><span class="p">)</span>
        <span class="n">win_counts</span><span class="p">[</span><span class="n">winner_id</span><span class="p">]</span> <span class="o">+=</span> <span class="mi">1</span>

    <span class="k">print</span><span class="p">(</span><span class="s">"</span><span class="se">\n</span><span class="s">Tournament Win Probabilities:"</span><span class="p">)</span>
    <span class="k">for</span> <span class="n">team_id</span> <span class="ow">in</span> <span class="nb">sorted</span><span class="p">(</span><span class="n">win_counts</span><span class="p">.</span><span class="n">keys</span><span class="p">()):</span>
        <span class="n">probability</span> <span class="o">=</span> <span class="p">(</span><span class="n">win_counts</span><span class="p">[</span><span class="n">team_id</span><span class="p">]</span> <span class="o">/</span> <span class="n">num_simulations</span><span class="p">)</span> <span class="o">*</span> <span class="mi">100</span>
        <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"Team </span><span class="si">{</span><span class="n">team_id</span><span class="si">:</span><span class="mi">2</span><span class="n">d</span><span class="si">}</span><span class="s">: </span><span class="si">{</span><span class="n">probability</span><span class="si">:</span><span class="p">.</span><span class="mi">2</span><span class="n">f</span><span class="si">}</span><span class="s">%"</span><span class="p">)</span>

<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">"__main__"</span><span class="p">:</span>
    <span class="n">parser</span> <span class="o">=</span> <span class="n">argparse</span><span class="p">.</span><span class="n">ArgumentParser</span><span class="p">(</span><span class="n">description</span><span class="o">=</span><span class="s">"Run a Monte Carlo simulation for a 4-team tournament."</span><span class="p">)</span>
    <span class="n">parser</span><span class="p">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s">"num_simulations"</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">int</span><span class="p">,</span> <span class="n">help</span><span class="o">=</span><span class="s">"Number of simulations to run."</span><span class="p">)</span>
    <span class="n">parser</span><span class="p">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s">"average_total_goals_per_game"</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">float</span><span class="p">,</span> 
                        <span class="n">help</span><span class="o">=</span><span class="s">"Average total goals scored in a game for the sport being simulated."</span><span class="p">)</span>
    <span class="n">parser</span><span class="p">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s">"team1_rating"</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">float</span><span class="p">,</span> <span class="n">help</span><span class="o">=</span><span class="s">"Power rating for Team 1."</span><span class="p">)</span>
    <span class="n">parser</span><span class="p">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s">"team2_rating"</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">float</span><span class="p">,</span> <span class="n">help</span><span class="o">=</span><span class="s">"Power rating for Team 2."</span><span class="p">)</span>
    <span class="n">parser</span><span class="p">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s">"team3_rating"</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">float</span><span class="p">,</span> <span class="n">help</span><span class="o">=</span><span class="s">"Power rating for Team 3."</span><span class="p">)</span>
    <span class="n">parser</span><span class="p">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s">"team4_rating"</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">float</span><span class="p">,</span> <span class="n">help</span><span class="o">=</span><span class="s">"Power rating for Team 4."</span><span class="p">)</span>

    <span class="n">args</span> <span class="o">=</span> <span class="n">parser</span><span class="p">.</span><span class="n">parse_args</span><span class="p">()</span>

    <span class="c1"># Validate inputs
</span>    <span class="k">if</span> <span class="n">args</span><span class="p">.</span><span class="n">num_simulations</span> <span class="o">&lt;=</span> <span class="mi">0</span><span class="p">:</span>
        <span class="k">print</span><span class="p">(</span><span class="s">"Error: Number of simulations must be positive."</span><span class="p">)</span>
        <span class="nb">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
    <span class="k">if</span> <span class="n">args</span><span class="p">.</span><span class="n">average_total_goals_per_game</span> <span class="o">&lt;=</span> <span class="mi">0</span><span class="p">:</span>
        <span class="k">print</span><span class="p">(</span><span class="s">"Error: Average total goals per game must be positive."</span><span class="p">)</span>
        <span class="nb">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>

    <span class="n">power_ratings</span> <span class="o">=</span> <span class="p">[</span>
        <span class="n">args</span><span class="p">.</span><span class="n">team1_rating</span><span class="p">,</span>
        <span class="n">args</span><span class="p">.</span><span class="n">team2_rating</span><span class="p">,</span>
        <span class="n">args</span><span class="p">.</span><span class="n">team3_rating</span><span class="p">,</span>
        <span class="n">args</span><span class="p">.</span><span class="n">team4_rating</span>
    <span class="p">]</span>

    <span class="n">initial_tournament_teams</span> <span class="o">=</span> <span class="p">[]</span>
    <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">power</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">power_ratings</span><span class="p">):</span>
        <span class="n">initial_tournament_teams</span><span class="p">.</span><span class="n">append</span><span class="p">((</span><span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span> <span class="n">power</span><span class="p">))</span>

    <span class="n">run_monte_carlo_analysis</span><span class="p">(</span>
        <span class="n">initial_tournament_teams</span><span class="p">,</span>
        <span class="n">args</span><span class="p">.</span><span class="n">num_simulations</span><span class="p">,</span>
        <span class="n">args</span><span class="p">.</span><span class="n">average_total_goals_per_game</span>
    <span class="p">)</span>

</code></pre></div></div>]]></content><author><name></name></author><category term="data science" /><category term="python" /><category term="lacrosse" /><summary type="html"><![CDATA[Who is going to win it all? Heading into Memorial Day weekend the Cornell Big Red have to feel pretty confident. They’ve won their first two playoff games over UAlbany by a score of 15-6, and Richmond by a score of 13-12, and have increased their odds of winning the whole tournament from near 30% to 44%. How did I determine the odds of each team winning the tournament? Monte Carlo analysis, power ratings, and poisson distributions to the rescue. Read on to learn more.]]></summary></entry><entry><title type="html">Predicting outcomes of lacrosse games</title><link href="https://sprestridge.net/data%20science/python/lacrosse/2025/05/18/Monte-Carlo-Lacrosse.html" rel="alternate" type="text/html" title="Predicting outcomes of lacrosse games" /><published>2025-05-18T17:00:00+00:00</published><updated>2025-05-18T17:00:00+00:00</updated><id>https://sprestridge.net/data%20science/python/lacrosse/2025/05/18/Monte-Carlo-Lacrosse</id><content type="html" xml:base="https://sprestridge.net/data%20science/python/lacrosse/2025/05/18/Monte-Carlo-Lacrosse.html"><![CDATA[<p>This weekend is the NCAA Division I Mens Lacrosse Quarterfinals. I wrote a Python script to perform <strong>Monte Carlo analysis on game outcomes for two teams</strong> based on the <strong>ratings</strong> and <strong>average goal differential (AGD)</strong> provided by <a href="https://www.laxnumbers.com?ref=sprestridge.net">LaxNumbers</a>. I’m not sold on the approach taken in the script but it takes into account the two teams relative ratings and their ability to score more goals than their opponents (via the AGD).</p>

<p>Predicted winners are Cornell, Syracuse, Notre Dame, and Maryland.</p>

<p>Click through for details.</p>

<!--more-->

<p>Monte Carlo simulation is a statistical technique to approximate the outcome of an event by generating random samples from a probability distribution. Here, I am using Monte Carlo simulation to predict the number of wins for each team in a series of matches (games).</p>

<p>Here are the quarterfinal teams sorted by rating, high to low. AGD is average goal differential, GF is goals for, and GA is goals against:</p>

<table>
  <thead>
    <tr>
      <th style="text-align: left">Team</th>
      <th style="text-align: right">Record</th>
      <th style="text-align: right">Rating</th>
      <th style="text-align: right">AGD</th>
      <th style="text-align: right">GF</th>
      <th style="text-align: right">GA</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: left">Cornell</td>
      <td style="text-align: right">15-1</td>
      <td style="text-align: right">99.99</td>
      <td style="text-align: right">6.12</td>
      <td style="text-align: right">275</td>
      <td style="text-align: right">176</td>
    </tr>
    <tr>
      <td style="text-align: left">Notre Dame</td>
      <td style="text-align: right">9-4</td>
      <td style="text-align: right">99.52</td>
      <td style="text-align: right">5.00</td>
      <td style="text-align: right">179</td>
      <td style="text-align: right">114</td>
    </tr>
    <tr>
      <td style="text-align: left">Maryland</td>
      <td style="text-align: right">13-3</td>
      <td style="text-align: right">98.43</td>
      <td style="text-align: right">3.12</td>
      <td style="text-align: right">177</td>
      <td style="text-align: right">127</td>
    </tr>
    <tr>
      <td style="text-align: left">Syracuse</td>
      <td style="text-align: right">12-5</td>
      <td style="text-align: right">98.42</td>
      <td style="text-align: right">3.88</td>
      <td style="text-align: right">250</td>
      <td style="text-align: right">183</td>
    </tr>
    <tr>
      <td style="text-align: left">Penn State</td>
      <td style="text-align: right">11-5</td>
      <td style="text-align: right">97.96</td>
      <td style="text-align: right">3.00</td>
      <td style="text-align: right">194</td>
      <td style="text-align: right">146</td>
    </tr>
    <tr>
      <td style="text-align: left">Richmond</td>
      <td style="text-align: right">14-3</td>
      <td style="text-align: right">97.27</td>
      <td style="text-align: right">5.88</td>
      <td style="text-align: right">246</td>
      <td style="text-align: right">147</td>
    </tr>
    <tr>
      <td style="text-align: left">Princeton</td>
      <td style="text-align: right">13-3</td>
      <td style="text-align: right">97.90</td>
      <td style="text-align: right">3.06</td>
      <td style="text-align: right">234</td>
      <td style="text-align: right">186</td>
    </tr>
    <tr>
      <td style="text-align: left">Georgetown</td>
      <td style="text-align: right">11-5</td>
      <td style="text-align: right">95.20</td>
      <td style="text-align: right">3.12</td>
      <td style="text-align: right">196</td>
      <td style="text-align: right">146</td>
    </tr>
  </tbody>
</table>

<p><strong>Sunday</strong></p>

<ul>
  <li>
    <p>Notre Dame, 9-4, rating 99.52, AGD 5.00, GF 179, GA 114</p>
  </li>
  <li>
    <p>Penn State, 11-5, rating 97.96, AGD 3.0, GF 194, GA 146</p>
  </li>
  <li>
    <p>Maryland, 13-3, rating 98.43, AGD 3.12, GF 177, GA 127</p>
  </li>
  <li>
    <p>Georgetown, 11-5, rating 95.20, AGD 3.12, GF 196, GA 146</p>
  </li>
</ul>

<p>I have several versions of Python installed on my system so in the examples below I am specifying use of Python3.10.</p>

<p>For <strong>ND vs Penn State</strong>:</p>

<p><code class="language-plaintext highlighter-rouge">python3.10 mc_teams.py 5000 99.52 5.0 97.96 3.0</code></p>

<p><strong>ND wins</strong> 2,882 games or <em>57.64% of the time</em>.</p>

<p>For <strong>MD vs Gtown</strong>:</p>

<p><code class="language-plaintext highlighter-rouge">python3.10 mc_teams.py 5000 98.43 3.12 95.20 3.12</code></p>

<p><strong>Maryland wins</strong> 2,817 games or <em>56.34% of the time</em>.</p>

<p><strong>Saturday</strong></p>

<p>Cornell and Syracuse each won yesterday by a single goal but let’s look at the numbers.</p>

<ul>
  <li>
    <p>Cornell, 15-1, rating 99.99, AGD 6.12, GF 275, GA 176</p>
  </li>
  <li>
    <p>Richmond, 14-3, rating 97.27, AGD 5.88, GF 246, GA 147</p>
  </li>
  <li>
    <p>Syracuse, 12-5, rating 98.42, AGD 3.88, GF 250, GA 183</p>
  </li>
  <li>
    <p>Princeton, 13-3, rating 97.90, AGD 3.06, GF 234, GA 186</p>
  </li>
</ul>

<p>For <strong>Cornell vs Richmond</strong>:</p>

<p><code class="language-plaintext highlighter-rouge">python3.10 mc_teams.py 5000 99.99 6.12 97.27 5.88</code></p>

<p><strong>Cornell wins</strong> 2,926 games or <em>58.52% of the time</em>.</p>

<p>Let’s try entering the GF per game instead of the ratings for each team. So <strong>Cornell</strong> has 275 / 16 = <strong>17.188 goals for per game</strong> and <strong>Richmond</strong> has 246 / 17 = <strong>14.471 goals for per game</strong>:</p>

<p><code class="language-plaintext highlighter-rouge">python3.10 mc_teams.py 5000 17.188 3.88 14.471 3.06</code></p>

<p>In this case, Cornell wins 3,494 games or <em>69.88% of the time</em>. This method significantly favors Cornell and I do not think is as good as the other.</p>

<p>For <strong>Syracuse vs Princeton</strong>:</p>

<p><code class="language-plaintext highlighter-rouge">python3.10 mc_teams.py 5000 98.42 3.88 97.90 3.06</code></p>

<p><strong>Syracuse wins</strong> 2,640 games or <em>52.80% of the time</em>.</p>

<p>Python Script, <code class="language-plaintext highlighter-rouge">mc_teams.py</code> is shown below. I’m not sold on the rating plus AGD to calculate a team’s “expected goals”, it’s more a total expected rating, but the approach of using the Poisson distribution is sound.</p>

<h3 id="code-explanation">Code Explanation</h3>

<p>The code consists of two main functions: <code class="language-plaintext highlighter-rouge">simulate_match</code> and <code class="language-plaintext highlighter-rouge">monte_carlo_simulation</code>.</p>

<ul>
  <li>
    <p>The <code class="language-plaintext highlighter-rouge">simulate_match</code> function takes the ratings and AGD of the two teams as input and returns the number of goals scored by each team. It uses the following formulas to calculate the expected number of goals:</p>

    <ul>
      <li><code class="language-plaintext highlighter-rouge">team1_expected_goals = team1_rating + team1_agd</code></li>
      <li><code class="language-plaintext highlighter-rouge">team2_expected_goals = team2_rating + team2_agd</code></li>
    </ul>
  </li>
  <li>
    <p>The <code class="language-plaintext highlighter-rouge">monte_carlo_simulation</code> function takes the number of trials and the ratings and AGD of the two teams as input and returns the number of wins for each team. It uses a loop to repeat the simulation for the specified number of trials and keeps track of the number of wins for each team.</p>
  </li>
  <li>
    <p>The <code class="language-plaintext highlighter-rouge">main</code> function reads the command line arguments, calls the <code class="language-plaintext highlighter-rouge">monte_carlo_simulation</code> function, and prints the results.</p>
  </li>
</ul>

<h3 id="example-use-case">Example Use Case</h3>

<p>To use this code, you would need to run it from the command line and provide the following arguments:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">num_trials</code>: The number of trials to run the simulation for.</li>
  <li><code class="language-plaintext highlighter-rouge">team1_rating</code>: The rating of the first team.</li>
  <li><code class="language-plaintext highlighter-rouge">team1_agd</code>: The average goal difference of the first team.</li>
  <li><code class="language-plaintext highlighter-rouge">team2_rating</code>: The rating of the second team.</li>
  <li><code class="language-plaintext highlighter-rouge">team2_agd</code>: The average goal difference of the second team.</li>
</ul>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">#mc_teams.py
</span><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">import</span> <span class="nn">sys</span>

<span class="k">def</span> <span class="nf">monte_carlo_simulation</span><span class="p">(</span><span class="n">num_trials</span><span class="p">,</span> <span class="n">team1_rating</span><span class="p">,</span> <span class="n">team1_agd</span><span class="p">,</span> <span class="n">team2_rating</span><span class="p">,</span> <span class="n">team2_agd</span><span class="p">):</span>
    <span class="k">def</span> <span class="nf">simulate_match</span><span class="p">(</span><span class="n">team1_rating</span><span class="p">,</span> <span class="n">team1_agd</span><span class="p">,</span> <span class="n">team2_rating</span><span class="p">,</span> <span class="n">team2_agd</span><span class="p">):</span>
        <span class="n">team1_expected_goals</span> <span class="o">=</span> <span class="n">team1_rating</span> <span class="o">+</span> <span class="n">team1_agd</span>
        <span class="n">team2_expected_goals</span> <span class="o">=</span> <span class="n">team2_rating</span> <span class="o">+</span> <span class="n">team2_agd</span>
        
        <span class="n">team1_goals</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">poisson</span><span class="p">(</span><span class="n">team1_expected_goals</span><span class="p">)</span>
        <span class="n">team2_goals</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">poisson</span><span class="p">(</span><span class="n">team2_expected_goals</span><span class="p">)</span>
        
        <span class="k">return</span> <span class="n">team1_goals</span><span class="p">,</span> <span class="n">team2_goals</span>

    <span class="n">team1_wins</span> <span class="o">=</span> <span class="mi">0</span>
    <span class="n">team2_wins</span> <span class="o">=</span> <span class="mi">0</span>
    
    <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">num_trials</span><span class="p">):</span>
        <span class="n">team1_goals</span><span class="p">,</span> <span class="n">team2_goals</span> <span class="o">=</span> <span class="n">simulate_match</span><span class="p">(</span><span class="n">team1_rating</span><span class="p">,</span> <span class="n">team1_agd</span><span class="p">,</span> <span class="n">team2_rating</span><span class="p">,</span> <span class="n">team2_agd</span><span class="p">)</span>
        
        <span class="k">if</span> <span class="n">team1_goals</span> <span class="o">&gt;</span> <span class="n">team2_goals</span><span class="p">:</span>
            <span class="n">team1_wins</span> <span class="o">+=</span> <span class="mi">1</span>
        <span class="k">elif</span> <span class="n">team2_goals</span> <span class="o">&gt;</span> <span class="n">team1_goals</span><span class="p">:</span>
            <span class="n">team2_wins</span> <span class="o">+=</span> <span class="mi">1</span>
        <span class="k">else</span><span class="p">:</span>
            <span class="k">if</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">rand</span><span class="p">()</span> <span class="o">&lt;</span> <span class="mf">0.5</span><span class="p">:</span>
                <span class="n">team1_wins</span> <span class="o">+=</span> <span class="mi">1</span>
            <span class="k">else</span><span class="p">:</span>
                <span class="n">team2_wins</span> <span class="o">+=</span> <span class="mi">1</span>
            
    <span class="k">return</span> <span class="n">team1_wins</span><span class="p">,</span><span class="n">ref</span> <span class="n">team2_wins</span>

<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">"__main__"</span><span class="p">:</span>
    <span class="c1"># Read command line arguments
</span>    <span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">sys</span><span class="p">.</span><span class="n">argv</span><span class="p">)</span> <span class="o">!=</span> <span class="mi">6</span><span class="p">:</span>
        <span class="k">print</span><span class="p">(</span><span class="s">"Usage: mc_teams.py &lt;num_trials&gt; &lt;team1_rating&gt; &lt;team1_agd&gt; &lt;team2_rating&gt; &lt;team2_agd&gt;"</span><span class="p">)</span>
        <span class="n">sys</span><span class="p">.</span><span class="nb">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>

    <span class="n">num_trials</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">sys</span><span class="p">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span>
    <span class="n">team1_rating</span> <span class="o">=</span> <span class="nb">float</span><span class="p">(</span><span class="n">sys</span><span class="p">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">2</span><span class="p">])</span>
    <span class="n">team1_agd</span> <span class="o">=</span> <span class="nb">float</span><span class="p">(</span><span class="n">sys</span><span class="p">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">3</span><span class="p">])</span>
    <span class="n">team2_rating</span> <span class="o">=</span> <span class="nb">float</span><span class="p">(</span><span class="n">sys</span><span class="p">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">4</span><span class="p">])</span>
    <span class="n">team2_agd</span> <span class="o">=</span> <span class="nb">float</span><span class="p">(</span><span class="n">sys</span><span class="p">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">5</span><span class="p">])</span>

    <span class="n">team1_wins</span><span class="p">,</span> <span class="n">team2_wins</span> <span class="o">=</span> <span class="n">monte_carlo_simulation</span><span class="p">(</span><span class="n">num_trials</span><span class="p">,</span> <span class="n">team1_rating</span><span class="p">,</span> <span class="n">team1_agd</span><span class="p">,</span> <span class="n">team2_rating</span><span class="p">,</span> <span class="n">team2_agd</span><span class="p">)</span>

    <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"After </span><span class="si">{</span><span class="n">num_trials</span><span class="si">}</span><span class="s"> trials:"</span><span class="p">)</span>
    <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"Team 1 Wins, %: </span><span class="si">{</span><span class="n">team1_wins</span><span class="si">}</span><span class="s">, </span><span class="si">{</span><span class="n">team1_wins</span> <span class="o">/</span> <span class="n">num_trials</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>
    <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"Team 2 Wins %: </span><span class="si">{</span><span class="n">team2_wins</span><span class="si">}</span><span class="s">, </span><span class="si">{</span><span class="n">team2_wins</span> <span class="o">/</span> <span class="n">num_trials</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>
</code></pre></div></div>]]></content><author><name></name></author><category term="data science" /><category term="python" /><category term="lacrosse" /><summary type="html"><![CDATA[This weekend is the NCAA Division I Mens Lacrosse Quarterfinals. I wrote a Python script to perform Monte Carlo analysis on game outcomes for two teams based on the ratings and average goal differential (AGD) provided by LaxNumbers. I’m not sold on the approach taken in the script but it takes into account the two teams relative ratings and their ability to score more goals than their opponents (via the AGD). Predicted winners are Cornell, Syracuse, Notre Dame, and Maryland. Click through for details.]]></summary></entry><entry><title type="html">No More Paper Books</title><link href="https://sprestridge.net/sci-fi/books/2025/01/06/No-More-Paper-Books.html" rel="alternate" type="text/html" title="No More Paper Books" /><published>2025-01-06T17:00:00+00:00</published><updated>2025-01-06T17:00:00+00:00</updated><id>https://sprestridge.net/sci-fi/books/2025/01/06/No-More-Paper-Books</id><content type="html" xml:base="https://sprestridge.net/sci-fi/books/2025/01/06/No-More-Paper-Books.html"><![CDATA[<h2 id="return-from-the-stars-by-stanislaw-lem---wikipedia"><a href="https://en.m.wikipedia.org/wiki/Return_from_the_Stars">Return from the Stars, by Stanislaw Lem - Wikipedia</a></h2>

<p><a href="https://en.wikipedia.org/wiki/Time_dilation">Time dilation</a> fascinated me from a young age, before I even learned about the special theory of relativity or studied physics. I learned about time dilation from the sci-fi books of my youth, like this one.</p>

<blockquote>
  <p>Written in 1961, it is the story of a <a href="https://en.m.wikipedia.org/wiki/Cosmonaut">cosmonaut</a> returning to his homeworld, Earth, after more than a century in Earth time, but just 10 years for him, finding it a completely different place, with many developments he dislikes.</p>
</blockquote>

<p>An even more wild thing is the author predicts the disappearance of paper books and describes a reading device like a tablet or an e-ink reader (Kindle) with a single page and successive touches reveal the next page.</p>

<p><a href="/sci-fi/books/2025/01/06/No-More-Paper-Books.html">–more–</a></p>

<!--more-->

<p>Although, I wonder if Stanislaw had seen <a href="https://en.wikipedia.org/wiki/2001:_A_Space_Odyssey">2001: A Space Odyssey</a> before he wrote the book. Kubrick’s screenplay famously shows the astronauts aboard the Discovery One using an IBM Newspad, a device that allows space-faring humans to keep up with the news back home. Samsung famously used this in a patent battle with Apple over the technology behind the iPad, claiming prior art.</p>

<p>Technology predictions aside, imagine being gone only 10-years from your point of view but the home planet you left behind has experienced more than 100-years.</p>

<p>What wonders would you experience?</p>

<p>What social norms would you not understand?</p>

<blockquote>
  <p>The books were crystals with recorded contents. They could be read with the aid of an opton, which was similar to a book but had only one page between the covers. At a touch, successive pages of the text appeared on it.</p>
</blockquote>

<p><img src="/img/2025-01-06-kindle-no-paper-books.jpg" alt="" title="A screenshot of an e-ink reader, in this case a Kindle, with a quote from the book, Return from the Stars, by Stanislaw Lem, that reads 'The books were crystals with recorded contents. They could be read with the aid of an Opton, which was similar to a book but had only one page between the covers. At a touch, successive pages of the text appeared on it.'" /></p>

<p><cite>Return from the Stars, by Stanislaw Lem, 1961</cite></p>

<p>If you would like to read <em>Return from the Stars</em>, the <a href="https://archive.org/details/returnfromstars0000lems/page/n5/mode/2up">book is available on archive.org</a>.</p>]]></content><author><name></name></author><category term="sci-fi" /><category term="books" /><summary type="html"><![CDATA[Return from the Stars, by Stanislaw Lem - Wikipedia Time dilation fascinated me from a young age, before I even learned about the special theory of relativity or studied physics. I learned about time dilation from the sci-fi books of my youth, like this one. Written in 1961, it is the story of a cosmonaut returning to his homeworld, Earth, after more than a century in Earth time, but just 10 years for him, finding it a completely different place, with many developments he dislikes. An even more wild thing is the author predicts the disappearance of paper books and describes a reading device like a tablet or an e-ink reader (Kindle) with a single page and successive touches reveal the next page. –more–]]></summary></entry><entry><title type="html">When the press is not free</title><link href="https://sprestridge.net/antifa/2025/01/05/Press-Not-Free.html" rel="alternate" type="text/html" title="When the press is not free" /><published>2025-01-05T17:00:00+00:00</published><updated>2025-01-05T17:00:00+00:00</updated><id>https://sprestridge.net/antifa/2025/01/05/Press-Not-Free</id><content type="html" xml:base="https://sprestridge.net/antifa/2025/01/05/Press-Not-Free.html"><![CDATA[<h2 id="washington-post-cartoonist-ann-telnaes-quits-after-paper-rejects-sketch-of-bezos-bowing-to-trump">Washington Post Cartoonist, Ann Telnaes, quits after paper rejects sketch of Bezos bowing to Trump</h2>

<p>First there was Jeff Bezos’ decision to block the Washington Post endorsement of Vice President Kamala Harris for president, and now this. I already <a href="https://www.npr.org/2024/10/28/nx-s1-5168416/washington-post-bezos-endorsement-president-cancellations-resignations">canceled my subscription</a> but that did not stop the Post from spending the holidays begging me to return with offers of reduced subscription pricing.</p>

<p>Via Ann’s blog post on the matter:</p>

<blockquote>
  <p>“I’ve worked for the Washington Post since 2008 as an editorial cartoonist. … I’ve never had a cartoon killed because of who or what I chose to aim my pen at. Until now.”</p>
</blockquote>

<p><img src="/img/ann-telnaes-cartoon-2025-01.webp" alt="" title="Draft of the cancelled cartoon showing Mark Zuckerberg-Facebook &amp; Meta founder and CEO, Sam Altman-OpenAI CEO, Patrick Soon-Shiong-LA Times publisher, the Walt Disney Company (as symbolized by Mickey Mouse)/ABC News, and Jeff Bezos-Washington Post owner praising and offering gifts to a big statue or person only seen from the belly down but easily identifiable as Trump by the large belly, small hands, and too long tie." /></p>

<p><cite>Ann Telnaes, former Washington Post Political Cartoonist</cite></p>

<p><cite><em>Alt text</em>: Draft of the cancelled cartoon showing Mark Zuckerberg-Facebook &amp; Meta founder and CEO, Sam Altman-OpenAI CEO, Patrick Soon-Shiong-LA Times publisher, the Walt Disney Company (as symbolized by Mickey Mouse)/ABC News, and Jeff Bezos-Washington Post owner praising and offering gifts to a big statue or person only seen from the belly down but easily identifiable as Trump by the large belly, small hands, and too long tie.</cite></p>

<p><a href="/antifa/2025/01/05/Press-Not-Free.html">–more–</a></p>

<!--more-->

<p>But sure, <a href="https://en.wikipedia.org/wiki/Democracy_Dies_in_Darkness">democracy dies in darkness</a>. Ann is at least holding truth to power through her cartooning and decision to quit her job.</p>

<p>I am not linking to the post because it is on <a href="https://www.theatlantic.com/ideas/archive/2023/11/substack-extremism-nazi-white-supremacy-newsletters/676156/">Substack (you know - if it walks like a duck)</a>, but Ann Telnaes’ entire post about the situation follows.</p>

<hr />

<p>2025-01-03</p>

<p><strong>By Ann Telnaes</strong></p>

<blockquote>
  <p>I’ve worked for the <em>Washington Post</em> since 2008 as an editorial cartoonist. I have had editorial feedback and productive conversations–and some differences–about cartoons I have submitted for publication, but in all that time I’ve never had a cartoon killed because of who or what I chose to aim my pen at. Until now.</p>

  <p>The cartoon that was killed criticizes the billionaire tech and media chief executives who have been doing their best to curry favor with incoming President-elect Trump. There have been multiple <a href="https://www.washingtonpost.com/politics/2024/12/19/trump-bezos-musk-dinner/">articles</a> recently about these men with lucrative government contracts and an interest in eliminating regulations making their way to Mar-a-lago. The group in the cartoon included Mark Zuckerberg/Facebook &amp; Meta founder and CEO, Sam Altman/AI CEO, Patrick Soon-Shiong/<em>LA Times _publisher, the Walt Disney Company/ABC News, and Jeff Bezos/_Washington Post</em> owner.</p>

  <p>While it isn’t uncommon for editorial page editors to object to visual metaphors within a cartoon if it strikes that editor as unclear or isn’t correctly conveying the message intended by the cartoonist, such editorial criticism was not the case regarding this cartoon. To be clear, there have been instances where sketches have been rejected or revisions requested, but never because of the point of view inherent in the cartoon’s commentary. That’s a game changer…and dangerous for a free press.</p>

  <p><img src="/img/ann-telnaes-cartoon-2025-01.webp" alt="" /></p>

  <p>(rough of cartoon killed)</p>

  <p>Over the years I have watched my overseas colleagues risk their livelihoods and sometimes even their lives to expose injustices and hold their countries’ leaders accountable. As a member of the <a href="https://freedomcartoonists.com/about/#governance">Advisory board</a> for the Geneva based <a href="https://freedomcartoonists.com/">Freedom Cartoonists Foundation </a>and a former board member of <a href="https://cartoonistsrights.org/">Cartoonists Rights</a>, I believe that editorial cartoonists are vital for civic debate and have an essential role in journalism.</p>

  <p>There will be people who say, “Hey, you work for a company and that company has the right to expect employees to adhere to what’s good for the company”. That’s true except we’re talking about news organizations that have public obligations and who are obliged to nurture a free press in a democracy. Owners of such press organizations are responsible for safeguarding that free press– and trying to get in the good graces of an autocrat-in-waiting will only result in undermining that free press.</p>

  <p>As an editorial cartoonist, my job is to hold powerful people and institutions accountable. For the first time, my editor prevented me from doing that critical job. So I have decided to leave the <em>Post</em>. I doubt my decision will cause much of a stir and that it will be dismissed because I’m just a cartoonist. But I will not stop holding truth to power through my cartooning, because as they say, “<em>Democracy dies in darkness</em>”.</p>

  <p>Thank you for reading this.</p>
</blockquote>]]></content><author><name></name></author><category term="antifa" /><summary type="html"><![CDATA[Washington Post Cartoonist, Ann Telnaes, quits after paper rejects sketch of Bezos bowing to Trump First there was Jeff Bezos’ decision to block the Washington Post endorsement of Vice President Kamala Harris for president, and now this. I already canceled my subscription but that did not stop the Post from spending the holidays begging me to return with offers of reduced subscription pricing. Via Ann’s blog post on the matter: “I’ve worked for the Washington Post since 2008 as an editorial cartoonist. … I’ve never had a cartoon killed because of who or what I chose to aim my pen at. Until now.” Ann Telnaes, former Washington Post Political Cartoonist Alt text: Draft of the cancelled cartoon showing Mark Zuckerberg-Facebook &amp; Meta founder and CEO, Sam Altman-OpenAI CEO, Patrick Soon-Shiong-LA Times publisher, the Walt Disney Company (as symbolized by Mickey Mouse)/ABC News, and Jeff Bezos-Washington Post owner praising and offering gifts to a big statue or person only seen from the belly down but easily identifiable as Trump by the large belly, small hands, and too long tie. –more–]]></summary></entry><entry><title type="html">Write Better Code</title><link href="https://sprestridge.net/ai/data%20science/2025/01/04/Write-Better-Code.html" rel="alternate" type="text/html" title="Write Better Code" /><published>2025-01-04T17:00:00+00:00</published><updated>2025-01-04T17:00:00+00:00</updated><id>https://sprestridge.net/ai/data%20science/2025/01/04/Write-Better-Code</id><content type="html" xml:base="https://sprestridge.net/ai/data%20science/2025/01/04/Write-Better-Code.html"><![CDATA[<p>Did you know that continually asking an LLM to “<em>write better code</em>” works?</p>

<p>In <a href="https://minimaxir.com/2025/01/write-better-code/">Can LLMs write better code if you keep asking them to “write better code”?</a> Max Woolf investigates and kicks-off with a brief review of the short-lived meme where users asked a Stable Diffusion model to “<em>make it more bro</em>” (really, go look at the hero image on the blog post).</p>

<p>Ultimately, Max showed that LLMs can indeed make better code just by being asked.</p>

<p><a href="/ai/data%20science/2025/01/04/Write-Better-Code.html">–more–</a></p>

<!--more-->

<blockquote>
  <p>In all, asking an LLM to “write code better” does indeed make the code better, depending on your definition of better. Through the use of the generic iterative prompts, the code did objectively improve from the base examples, both in terms of additional features and speed. Prompt engineering improved the performance of the code much more rapidly and consistently, but was more likely to introduce subtle bugs as LLMs are not optimized to generate high-performance code. As with any use of LLMs, your mileage may vary, and in the end it requires a human touch to fix the inevitable issues no matter how often AI hypesters cite LLMs as magic.</p>

  <p>There are a few optimizations that I am very surprised Claude 3.5 Sonnet did not identify and implement during either experiment. Namely, it doesn’t explore the statistical angle: since we are generating 1,000,000 numbers uniformly from a range of 1 to 100,000, there will be a significant amount of duplicate numbers that will never need to be analyzed. The LLM did not attempt to dedupe, such as casting the list of numbers into a Python <code class="language-plaintext highlighter-rouge">set()</code> or using numpy’s <code class="language-plaintext highlighter-rouge">unique()</code>. I was also expecting an implementation that involves sorting the list of 1,000,000 numbers ascending: that way the algorithm could search the list from the start to the end for the minimum (or the end to the start for the maximum) without checking every number, although sorting is slow and a vectorized approach is indeed more pragmatic.</p>
</blockquote>

<p>The best part is that Max’s code is available on GitHub including the full, unedited conversation threads for the two methods of prompting he covers:</p>

<ul>
  <li>
    <p><a href="https://github.com/minimaxir/llm-write-better-code/blob/main/python_30_casual_use.md">Casual prompting</a></p>
  </li>
  <li>
    <p><a href="https://github.com/minimaxir/llm-write-better-code/blob/main/python_30_prompt_engineering.md">Prompt engineering</a></p>
  </li>
</ul>

<p>After an initial bout of hard work maybe someday the LLMs will just say “<a href="https://en.wikipedia.org/wiki/Bartleby%2C_the_Scrivener">I would prefer not to</a>”.</p>]]></content><author><name></name></author><category term="ai" /><category term="data science" /><summary type="html"><![CDATA[Did you know that continually asking an LLM to “write better code” works? In Can LLMs write better code if you keep asking them to “write better code”? Max Woolf investigates and kicks-off with a brief review of the short-lived meme where users asked a Stable Diffusion model to “make it more bro” (really, go look at the hero image on the blog post). Ultimately, Max showed that LLMs can indeed make better code just by being asked. –more–]]></summary></entry><entry><title type="html">Unemployment - Chart Recreations</title><link href="https://sprestridge.net/datavis/data/python/excel/2024/11/07/Unemployment-Chart-Recreation.html" rel="alternate" type="text/html" title="Unemployment - Chart Recreations" /><published>2024-11-07T14:00:00+00:00</published><updated>2024-11-07T14:00:00+00:00</updated><id>https://sprestridge.net/datavis/data/python/excel/2024/11/07/Unemployment-Chart-Recreation</id><content type="html" xml:base="https://sprestridge.net/datavis/data/python/excel/2024/11/07/Unemployment-Chart-Recreation.html"><![CDATA[<p><strong>TIL</strong>: Shaded area on a line graph in Matplotlib using pyplot.</p>

<p>Dr. Drang had two posts on <a href="https://leancrew.com/all-this/">All This</a> about the jobless rate (unemployment) and his use of Matplotlib and Python for plotting. I decided to experiment. I always love a good chart recreation exercise. <a href="https://leancrew.com/all-this/2024/11/ticks-tricks/">Ticks tricks</a> provides the details while <a href="https://leancrew.com/all-this/2024/11/rescaling-a-graph/">Rescaling a graph</a> was the original post.</p>

<p><a href="/datavis/data/python/excel/2024/11/07/Unemployment-Chart-Recreation.html">–more–</a></p>

<!--more-->

<h3 id="1-obtain-the-same-data">1. Obtain the same data</h3>

<p>Use the BLS link:</p>

<p><a href="https://www.bls.gov/charts/employment-situation/civilian-unemployment-rate.htm">https://www.bls.gov/charts/employment-situation/civilian-unemployment-rate.htm</a></p>

<p>Click on the <a href="https://www.bls.gov/charts/employment-situation/civilian-unemployment-rate.htm#">Show Table</a> link on that page.</p>

<p>Copy and Paste Special Values that data into an Excel or Numbers file and save it as a CSV.</p>

<p>Drang does EDL on the file first:</p>

<ol>
  <li>
    <p>Extract January 2014 to October 2024</p>
  </li>
  <li>
    <p>Delete everything except the first two columns.</p>
  </li>
  <li>
    <p>Relabel them <code class="language-plaintext highlighter-rouge">Date</code> and <code class="language-plaintext highlighter-rouge">Rate</code>.</p>
  </li>
  <li>
    <p>Save as CSV named <code class="language-plaintext highlighter-rouge">unemployment.csv</code></p>
  </li>
  <li>
    <p>Deal with June, July, and September which need to be converted to their three letter abbreviations. Drang uses regular expressions and BBEdit. Sublime works too. His notes:</p>
  </li>
</ol>

<blockquote>
  <p>The regular expression in the Find section is</p>

  <p><code class="language-plaintext highlighter-rouge">^(\w\w\w)\w </code></p>

  <p>If you select that line you’ll see there’s a space at the end of it, which is important. The Replace regex is</p>

  <p><code class="language-plaintext highlighter-rouge">\1 </code></p>

  <p>and there’s a space character at the end of it, too.</p>
</blockquote>

<p>It doesn’t look like I need to deal with this as I did some EDA in Excel and it looks like I have the final CSV formatted with only three character months. The space at the end of the regex and the use of the group are good tools to remember.</p>

<h3 id="2-plot-the-data">2. Plot the data</h3>

<p>Dr. Drang’s code with line numbers:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="mi">1</span>  <span class="c1">#!/usr/bin/env python3
</span> <span class="mi">2</span>  
 <span class="mi">3</span>  <span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="n">pd</span>
 <span class="mi">4</span>  <span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>
 <span class="mi">5</span>  <span class="kn">from</span> <span class="nn">datetime</span> <span class="kn">import</span> <span class="n">datetime</span>
 <span class="mi">6</span>  <span class="kn">from</span> <span class="nn">matplotlib.ticker</span> <span class="kn">import</span> <span class="n">MultipleLocator</span>
 <span class="mi">7</span>  <span class="kn">from</span> <span class="nn">matplotlib.dates</span> <span class="kn">import</span> <span class="n">DateFormatter</span><span class="p">,</span> <span class="n">YearLocator</span>
 <span class="mi">8</span>  
 <span class="mi">9</span>  <span class="c1"># Import data
</span><span class="mi">10</span>  <span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">read_csv</span><span class="p">(</span><span class="s">'unemployment.csv'</span><span class="p">)</span>
<span class="mi">11</span>  <span class="n">x</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">to_datetime</span><span class="p">(</span><span class="n">df</span><span class="p">.</span><span class="n">Date</span><span class="p">,</span> <span class="nb">format</span><span class="o">=</span><span class="s">"%b %Y"</span><span class="p">)</span>
<span class="mi">12</span>  <span class="n">y</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="n">Rate</span>
<span class="mi">13</span>  
<span class="mi">14</span>  <span class="c1"># Create the plot with a given size in inches
</span><span class="mi">15</span>  <span class="n">fig</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="p">.</span><span class="n">subplots</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span><span class="mi">5</span><span class="p">))</span>
<span class="mi">16</span>  
<span class="mi">17</span>  <span class="c1"># Add a line
</span><span class="mi">18</span>  <span class="n">ax</span><span class="p">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="s">'-'</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">'black'</span><span class="p">,</span> <span class="n">lw</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
<span class="mi">19</span>  
<span class="mi">20</span>  <span class="c1"># Set the limits
</span><span class="mi">21</span>  <span class="n">plt</span><span class="p">.</span><span class="n">xlim</span><span class="p">(</span><span class="n">xmin</span><span class="o">=</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2014</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">),</span> <span class="n">xmax</span><span class="o">=</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2025</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">))</span>
<span class="mi">22</span>  <span class="n">plt</span><span class="p">.</span><span class="n">ylim</span><span class="p">(</span><span class="n">ymin</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">ymax</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span>
<span class="mi">23</span>  
<span class="mi">24</span>  <span class="c1"># Set the major and minor ticks and add a grid
</span><span class="mi">25</span>  <span class="n">ax</span><span class="p">.</span><span class="n">xaxis</span><span class="p">.</span><span class="n">set_major_locator</span><span class="p">(</span><span class="n">YearLocator</span><span class="p">(</span><span class="mi">1</span><span class="p">))</span>
<span class="mi">26</span>  <span class="n">ax</span><span class="p">.</span><span class="n">xaxis</span><span class="p">.</span><span class="n">set_major_formatter</span><span class="p">(</span><span class="n">DateFormatter</span><span class="p">(</span><span class="s">'        ’%y'</span><span class="p">))</span>
<span class="mi">27</span>  <span class="n">plt</span><span class="p">.</span><span class="n">setp</span><span class="p">(</span><span class="n">ax</span><span class="p">.</span><span class="n">get_xticklabels</span><span class="p">()[</span><span class="o">-</span><span class="mi">1</span><span class="p">],</span> <span class="n">visible</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="mi">28</span>  <span class="n">ax</span><span class="p">.</span><span class="n">yaxis</span><span class="p">.</span><span class="n">set_major_locator</span><span class="p">(</span><span class="n">MultipleLocator</span><span class="p">(</span><span class="mi">2</span><span class="p">))</span>
<span class="mi">29</span>  <span class="n">ax</span><span class="p">.</span><span class="n">yaxis</span><span class="p">.</span><span class="n">set_major_formatter</span><span class="p">(</span><span class="s">'{x:.0f}%'</span><span class="p">)</span>
<span class="mi">30</span>  <span class="n">ax</span><span class="p">.</span><span class="n">yaxis</span><span class="p">.</span><span class="n">set_minor_locator</span><span class="p">(</span><span class="n">MultipleLocator</span><span class="p">(</span><span class="mi">1</span><span class="p">))</span>
<span class="mi">31</span>  <span class="n">ax</span><span class="p">.</span><span class="n">grid</span><span class="p">(</span><span class="n">linewidth</span><span class="o">=</span><span class="p">.</span><span class="mi">5</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="s">'y'</span><span class="p">,</span> <span class="n">which</span><span class="o">=</span><span class="s">'both'</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">'#dddddd'</span><span class="p">,</span> <span class="n">linestyle</span><span class="o">=</span><span class="s">'-'</span><span class="p">)</span>
<span class="mi">32</span>  
<span class="mi">33</span>  <span class="c1"># Title and axis labels
</span><span class="mi">34</span>  <span class="n">plt</span><span class="p">.</span><span class="n">title</span><span class="p">(</span><span class="s">'Civilian Unemployment'</span><span class="p">)</span>
<span class="mi">35</span>  
<span class="mi">36</span>  <span class="c1"># Annotations
</span><span class="mi">37</span>  <span class="n">plt</span><span class="p">.</span><span class="n">text</span><span class="p">(</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2020</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="mf">2.45</span><span class="p">,</span> <span class="s">"Peak of 14.8% in April 2020"</span><span class="p">,</span> <span class="n">ha</span><span class="o">=</span><span class="s">'center'</span><span class="p">)</span>
<span class="mi">38</span>  <span class="n">plt</span><span class="p">.</span><span class="n">arrow</span><span class="p">(</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2020</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="mf">2.75</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="p">.</span><span class="mi">45</span><span class="p">,</span> <span class="n">head_width</span><span class="o">=</span><span class="mi">50</span><span class="p">,</span> <span class="n">head_length</span><span class="o">=</span><span class="p">.</span><span class="mi">25</span><span class="p">,</span> <span class="n">lw</span><span class="o">=</span><span class="p">.</span><span class="mi">75</span><span class="p">,</span> <span class="n">fc</span><span class="o">=</span><span class="s">'black'</span><span class="p">,</span> <span class="n">zorder</span><span class="o">=</span><span class="mi">100</span><span class="p">)</span>
<span class="mi">39</span>  
<span class="mi">40</span>  <span class="c1"># Make the border and tick marks 0.5 points wide
</span><span class="mi">41</span>  <span class="p">[</span> <span class="n">i</span><span class="p">.</span><span class="n">set_linewidth</span><span class="p">(</span><span class="mf">0.5</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">ax</span><span class="p">.</span><span class="n">spines</span><span class="p">.</span><span class="n">values</span><span class="p">()</span> <span class="p">]</span>
<span class="mi">42</span>  <span class="n">ax</span><span class="p">.</span><span class="n">tick_params</span><span class="p">(</span><span class="n">which</span><span class="o">=</span><span class="s">'both'</span><span class="p">,</span> <span class="n">width</span><span class="o">=</span><span class="p">.</span><span class="mi">5</span><span class="p">)</span>
<span class="mi">43</span>  
<span class="mi">44</span>  <span class="c1"># Save as PNG
</span><span class="mi">45</span>  <span class="n">plt</span><span class="p">.</span><span class="n">savefig</span><span class="p">(</span><span class="s">'20241103-Improved unemployment graph.png'</span><span class="p">,</span> <span class="nb">format</span><span class="o">=</span><span class="s">'png'</span><span class="p">,</span> <span class="n">dpi</span><span class="o">=</span><span class="mi">200</span><span class="p">)</span>
</code></pre></div></div>

<p>The <a href="https://leancrew.com/all-this/2024/11/ticks-tricks/">Ticks tricks</a> post covers the specifics about the code well. Review that post for details.</p>

<p>My final version is shown below. I had assistance from Duck Duck Go and CoPilot on the title, subtitle, labels, Pandemic shading area, and a few other formatting options.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># my final version, thanks to both DuckDuckGo and CoPilot for assistance with labels, the Pandemic shading, and formatting of the Title and Sub-title
</span><span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="n">pd</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>
<span class="kn">from</span> <span class="nn">datetime</span> <span class="kn">import</span> <span class="n">datetime</span>
<span class="kn">from</span> <span class="nn">matplotlib.ticker</span> <span class="kn">import</span> <span class="n">MultipleLocator</span>
<span class="kn">from</span> <span class="nn">matplotlib.dates</span> <span class="kn">import</span> <span class="n">DateFormatter</span><span class="p">,</span> <span class="n">YearLocator</span>

<span class="c1"># Import data
</span><span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">read_csv</span><span class="p">(</span><span class="s">'unemployment.csv'</span><span class="p">)</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">to_datetime</span><span class="p">(</span><span class="n">df</span><span class="p">.</span><span class="n">Date</span><span class="p">,</span> <span class="nb">format</span><span class="o">=</span><span class="s">"%b-%Y"</span><span class="p">)</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="n">Rate</span>

<span class="c1"># Create the plot with a given size in inches
</span><span class="n">fig</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="p">.</span><span class="n">subplots</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">9</span><span class="p">,</span> <span class="mi">9</span><span class="p">))</span>

<span class="c1"># Add a line
</span><span class="n">ax</span><span class="p">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="s">'-'</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">'black'</span><span class="p">,</span> <span class="n">lw</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>

<span class="c1"># Set the limits
</span><span class="n">plt</span><span class="p">.</span><span class="n">xlim</span><span class="p">(</span><span class="n">xmin</span><span class="o">=</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2014</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">),</span> <span class="n">xmax</span><span class="o">=</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2025</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">))</span>
<span class="n">plt</span><span class="p">.</span><span class="n">ylim</span><span class="p">(</span><span class="n">ymin</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">ymax</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span>

<span class="c1"># Set the major and minor ticks and add a grid
</span><span class="n">ax</span><span class="p">.</span><span class="n">xaxis</span><span class="p">.</span><span class="n">set_major_locator</span><span class="p">(</span><span class="n">YearLocator</span><span class="p">(</span><span class="mi">1</span><span class="p">))</span>
<span class="n">ax</span><span class="p">.</span><span class="n">xaxis</span><span class="p">.</span><span class="n">set_major_formatter</span><span class="p">(</span><span class="n">DateFormatter</span><span class="p">(</span><span class="s">'        ’%y'</span><span class="p">))</span>
<span class="n">plt</span><span class="p">.</span><span class="n">setp</span><span class="p">(</span><span class="n">ax</span><span class="p">.</span><span class="n">get_xticklabels</span><span class="p">()[</span><span class="o">-</span><span class="mi">1</span><span class="p">],</span> <span class="n">visible</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">yaxis</span><span class="p">.</span><span class="n">set_major_locator</span><span class="p">(</span><span class="n">MultipleLocator</span><span class="p">(</span><span class="mi">2</span><span class="p">))</span>
<span class="n">ax</span><span class="p">.</span><span class="n">yaxis</span><span class="p">.</span><span class="n">set_major_formatter</span><span class="p">(</span><span class="s">'{x:.0f}%'</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">yaxis</span><span class="p">.</span><span class="n">set_minor_locator</span><span class="p">(</span><span class="n">MultipleLocator</span><span class="p">(</span><span class="mi">1</span><span class="p">))</span>
<span class="n">ax</span><span class="p">.</span><span class="n">grid</span><span class="p">(</span><span class="n">linewidth</span><span class="o">=</span><span class="p">.</span><span class="mi">5</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="s">'y'</span><span class="p">,</span> <span class="n">which</span><span class="o">=</span><span class="s">'both'</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">'#dddddd'</span><span class="p">,</span> <span class="n">linestyle</span><span class="o">=</span><span class="s">'-'</span><span class="p">)</span>

<span class="c1"># Title and axis labels
</span><span class="n">plt</span><span class="p">.</span><span class="n">suptitle</span><span class="p">(</span><span class="s">'Jobless rate'</span><span class="p">,</span> <span class="n">x</span><span class="o">=</span><span class="mf">0.1</span><span class="p">,</span> <span class="n">ha</span><span class="o">=</span><span class="s">'left'</span><span class="p">,</span> <span class="n">fontsize</span><span class="o">=</span><span class="mi">18</span><span class="p">,</span> <span class="n">fontweight</span><span class="o">=</span><span class="s">'bold'</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">title</span><span class="p">(</span><span class="s">'Percent of civilian workforce that is unemployed, by month, </span><span class="se">\n</span><span class="s">seasonally adjusted'</span><span class="p">,</span>
           <span class="n">x</span><span class="o">=</span><span class="mf">0.0</span><span class="p">,</span>
           <span class="n">ha</span><span class="o">=</span><span class="s">'left'</span><span class="p">,</span> 
           <span class="n">fontstyle</span><span class="o">=</span><span class="s">'italic'</span><span class="p">,</span>
           <span class="n">fontsize</span><span class="o">=</span><span class="mi">14</span><span class="p">)</span>

<span class="c1"># Annotations
</span><span class="n">plt</span><span class="p">.</span><span class="n">text</span><span class="p">(</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2020</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="mf">2.6</span><span class="p">,</span> <span class="s">"Peak of 14.8% in April 2020"</span><span class="p">,</span> <span class="n">ha</span><span class="o">=</span><span class="s">'center'</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">arrow</span><span class="p">(</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2020</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="mf">2.75</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="p">.</span><span class="mi">45</span><span class="p">,</span> <span class="n">head_width</span><span class="o">=</span><span class="mi">50</span><span class="p">,</span> <span class="n">head_length</span><span class="o">=</span><span class="p">.</span><span class="mi">25</span><span class="p">,</span> <span class="n">lw</span><span class="o">=</span><span class="p">.</span><span class="mi">75</span><span class="p">,</span> <span class="n">fc</span><span class="o">=</span><span class="s">'black'</span><span class="p">,</span> <span class="n">zorder</span><span class="o">=</span><span class="mi">100</span><span class="p">)</span>

<span class="n">plt</span><span class="p">.</span><span class="n">text</span><span class="p">(</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2024</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="mf">3.0</span><span class="p">,</span> <span class="s">"October '24:</span><span class="se">\n</span><span class="s"> 4.1%"</span><span class="p">,</span> <span class="n">ha</span><span class="o">=</span><span class="s">'right'</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">arrow</span><span class="p">(</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2024</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="mf">3.3</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="p">.</span><span class="mi">45</span><span class="p">,</span> <span class="n">head_width</span><span class="o">=</span><span class="mi">50</span><span class="p">,</span> <span class="n">head_length</span><span class="o">=</span><span class="p">.</span><span class="mi">25</span><span class="p">,</span> <span class="n">lw</span><span class="o">=</span><span class="p">.</span><span class="mi">75</span><span class="p">,</span> <span class="n">fc</span><span class="o">=</span><span class="s">'black'</span><span class="p">,</span> <span class="n">zorder</span><span class="o">=</span><span class="mi">100</span><span class="p">)</span>

<span class="c1"># Make the border and tick marks 0.5 points wide
</span><span class="p">[</span> <span class="n">i</span><span class="p">.</span><span class="n">set_linewidth</span><span class="p">(</span><span class="mf">0.5</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">ax</span><span class="p">.</span><span class="n">spines</span><span class="p">.</span><span class="n">values</span><span class="p">()</span> <span class="p">]</span>
<span class="n">ax</span><span class="p">.</span><span class="n">tick_params</span><span class="p">(</span><span class="n">which</span><span class="o">=</span><span class="s">'both'</span><span class="p">,</span> <span class="n">width</span><span class="o">=</span><span class="p">.</span><span class="mi">5</span><span class="p">)</span>

<span class="c1"># SHADED AREA for pandemic (e.g., March 2020 to September 2021)
</span><span class="n">ax</span><span class="p">.</span><span class="n">axvspan</span><span class="p">(</span><span class="s">'2020-03-01'</span><span class="p">,</span> <span class="s">'2021-08-31'</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">'yellow'</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.5</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">'Pandemic'</span><span class="p">)</span>

<span class="c1"># Add label within the shaded area
# pd.Timestamp specifies X coordinate and 14 specifies the Y
</span><span class="n">ax</span><span class="p">.</span><span class="n">text</span><span class="p">(</span><span class="n">pd</span><span class="p">.</span><span class="n">Timestamp</span><span class="p">(</span><span class="s">'2020-03-15'</span><span class="p">),</span> <span class="mf">4.0</span><span class="p">,</span> <span class="s">'Pandemic'</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">'black'</span><span class="p">,</span> <span class="n">fontsize</span><span class="o">=</span><span class="mi">12</span><span class="p">,</span> <span class="n">ha</span><span class="o">=</span><span class="s">'left'</span><span class="p">)</span>

<span class="c1">#plt.show()
</span>
<span class="c1"># Save as PNG
</span><span class="n">plt</span><span class="p">.</span><span class="n">savefig</span><span class="p">(</span><span class="s">'2024-11-06-Jobless_Rate.png'</span><span class="p">,</span> <span class="nb">format</span><span class="o">=</span><span class="s">'png'</span><span class="p">,</span> <span class="n">dpi</span><span class="o">=</span><span class="mi">200</span><span class="p">)</span>
</code></pre></div></div>

<p>Looks pretty good if I do say so myself. Certainly things could be improved (peak, Pandemic, and October ‘24 labels) or debated (span of the shading on the “Pandemic” era) but overall I am pleased. TIL how to add a shaded area to a line plot in Matplotlib.</p>

<p><img src="/img/2024-11-06-Jobless_Rate.png" alt="" /></p>

<p>For good measure, I tested myself and did it in Excel as well. Had to make a few different choices on labels and I got tired of trying to get the call-out shape on Oct ‘24 just right. Again, overall I’m pleased.</p>

<p><img src="/img/2024-11-08-Jobless_Rate_Excel.png" alt="" /></p>]]></content><author><name></name></author><category term="datavis" /><category term="data" /><category term="python" /><category term="excel" /><summary type="html"><![CDATA[TIL: Shaded area on a line graph in Matplotlib using pyplot. Dr. Drang had two posts on All This about the jobless rate (unemployment) and his use of Matplotlib and Python for plotting. I decided to experiment. I always love a good chart recreation exercise. Ticks tricks provides the details while Rescaling a graph was the original post. –more–]]></summary></entry><entry><title type="html">ChatGPT Prompt Frameworks</title><link href="https://sprestridge.net/llm/chatgpt/ai/data%20science/2024/05/23/ChatGPT-Frameworks.html" rel="alternate" type="text/html" title="ChatGPT Prompt Frameworks" /><published>2024-05-23T14:00:00+00:00</published><updated>2024-05-23T14:00:00+00:00</updated><id>https://sprestridge.net/llm/chatgpt/ai/data%20science/2024/05/23/ChatGPT-Frameworks</id><content type="html" xml:base="https://sprestridge.net/llm/chatgpt/ai/data%20science/2024/05/23/ChatGPT-Frameworks.html"><![CDATA[<p>Unlock the full potential of <a href="https://chat.openai.com/">ChatGPT</a> and LLMs. Learn these four simple prompt frameworks to improve the responses from the model.</p>

<!--more-->

<p><code class="language-plaintext highlighter-rouge">R-T-F</code>: <strong>Role - Task - Format</strong>. Act as a ROLE, Create a TASK, Show as FORMAT.</p>

<p><code class="language-plaintext highlighter-rouge">T-A-G</code>: <strong>Task - Action - Goal</strong>. Define the TASK, State the ACTION, Clarify the GOAL.</p>

<p><code class="language-plaintext highlighter-rouge">B-A-B</code>: <strong>Before - After - Bridge</strong>. Explain problem BEFORE, State outcome AFTER, ask the BRIDGE.</p>

<p><code class="language-plaintext highlighter-rouge">C-A-R-E</code>: <strong>Context - Action - Result</strong> - Example: Give the CONTEXT, Describe the ACTION, Clarify the RESULTS, Give the EXAMPLES.</p>

<p>Eager to use other strategic planning frameworks like SWOT analysis from the business world? That works too.</p>

<p><code class="language-plaintext highlighter-rouge">S-W-O-T</code>: <strong>Strengths - Weaknesses - Opportunities - Threats</strong>: Analyze STRENGTHS, acknowledge WEAKNESSES, explore and consider OPPORTUNITIES, and consider THREATS.</p>

<p>How do these frameworks work for you? Others to share?</p>]]></content><author><name></name></author><category term="llm" /><category term="chatgpt" /><category term="ai" /><category term="data science" /><summary type="html"><![CDATA[Unlock the full potential of ChatGPT and LLMs. Learn these four simple prompt frameworks to improve the responses from the model.]]></summary></entry><entry><title type="html">TIL: SORT, UNIQUE, VSTACK</title><link href="https://sprestridge.net/excel/2024/04/03/SORT-UNIQUE-VSTACK.html" rel="alternate" type="text/html" title="TIL: SORT, UNIQUE, VSTACK" /><published>2024-04-03T17:00:00+00:00</published><updated>2024-04-03T17:00:00+00:00</updated><id>https://sprestridge.net/excel/2024/04/03/SORT-UNIQUE-VSTACK</id><content type="html" xml:base="https://sprestridge.net/excel/2024/04/03/SORT-UNIQUE-VSTACK.html"><![CDATA[<h2 id="how-to-use-three-formulas-to-combine-and-sort-the-unique-values-from-two-different-lists-arrays">How to use three formulas to combine and sort the unique values from two different lists (arrays)</h2>

<p>Imagine two very long lists of unique codes (names, id numbers, any unique identifier). You need a single list of the unique codes. There are several approaches but I learned about <code class="language-plaintext highlighter-rouge">VSTACK</code> recently, have wanted to use it, and had to look it up again to apply it, so I am writing this as a TIL - today I learned.</p>

<p>Use the two lists to combine (<code class="language-plaintext highlighter-rouge">VSTACK</code>) them into a single list of unique values (<code class="language-plaintext highlighter-rouge">UNIQUE</code>) that is sorted (<code class="language-plaintext highlighter-rouge">SORT</code>).</p>

<!--more-->

<p>In the screen shot, the array formula in cell D3 combines these three Excel functions to produce the sorted list of unique alpha codes. Two adjacent Boolean columns give a 1 or a 0 depending on whether the alpha code is from list one or two. <code class="language-plaintext highlighter-rouge">ISNUMBER</code> and <code class="language-plaintext highlighter-rouge">MATCH</code> are used with double unary characters to return the 1’s and 0’s. A value of 1 indicates it was from that list; a value of 0 indicates it was not.</p>

<p><img src="/img/2024-04-03_SORT-UNIQUE-VSTACK.png" alt="" title="Screenshot showing the two lists and the formula combining SORT, UNIQUE, and VSTACK described in this post." /></p>

<p>The final column, <strong>Source</strong>, uses an <code class="language-plaintext highlighter-rouge">IF</code> formula and the Boolean columns to indicate where the alpha code appeared–list 1, list 2, or both lists. Careful readers will note that <strong>EVER appears in both lists</strong> and the <code class="language-plaintext highlighter-rouge">IF</code> formula correctly identifies that in column G.</p>

<h3 id="formulas">Formulas</h3>

<p><strong>D3</strong>: <code class="language-plaintext highlighter-rouge">= SORT( UNIQUE( VSTACK(A3:A7, B3:B8)))</code></p>

<p><strong>E3</strong>: <code class="language-plaintext highlighter-rouge">= --ISNUMBER( MATCH( D3#, $A$3:$A$7, 0))</code></p>

<p><strong>F3</strong>: <code class="language-plaintext highlighter-rouge">= --ISNUMBER( MATCH( D3#, $B$3:$B$8, 0))</code></p>

<p><strong>G3</strong>: <code class="language-plaintext highlighter-rouge">=IF(AND(E3, F3), "Both", IF(E3, "List 1", "List 2"))</code></p>

<h3 id="functions">Functions</h3>

<p><a href="https://support.microsoft.com/en-us/office/vstack-function-a4b86897-be0f-48fc-adca-fcc10d795a9c">VSTACK</a> - Appends arrays vertically and in sequence to return a larger array.</p>

<p><a href="https://support.microsoft.com/en-us/office/sort-function-22f63bd0-ccc8-492f-953d-c20e8e44b86c">SORT</a> - The SORT function sorts the contents of a range or array.</p>

<p><a href="https://support.microsoft.com/en-us/office/unique-function-c5ab87fd-30a3-4ce9-9d1a-40204fb85e1e">UNIQUE</a> - The UNIQUE function returns a list of unique values in a list or range.</p>

<p><a href="https://support.microsoft.com/en-us/office/is-functions-0f2d7971-6019-40a0-a171-f2d869135665">ISNUMBER</a> - checks whether a specified value is a number (TRUE) or not (FALSE)</p>

<p><a href="https://support.microsoft.com/en-us/office/match-function-e8dffd45-c762-47d6-bf89-533f4a37673a">MATCH</a> - The MATCH function searches for a specified item in a range of cells, and then returns the relative position of that item in the range.</p>

<p><a href="https://support.microsoft.com/en-us/office/if-function-69aed7c9-4e8a-4755-a9bc-aa8bbff73be2">IF</a> - makes logical comparisons and can have two results, one if the comparison is TRUE and the other if the comparison is FALSE.</p>]]></content><author><name></name></author><category term="excel" /><summary type="html"><![CDATA[How to use three formulas to combine and sort the unique values from two different lists (arrays) Imagine two very long lists of unique codes (names, id numbers, any unique identifier). You need a single list of the unique codes. There are several approaches but I learned about VSTACK recently, have wanted to use it, and had to look it up again to apply it, so I am writing this as a TIL - today I learned. Use the two lists to combine (VSTACK) them into a single list of unique values (UNIQUE) that is sorted (SORT).]]></summary></entry></feed>