This post is a scratchpad demonstrating the building blocks available when writing on this blog. To start your own post, drop a new YYYY-MM-DD-slug.md into _posts/ and copy whatever frontmatter you need from the top of this file.

Markdown basics

Plain prose, italic, bold (note the accent color), and inline code all render as you’d expect. Hyperlinks like the Tokyo Night repo get their own accent and a hover state.

Premature optimization is the root of all evil (or at least most of it) in programming. — Donald Knuth

Bullets and numbered lists:

  • Step one is to read the spec.
  • Step two is to ignore the spec.
  • Step three is to draft a postmortem.
  1. Reproduce the bug.
  2. Isolate the smallest failing case.
  3. Fix it; add a regression test.

Code blocks

Triple-backtick fences with a language hint get full syntax highlighting via Rouge. The colors come from the Tokyo Night palette baked into assets/css/jekyll-pygments-themes-native.css.

def fibonacci(n: int) -> list[int]:
    """Classic, iterative — O(n) time, O(1) extra space."""
    a, b = 0, 1
    out = []
    for _ in range(n):
        out.append(a)
        a, b = b, a + b
    return out


print(fibonacci(8))  # [0, 1, 1, 2, 3, 5, 8, 13]
fn dot(a: &[f64], b: &[f64]) -> f64 {
    a.iter().zip(b).map(|(x, y)| x * y).sum()
}
# Spin up the local Jekyll dev server
export PATH="/opt/homebrew/lib/ruby/gems/4.0.0/bin:/opt/homebrew/opt/ruby/bin:$PATH"
bundle install
bundle exec jekyll serve --livereload

Math (MathJax 3)

Inline math sits between $$ ... $$ inside a paragraph — for example, the Bellman equation \(V^\pi(s) = \mathbb{E}_\pi\!\left[ R_{t+1} + \gamma V^\pi(S_{t+1}) \mid S_t = s \right]\) is short enough to flow with prose.

Display math goes in its own paragraph:

\[J(\theta) \;=\; \mathbb{E}_{\tau \sim \pi_\theta}\!\left[ \sum_{t=0}^{T} \gamma^t r(s_t, a_t) \right]\]

Numbered equations work via the LaTeX equation environment, and you can \eqref them later:

\begin{equation} \label{eq:policy-gradient} \nabla_\theta J(\theta) \;=\; \mathbb{E}{\tau \sim \pi\theta}!\left[ \sum_{t=0}^{T} \nabla_\theta \log \pi_\theta(a_t \mid s_t) \, \hat{A}_t \right] \end{equation}

The policy gradient theorem in equation \eqref{eq:policy-gradient} is the workhorse behind REINFORCE, A2C, PPO, and friends.

Images

The simplest form is plain Markdown:

Tokyo Night palette swatch

For richer behaviour — captions, click-to-zoom, responsive sizing — use al-folio’s figure.html include:

A subset of the Tokyo Night palette this site is themed with.

Side-by-side images use a Bootstrap row with col-sm for each cell:

<div class="row">
  <div class="col-sm mt-3 mt-md-0">
    {% include figure.html path="assets/img/posts/sample/palette.svg" class="img-fluid rounded" %}
  </div>
  <div class="col-sm mt-3 mt-md-0">
    {% include figure.html path="assets/img/posts/sample/palette.svg" class="img-fluid rounded" %}
  </div>
</div>

Tables

Algorithm On-/Off-policy Action space Notes
REINFORCE on discrete + continuous High variance; use a baseline.
DQN off discrete Replay buffer + target network.
PPO on discrete + continuous Clipped surrogate objective.
SAC off continuous Entropy-regularized actor-critic.

Wrapping up

Once you’re comfortable with the format, delete this file and start populating _posts/ with the real stuff. Each post becomes a card on /blog/ and its own permalink at /blog/<year>/<slug>/.