← Back to Blog

Building My Academic Homepage with Astro

· 4 min read
Astro Web Development Tutorial

Why Astro?

When building an academic homepage, the priorities are clear: fast load times, excellent SEO, and clean typography for math and code. After evaluating Jekyll, Hugo, Next.js, and others, I chose Astro for several reasons:

  • Zero JavaScript by default — pages ship as pure HTML + CSS
  • Content Collections — type-safe Markdown/MDX with Zod schema validation
  • Built-in i18n — first-class support for bilingual routing
  • Island Architecture — add interactivity only where needed

The best framework is the one that gets out of your way and lets the content shine.

Project Architecture

The site follows a clean, flat structure. Here’s the key layout:

src/
├── components/     # Reusable UI components
├── content/        # Markdown/MDX content collections
│   ├── blog/       # Blog posts (en/ + zh/)
│   └── research/   # Research papers (en/ + zh/)
├── i18n/           # Translation strings
├── layouts/        # Page layout wrappers
├── lib/            # Utilities and constants
├── pages/          # File-based routing
│   ├── en/         # English pages
│   └── zh/         # Chinese pages
└── styles/         # Global CSS

Each content collection is validated against a Zod schema. For example, the blog collection schema ensures every post has the required frontmatter:

const blog = defineCollection({
  loader: glob({ pattern: '**/*.{md,mdx}', base: './src/content/blog' }),
  schema: z.object({
    title: z.string(),
    description: z.string(),
    pubDate: z.coerce.date(),
    updatedDate: z.coerce.date().optional(),
    tags: z.array(z.string()).default([]),
    draft: z.boolean().default(false),
  }),
});

Math Rendering with KaTeX

One of the key requirements for an academic site is proper math support. With remark-math and rehype-katex, I can write LaTeX directly in Markdown.

Inline Math

The loss function of logistic regression is L(θ)=1Ni=1N[yilog(y^i)+(1yi)log(1y^i)]\mathcal{L}(\theta) = -\frac{1}{N}\sum_{i=1}^{N}[y_i \log(\hat{y}_i) + (1-y_i)\log(1-\hat{y}_i)], where y^i=σ(xiTθ)\hat{y}_i = \sigma(\mathbf{x}_i^T\theta).

Display Math

The cosine similarity between two asset embedding vectors, used in my research on representation learning for financial assets:

sim(vi,vj)=vivjvivj\text{sim}(\mathbf{v}_i, \mathbf{v}_j) = \frac{\mathbf{v}_i \cdot \mathbf{v}_j}{\|\mathbf{v}_i\| \cdot \|\mathbf{v}_j\|}

A more complex example — the attention mechanism in Transformers:

Attention(Q,K,V)=softmax(QKTdk)V\text{Attention}(Q, K, V) = \text{softmax}\left(\frac{QK^T}{\sqrt{d_k}}\right)V

Code Highlighting

Astro uses Shiki for syntax highlighting with dual theme support. Here’s an example of querying content collections:

---
import { getCollection, render } from 'astro:content';

const allPosts = await getCollection('blog');
const posts = allPosts
  .filter((entry) => entry.id.startsWith('en/'))
  .filter((entry) => !entry.data.draft)
  .sort((a, b) => b.data.pubDate.getTime() - a.data.pubDate.getTime());
---

{posts.map((post) => (
  <BlogPostCard title={post.data.title} />
))}

And here’s a Python snippet for computing portfolio embeddings:

import numpy as np
from gensim.models import Word2Vec

def train_asset_embeddings(portfolios: list[list[str]], dim: int = 64):
    """Train Word2Vec-style embeddings for financial assets."""
    model = Word2Vec(
        sentences=portfolios,
        vector_size=dim,
        window=5,
        min_count=1,
        workers=4,
        sg=1,  # skip-gram
    )
    return model.wv

Dark Mode Support

The code blocks automatically switch themes based on the site’s dark mode setting — github-light for light mode and one-dark-pro for dark mode. This is achieved through Shiki’s CSS variables:

.dark .astro-code,
.dark .astro-code span {
  color: var(--shiki-dark) !important;
  background-color: var(--shiki-dark-bg) !important;
}

Bilingual Support

The i18n system uses a simple but effective approach:

  1. Routing: Both /en/ and /zh/ prefixes via Astro’s built-in i18n
  2. UI Strings: Separate translation files (en.ts, zh.ts) for all interface text
  3. Content: Parallel content directories (content/blog/en/, content/blog/zh/)
  4. Language Switch: Preserves the current page path when toggling locale

Performance Results

The final build achieves impressive metrics:

MetricScore
First Contentful Paint< 0.8s
Total Page Weight< 150KB
Lighthouse Performance98+
JavaScript Shipped0KB (static pages)

What’s Next

There are several enhancements planned for future iterations:

  • RSS feed for blog subscribers
  • Search using Pagefind for client-side full-text search
  • Comments via Giscus (GitHub Discussions)
  • View Transitions for smooth page navigation

This post was written to test the blog system’s rendering capabilities — KaTeX math, Shiki code highlighting, tables, lists, blockquotes, and responsive typography.