On this page
- Purpose
- Content Without Context
- A Remark Plugin That Calculates Automatically
- Understanding the Plugin Lifecycle
- Prerequisites & Tooling
- Knowledge Base
- Environment
- High-Level Architecture
- Data Flow Diagram
- The Assembly Line
- The Implementation
- Understanding the Plugin Signature
- Extracting Text from the AST
- Calculating Reading Time
- Injecting into Astro Frontmatter
- Complete Implementation
- Under the Hood
- Memory & Performance
- How Remark Processes Plugins
- Edge Cases & Pitfalls
- Code Blocks Skew Word Count
- Non-English Languages
- Frontmatter Already Has minutesRead
- Plugin Order Matters
- Security: Malicious Markdown
- Conclusion
- Skills Acquired
- Using This Plugin
Purpose
Content Without Context
You’ve written a 5,000-word technical deep dive on Kubernetes networking. A reader lands on your blog post and sees… nothing indicating how long it will take to read. They might be on a coffee break with 5 minutes to spare, or settling in for a long evening study session. Without a reading time estimate, they can’t make an informed decision about whether to start now or bookmark for later.
This isn’t just a nice-to-have feature—it’s a user experience problem that affects engagement metrics. Medium, Dev.to, and every major content platform displays reading time because it reduces bounce rates and sets expectations. But how do you add this to a static site generator like Astro without manually calculating and hardcoding the time for every article?
A Remark Plugin That Calculates Automatically
Remark is a Markdown processor that transforms Markdown into an Abstract Syntax Tree (AST), applies transformations, and outputs HTML. By writing a custom Remark plugin, we can intercept the AST during the build process, extract all text content, calculate reading time using word count heuristics, and inject the result into the page’s frontmatter—all before the HTML is generated.
The specific code we’re analyzing is remark-plugins/remark-reading-time.mjs, a 10-line plugin that:
- Receives the Markdown AST
- Converts it to plain text (stripping formatting)
- Calculates reading time using the
reading-timelibrary - Injects the result into Astro’s frontmatter data
Understanding the Plugin Lifecycle
This tutorial demonstrates three critical skills:
- AST Manipulation: Understanding that Markdown isn’t just text—it’s a structured tree that can be programmatically traversed
- Build-Time Computation: Recognizing when to calculate data at build time vs. runtime
- Plugin Architecture: Learning how to extend tools through their plugin APIs rather than forking codebases
🔵 Deep Dive: Remark plugins follow the “unified” ecosystem pattern, where processors (remark, rehype, retext) all share the same plugin interface. Learning this pattern once unlocks the ability to write plugins for dozens of tools.
Prerequisites & Tooling
Knowledge Base
Required:
- Basic JavaScript (functions, imports, object destructuring)
- Understanding of what Markdown is (headings, lists, bold text)
- Familiarity with npm packages
Helpful (but not required):
- What an Abstract Syntax Tree (AST) is conceptually
- How static site generators work
Environment
From the project’s package.json, we’re using:
{
"type": "module", // ES modules enabled
"dependencies": {
"astro": "^5.5.4",
"reading-time": "^1.5.0",
"mdast-util-to-string": "^4.0.0"
}
}
Setup Steps:
# Install dependencies
npm install reading-time mdast-util-to-string
# Verify Node.js version (ES modules require Node 14+)
node --version # Should be v14.0.0 or higher
Key Libraries:
reading-time: Calculates reading time from text using average reading speed (200-250 words/min)mdast-util-to-string: Extracts plain text from a Markdown AST (mdast = Markdown Abstract Syntax Tree)
High-Level Architecture
Data Flow Diagram
graph LR
A[Markdown File] --> B[Remark Parser]
B --> C[Markdown AST]
C --> D[remarkReadingTime Plugin]
D --> E[mdast-util-to-string]
E --> F[Plain Text String]
F --> G[reading-time Library]
G --> H[Reading Time Object]
H --> I[Inject into Frontmatter]
I --> J[Astro Component]
J --> K[Rendered HTML]
style D fill:#a855f7
style I fill:#10b981
The Assembly Line
Think of Remark as a car assembly line:
- Raw Materials (Markdown): Your blog post arrives as text with
#symbols and**bold**markers - Disassembly (Parser): The text is broken down into structured parts (headings, paragraphs, lists)
- Inspection Station (Your Plugin): A worker (your plugin function) examines the disassembled parts, counts all the words, and attaches a label: “3 min read”
- Reassembly (Renderer): The parts are put back together as HTML, now with the reading time label attached
The key insight: Your plugin runs during the disassembly phase, when the content is in its most manipulable form (the AST), not when it’s raw text or final HTML.
The Implementation
Understanding the Plugin Signature
Every Remark plugin follows this pattern:
// Naive Approach: What beginners might try first
function myPlugin() {
// This function runs ONCE when the plugin is registered
console.log("Plugin initialized!");
// BUG: This doesn't actually process any content!
}
Why This Fails: Remark needs a transformer function that processes each file. The plugin must return a function, not execute logic directly.
Refined Solution (From Repo):
export function remarkReadingTime() {
// This outer function runs once during plugin registration
// It returns the actual transformer function
return function (tree, { data }) {
// This inner function runs for EACH Markdown file
// tree = the AST (Abstract Syntax Tree)
// data = metadata object where we can store results
};
}
🔴 Danger: Forgetting to return the transformer function is the #1 mistake when writing Remark plugins. The outer function is just a factory—the inner function does the work.
Extracting Text from the AST
The Markdown AST is a nested tree structure. For example, this Markdown:
# Hello
This is **bold** text.
Becomes this AST (simplified):
{
type: 'root',
children: [
{
type: 'heading',
depth: 1,
children: [{ type: 'text', value: 'Hello' }]
},
{
type: 'paragraph',
children: [
{ type: 'text', value: 'This is ' },
{ type: 'strong', children: [{ type: 'text', value: 'bold' }] },
{ type: 'text', value: ' text.' }
]
}
]
}
Naive Approach: Manual Traversal
function extractText(node) {
let text = '';
if (node.type === 'text') {
text += node.value;
}
if (node.children) {
for (const child of node.children) {
text += extractText(child); // Recursive traversal
}
}
return text;
}
// Usage
const textOnPage = extractText(tree);
This works, but it’s 10+ lines of boilerplate you’ll write for every plugin.
Refined Solution (From Repo):
import { toString } from 'mdast-util-to-string';
// Inside the transformer function:
const textOnPage = toString(tree);
// Result: "Hello This is bold text."
🔵 Deep Dive: mdast-util-to-string handles edge cases you’d forget: code blocks (should be counted), HTML comments (should be ignored), image alt text (should be counted), and link URLs (should be ignored). It’s battle-tested across thousands of projects.
Calculating Reading Time
The reading-time library uses a simple but effective algorithm:
import getReadingTime from 'reading-time';
const textOnPage = "Your article text here...";
const readingTime = getReadingTime(textOnPage);
console.log(readingTime);
// Output:
// {
// text: '3 min read',
// minutes: 3.2,
// time: 192000, // milliseconds
// words: 650
// }
How It Works Internally:
- Word Count: Splits text on whitespace, counts tokens
- Reading Speed: Assumes 200 words per minute (adjustable)
- Rounding: Rounds to nearest minute, adds “min read” suffix
From the Repo:
const readingTime = getReadingTime(textOnPage);
// We only need the friendly string: "3 min read"
Injecting into Astro Frontmatter
Astro provides a special data object that plugins can modify. This object becomes available to your Astro components.
return function (tree, { data }) {
const textOnPage = toString(tree);
const readingTime = getReadingTime(textOnPage);
// Inject into Astro's frontmatter
data.astro.frontmatter.minutesRead = readingTime.text;
// Now any Astro component can access: frontmatter.minutesRead
};
🔴 Danger: The path data.astro.frontmatter is Astro-specific. If you’re using Remark with Gatsby or Next.js, the injection point differs. Always check your framework’s documentation.
Complete Implementation
Here’s the full plugin from the repository:
import getReadingTime from 'reading-time';
import { toString } from 'mdast-util-to-string';
export function remarkReadingTime() {
// Plugin factory function (runs once at build start)
return function (tree, { data }) {
// Transformer function (runs for each Markdown file)
// Step 1: Convert AST to plain text
const textOnPage = toString(tree);
// Step 2: Calculate reading time
const readingTime = getReadingTime(textOnPage);
// Step 3: Inject into frontmatter
// readingTime.text gives us "3 min read" format
data.astro.frontmatter.minutesRead = readingTime.text;
};
}
That’s it. Ten lines of code that run automatically for every blog post, tutorial, and project page.
Under the Hood
Memory & Performance
When Does This Run?
This plugin executes during the build phase, not at runtime. When you run npm run build:
- Astro reads all Markdown files
- Remark parses each file into an AST (heap allocation for the tree structure)
- Your plugin runs, creating temporary strings (stack allocation for primitives, heap for string buffers)
- The result is serialized into static HTML
- All AST data is garbage collected
Performance Characteristics:
-
Time Complexity: O(n) where n = number of characters in the document
toString(): Single tree traversalgetReadingTime(): Single pass over text for word counting
-
Space Complexity: O(n) for the text string
- The AST itself is already in memory (Remark created it)
- We create one additional string copy
Real-World Impact:
For a 5,000-word article:
- AST traversal: ~2ms
- Reading time calculation: ~1ms
- Total overhead: ~3ms per article
Even with 1,000 blog posts, this adds only 3 seconds to your build time—negligible compared to image optimization or bundle generation.
How Remark Processes Plugins
Remark uses a middleware chain pattern:
// Simplified Remark internals
class RemarkProcessor {
constructor() {
this.plugins = [];
}
use(plugin) {
this.plugins.push(plugin()); // Call factory, store transformer
}
process(markdown) {
let tree = this.parse(markdown); // Markdown → AST
let data = { astro: { frontmatter: {} } };
// Run each transformer in sequence
for (const transformer of this.plugins) {
transformer(tree, { data }); // Plugin modifies tree or data
}
return this.stringify(tree); // AST → HTML
}
}
Your plugin is one link in this chain. It receives the tree, modifies data, and passes control to the next plugin.
🔵 Deep Dive: This is the Chain of Responsibility design pattern. Each plugin can modify the tree or data, then pass it along. Plugins are composable—you can combine reading time, syntax highlighting, and table of contents generation without them knowing about each other.
Edge Cases & Pitfalls
Code Blocks Skew Word Count
Problem: A tutorial with 500 words of explanation and 2,000 words of code examples shows “10 min read” when it’s really a 3-minute read.
Current Behavior: toString() includes code block content in the word count.
Solution (Not Implemented): Filter code nodes before counting:
function toStringWithoutCode(tree) {
// Clone tree and remove code nodes
const filtered = visit(tree, 'code', (node, index, parent) => {
parent.children.splice(index, 1);
});
return toString(filtered);
}
Non-English Languages
Problem: The reading-time library assumes English word boundaries (spaces). Languages like Chinese/Japanese don’t use spaces between words.
Current Behavior: Undercounts reading time for CJK languages.
Solution: Pass language-specific options:
const readingTime = getReadingTime(textOnPage, {
wordsPerMinute: 500 // Adjust for language
});
Frontmatter Already Has minutesRead
Problem: What if the author manually sets minutesRead in the Markdown frontmatter?
---
title: "My Post"
minutesRead: "10 min read" # Manual override
---
Current Behavior: The plugin overwrites the manual value.
Better Approach:
// Only calculate if not manually set
if (!data.astro.frontmatter.minutesRead) {
const textOnPage = toString(tree);
const readingTime = getReadingTime(textOnPage);
data.astro.frontmatter.minutesRead = readingTime.text;
}
Plugin Order Matters
If you have multiple plugins, order affects behavior:
// astro.config.mjs
export default defineConfig({
markdown: {
remarkPlugins: [
remarkReadingTime, // Runs first
remarkTOC, // Runs second
remarkCodeHighlight // Runs third
]
}
});
If remarkCodeHighlight wraps code in HTML before remarkReadingTime runs, the word count will include HTML tags. Always run content-analysis plugins early in the chain.
Security: Malicious Markdown
Scenario: A user submits Markdown with deeply nested structures to cause stack overflow:
> > > > > > > > > > (1000 levels deep)
Protection: The toString() utility has recursion limits, but you should validate input:
const textOnPage = toString(tree);
if (textOnPage.length > 1_000_000) { // 1 million chars
console.warn('Suspiciously large document, skipping reading time');
return;
}
Conclusion
Skills Acquired
You’ve learned:
- Plugin Architecture: How to extend tools through their plugin APIs rather than modifying source code
- AST Manipulation: Understanding that structured data (ASTs) are easier to work with than raw text
- Build-Time Optimization: Recognizing when to compute data once at build time vs. repeatedly at runtime
- Functional Composition: Writing pure functions that transform data without side effects
The Proficiency Marker: Most developers treat Markdown as opaque text. You now understand it as a structured data format that can be programmatically analyzed. This mental model transfers to:
- Writing ESLint rules (JavaScript AST manipulation)
- Creating Babel plugins (code transformation)
- Building custom Webpack loaders (asset processing)
Using This Plugin
In your Astro config:
// astro.config.mjs
import { remarkReadingTime } from './remark-plugins/remark-reading-time.mjs';
export default defineConfig({
markdown: {
remarkPlugins: [remarkReadingTime]
}
});
In your Astro component:
---
// src/pages/blog/[slug].astro
const { frontmatter } = Astro.props;
---
<article>
<h1>{frontmatter.title}</h1>
<p class="text-gray-500">{frontmatter.minutesRead}</p>
<slot />
</article>
Next Challenge: Extend this plugin to calculate code-to-prose ratio for technical tutorials, helping readers know if an article is theory-heavy or code-heavy.