Font Subsetting: How to Strip Unused Glyphs and Cut Font File Size by Up to 90%

Font Subsetting: How to Strip Unused Glyphs and Cut Font File Size by Up to 90%

You picked a beautiful typeface for your site. Downloaded it, dropped it into your CSS, and now your Lighthouse report is flagging a 480KB font file like it personally offends Google. You scroll through the font’s character map and realize it’s packed with Cyrillic, Greek, Arabic, and Devanagari glyphs — and your site is entirely in English. You’re shipping hundreds of kilobytes of characters your visitors will never once see rendered.

I’m Rohan Ratnayake, and I’ve spent the last five years as a frontend performance engineer specializing in web asset optimization, and I’ve watched this exact problem quietly tank load times for client after client. Most developers either don’t know font subsetting exists, or they’ve heard of it but assume it’s some complex typography workflow that’s not worth the trouble. I learned the hard way that it absolutely is worth it — I once spent three days hunting down why a beautifully designed e-commerce site was consistently failing Core Web Vitals, and the culprit was a single serif font that weighed 612KB because nobody had ever touched its glyph set.

This guide is a practical, tool-focused walkthrough. No theory fluff. By the end, you’ll know exactly which tools to use, what commands to run, and what numbers to expect when you strip a bloated font down to only the characters your project actually needs.


Why Your Font File Is So Much Bigger Than It Needs to Be

Type foundries and font services distribute fonts with broad Unicode coverage because they’re building for everyone. A single font like Noto Sans is designed to support virtually every writing system on Earth — that’s its whole mission. The result is a .ttf or .otf file that can hit 300KB to over 1MB before it ever touches your server.

If your website is in English, you need Latin characters, a handful of punctuation marks, numbers, and maybe a few special symbols. That’s roughly 200–300 glyphs out of the 800–2,000+ that might live in that font file. The rest is deadweight that every single one of your visitors downloads whether they need it or not.

ALSO READ:  How to Self-Host Google Fonts and Actually Pass GDPR Compliance (The Right Way)

The fix is font subsetting — the process of physically opening the font binary and removing every glyph that isn’t in your defined character set. This isn’t a CSS trick. You’re not hiding characters or lazy-loading them. You’re cutting them out of the file permanently for your production build.


The Tools That Actually Do the Job

There are four tools worth knowing. Each has a different use case, and picking the wrong one wastes time.

ToolBest ForRequiresOutput Formats
pyftsubset (fonttools)Precise, scriptable subsettingPythonTTF, OTF, WOFF, WOFF2
GlyphhangerCrawl-based, automatic subsettingNode.jsWOFF, WOFF2
FontForgeManual GUI editingInstall (GUI app)Most formats
TransfonterQuick one-off jobs onlineBrowser onlyWOFF, WOFF2, TTF

For production workflows, pyftsubset is what I reach for every time. It’s part of the fonttools library, it’s scriptable, and it gives you exact control over what stays and what goes.


Using pyftsubset: The Command That Does the Heavy Lifting

Install fonttools first:

pip install fonttools brotli

The brotli package is needed if you want WOFF2 output, which you almost always do for web use.

Basic subset command for English-only Latin:

pyftsubset YourFont.ttf \
  --unicodes="U+0020-007E,U+00A0-00FF" \
  --output-file="YourFont-subset.woff2" \
  --flavor="woff2"

That Unicode range covers:

  • U+0020–007E: Basic Latin (all standard English characters, numbers, punctuation)
  • U+00A0–00FF: Latin-1 Supplement (accented characters like é, ñ, ü — useful if your content includes European names or loanwords)

Here’s what that looks like on a real font. I ran this on a copy of Source Serif 4 that a client was using:

VersionFile SizeReduction
Original .ttf (full)498KB
Subsetted WOFF2 (Latin only)44KB91.2%

That’s not a cherry-picked example. A 90%+ reduction is typical when you’re cutting Cyrillic, Greek, CJK, and other extended Unicode ranges from a font that was designed for global distribution.


The Unicode Range You Actually Need (And What to Skip)

A lot of tutorials just say “use the Basic Latin range” and call it done. That leaves you with broken rendering the first time someone’s name has an accent or your CMS auto-inserts a curly quote.

Here’s a more practical set of ranges for an English-language site:

  • U+0020–007E — Basic Latin (A–Z, a–z, 0–9, standard punctuation)
  • U+00A0–00FF — Latin-1 Supplement (accented Latin, common symbols like © and ®)
  • U+2000–206F — General Punctuation (em dashes, ellipses, curly quotes)
  • U+20AC — Euro sign (€), if your site shows prices
  • U+2192, U+2190 — Arrow symbols, if your UI uses them
ALSO READ:  CSS Font Stack Fallbacks: How to Build a System-Native Hierarchy That Never Breaks

What you’re almost certainly safe to drop for an English site:

  • U+0400–04FF — Cyrillic
  • U+0370–03FF — Greek
  • U+0600–06FF — Arabic
  • U+4E00–9FFF — CJK Unified Ideographs (Chinese/Japanese/Korean)

Each of those ranges can add 50KB to 300KB to your font file on its own.


Glyphhanger: The Smarter Option When You Don’t Want to Guess

The downside of manually specifying Unicode ranges is that you might miss something. Glyphhanger, built by Zach Leatherman, takes a different approach — it crawls your actual HTML pages, collects every character that appears in the rendered content, and builds a subset based exactly on what’s used.

Install it:

npm install -g glyphhanger

Run it against a local file or a URL:

glyphhanger http://localhost:3000 --subset=YourFont.ttf --formats=woff2

Glyphhanger outputs a --unicodes string you can inspect, and it produces the subsetted file automatically. What I like about it is that it catches edge cases — a stray Greek letter in a math formula, a currency symbol in a footer — that you’d miss if you were just typing Unicode ranges by hand.

The tradeoff is that you need to run it after your content is populated. If you run it against a template with placeholder text, you’ll get a subset that’s too small and your real content will render in a fallback font for missing characters.


A Specific Thing That Will Bite You If You’re Not Careful

There’s a failure mode I’ve seen three times with clients who were new to subsetting: they’d subset the font correctly, deploy it, and then a month later get a bug report that certain characters were showing as rectangles or question marks. Every time, the cause was the same — someone updated the CMS content and added a character that wasn’t in the subset.

The fix isn’t complicated but it is deliberate: make subsetting part of your build process, not a one-time manual step. If you’re using a static site generator or a bundler, automate the pyftsubset command so it runs against your finalized content. That way, the subset always reflects what’s actually on the page.

If your content is truly dynamic — like a user-generated comment section — subsetting may not be the right solution for that specific font. Use a WOFF2 with a broad Latin range for variable content, and subset aggressively only for fonts used in fixed UI text like headings and navigation.

ALSO READ:  How to Preload Fonts and Fix First Contentful Paint Before It Kills Your Rankings

Comparing Output Formats After Subsetting

Once you’ve stripped the glyphs, your output format choice matters too. Here’s how the same subsetted character set compares across formats:

FormatBrowser SupportCompressionTypical Size (subsetted Latin)
TTFAllNone~80–120KB
WOFFAll modern + IE9+zlib~50–70KB
WOFF2All modernBrotli~30–50KB

WOFF2 is the right choice for virtually every modern web project. The W3C WOFF2 specification uses Brotli compression, which consistently outperforms the zlib compression in WOFF by 20–30% on font data. You’re doubling down on savings — smaller glyph set, better compression.

Your @font-face declaration in CSS should look like this after subsetting:

@font-face{ 
  font-family: 'YourFont';
  src: url('/fonts/YourFont-subset.woff2') format('woff2'),
       url('/fonts/YourFont-subset.woff') format('woff');
  font-display: swap;
 }

The font-display: swap is worth including — it tells the browser to show text in a fallback font immediately while the custom font loads, rather than showing invisible text. It won’t affect your file size but it will affect your Cumulative Layout Shift score if the subset and fallback fonts differ significantly in metrics.


The One Step Most Guides Skip: Subsetting Variable Fonts

Variable fonts are increasingly common and they’re a separate problem. A variable font encodes an entire design space — weight, width, optical size — as a single file with interpolation data. Subsetting them with pyftsubset works, but you need to add the --no-hinting and --desubroutinize flags to avoid corrupting the variation tables:

pyftsubset YourVariableFont.ttf \
  --unicodes="U+0020-007E,U+00A0-00FF,U+2000-206F" \
  --output-file="YourVariableFont-subset.woff2" \
  --flavor="woff2" \
  --no-hinting \
  --desubroutinize

Without those flags, I’ve seen variable fonts that subset to the right file size but display interpolation artifacts at certain weights. It’s a subtle bug that only shows up visually, and it’s easy to miss in testing.


FAQs

Can I re-add characters to a subsetted font later if I need them? No — once glyphs are removed, they’re gone from that file. Keep the original font file in your project repository and re-run the subsetting process whenever your character requirements change. Treating subsetting as part of your build pipeline makes this automatic.

Does font subsetting affect kerning or OpenType features? Standard subsetting with pyftsubset preserves kerning pairs for the characters that remain. OpenType features like ligatures and contextual alternates are also preserved by default. If you want to strip those to save additional kilobytes, you can use the --no-layout-closure flag, but test carefully — some typefaces look noticeably worse without their built-in ligatures.

Is subsetting legal for the fonts I’m using? It depends on the license. Most open-source fonts (Google Fonts, OFL-licensed typefaces) explicitly permit subsetting. Commercial fonts vary — some licenses restrict modification of the font file. Check the EULA or license file that came with the font before subsetting it for production use.


Wrap-Up

Font files are one of the most overlooked render-blocking resources on the web. A 480KB font serving Cyrillic characters to an English-only audience isn’t a font problem — it’s a decision that was never made consciously. Subsetting forces that decision.

Run pyftsubset on your heaviest fonts this week. Check the before and after sizes. If you’re seeing the typical 80–90% reduction, that’s real bandwidth you’re no longer charging your visitors. For a site doing 50,000 page views a month, cutting a 400KB font to 40KB is roughly 18GB less data transferred — every single month.

Start with your heading font. It’s usually the heaviest and the most character-restricted.