Who doesn’t like a nice font! Even if what you’re writing about is as dull as a guide on font subsetting for developers, a good font can inject some much needed visual spice! The client and the designer love it - adding variations and weights to perfect the visual design and hierarchy (as they should). This would all be great if it wasn’t for the atrocious file sizes of these fonts direct from the foundry. Let’s work with the Google font Nunito today to demonstrate this. Straight from the foundry, a single variation (Nunito-Regular.ttf) is a nice chunky 132 KB. Even a conservative estimate of 6 variations on a your site will mean sending ~ 800 KB in fonts alone. Tsk tsk.

There is a good reason, however, why original font files tend to be large — the foundry doesn’t know what you’re going to write with it. An English language cooking blog might write blogs titled “How to make the best Crêpes 🥞”, using accented latin characters, and a SaaS platform in the US might want to use international currency symbols ($ £ € ₹) for billings. For this reason, foundries only ever package their fonts based on a script (Latin, Devanagari, Arabic, etc.) adding some common characters and symbols to it. But as the end user, we can do better with two tools at our disposal — font subsetting and compression.

Existing Resources

If you’re working with Google fonts, you should definitely check out the Google Webfonts Helper built by Mario Ranftl. I’ve gone years without having to dive further into the rabbit hole than this! If you need to work with non-Google fonts, you can get equivalent functionality by chaining together the font subsetter and font converter from Everything Fonts.

You can also find my font book in the Resources section of this site. It has previews of some of the fonts I use the most, with pre–optimised download packages and a nifty tool to generate the required css!

However, if you’re looking to understand font optimisation on a deeper level, read on! The purpose of this blog is to help you build a richer understanding of what the above tools are doing so that you can do it better and open up new possibilities in terms of the design and performance of your web applications.

What’s In A Font?

To oversimplify a bit, a font file tells the computer what the character shapes look like and provides a table mapping the shapes to their corresponding character codes that your computer can understand.

Project Setup

To begin with, we’re going to use a bit of python code to look inside our Nunito font file to help us make more informed decisions about subsetting it.

bash
mkdir fonts && cd fonts
python -m venv .venv
source .venv/bin/activate
pip install fonttools brotli zopfli
Folder Structure
  • fonts
    • chars.txt
    • checkfont.py
    • nunito # Downloaded from Google Fonts
      • subset # For optimised files
      • Nunito-Black.ttf
      • Nunito-BlackItalic.ttf
      • Nunito-Bold.ttf
      • Nunito-BoldItalic.ttf
      • Nunito-ExtraBold.ttf
      • Nunito-ExtraBoldItalic.ttf
      • Nunito-ExtraLight.ttf
      • Nunito-ExtraLightItalic.ttf
      • Nunito-Light.ttf
      • Nunito-LightItalic.ttf
      • Nunito-Medium.ttf
      • Nunito-MediumItalic.ttf
      • Nunito-Regular.ttf
      • Nunito-RegularItalic.ttf
      • Nunito-SemiBold.ttf
      • Nunito-SemiBoldItalic.ttf

Python Utility

Copy the following code into checkfont.py!

python
import sys
from fontTools.ttLib import TTFont

with TTFont(sys.argv[1]) as ttf:
    # Get a list of tuples containing (code point, glyph id) 
    # for each character in the font like (36, 'dollar')
    chars = ttf["cmap"].tables[0].cmap.items()
    chars_length = len(chars)

    # python checkfont.py font.ttf
    # prints number of chars in font
    if len(sys.argv) == 2:
        print(f"{chars_length} characters in font")
    
    # python checkfont.py font.ttf printall
    # prints character glyphs to chars.txt
    elif len(sys.argv) == 3 and sys.argv[2] == 'printall':
        with open('chars.txt', 'w') as f:
            nprintable = []
            for c in chars:
                glyph = chr(c[0])
                if glyph.isprintable():
                    f.write(glyph)
                else:
                    nprintable.append(c[1])
            print(f"printed {chars_length - len(nprintable)} out of {chars_length} characters")
            print(f"non printable characters: {nprintable}")
bash
du -sh nunito/Nunito-Regular.ttf
python checkfont.py nunito/Nunito-Regular.ttf
python checkfont.py nunito/Nunito-Regular.ttf printall
console output
132K    nunito/Nunito-Regular.ttf
938 characters in font
printed 928 out of 938 characters
non printable characters: ['NULL', 'CR', 'uni00A0', 'uni00AD', 'uni2007', 'uni2008', 'uni2009', 'uni200A', 'uni200B', 'uniF8FF']

That explains it 😅

No wonder our uncompressed Nunito-Regular.ttf file is a 132 KB, it codes for a whopping 938 characters! The output shows that we’ve printed out all but 10 of the characters to chars.txt (some unicode characters like NULL and non–breaking spaces don’t have printable glyphs). Let’s open up chars.txt and take a look at what characters the foundry has provided glyphs for.

txt
# chars.txt
-----------
 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{|}~¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖרÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿĀāĂ㥹ĆćĈĉĊċČčĎďĐđĒēĔĕĖėĘęĚěĜĝĞğĠġĢģĤĥĦħĨĩĪīĬĭĮįİıIJĴĵĶķĸĹĺĻļĽľĿŀŁłŃńŅņŇňʼnŊŋŌōŎŏŐőŒœŔŕŖŗŘřŚśŜŝŞşŠšŢţŤťŦŧŨũŪūŬŭŮůŰűŲųŴŵŶŷŸŹźŻżŽžƁƊƏƒƠơƯưƳƴDŽDždžLJLjljNJNjnjǍǎǔǥǦǧǩǪǫǯǺǻǼǽǾǿȀȁȂȃȄȅȆȇȈȉȊȋȌȍȎȏȐȑȒȓȔȕȖȗȘșȚțȟȦȧȨȪȫȬȭȰȱȲȳȷɓɗəʒʹʺʻʼʾʿˆˇˈˉˊˋˌ˘˙˚˛˜˝̵̷̸̧̨̛̣̤̦̮̱̀́̂̃̄̆̇̈̉̊̋̌̏̑̒ΔΣΩμπЀЁЂЃЄЅІЇЈЉЊЋЌЍЎЏАБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуфхцчшщъыьэюяѐёђѓєѕіїјљњћќѝўџѢѣѪѫѲѳѴѵҊҋҌҍҎҏҐґҒғҔҕҖҗҘҙҚқҜҝҞҟҠҡҢңҤҥҨҩҪҫҬҭҮүҰұҲҳҴҵҶҷҸҹҺһҼҽҾҿӀӁӂӃӄӅӆӇӈӉӊӋӌӍӎӏӐӑӒӓӔӕӖӗӘәӚӛӜӝӞӟӠӡӢӣӤӥӦӧӨөӪӫӬӭӮӯӰӱӲӳӴӵӶӷӸӹӺӻӼӽӾӿԐԑԒԓԚԛԜԝԤԥԦԧԨԩԮԯḈḉḌḍḎḏḔḕḖḗḜḝḠḡḤḥḪḫḮḯḶḷḺḻṂṃṄṅṆṇṈṉṌṍṎṏṐṑṒṓṚṛṞṟṠṡṢṣṤṥṦṧṨṩṬṭṮṯṸṹṺṻẀẁẂẃẄẅẎẏẒẓẗẞẠạẢảẤấẦầẨẩẪẫẬậẮắẰằẲẳẴẵẶặẸẹẺẻẼẽẾếỀềỂểỄễỆệỈỉỊịỌọỎỏỐốỒồỔổỖỗỘộỚớỜờỞởỠỡỢợỤụỦủỨứỪừỬửỮữỰựỲỳỴỵỶỷỸỹ‐‒–—―‘’‚“”„†‡•…‰′″‹›⁄⁒⁰⁴⁵⁶⁷⁸⁹₀₁₂₃₄₅₆₇₈₉₡₣₤₦₧₩₫€₭₮₱₲₴₵₸₹₺₼₽ℓ№™Ω℮∂∅∆∏∑−∕∙√∞∫≈≠≤≥◊⟨⟩fifl

o_o Your mileage may vary, but I don’t think I’d use more than a couple lines worth of these characters when writing in English. That sounds like A LOT of wasted bandwidth, especially once you start adding variations. What if we could go into the font file and surgically snip out all those glyphs we’re not planning to use?

Enter: Glyphhanger

Glyphhanger is a really cool npm package that does a few things —

  1. If given a url, it can give you a list of all the characters used in it.
  2. If given font files, it can subset and compress them.
  3. If given both, it can subset and compress the font file to include all the characters used on the site.

Let’s take it for a spin! If you have node installed without node version manager then skip the first command shown below.

bash
nvm install --lts --save
npm install -g glyphhanger

My go–to subset settings for glyphhanger, for English language sites, includes all the printable ASCII characters and a few explicitly whitelisted characters from extended ASCII such as the non-breaking space (which help keep things like numbers and units on the same line, like 5 KB), directional quotes (“ ” ‘ ’), em and en dashes (— –), pretty ellipses (…), the bullet mark (•) and a few other useful symbols (like ™©®°£€₹·). Feel free to add to this whitelist before running the command shown below.

bash
glyphhanger --subset='nunito/*.ttf' --output=nunito/subset --formats=woff2 --US_ASCII --whitelist='“”‘’—–…•™©®°£€₹· '
console output
U+20-7E,U+A0,U+A3,U+A9,U+AE,U+B0,U+B7,U+2013,U+2014,U+2018,U+2019,U+201C,U+201D,U+2022,U+2026,U+20AC,U+20B9,U+2122
Subsetting nunito/Nunito-Black.ttf to Nunito-Black-subset.woff2 (was 128.93 KB, now 11.32 KB)
Subsetting nunito/Nunito-BlackItalic.ttf to Nunito-BlackItalic-subset.woff2 (was 132.16 KB, now 11.94 KB)
Subsetting nunito/Nunito-Bold.ttf to Nunito-Bold-subset.woff2 (was 129.05 KB, now 11.16 KB)
Subsetting nunito/Nunito-BoldItalic.ttf to Nunito-BoldItalic-subset.woff2 (was 132.22 KB, now 11.78 KB)
Subsetting nunito/Nunito-ExtraBold.ttf to Nunito-ExtraBold-subset.woff2 (was 128.97 KB, now 11.21 KB)
Subsetting nunito/Nunito-ExtraBoldItalic.ttf to Nunito-ExtraBoldItalic-subset.woff2 (was 132.23 KB, now 11.88 KB)
Subsetting nunito/Nunito-ExtraLight.ttf to Nunito-ExtraLight-subset.woff2 (was 128.89 KB, now 10.29 KB)
Subsetting nunito/Nunito-ExtraLightItalic.ttf to Nunito-ExtraLightItalic-subset.woff2 (was 132.17 KB, now 10.73 KB)
Subsetting nunito/Nunito-Italic.ttf to Nunito-Italic-subset.woff2 (was 132.35 KB, now 11.69 KB)
Subsetting nunito/Nunito-Light.ttf to Nunito-Light-subset.woff2 (was 129.12 KB, now 10.98 KB)
Subsetting nunito/Nunito-LightItalic.ttf to Nunito-LightItalic-subset.woff2 (was 132.48 KB, now 11.55 KB)
Subsetting nunito/Nunito-Medium.ttf to Nunito-Medium-subset.woff2 (was 129.2 KB, now 11.2 KB)
Subsetting nunito/Nunito-MediumItalic.ttf to Nunito-MediumItalic-subset.woff2 (was 132.41 KB, now 11.75 KB)
Subsetting nunito/Nunito-Regular.ttf to Nunito-Regular-subset.woff2 (was 129.1 KB, now 11.09 KB)
Subsetting nunito/Nunito-SemiBold.ttf to Nunito-SemiBold-subset.woff2 (was 129.05 KB, now 11.3 KB)
Subsetting nunito/Nunito-SemiBoldItalic.ttf to Nunito-SemiBoldItalic-subset.woff2 (was 132.23 KB, now 11.83 KB)

Woah! With that single command, we were able to get our file sizes down by a factor of more than 10 times, from ~130 KB to ~12 KB. Once you get a hang of it (no pun intended), subsetting opens up so many possibilities.

Creative Subsets

Want to base a subset on the text copy of an existing website? Glyphhanger can do that too!

Let’s first verify the subset that would be generated. Run the following command after changing the URL to the site you’d like to crawl.

bash
glyphhanger https://bererblog.com/ --onlyVisible --spider-limit=10 --string
  • Modify the --spider-limit value to control how many additional internal links are crawled to generate the subset. Setting it to 0 sets the limit to infinite and will affect how long the command takes to run.
  • If you encounter errors, it’s possible that the site you’re trying to reference has paywalls or other measures to block crawlers, try a different site.
console output
 !"%&'()+,-./0123456789:;<=?ABCDEFGHIJKLMNOPQRSTUVWYZabcdefghijklmnopqrstuvwxyz| ©í –—’“”

Oops! The capital letter X is missing from our crawled site. A nice way to protect against such unexpected gaps is to add in the --US_ASCII and --whitelist flags to our subset command. The following command builds a subset based on the crawled site and our whitelist.

bash
glyphhanger https://bererblog.com/ --onlyVisible --spider-limit=10 --subset='nunito/*.ttf' --output=nunito/subset --formats=woff2 --US_ASCII --whitelist='“”‘’—–…•™©®°£€₹· '

Let’s see what we’re left with!

bash
du -sh nunito/subset/Nunito-Regular-subset.woff2
python checkfont.py nunito/subset/Nunito-Regular-subset.woff2
python checkfont.py nunito/subset/Nunito-Regular-subset.woff2 printall
console output
12K    nunito/subset/Nunito-Regular-subset.woff2
114 characters in font
txt
# chars.txt
# ---------
 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{|}~£©®°·í–—‘’“”•…€₹™

I’d say this is a pretty impressive result!

Such granular control over the characters you ship in a font means you can do some really unique things —

  • Like using two entirely different fonts for numbers and alphabets on your site.
  • Supplementing a font lacking some crucial glyphs with a subsetted font (defined as the fallback in css) that only includes those missing glyphs.
  • Subsetting an expansive script like Devanagari or Arabic to only include the characters required for your site’s language and optionally including a whitelist of just English alphabets for the occasional English word.

The permutations and combinations are endless once you start thinking about it.

If you end up using this workflow for an interesting use case, I’d love to hear from you! Drop me an email and share your experience!