Fonts (for computers)

Around a hundred years ago, a font was a collection of metal blocks on which characters (or rather, glyphs) were engraved. Today, when we speak of computer fonts, we signify characters represented by shapes stored inside the computer's memory. A character, say "a" is basically an abstract entity. Inside a computer, it is represented by certain shapes called glyphs (which normal users usually call "letters"). Always keep in mind that a character can have multiple kinds of glyphs - for example, "A" can be represented by the following glyphs:

Figure 1.1. Different glyphs for a single character

Different glyphs for a single character

Different glyphs for a single character "A".

Moreover, glyphs can also represent multiple characters at a time, ie, a glyph can be composed of two (or more) glyphs. For example, in latin, we often come across the a slightly different looking combination of "f" and "i", as displayed below:

Figure 1.2. The "fi" ligature

The "fi" ligature

"f" and "i" combine to form the "fi" ligature.

Fonts usually come in three varieties. The simplest one is called Bitmapped Font which basically contain all the glyphs as small pictures within the font file. This usually gives us the best quality fonts, but the size of this type of font is quite high, and moreover, these fonts usually look good at a particular size. Stroked Fonts, on the other hand, represents each and every glyph as a set of stems, with the font storing only the lines passing through the center of each stem. During on screen or on paper rendering (drawing), the font is drawn using a fixed width "pen" along each line. This does not result in very good quality, but the benefits as far as file-size in concerned are quite high. Outline Fonts, are kind of in between bitmapped fonts and stroked fonts. Outline fonts represent glyphs as contours. In terms of file size and visual quality, these fonts fall in between the other two categories. However, rendering of these of fonts can be quite resource intensive.

When a computer encounters some text, it moves through the text character by character (or rather code point by code point). As soon as it encounters a code point, it looks up that point in the currently active font, extracts the glyph mapped to that point, and renders it on screen. This is the usual process in the latin land. However, as soon as you start to process Indic (or maybe Arabic or any one of the far Eastern scripts), things become slightly more complicated. This happens because in Indic (and many of the non latin) scripts, character/codepoint to glyph relation is not always one is to one. For example, theDevanagri conjunct ksha is composed of three codepoints, as shown below:

Figure 1.3. The Devanagri conjunct kshha

The Devanagri conjunct kshha

Formation of the Devanagri conjunct kshha

This kind of "substituting" information can be embedded into a special kind of font - called "Open Type" font (or in short - OTF). Open Type fonts have tables inside them, which are used to store these substitution information. The tables are called GSUB tables (SUB standing for substitution). Another Open Type table which a typical Indic font (especially the high quality ones) extensive use is the GPOS table which is used for positioning the various matras and bindis . For example:

Figure 1.4. Positioning of ukaar on ka (Bengali)

Positioning of ukaar on ka (Bengali)

Positioning of ukaar on ka (Bengali) with GPOS tables

It may be worthwhile to point out here that the glyphs for the various conjuncts, which are essentially composed of two or more glyphs are often called "ligatures".

However, when you work on Open Type fonts, always remember that Open Type fonts alone cannot do the magic. The software which actually does the heavy duty lifting of drawing the actual glyphs on screen (or on paper) based on the text in the computers memory also needs to "understand", or as we often put it, "support" Open Type tables. This software (which is often referred to as the rendering engine) is usually the same across a given desktop (or, even operating system). Some special purpose applications use their own rendering engine - a very commonly used example would be Yudit, which is often used in older systems (which do not have native Open Type support) to deal with non Latin text. In the Free Software world - GNOME (or to be more precise, GTK) based applications use the Pango as their rendering engines (a notable exception being the AbiWord word processor). KDE based applications use the renderer that comes with the QT GUI toolkit. In M$Windows, usually the Uniscribe engine and some associated stuff is used.