§ Fonts experimented at this site

  1. System font
  2. Pan-Unicode Fonts
  3. Middle English fonts
  4. Blackletter
  5. Arabic
  6. Irish

I. System font

I thought using "system" font on the fly might work in most cases. But testing this on a sample (see right ⇒) found that it did not properly display "yogh", etc.

Another problem is that on my Japanese system, a character displayed in System font gets recognized as part of a Japanese word, and will word-wrap (line-break) before or after this character if it occurs near the end-of-line.

II. Pan-Unicode Fonts (All around general-purpose fonts)

stylesheet class: ipa
* Only a dozen or so font-sets strive to be Pan-Unicode fonts. General practice is to use a choice of fonts.
  1. Lucida sans Unicode
    (316 KB. standard on Windows, can be
    * A font that early on covered the "Latin Extended-B" subset, which allowed the display of the Norse "o-hook" (see sidebar, on right).
    * It also included the "IPA Extensions" subset useful for displaying pronunciation symbols.

  2. Arial Unicode MS
    (23.0 MB. Bundled with MS Office, but may require installing.)
    * The large file-size reflects the fact that it includes the CJK glyphs (Chinese characters, Japanese and Korean) that number in the tens of thousands.
    * This also includes a more complete selection of "Latin Extended Additional" characters. For example, "k with dot below" is needed for romanization of Arabic/Persian words. (style sheet classes:farsilat, arabiclat)

  3. TITUS Cyberbit Basic
    (1.8 MB. → TITUS: Software / Fonts)
    * The TITUS houses a text database of truely wide assortment of Indo-germanic languages. It has the Poetic edda using O-hook (see column right), the Septuagint and other Greek texts using variants of the letters beta, theta, pi and phi; and exotic texts such as Georgian version of the Shah-Nameh to name a few.
    * It also includes some symbols (e.g. the Middle-English yogh, see next sections) that are omitted from Arial Unicode Basic.

§ Sidebar on "O-hook"
  • (1) The Norse "hooked o" is usually substituted using " ö " (umlated-o or o with dierisis) in electronic texts.
    The TITUS site however uses the "hooked o" character ("&01EB;" UNICODE hexadecimal):
    Ǫ / ǫ
    The above should display correctly with if you have either of:
      1) Lucida sans Unicode,
      2) Arial Unicode MS ,
      3) TITUS Cyberbit Basic

  • To view the TITUS pages with Norse texts properly, you may have to go to the browser options under Fonts, and select one of the above for the default "Latin 1S" font.

III. Middle-English

stylesheet class: middleeng, middleeng+ me
lang. abbr.: Middle English = [ME].
* The "eth" (ð) and "thorn" (þ) characters don't pose much of a problem. For these and the regular Alphabet, I am going to use an ordinary font ("Palatino Linotype" font).
* However, special fonts are needed for such entities as "yogh" (ȝ Unicode hex 021D), a "small letter l with bar" (ƚ Unicode hex 019A) and "combining horn" (̛ Unicode hex 031B), a "small letter y with dot above" ( Unicode hex 1E8F).
* I am going to use a Medieval-compliant fonts just for those characters, but not for all characters because the homespun Medieval-compliant font-sets I have downloaded off the internet look jaggy and don't have as clean an appearance.
  • Junicode junicode.sourceforge.net
    * Font for medievalists. Also for OE/Anglo-Saxon, covers "ae ligature with acute accent" (ǽ Unicode hex 01FD) that appears in Boswell.

  • Times Old English (Mac) babel.uoregon.edu/yamada/fonts/english.html
    * Western Michigan University, home of the Corpus of Middle English Prose & Verse site. The The Medieval Review site's mideng page suggested.

    There is also good info on Anglo-Saxon fonts at Anglo-Saxon Type page for EEBO.

    In Middle Welsh, not only the "letter l with bar" is used but also the "letter d with stroke" (đ Unicode hex 0111), which stands for a "dd" which in Welsh is pronounced like "ð" .
  • § Sidebar on "yogh"

    IV. Blackletter / Fraktur

    stylesheet class: middleeng, middleeng+ me
    Blackletter is the type of font used in used in Early English books by Caxton, Wynkyn de Worde etc. (such earliest printed books are refered to as Incunabula) and spanning 15-17th century or so.

    Perhaps an easy way to describe these is to say they are similar to typesets used in Newspaper titles such as "The New York Times".

    Similar types of font, current in some books of the German-speaking countries into the 19-20th century, are usually referred to as Fraktur.

  • JSL Blackletter [JSL Blackletter] www.shipbrook.com/jeff/typograf.html
  • Old English [Old English]
  • Black Chancery [BlackChancery]
  • Faustus [Faustus]

    The blackletter s-z ligature (ß) looks like a "hooked z"
  • Capital Z with hook - Ȥ (unicode 0224)
  • Capital z with hook - ȥ (unicode 0225) but these characters are not in many font-sets. It is used in modern type in e.g. Ettmüller's edition of Heinrich von Veldeke's Eneide. I also have seen the character represented by an z with an inverted triangle (▼) hanging from the bottom right tip in The Penguin Book of German Verse
  • § Sidebar on "elongated s"
    • The "Latin small letter long s" (ſ Unicode ſ) is mapped. This is a post-17th century English form of "s" that occurs in both print and handwriting (cursive). You'll find them in Elizabethan plays, in Colonial Period American letters, etc. The capital letter of this however is not included in the Unicode set.
    • One possible subtitute is the integral symbol ∫ (∫)
    • Another subtitute is the phonetic symbol "esh" ʃ (ʃ) which represents the pronunciation of the "sh" sound.
    • I've even read about someone using a character from the Windows "Symbol" font set, in which the character with the one-byte code 242 decimal (= 00F2 hexadecimal) is mapped to an integral symbol. (But if the reader does not have the font, the character will revert to "o with grave accent" ò which is what that unicode normally represents.)

    V. Arabic / Farsi

    stylesheet class: arabic, arabiclat; farsi, farsilat

    * To properly display the Latin transliteration of the Arabic language requires a font-set that includes necessary characters from the "Latin extended" subset. e.g. "" ("underlined d" 1E0F) to represent the Arabic letter dhal, "" ("s with dot below" 1E63) to represent the Arabic letter sad and so on.
    Arabic Script Spice Index has an excellent tabulatation of the transliterations.

    * At least on my browser, Arabic generally appears correctly (left to right), though it may be advisable to enclose the text with "right to left" symbol (200F) and the converse symbol (200E).

    * The most widely used e-text convention represents the Arabic ain as a "grave accent") and the Arabic hamza as the simple apostrophe (see table below).
    * In U of Chicago's online version of Steingass's Persian-English dictionary the above system I think is used for "non-unicode" view. For unicode view, it uses the a 'turned comma' rather than 'grave accent' etc. (see table).

    • Transliteration of the Arabic "ain" and "hamza"
      Simplified 1-byte (non-Unicode) digital Steingass (Unicode) rigorous notation (Unicode)
    [ ` ]
    "grave accent" hex 0060 / chr$(96)
    [ ʻ ]
    "modifier letter turned comma" 02BB
    [ ʿ ]
    "modifier letter left half ring" 02BF
    [ ' ]
    "apostrophe" hex 0027 / chr$(39)
    [ ʼ ]
    "modifier letter apostrophe" 02BC
    [ ʾ ]
    "modifier letter right half ring" 02BE
    farsi yeh
    See Endnote 1
    [ ʼi ]
    "modifier letter apostrophe" and "i" (02BC + 0069)

    * The arabic letter "ain" (‏ع‎) changes form to ("‏ـعـ‎") in the "medial position" and in this case is represented by a standalone "`".
    When "ain" begins a word it looks like ("‏عـ‎") and is represented by "`A". For example, the name "`Amr" is styled this way since "`mr" "`mr" appears decidedly unnatural.

    * (Endnote 1 - Farsi yeh) The ی (Farsi yeh) is an extended character and frequently substitued by ى (Arabic Alef Maksura) or by ي (Arabic Yeh). See The Persian letter Yeh page at the "How to Type Persian" site.

    * (Endnote 2 - yeh above / hamza above)- When a word such as gurza ending in ه (heh) is followed by an "-'i" (" -ye") suffix [represented by ی (Farsi yeh) and basically meaning "~ of"] in order to produce the form gurza or guraza-'i (gurza-ye) this is represented in several ways:
    1. The "-'i" (" -ye") is omitted altogether in writing (though it is meant to be pronounced); or
    2. Written traditionally as a "miniscule Farsi yeh above" which ceases to look like هkیl and resembles هٔ  "hamza above" (so that this substitution is used; valid in Tahoma font but not in Nazanin font); or
    3. Written as a single-standing ی (Farsi yeh), which can be substituted with the approximate Arabic letters as described in Endnote 1 above.
    See The Persian Heh=Hamzeh page at the "How to Type Persian" site, which addresses the solution using Nazanin font (download info also available at the site). ´

    VI. Irish

    stylesheet class: irish [font "Bunchló GC" specified], irishuncial

    I obtained Celtic fonts at Mary Jones'
    Free Downloads page, (the "Celtic Fonts" collection, zipped format).
    The behavior of these fonts are not consistent.

    1) Font names that uses no accent or lenition such as "Gaeilge 1 Normal" can be used regualrly in the .css stylesheet.

    2) "Bunchló GC" will not appear on my system when defined inside stylesheet, but it can be use inline with the
      <font face ="Bunchló GC"> </font> 

    3) However, I am unable to display "Bunchló", "Bunchló Dubh," "Cló Gaelach", "Glanchló".

    Here is a list of mod. reformed vs. unreformed lenited consonants:

                        (unicode html)
      Bh - &#x1E02;
      bh - &#x1E03;
      Ch - Ċ &#x010A;
      ch - ċ &#x010B;
      Dh - &#x1E0A;
      dh - &#x1E0B;
      Fh - &#x1E1E;
      fh - &#x1E1F;
      Gh - Ġ &#x0120;
      gh - ġ &#x0121;
      Mh - &#x1E40;
      mh - &#x1E41;
      Nh - &#x1E44;
      nh - &#x1E45;
      long-legger r - ɼ &#x027C;
      Sh - &#x1E60;
      sh - &#x1E61;
      Long s - ſ &#x017F;
      Long sh - &#x1E9B;
      Th - &#x1E6A;
      th - &#x1E6B;

