Enabling Unicode and entering special characters and diacritics (Mac or PC)
While we have tested most of what follows, and we have no reason to think that any of it will cause problems, we assume no responsibility for any negative effects that might be caused to any software or hardware.
Authors writing for The Chicago Online Encyclopedia of Mamluk Studies are STRONGLY encouraged to compose their articles using Unicode-compliant software, and to submit them as such. This is now possible in all current operating systems and most current word-processing software. In what follows, we will attempt to provide instructions for enabling and using Unicode.
Unicode:
What is Unicode?
The explanation
at Unicode.org is a good place to begin.
Put simply, Unicode is a system that assigns a unique code to every symbol in
every writing system (currently totalling something like 70,000 characters).
Thus, no matter what font is being used, it will always know exactly what symbol
is called for. In other font systems, each font has its own codes and the same
code in two different fonts can signify two different symbols. When a user lacks
the font a document was created with, some (or all) characters may be replaced
by different ones or become invisible. Most people have experienced this at
some point. Unicode eliminates this issue, but it only works with fonts that
are Unicode compliant. Also, not all Unicode fonts have all possible characters
(as this would make them too huge for convenient use). (Click here
for an indication of the scripts currently included, and here
for those not yet included.)
Users of Windows XP or Mac OSX can use Unicode without a great deal of trouble,
though not all software supports it equally (or at all). Microsoft Word does
support Unicode under both platforms, but the degree of compatibility varies
from one version to the next. There are several methods for entering Unicode
in a document, some more complex than others. Older operating systems support
Unicode to lesser degrees. We have no experience of using Unicode in Unix or
Linux operating systems.
Finding Unicode Characters and their codes:
The Unicode.org website provides over 100 charts
(in the form of PDF files) showing what characters appear in what blocks, and
providing the hexadecimal code for each character. If you use the methods outlined
below, you will not need to refer to these charts. However, they may be useful
in the event that you need to find a character.
For the purposes of this project, the following will be most relevant:
- Most of what you type will come from the Basic Latin block: http://www.unicode.org/charts/PDF/U0000.pdf
- The Latin-1 Supplement block includes mostly vowels with accent marks, including umlaut: http://www.unicode.org/charts/PDF/U0080.pdf
- The Latin Extended-A block includes the letters A, I, and U (and a, i, u) with macron, among others: http://www.unicode.org/charts/PDF/U0100.pdf
- Latin Extended Additional: http://www.unicode.org/charts/PDF/U1E00.pdf
- Latin Extended-B: http://www.unicode.org/charts/PDF/U0180.pdf
- Hamza and 'Ayn are found in the Spacing Modifier Letters block: http://www.unicode.org/charts/PDF/U02B0.pdf
Unicode Fonts
- To use Unicode, you will need to install (or find already installed) Unicode fonts on your computer. This is neither difficult nor costly.
- For basic information and links to numerous Unicode fonts, see http://www.alanwood.net/unicode/fonts.html
- Gentium is a very readable, freely downloadable Unicode font, with all the characters necessary for transliterating Arabic, Turkish, Persian, and a host of other languages and scripts. It does not include Arabic characters. The information requested prior to downloading is not required. A PDF showing every character included can be viewed by clicking the "Glyphs" link on the Gentium homepage, then clicking the PDF link. At this time, Gentium has italic but does not have bold characters.
http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&item_id=Gentium - Charis SIL is the best font we have found. It has all the characters needed for transliteration, and unlike Gentium it has bold, italic, and bold-italic styles. Like Gentium it is a free download. It will work on Mac or PC.
- Titus Cyberbit Basic
is also free to download. On the site, scroll down to "fonts" in the menu
on the right. The site is not easy to navigate, but the font is apparently
widely used and it does have all the necessary characters. In appearance it
is very similar to common fonts such as Times New Roman.
http://titus.uni-frankfurt.de/indexe.htm - Microsoft includes some Unicode fonts in most versions of its software, but not all included fonts are Unicode. Arial Unicode MS is one of their Unicode fonts, and is present in most versions of Windows. The Microsoft website is not particularly clear in explaining the Unicode options they provide. See instructions below for some tips. Keep in mind that the degree to which Microsoft supports Unicode is impacted both by the operating system AND the software. For example, some Macintosh versions of their software are less able to use Unicode, even when the operating system is compliant. Likewise, older versions of Windows are less compliant. Under Windows XP, Unicode should be fully usable, depending on the software. Macintosh's OSX is Unicode compliant, but not all software for OSX is. Again, see below for tips.
- Unicode compliant fonts in OS X are listed here: http://www.alanwood.net/unicode/fonts_macosx.html (scroll down or use the links at the top). Note that each font contains different ranges of characters.
- Unicode compliant fonts for Windows are listed here: http://www.alanwood.net/unicode/fonts.html (scroll down or use the links at the top). Again, coverage varies widely. To see whether a font is Unicode, and what ranges it supports, download the Font Properties Extension from Microsoft. Once it is installed, you can right-click on fonts in the Fonts folder (Start > Settings > Control Panel > Fonts) and choose Properties.
- Many (perhaps most) Unicode fonts work for both OS X and Windows (and, presumably, Unix/Linux).
Web browsers and Unicode
For information about setting up web browers to display Unicode (not all browsers are equally able), see http://www.lib.uchicago.edu/e/su/mideast/encyclopedia/browsers.html or Alan Wood's pages, linked there.
Writing and reading Arabic script on the PC
While its focus is on using Arabic, al-Husein N. Madhany's "Multilingual
Computing with Arabic and Arabic Transliteration: Arabicizing Windows Applications
to Read and Write Arabic" is a wealth of information for understanding
how Windows deals with scripts and fonts, and contains much that is useful,
even for those who will never need to type Arabic. For those who will be typing
in Arabic, it may be the most useful document on the web. It includes information
on transliteration, as well as instructions for using onscreen keyboards. Some
information regarding Mac OS X has recently been added. The article is updated
frequently, so check back here often for the latest version.
He has also created a PowerPoint tutorial to guide users through the process.
It refers to the Windows 2000 operating system, Microsoft Office 2003 (11.0),
and Internet Explorer 6.0. Though it does not provide information specific to
other versions of Windows or Office, or to other Windows software, many programs
are similar in operation and this tutorial can still be useful. The article
has more current and detailed information than the PowerPoint tutorial.
Click on the file name to open it, or right click to save it to your computer:
multilingual_computing_arabic.ppt.
Entering Unicode accross platforms (both Windows and Mac OS X)
In addition to the methods outlined below, it is possible to use
an online character picker. At this time, Richard Ishida's
Unicode character picker seems to be the best:
http://people.w3.org/rishida/scripts/pickers/latin/
Because they are small and difficult to see, please note that the 'ayn
and hamza characters are the last two in the upper section (before the
combining marks section).
Instructions are provided at http://people.w3.org/rishida/scripts/pickers/.
The same site provides an Arabic character picker as well.
Please note that some characters in the picker will not display in all word
processing software, though most should work if Unicode is supported.
Unicode and Macintosh OS X
Mac users (OS X) can use Unicode in Microsoft
Word, but not in all versions. You will have to experiment to determine
your version's compatibility. According to Alan Wood's Unicode site, the latest
version (Microsoft Office 2004) does support Unicode properly, and our tests
show that this is correct, though we have had some problems with right-to-left
scripts, such as Arabic (that may be due to shortcomings in the Mac OS). See
http://www.alanwood.net/unicode/utilities_editors_macosx.html#wordx
for further information.
The TextEdit program (included with OS X)
is Unicode compatible. If unable to save Unicode in Word, you might want to
save it in TextEdit as an RTF (rich text format) file. Since Encyclopedia
articles will contain minimal formatting and (usually) no footnotes, using a
simpler program like TextEdit should not present any problems.
To enter Unicode text on a Macintosh, you have several options.
First, you may use the Character Pallette,
which is found in the Input Menu (the flag menu in the upper right, near the
clock).
- If the Character Pallette option is not shown, enable it by doing the
following:
- Go to the Apple menu, select System Preferences.
- In the Preferences window, choose International.
- Select Input Menu.
- Check Character Pallette. You can also check Keyboard Viewer, Unicode Hex
Input, and US Extended at this time.
- Check Alt-Latin. If it is not there, see below for information on installing
it.
- Make sure the "Show Input Menu in menu bar" option is checked.
- Go to the Apple menu, select System Preferences.
- To use the Character Pallette to enter Unicode characters in a document, just keep it open in the background. When you need a character, you can enter it by double clicking it in Character Pallette.
- A useful feature of Character Pallette is the ability to designate frequently-used characters as favorites, saving you the trouble of finding the different letters each time you need them.
- For more information on Character Pallette, see Alan Wood's site: http://www.alanwood.net/unicode/utilities_fonts_macosx.html
Second, you may use the excellent and extremely simple Alt-Latin keyboard or LatinTL keyboard, both of which were created specifically for this purpose by Kino.
- To install either keyboard (or both of them), you must first download AltLatin.zip and/or LatinTL_X.dmg.sit from http://quinon.com/files/keylayouts/ (or from our Alt-Latin page).
- If your browser does not automatically expand the .zip or .sit file, tell it to save the file to your desktop (so it will be easy to find), then manually expand it. Usually this can be done simply by double-clicking the file, which will start the appropriate decompression program. LatinTL expands to a disk image, but for the purpose of installation you can treat it just like a folder.
- Follow the instructions in the "readme" file to install the keyboard(s).
- Make it visible in the Input Menu by following the instructions given above
for the Character Pallette.
- Because Alt-Latin and LatinTL work like any other keyboard, you will not have to change keyboards unless you need to type in a different alphabet, such as Arabic.
- Entering letters with diacritics using either keyboard is very simple:
- Make sure you are using a Unicode font. It may work with other fonts, but you should use Unicode (OS X comes with Lucida Grande and there are others available).
- To enter a vowel with a macron, simply hold down either option key and hit the letter 'a' simultaneously. Release them, then type the letter needing the macron (using the shift key if you need a capital).
- For letters with dots below, press option and period, release, then type the letter.
- Hamza is shift+option+P, and 'ayn is option+p. (This may not work in Microsoft Word with Alt-Latin--we do not yet know why. If it does not work, use the LatinTL keyboard instead, or use the Character Pallette for these two characters. These keystrokes do work in TextEdit and other software with Alt-Latin.)
- The PDF file included with Alt-Latin shows maps of the keyboard, in case you need something not mentioned here, or you may use our maps.
- The layout of LatinTL is very similar, with only a few differences, and it also includes maps. (See the Alt-Latin page for a description of the differences.)
- Click here for
diagrams of the Alt-Latin keyboard (usable for LatinTL as well, with
a few differences) and for downloads.
- The diagrams are for the Windows version, but the layout is almost identical to the Mac version. The main difference is that where the Windows version uses only the Alt key to the right of the space bar, the Mac version uses either of the two Option keys. This makes the Mac version a little more comfortable to use, since you can use either hand. (There is no Windows version of LatinTL.)
- There are downloadable pdf files of the diagrams available on the same page, in case you would like to print them for easier reference while typing.
- Kino's site (linked above) also has numerous other Macintosh keyboard and font resources, such as some keyboards based on non-US layouts (notably a UK variant of Alt-Latin).
Third, you may want to use Knut S. Vikør's Jaghbub
keyboard layouts (and, perhaps, his Unicode fonts).
His Arabic Macintosh pages have long been one of the web's most useful sources
for Mac users who need to type Arabic or transliteration, and he has updated
both the pages and the downloadable resources he created.
- The page on transliteration, "Writing Arabic with Latin letters," explains the issues and provides a downloadable file containing the JaghbUni font package and the American Diacs. keyboard layout.
- The Jaghbub font package page gives more information about the three fonts included, as well as German, French, Italian, Danish, Swedish, Norwegian, US and UK keyboard layouts for typing diacritics in Unicode fonts.
- There are also separate keyboard layouts for typing IPA characters in Unicode fonts for the same national standards (that is, the non-option keys follow the regular national keyboard standard, but the IPA characters are all placed on option keys under no particular standard).
For any keyboard layout, you can always select Keyboard Viewer from the Input Menu to see what different keystrokes will do.
Unicode and Windows
Windows generally supports Unicode, but many programs (whether Microsoft's or others') do not. Microsoft Word is usually able to work with Unicode, though there may be some restrictions. As mentioned above, some of the included fonts are Unicode fonts, but not all. Even if a font is Unicode, it may not contain all Unicode ranges, and might lack characters you need.
There are numerous options for using Unicode in Windows:
In Microsoft Word, you may use the Insert
Symbol function (found in the Insert menu). This function allows you to choose
characters from a grid displayed in its own window. Double-clicking the desired
character inserts it at the cursor in the document. You can also use this window
to assign keystrokes to the characters you use most often. For example, you
might assign the keystroke alt+a to the lower case a with macron, and alt+shif+A
to capital A with macron.
HOWEVER, you might be better off using the Alt-Latin
keyboard, so you can enter special characters in any Unicode compliant
software, not just in Word.
To read more about Word's Unicode support, see Alan Wood's site: Word 97 (http://www.alanwood.net/unicode/utilities_editors.html#word97)
and Word 2000 and 2002 (http://www.alanwood.net/unicode/utilities_editors.html#word2002)
are covered.
We have not tested other major word processing software. This does not mean that other companies are not producing Unicode compliant software.
Microsoft's Notepad is Unicode compliant, but does not have a similar input method to Word's. However, if you install the Windows version of the Alt-Latin keyboard (see below), you can enter Unicode text in ANY Unicode compliant software, including Notepad.
Character Map is similar to the Mac's
Character Pallette, and is included with Windows. Alan Wood's site gives a brief
overview: http://www.alanwood.net/unicode/utilities_fonts.html#charactermap.
It is found by clicking on Start > All Programs > Accessories > System Tools.
Once it is open, make sure Advanced View is checked, Character set
is Unicode, and Group by is Unicode Subrange. It might take some scrolling
to find the characters you need, but if you use the Unicode charts linked above,
or the simplified chart at the bottom of this page, you can get a sense of where
to look. Once you find a character, you simply double-click it, then click copy,
then paste it into your document.
BabelMap is also similar to Character
Pallette, but for Windows. It is free. We have not yet tested it, but it looks
promising. http://www.babelstone.co.uk/Software/BabelMap.html
The same company also provides BabelPad,
a Unicode compliant text editor for Windows. It is also free. http://www.babelstone.co.uk/Software/BabelPad.html
Information about these and other utilities can be found at http://www.alanwood.net/unicode/utilities.html.
Unicode.org provides a list of Unicode-enabled products at http://www.unicode.org/onlinedat/products.html.
Installing fonts on a Windows computer is fairly simple. (These instructions will work for any version of Windows, but refer to the "classic" view in XP, not the "cluster" view.)
- Download the font.
- If it is a compressed file (such as .zip), expand it.
- Open the fonts folder by clicking on Start, then Settings, then Control Panel, then Fonts.
- Drag the font file(s) into this folder. It should automatically install.
The Alt-Latin keyboard for Windows
Our preferred method for typing in Unicode is now the Windows
version of the Alt-Latin keyboard (mentioned in the Macintosh section,
above). This free keyboard layout allows the user to type a very wide range
of characters and is very simple to use. Installation is not as simple as on
a Mac, but not too difficult.
- Download the file http://quinon.com/files/keylayouts/windows/AltLatNT.zip
(or from our Alt-Latin page) and decompress (unzip)
it.
(In XP, just right click and choose Extract All, then follow the prompts.) - Double click on the AltLatin.msi file inside the AltLatin folder. It will
install automatically, unless you do not have install privileges on the computer.
- Now you must enable the keyboard. Essentially this means telling Windows
that you want this keyboard layout to be available for use. At this step things
might vary from computer to computer.
- Click on Start, then Settings, then Control Panel.
- Double click Regional and Language Options. The window that opens should
have three tabs: Regional Options, Languages, and Advanced.
- Click on the Languages tab.
- In the upper section (Text Services and Input Languages) there should be
a Details button. Click it.
[Steps 4-7 can be accomplished by right-clicking on the keyboard icon in the lower right portion of the task bar (near the clock)
and choosing Settings. This icon may not be visible on all computers.]
- In the window that opens, click the Add button.
- (If Keyboard layout/IME is grayed out, put a check in the box.) Find "English
(United States) - Alt-Latin" in the drop down list and choose it. Click
OK.
- Alt-Latin now appears in the list of keyboards. Click Apply.
- To make it your default keyboard (which, if you type primarily in English
or other languages using the Latin alphabet and the US keyboard layout, will
probably not affect your usual typing habits) you must choose it in the drop
down list under Default Input Language at the top of this window. Click OK.
- Now there should be a little keyboard icon in the task bar at the bottom of your screen (if there wasn't already). (It will be next to the blue square with EN in it, which signifies that the current input language is English. If you use no other languages, this icon might not be there.) When you click on the small keyboard icon, a list of keyboard choices pops up. If you made Alt-Latin the default, it should be in bold type. (However, it may not appear until the next time you restart your computer. Until then it might be a blank line in the list.)
Typing with the Alt-Latin Windows keyboard is simple. For most letters you will do things as you always have. When you need a special character or a character with a diacritic or accent, you will use key combinations with the Alt key to the right of the space bar (the one on the left side does not work for this in Windows, unfortunately). For example, to type the letter a with a macron you hold down the Alt key and press the letter a, release them both, then type the letter above which you want the macron. For letters with a dot below, hold down Alt, press the period key, release both, type the letter needing the dot. See the diagrams for clearer explanations.
Click here for
diagrams of the Alt-Latin keyboard.
There are downloadable pdf files of the diagrams available on the same page,
in case you would like to print them for easier reference while typing.
For those whose ordinary keyboard layout is not the U.S. standard, getting
accustomed to using Alt-Latin will take some effort. We are not aware of specific
Unicode/transliteration keyboard layouts based on the standard layouts of, for
example, German or French computers. However, it is likely that they exist.
Please contact us with any information.
To create your own, you may use Microsoft's Keyboard Layout Creator: http://www.alanwood.net/unicode/utilities_fonts.html#klc,
which is free but unsupported.
Tavultesoft's Keyman software is not free, but is widely used for keyboard layout
creation: http://www.alanwood.net/unicode/utilities_fonts.html#keyman.
Please use the questions and comments link below to contact MEDOC if your questions are not answered by this page or the links presented.
Return to The Chicago Online Encyclopedia of Mamluk Studies
| Characters used for Arabic transliteration in the encyclopedia: | |||||
| 0100 Latin capital letter A with macron |
![]() |
0101 Latin small letter A with macron |
Found in the Latin Extended-A range. | ||
![]() |
00E1 Latin small letter A with acute (Used for alif maqsurah.) |
Found in the Latin-1 Supplement range. | |||
| 012A Latin capital letter I with macron |
![]() |
012B Latin small letter I with macron |
Found in the Latin Extended-A range. | ||
| 016A Latin capital letter U with macron |
![]() |
016B Latin small letter U with macron |
Found in the Latin Extended-A range. | ||
| 02BE Modifier letter right half ring Transliteration of Arabic hamza |
Found in the Spacing Modifier Letters range. | ||||
| 02BF Modifier letter left half ring Transliteration of Arabic 'ayn |
Found in the Spacing Modifier Letters range. | ||||
| 1E0C Latin capital letter D with dot below |
![]() |
1E0D Latin small letter D with dot below |
Found in the Latin Extended Additional range. | ||
| 1E24 Latin capital letter H with dot below |
![]() |
1E25 Latin small letter H with dot below |
Found in the Latin Extended Additional range. | ||
| 1E62 Latin capital letter S with dot below |
![]() |
1E63 Latin small letter S with dot below |
Found in the Latin Extended Additional range. | ||
| 1E6C Latin capital letter T with dot below |
![]() |
1E6D Latin small letter T with dot below |
Found in the Latin Extended Additional range. | ||
| 1E92 Latin capital letter Z with dot below |
![]() |
1E93 Latin small letter Z with dot below |
Found in the Latin Extended Additional range. | ||
| The following characters are included here as they may be useful for authors who need to enter names or titles in modern Turkish or some European languages. It is not exhaustive, and will probably grow as necessary. Any character which does not appear here can certainly be found by using the character pickers or other resources mentioned above. | |||||
| 00C7 Latin capital letter C with cedilla |
00E7 Latin small letter C with cedilla |
Found in the Latin-1 Supplement range. | |||
| 011E Latin capital letter G with breve |
011F Lating small letter G with breve |
Found in the Latin Extended-A range | |||
| 0130 Latin capital letter I with dot above |
0131 Latin small letter dotles I |
Found in the Latin Extended-A range | |||
| 00D6 Latin capital letter O with diaeresis |
![]() |
00F6 Latin small letter O with diaeresis |
Found in the Latin-1 Supplement range. | ||
| 00DC Latin capital letter U with diaeresis |
00FC Latin small letter U with diaeresis |
Found in the Latin-1 Supplement range. | |||
| 015E Latin capital letter S with cedilla |
015F Latin small letter S with cedilla |
Found in the Latin Extended-A range | |||
| For a more complete explanation of the transliteration system used in the Encyclopedia, see the romanization table that appears on the last page of all issues of MSR and in the MSR editorial and style guidelines, (a PDF file). | |||||










