A beginner's introduction to typesetting with LATEX
Appendix C — The ASCII character set
This edition of Formatting Information was prompted by the generous help I have received from TEX users too numerous to mention individually. Shortly after TUGboat published the November 2003 edition, I was reminded by a spate of email of the fragility of documentation for a system like LATEX which is constantly under development. There have been revisions to packages; issues of new distributions, new tools, and new interfaces; new books and other new documents; corrections to my own errors; suggestions for rewording; and in one or two cases mild abuse for having omitted package X which the author felt to be indispensable to users. ¶ I am grateful as always to the people who sent me corrections and suggestions for improvement. Please keep them coming: only this way can this book reflect what people want to learn. The same limitation still applies, however: no mathematics, as there are already a dozen or more excellent books on the market — as well as other online documents — dealing with mathematical typesetting in TEX and LATEX in finer and better detail than I am capable of. ¶ The structure remains the same, but I have revised and rephrased a lot of material, especially in the earlier chapters where a new user cannot be expected yet to have acquired any depth of knowledge. Many of the screenshots have been updated, and most of the examples and code fragments have been retested. ¶ As I was finishing this edition, I was asked to review an article for The PracTEX Journal, which grew out of the Practical TEX Conference in 2004. The author specifically took the writers of documentation to task for failing to explain things more clearly, and as I read more, I found myself agreeing, and resolving to clear up some specific problems areas as far as possible. It is very difficult for people who write technical documentation to remember how they struggled to learn what has now become a familiar system. So much of what we do is second nature, and a lot of it actually has nothing to do with the software, but more with the way in which we view and approach information, and the general level of knowledge of computing. If I have obscured something by making unreasonable assumptions about your knowledge, please let me know so that I can correct it.
This document is Copyright © 1999–2005 by Silmaril Consultants under the terms of what is now the GNU Free Documentation License (copyleft).
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled The GNU Free Documentation License.
You are allowed to distribute, reproduce, and modify it without fee or further requirement for consent subject to the conditions in section D.5. The author has asserted his right to be identified as the author of this document. If you make useful modifications you are asked to inform the author so that the master copy can be updated. See the full text of the License in Appendix D.
The ASCII character set
The American Standard Code for Information Interchange was invented in 1963, and after some redevelopment settled down in 1984 as standard X3.4 of American National Standards Institute (ANSI). It represents the 95 basic codes for the unaccented printable characters and punctuation of the Latin alphabet, plus 33 internal ‘control characters’ originally intended for the control of computers, programs, and external devices like printers and screens.
Many other character sets (strictly speaking, ‘character repertoires’) have been standardised for accented Latin characters and for all other non-Latin writing systems, but these are intended for representing the symbols people use when writing text on computers. Most programs and computers use ASCII internally for all their coding, the exceptions being XML-based languages like XSLT, which are inherently designed to be usable with any writing system, and a few specialist systems like APL.
Although the TEX and LATEX file formats can easily be used with many other encoding systems (see the discussion of the inputenc in section 2.7), they are based on ASCII. It is therefore important to know where to find all 95 of the printable characters, as some of them are not often used in other text-formatting systems. The following table shows all 128 characters, with their decimal, octal (base-8), and hexadecimal (base-16) code numbers.
The index numbers in the first and last columns are for finding the octal (base-8) and hexadecimal (base-16) values respectively. Replace the arrow with the number or letter from the top of the column (if the arrow points up) from the bottom of the column(if the arrow points down).
Example: The Escape character (ESC) is octal '033 (03 for the row, 3 for the number at the top of the column because the arrow points up) or hexadecimal "1B (1 for the row, B for the letter at the bottom of the column because the arrow points down).
For the decimal value, multiply the Octal row number by eight and add the column number from the top line (that makes ESC 27).