chrtbl converts a source description of a character classification table into a form that can be used by the character classification functions and multibyte functions (see ctype.3v and mblen.3 The source description is found in filename. If filename is not given, or just given as `-', chrtbl reads its source description from the standard input.
chrtbl creates one or two output files, the second file is only created if the model token is specified. By default, these files are created in the current working directory. The first file, named by the chrclass token, is always produced and contains the character classification information for all single-byte (7-bit and 8-bit) character code-sets described by one setting of the LC_CTYPE category of locale. The second file, created if the model token is specified, contains information relating to details of width and structure of the coded character set currently under definition. The second file is named by appending `.ci'. to the value specified by the chrclass token.
The first output file contains a binary form of the character classification information described in filename. It is structured in such a way that it can be used at run-time to replace the active version of the ctype[] array in the C-library, For it to be understood at run-time, the output file must be moved to the /usr/share/lib/locale/LC_TYPE or /etc/locale directory (see FILES below) by the super-user or a member of group bin. This file must be readable by user, group, and other; no other permission should be set.
filename contains a sequence of tokens in any order after the chrclass token, each separated by one or more NEWLINE characters or comment lines. The tokens recognized by chrtbl are as follows:
If the model token is declared without arguments, then it is assumed that there is a set of user-defined rules for character code-set announcement. This is noted in the output file and will be later used to fold in user-defined code into the multibyte functions in the C-library (see mblen.3
Any lines with the number sign (#) in the first column are treated as comments and are ignored. Blank lines are also ignored.
A character can be represented as a hexadecimal or octal constant (for example, the letter a can be represented as 0x61 in hexadecimal or 0141 in octal). Hexadecimal and octal constants may be separated by one or more space and tab characters.
The dash (-) may be used to indicate a range of consecutive numbers. Zero or more space characters may be used for separating the dash character from the numbers.
The backslash character (\) is used for line continuation. Only a RETURN is permitted after the backslash character.
The relationship between upper- and lower-case letters (ul) is expressed as ordered pairs of octal and hexadecimal constants:
<upper-case_character lower-case_character>
These two constants may be separated by one or more space characters. Zero or more space characters may be used for separating the angle brackets (<>) from the numbers.
The following is an example of an input file used to create the ASCII code set definition table on a file named ascii.
chrclass ascii isupper 0x41 - 0x5a islower 0x61 - 0x7a isdigit 0x30 - 0x39 isspace 0x20 0x9 - 0xd ispunct 0x21 - 0x2f 0x3a - 0x40 \ 0x5b - 0x60 0x7b - 0x7e iscntrl 0x0 - 0x1f 0x7f isblank 0x20 isxdigit 0x30 - 0x39 0x61 - 0x66 \ 0x41 - 0x46
ul <0x41 0x61> <0x42 0x62> <0x43 0x63> \ <0x44 0x64> <0x45 0x65> <0x46 0x66> \ <0x47 0x67> <0x48 0x68> <0x49 0x69> \ <0x4a 0x6a> <0x4b 0x6b> <0x4c 0x6c> \ <0x4d 0x6d> <0x4e 0x6e> <0x4f 0x6f> \ <0x50 0x70> <0x51 0x71> <0x52 0x72> \ <0x53 0x73> <0x54 0x74> <0x55 0x75> \ <0x56 0x76> <0x57 0x77> <0x58 0x78> \ <0x59 0x79> <0x5a 0x7a>
The error messages produced by chrtbl are intended to be self-explanatory. They indicate input errors in the command line or syntactic errors encountered within the input file.
Created by unroff & hp-tools. © somebody (See intro for details). All Rights Reserved. Last modified 11/5/97