.\" Copyright (C) 2001 Information-technology Promotion Agency (IPA) .\" Copyright (C) 2001-2003 .\" National Institute of Advanced Industrial Science and Technology (AIST) .\" This file si aprt of the m17n library documentation. .\" Permission is granted to copy, distribute and/or modify this document .\" under the terms of the GNU Free Documentation License, Version 1.2 or .\" any later version published by the Free Software Foundation; with no .\" Invariant Section, Front-Cover Texts "The m17n library documentation", .\" and no Back-Cover Texts. A copy of the license is included in the .\" appendix entitled "GNU Free Documentation License". .TH "Character" 3m17n "14 Jul 2007" "" "Version 1.4.0" "" "The m17n Library" \" -*- nroff -*- .ad l .nh .SH NAME Character \- Character objects and API for them. .PP .SS "Variables: Keys of character properties" Key for script. .PP These symbols are used as keys of character properties. .PP The symbol \fBMscript\fP has the name \fC'script'\fP and is used as the key of a character property. The value of such a property is a symbol representing the script to which the character belongs. .PP Each symbol that represents a script has one of the names listed in the \fIUnicode Technical Report #24\fP. .in +1c .ti -1c .RI "\fBMSymbol\fP \fBMscript\fP" .br .ti -1c .RI "\fBMSymbol\fP \fBMname\fP" .br .RI "\fIKey for character name. \fP" .ti -1c .RI "\fBMSymbol\fP \fBMcategory\fP" .br .RI "\fIKey for general category. \fP" .ti -1c .RI "\fBMSymbol\fP \fBMcombining_class\fP" .br .RI "\fIKey for canonical combining class. \fP" .ti -1c .RI "\fBMSymbol\fP \fBMbidi_category\fP" .br .RI "\fIKey for bidi category. \fP" .ti -1c .RI "\fBMSymbol\fP \fBMsimple_case_folding\fP" .br .RI "\fIKey for corresponding single lowercase character. \fP" .ti -1c .RI "\fBMSymbol\fP \fBMcomplicated_case_folding\fP" .br .RI "\fIKey for corresponding multiple lowercase characters. \fP" .in -1c .SS "Defines" .in +1c .ti -1c .RI "#define \fBMCHAR_MAX\fP" .br .RI "\fIMaximum character code. \fP" .in -1c .SS "Functions" .in +1c .ti -1c .RI "\fBMSymbol\fP \fBmchar_define_property\fP (const char *name, \fBMSymbol\fP type)" .br .RI "\fIDefine a character property. \fP" .ti -1c .RI "void * \fBmchar_get_prop\fP (int c, \fBMSymbol\fP key)" .br .RI "\fIGet the value of a character property. \fP" .ti -1c .RI "int \fBmchar_put_prop\fP (int c, \fBMSymbol\fP key, void *val)" .br .RI "\fISet the value of a character property. \fP" .ti -1c .RI "\fBMCharTable\fP * \fBmchar_get_prop_table\fP (\fBMSymbol\fP key, \fBMSymbol\fP *type)" .br .RI "\fIGet the char-table for a character property. \fP" .in -1c .SH "Detailed Description" .PP The m17n library represents a \fIcharacter\fP by a character code (an integer). The minimum character code is \fC0\fP. The maximum character code is defined by the macro \fBMCHAR_MAX\fP. It is assured that \fBMCHAR_MAX\fP is not smaller than \fC0x3FFFFF\fP (22 bits). .PP Characters \fC0\fP to \fC0x10FFFF\fP are equivalent to the Unicode characters of the same code values. .PP A character can have zero or more properties called \fIcharacter\fP \fIproperties\fP. A character property consists of a \fIkey\fP and a \fIvalue\fP, where key is a symbol and value is anything that can be cast to \fC(void *)\fP. 'The character property that belongs to character C and whose key is K' may be shortened to 'the K property of C'. .SH "Define Documentation" .PP .SS "#define MCHAR_MAX" .PP The macro \fBMCHAR_MAX\fP gives the maximum character code. .SH "Variable Documentation" .PP .SS "\fBMSymbol\fP \fBMscript\fP" .PP .SS "\fBMSymbol\fP \fBMname\fP" .PP The symbol \fBMname\fP has the name \fC'name'\fP and is used as the key of a character property. The value of such a property is a C-string representing the name of the character. .SS "\fBMSymbol\fP \fBMcategory\fP" .PP The symbol \fBMcategory\fP has the name \fC'category'\fP and is used as the key of a character property. The value of such a property is a symbol representing the \fIgeneral category\fP of the character. .PP Each symbol that represents a general category has one of the names listed as abbreviations for \fIGeneral Category\fP in Unicode. .SS "\fBMSymbol\fP \fBMcombining_class\fP" .PP The symbol \fBMcombining_class\fP has the name \fC'combining-class'\fP and is used as the key of a character property. The value of such a property is an integer that represents the \fIcanonical combining class\fP of the character. .PP The meaning of each integer that represents a canonical combining class is identical to the one defined in Unicode. .SS "\fBMSymbol\fP \fBMbidi_category\fP" .PP The symbol \fBMbidi_category\fP has the name \fC'bidi-category'\fP and is used as the key of a character property. The value of such a property is a symbol that represents the \fIbidirectional category\fP of the character. .PP Each symbol that represents a bidirectional category has one of the names listed as types of \fIBidirectional Category\fP in Unicode. .SS "\fBMSymbol\fP \fBMsimple_case_folding\fP" .PP The symbol \fBMsimple_case_folding\fP has the name \fC'simple-case-folding'\fP and is used as the key of a character property. The value of such a property is the corresponding single lowercase character that is used when comparing M-texts ignoring cases. .PP If a character requires a complicated comparison (i.e. cannot be compared by simply mapping to another single character), the value of such a property is \fC0xFFFF\fP. In this case, the character has another property whose key is \fBMcomplicated_case_folding\fP. .SS "\fBMSymbol\fP \fBMcomplicated_case_folding\fP" .PP The symbol \fBMcomplicated_case_folding\fP has the name \fC'complicated-case-folding'\fP and is used as the key of a character property. The value of such a property is the corresponding M-text that contains a sequence of lowercase characters to be used for comparing M-texts ignoring case.