2000-05-15 Miguel de Icaza * Makefile.am: Generate the unicodeConf.sh file, include unicodeConf.sh.in file in the distribution. * configure.in (UNICODE_INCLUDEDIR, UNICODE_LIBDIR, UNICODE_LIBS): New variables, used to susbtitute in the unicodeConf.sh file * unicodeConf.sh.in: New file, template for generating unicodeConf.sh file. 2000-04-26 * configure.in: no need for c++ here, removing check. 2000-02-11 Owen Taylor * libunicode.spec.in: Add a .spec file * autogen.sh: Run libtoolize * ltmain.sh ltconfig config.guess config.sub: Removed from CVS 2000-02-10 Tom Tromey * unicode.h (EILSEQ): Don't use EBADMSG. Fixed comment. 2000-02-09 Tom Tromey * sjis.c (sjis_write): Use empty brace pair, not semicolon, to avoid compiler warning. * convert.c (EILSEQ): Don't define. * unicode.h: Include (EILSEQ): Define if not defined. 2000-02-09 Dan Winship * convert.c (unicode_iconv): Change return type to ssize_t since it can return -1. * utf8.h (UTF8_COMPUTE): 0xFE and 0xFF are invalid. (UTF8_GET): break after setting result to -1 on error. 2000-01-07 Owen Taylor * Makefile.am (AUTOMAKE_OPTIONS): Change AUTOMAKE_OPTIONS from gnits to gnu, since gnits prohibits the LGPL, at least in currently released versions of automake. 1999-11-07 Tom Tromey * prop.c (unicode_iswide): Reject some characters that aren't wide. From Markus Kuhn. 1999-11-06 Tor Lillqvist * gen-formata.pl * sjis.c * jis/shiftjis.h * msft/cp932.h: Slight cosmetic cleanup, no changes in functionality. 1999-10-31 Tom Tromey * unicode.h (unicode_iswide): Declare. * prop.c (unicode_iswide): New function. 1999-10-29 Tom Tromey * iso8859.c (iso8859_write): Special case old mapping of 8859-7. * iso/iso8859-7.h: Remapped 0xa0 and 0xa1 per latest Unicode tables. 1999-10-25 Tom Tromey * testsuite/Makefile.am (tinyiconv_SOURCES): New macro. (tinyiconv_LDADD): Likewise. * configure.in: Create msft/Makefile, jis/Makefile. 1999-10-24 Tor Lillqvist * gen-formata.pl: New file. Generates tables for converting CJKV encodings with "Format A" style table text files available from www.unicode.org. * sjis.c: New file. Handles Shift-JIS encoding of JIS X 0208 (and the closely related DBCS encoding of Microsoft code page 932). * Makefile.am: Add it. * convert.h * init.c: Add the two new encodings. * jis/shiftjis.h * msft/cp932.h: New files, generated by gen-formata. * jis/Makefile.am * msft/Makefile.am: New files. * testsuite/tinyiconv.c: New file, for generic tests of encodings. * testsuite/Makefile.am: Add it. 1999-10-16 Tor Lillqvist * configure.in: Check for langinfo.h. Add AC_LIBTOOL_WIN32_DLL to enable dll builds on Win32. Note: To actually build libunicode as a DLL on Win32 using the auto* and libtool mechanism, you will currently need some pretty hairy manual intervention, and still it probably won't work. I actually use hand-written makefiles anyway... * acconfig.h: Add HAVE_LANGINFO_H. * utf8.c: Guard inclusion of langinfo.h * testsuite/utf8.c: Don't include netinet/in.h just for one htonl, do it manually. (Good for system without netinet/in.h.) 1999-09-13 Robert Brady * unicode.h, utf8.c: Add prototypes for and implement : unicode_offset_to_index, unicode_index_to_offset, unicode_strchr, unicode_strrchr, unicode_strncpy unicode_pad_string * testsuite/utf8.c: Add more test cases for unicode_strncpy, unicode_offset_to_index, unicode_index_to_offset, unicode_strchr, unicode_strncpy, unicode_last_utf8, unicode_string_width. 1999-09-07 Tom Tromey * COPYING.LIB: Added. * COPYING: Removed. 1999-09-02 Tom Tromey * unicode.h: Document errno substitution. * convert.c (unicode_iconv): Use bcopy if memmove doesn't exist. (EILSEQ): Define if not defined. * configure.in: Check for memmove. * ucs2.c (get_one): Pull out correct byte according to endianness. (write_one): Likewise. (ucs2_native_init): Correctly initialize for big-endianness. * testsuite/utf8.c (test_split_utf8_results): Translate to UCS4-native. * ucs4.c (ucs4_native_init): Reverted previous change. (ubn): Use names UCS4 and UCS-4. (un): Use name UCS4-native only. * testsuite/utf8.c (test_split_utf8_results): Correctly check result of unicode_iconv(). * testsuite/Makefile.am (ucs4_SOURCES): New macro. (ucs4_LDADD): Likewise. (std): Added ucs4. * testsuite/ucs4.c: New file. * ucs4.c (ucs4_read): Fixed big-endian code. (ucs4_write): Likewise. * unicode.h: Coding-standard cleanup. * testsuite/utf8.c (test_utf8_sequence): Use big-endian (network) byte order for UCS-4. * ucs4.c (ucs4_native_init): Always use big-endian. 1999-08-24 Tom Tromey * Makefile.am (cvs-dist): New target. 1999-08-20 Tom Tromey * utf8.c: Indentation and other pedantry. * unicode.h: Likewise. 1999-08-20 Robert Brady * unicode.h, init.c (unicode_init): ANSIfied function. * unicode.h (unicode_get_charset, unicode_string_width): Added prototype. * utf8.c (unicode_get_charset): New function. Returns TRUE if we are in a UTF-8 locale, or if CHARSET is set to "UTF-8". Puts the charset into the parameter if non-NULL. (unicode_string_width): Returns the probable visual width of a string. 1999-08-18 Tom Tromey * Makefile.am (BUILT_SOURCES): Removed. (unicode-config): Removed. 1999-08-18 Raja R Harinath Include everything to fallback on for systems with native iconv. * Makefile.am (libunicode_la_SOURCES): Include all files unconditionally. * init.c (unicode_init): Register all encodings unconditionally. 1999-08-18 Robert Brady * testsuite/utf8.c (test_utf8_strlen): New test, to check if unicode_strlen returns the correct value. (main): Call it. * utf8.c (unicode_strlen): Fix off by one error. 1999-08-18 James Blandy * testsuite/utf8.c (test_split_utf8_results): New test, verifying that we handle incomplete encodings at the end of the input correctly. (main): Call it. Use enums for selectors, not #defines. * convert.h (struct unicode_iconv_i): Make `type' an enum, with the values unicode_iconv_type_native and unicode_iconv_type_ours. * convert.c (NATIVE, OURS): Deleted. (unicode_iconv_open, unicode_iconv_close, unicode_iconv): Use the unicode_iconv_type_ values. * convert.c: (list_encodings): Remove function, added by accident. * testsuite/utf8.c (test_utf8_results): New test function, to catch the bugs below. (main): Call it. * testsuite/Makefile.am: Doc fix. * utf8conv.c (utf8_read): Remember to advance inbuf and inbytesleft by len, not by one. * utf8.h (UTF8_GET): Don't forget to wrap all the arguments in parens, to avoid operator associativity problems. * convert.h (unicode_encoding_t): Add `outbuf' and `outbytesleft' arguments to the `reset' function, so it can actually produce some text. * convert.c (unicode_iconv): Pass those arguments. 1999-08-17 Tom Tromey * testsuite/utf8.c: Use unicode_iconv_t. * testsuite/roundtrip.c: Use unicode_iconv_t. * unicode.h (unicode_iconv_t): New define; replaces old structure and include. * convert.h (unicode_iconv_i): Moved from unicode.h and renamed. (unicode_iconv_t): New structure. * convert.c (unicode_iconv_open): Use native iconv_open if available. (unicode_iconv_close): Use native iconv_close, if available. Free buffer in `our' case. Free descriptor. (unicode_iconv): Use native iconv, if available. (NATIVE): New macro. (OURS): Likewise. 1999-08-13 Raja R Harinath * convert.c (unicode_iconv_open): Don't dereference pointer after freeing it. 1999-08-13 Tom Tromey Efficiency change suggested by Jim Blandy: * utf8conv.c (utf8_read): New function. (utf8_write): Updated interface. Include "utf8.h". (unicode_utf8_encoding): Use utf8_read. (unicode_java_utf8_encoding): Likewise. * utf8.c (unicode_get_utf8): Rewrote. (unicode_utf8_read): Removed. Include "utf8.h". * Makefile.am (libunicode_la_SOURCES): Added utf8.h. * utf8.h: New file. * ucs4.c (ucs4_read): Updated interface. (ucs4_write): Likewise. * ucs2.c (ucs2_write): Updated interface. (ucs2_read): Likewise. * latin1.c (latin1_write): Updated interface. (latin1_read): Likewise. * iso8859.c (iso8859_read): Updated interface. (iso8859_write): Likewise. * convert.c (unicode_iconv): Rewrote to use new interface. (unicode_iconv_open): Allocate intermediate buffer. * unicode.h (iconv_s): Added `buffer', `valid', and `size' fields. * convert.h (unicode_read_result): New enum; removed READ_ defines. (unicode_write_result): Likewise. (unicode_encoding_t): Changed interface to read and write. 1999-08-10 Tom Tromey * ucs2.c (unicode_ucs2_native_encoding): Added aliases. * latin1.c (unicode_latin1_encoding): Added aliases. * iso8859.c (ISO_DEFINE): New macro. Use it to define the standard ISO tables. (unicode_windows_1252_encoding): Added aliases. (P3): New macro. * convert.c (find_encoding): Search all names for an encoder. Use case-insensitive compare. * convert.h (unicode_encoding_t): Replaced `name' with `names', an array of names. * Various files: Changed all users of unicode_encoding_t. 1999-08-10 Raja R Harinath * testsuite/utf8.c: Include config.h. * testsuite/roundtrip.c: Likewise. * Makefile.am (libunicode_la_SOURCES) [NATIVE_ICONV]: Add `init.c' for `unicode_init'. 1999-08-10 Tom Tromey * Makefile.am (SUBDIRS): Put `.' at beginning. * testsuite/roundtrip.c (main): Fixed names of charsets to conform to libunicode. * utf8conv.c (utf8_write): Fixed direction of increment from `++' to `--'. Only loop `len - 1' times. * testsuite/roundtrip.c (main): Call unicode_init(). * testsuite/utf8.c (test_utf8_sequence): Test `i' against -1, not 0. Added \n to failure message. Don't use network byte order for UCS4 characters. (main): Call unicode_init(). 1999-08-10 Robert Brady * testsuite/utf8.c: New test program. * testsuite/Makefile.am (std): added utf8. (utf8_SOURCES utf8_LDADD): New macros. 1999-08-09 Robert Brady * testsuite/roundtrip.c: New test program. * testsuite/Makefile.am (std): Added roundtrip. (roundtrip_SOURCES cxxsmoke_LDADD): New macros. * iso8859.c: (windows_1252_encoding): Renamed "WINDOWS-1252" to "CP1252", for compatibility with glibc. * testsuite/.cvsignore: Made CVS quieter. 1999-08-08 Tom Tromey * testsuite/cxxsmoke.cc: New file. * testsuite/Makefile.am (std): Added cxxsmoke. (cxxsmoke_SOURCES): New macro. (cxxsmoke_LDADD): New macro. * configure.in: Added AC_PROG_CXX. * unicode.h: Updated prototypes for const-ification. * utf8.c (unicode_previous_utf8): Made arguments `const'. (unicode_next_utf8): Made argument `const'. (unicode_get_utf8): Likewise. (unicode_strlen): Removed unneeded cast. 1999-08-06 Raja R Harinath * iso/Makefile.am (noinst_HEADERS): Add windows-1252.h. * Makefile.am (include_HEADERS): Move decomp.h and convert.h ... (noinst_HEADERS): ... here. 1999-08-06 Tom Tromey * unicode.h: Added missing `{'. * iso8859.c (unicode_georgian_academy_encoding): Fixed typo in init function name. * convert.h (unicode_georgian_ps_encoding): Fixed typo. 1999-08-06 Robert Brady * unicode.h, utf8.c: (unicode_strlen): Made const. * unicode.h: Made C++-safe. 1999-08-06 Pablo Saratxaga * added georgian encodings (this time whithout typos :) ) 1999-08-04 Raja R Harinath Fix typos. * iso/armscii-8.h (armscii_8_table): Was `koi8_u_table'. * iso/Makefile.am (noinst_HEADERS): Missing .h extns. * iso8859.c (unicode_iso8859_10_encoding): Was `unicode_iso8859_14_encoding'. 1999-08-04 Pablo Saratxaga * added encodings for iso-8859-10, koi8-u (used in Ukrainia), armscii-8 (armenian) and tis-620 (thai) 1999-08-02 Owen Taylor * autogen.sh: Add an autogen.sh. (Simplified from the one in GTK+) 1999-08-02 Owen Taylor * testsuite/Makefile.am: Unconditionalize a conditional to avoid bugs in automake-1.4a. 1999-07-29 Owen Taylor * Makefile.am (libunicode_la_SOURCES): Add utf8.c for NATIVE_ICONV. * unicode.h: Added for size_t * configure.in: Use AM_PROG_LIBTOOL instead of AC_PROG_LIBTOOL, since AC_PROG_LIBTOOL doesn't appear in any released version of libtool. * unicode-config.in: Generate and install a GTK+-style unicode-config for CFLAGS. 1999-07-28 Robert Brady * init.c, iso8859.c, convert.h, iso/windows-1252.h, iso/koi8-r.h: Added support for important legacy encodings Windows-1252 and KOI8-R. 1999-07-28 Tom Tromey * Makefile.am (AM_CFLAGS): New macro. * configure.in (cflags): New subst * utf8conv.c (utf8_write): Make `len' a `size_t' to avoid warning. Removed unused variable. * utf8.c (unicode_utf8_read): Make `len' a `size_t' to avoid warning. * prop.c (unicode_istitle): Use unsigned to avoid warning. (unicode_toupper): Likewise. (unicode_tolower): Likewise. (unicode_totitle): Likewise. * latin1.c (latin1_write): Used `unsigned int' to avoid warning. * convert.c (unicode_iconv): Removed unused variable. * ucs2.c (unicode_ucs2_native_encoding): Reference ucs2_native_init. * utf8.c (unicode_previous_utf8): Use `char', not `unsigned char'. (unicode_next_utf8): Likewise. (unicode_strlen): Likewise. (unicode_get_utf8): Likewise. 1999-07-27 Tom Tromey * unicode.h (unicode_strlen): Updated. * utf8.c (unicode_strlen): Changed interface. 1999-07-26 Raja R Harinath * ucs2.c (write_one): Fix off-by-one error. (get_one): Likewise. * ucs4.c (ucs4_read): Likewise. (ucs4_write): Likewise. Wed Jul 28 00:52:27 1999 Owen Taylor * utf8.c (unicode_next_utf8): Fixed reversed check. 1999-07-11 Tom Tromey * testsuite/ordering.c (check): New funtion. (check_decomp): Likewise. (main): Use them. (t2_src): New array. (t3_src): Likewise. (t4_src): Likewise. (t5_src): Likewise. (t6_src): Likewise. (t5_dst): Likewise. (t7_dst): Likewise. (t8_src): Likewise. (t7_src): Likewise. * decomp.h: Rebuilt. * gen-table.pl (process_one): Note decompositions. (fetch_cclass): Renamed. (expand_decomp): New sub. (print_decomp): Print decomposition table. * unicode.h (unicode_canonical_decomposition): Declare. * decomp.c (asize): New macro. (unicode_canonical_decomposition): Wrote. * testsuite/ordering.c: New file. * configure.in: Create iso/Makefile and testsuite/Makefile. * testsuite/Makefile.am: New file. * iso/Makefile.am: New file. * Makefile.am (SUBDIRS): New macro. * unicode.h (unicode_canonical_ordering): Declare. * Makefile.am (libunicode_la_SOURCES): Added decomp.c. * decomp.h: New file. * chartables.h: Rebuilt. * gen-table.pl (fetch_type): New sub. (fetch_attr): Likewise. (mappings): Added UNICODE_ prefix to each value. (print_row): Rewrote. (print_attr_row): Removed. (print_tables): Changed to use new print_row. (print_decomp): New sub. (fetch_cclas): New sub. (process_one): Note combining class of character. * decomp.c: New file. * iso8859.c (iso8859_write): Mask off high bits for clarity. * iso8859.c (iso8859_read): Special-case digits in 8859-6. (iso8859_write): Likewise. Also, initialize `xlate'. * Makefile.am (libunicode_la_SOURCES): Added iso8859.c. * init.c (unicode_init): Register 8859-* encoders. * convert.h: Declare 8859-* encoding structures. * iso/iso8859-14.h, iso/iso8859-15.h, iso/iso8859-2.h, iso/iso8859-3.h, iso/iso8859-4.h, iso/iso8859-5.h, iso/iso8859-6.h, iso/iso8859-7.h, iso/iso8859-8.h, iso/iso8859-9.h: New files. * iso8859.c: New file. * gen-iso.pl: New file. * init.c (unicode_init): Register unicode_ucs4_native_encoding. * convert.h (unicode_ucs4_native_encoding): Declare. * ucs4.c (ucs4_native_init): New function. (unicode_ucs4_native_encoding): New struct. * Makefile.am (libunicode_la_SOURCES): Removed ascii.c. * ascii.c: Deleted. * latin1.c (unicode_ascii_encoding): Moved from ascii.c. Set `init' element. (latin1_init): New function. (ascii_init): Likewise. (latin1_write): Use stored mask, not constant. (unicode_latin1_encoding): Set `init' element. * acconfig.h: New file. * utf8.c (unicode_get_utf8): Don't bother to compute length explicitly. * ucs2.c: Removed erroneous comment. * ucs2.c (ucs2_read): Handle surrogate pairs. (get_one): New function. (write_one): Likewise. (ucs2_write): Write surrogate pairs. * chartables.h: Rebuilt. * gen-table.pl (%gaps): New global. (CJK, HANGUL, SURROGATES, PRIVATE_USE): Removed. * ucs2.c (HIGH_SURROGATE): New macro. (LOW_SURROGATE): Likewise. (LAST_SURROGATE): Likewise. 1999-07-10 Tom Tromey * init.c (unicode_init): Initialize unicode_ucs2_native_encoding. * convert.h (unicode_ucs2_native_encoding): Declare. * ucs2.c (ucs2_native_init): New function. (unicode_ucs2_native_encoding): New structure. 1999-07-09 Tom Tromey * acconfig.h: New file. * configure.in: Check for existence of native iconv. * init.c (unicode_init): Conditionalize on UNICODE_USE_SYSTEM_ICONV. 1999-07-08 Tom Tromey * init.c (unicode_init): Register unicode_java_utf8_encoding. * convert.h (unicode_java_utf8_encoding): Declare. * utf8.c (unicode_utf8_read): Handle 4, 5, and 6 byte characters. * utf8conv.c (utf8_std_init): New function (unicode_utf8_encoding): Use it. (utf8_java_init): Likewise. (unicode_java_utf8_encoding): New global. (utf8_write): Handle Java and standard encodings. Handle 4, 5, and 6 byte characters. * configure.in: Version 0.2. * prop.c (TTYPE): New macro. (TYPE): Use it. All functions and macros: Use UNICODE_* name for types. (ATTTABLE): New macro. (unicode_toupper): Use it. (unicode_totitle): Likewise. (unicode_tolower): Likewise. (unicode_digit_value): Likewise. (unicode_xdigit_value): Likewise. * chartables.h: Rebuilt. * gen-table.pl (print_attr_row): New sub. (print_tables): Use it. (print_row): Start loop at $start, not 0. * utf8.c (unicode_utf8_read): Moved back from utf8conv.c. No longer static. * utf8conv.c (unicode_utf8_len): Removed. (unicode_utf8_encoding): Mention unicode_utf8_read. * unicode.h: Declare unicode_previous_utf8. * utf8.c (unicode_previous_utf8): New function. (unicode_get_utf8): Rewrote. Changed interface. (unicode_next_utf8): Rewrote. * chartables.h: Rebuilt. * gen-table.pl (print_tables): Don't print character classification constants. * unicode.h: Define UNICODE_* character classification constants. * Makefile.am (libunicode_la_SOURCES): Added utf8conv.c. * utf8.c: Moved converter functions to utf8conv.c. * utf8conv.c: New file. * prop.c (unicode_xdigit_value): New function. (unicode_type): Likewise * unicode.h (unicode_xdigit_value): Declare. (unicode_type): Likewise. Also, add comments describing functions. 1999-07-06 Tom Tromey * gen-table.pl (print_row): New sub. (print_tables): Use it. 1999-07-04 Tom Tromey * utf8.c (unicode_utf8_len): Removed. (unicode_to_utf8): Removed. (unicode_get_utf8_and_advance): Removed. (utf8_write): New function. (utf8_read): New function. (unicode_get_utf8): Rewrote. * init.c (unicode_init): Register UCS2 encoders. * convert.h (unicode_ucs2_big_encoding, unicode_ucs2_little_encoding): Declare. * Makefile.am (libunicode_la_SOURCES): Added ucs2.c. * ucs2.c: New file. * Makefile.am (libunicode_la_SOURCES): Added ucs4.c. * init.c (unicode_init): Register UCS4 encoders. * convert.h (unicode_ucs4_big_encoding, unicode_ucs4_little_encoding): Declare. * ucs4.c: New file. * Makefile.am (libunicode_la_SOURCES): Conditionalize on NATIVE_ICONV. * configure.in (NATIVE_ICONV): New conditional. * latin1.c: Rewrote. * unicode.h (iconv_s, unicode_iconv_open, unicode_iconv_close, unicode_iconv): Declare. * ascii.c: Rewrote. * convert.c: Include convert.h. (find_encoding): Renamed; now static. (unicode_buffer_to_encoding): Deleted. (unicode_buffer_from_encoding): Deleted. (unicode_iconv_open): New function. (unicode_iconv_close): Likewise. (unicode_iconv): Likewise. * init.c: Include convert.h. * convert.h: New file. 1999-06-30 Tom Tromey * chartables.h: Rebuilt. * gen-table.pl (process_one): Handle titlecase. (print_tables): Likewise. * prop.c (ISDIGIT, ISALPHA, asize): New macros. (unicode_isalnum, unicode_isalpha, unicode_iscntrl, unicode_isdigit, unicode_isgraph, unicode_isprint, unicode_ispunct, unicode_isxdigit): New functions. (unicode_toupper): Rewrote. (unicode_tolower): Likewise. (unicode_totitle): Likewise. (unicode_istitle): Likewise. 1999-06-27 Tom Tromey * Started.