diff options
| author | Ulrich Drepper <drepper@redhat.com> | 1999-08-27 19:52:08 +0000 |
|---|---|---|
| committer | Ulrich Drepper <drepper@redhat.com> | 1999-08-27 19:52:08 +0000 |
| commit | 6dd5b57e8bb397cd7bffdc4a88ccd720c7596734 (patch) | |
| tree | 09ac1b062db055227044d1451e0c49b18dc334eb /manual | |
| parent | 2e8a853b6c9e1034dbd0c2e6be34bbed8e0357b6 (diff) | |
| download | glibc-6dd5b57e8bb397cd7bffdc4a88ccd720c7596734.tar.xz glibc-6dd5b57e8bb397cd7bffdc4a88ccd720c7596734.zip | |
Update.
* manual/ctype.texi: Likewise.
* manual/locale.texi: Likewise.
Diffstat (limited to 'manual')
| -rw-r--r-- | manual/ctype.texi | 89 | ||||
| -rw-r--r-- | manual/locale.texi | 482 |
2 files changed, 282 insertions, 289 deletions
diff --git a/manual/ctype.texi b/manual/ctype.texi index b5ab6bae3d..0d3ab60aa2 100644 --- a/manual/ctype.texi +++ b/manual/ctype.texi @@ -266,34 +266,34 @@ with the SVID. @section Character class determination for wide characters The second amendment to @w{ISO C89} defines functions to classify wide -characters. The original @w{ISO C89} standard defined the type -@code{wchar_t} but failed to define any functions to operate on wide -characters. +characters. Although the original @w{ISO C89} standard already defined +the type @code{wchar_t}, no functions operating on them were defined. The general design of the classification functions for wide characters -is more general. It allows extending the set of available -classifications beyond the set which is always available. The POSIX -standard specifies how the extension can be done and this is already +is more general. It allows extensions to the set of available +classifications, beyond those which are always available. The POSIX +standard specifies how extensions can be made, and this is already implemented in the GNU C library implementation of the @code{localedef} program. -The character class functions are normally implemented using bitsets. -I.e., for the character in question the appropriate bitset is read from -a table and a test is performed to determine whether a certain bit is -set in this bitset. Which bit is tested for is determined by the class. +The character class functions are normally implemented with bitsets, +with a bitset per character. For a given character, the appropriate +bitset is read from a table and a test is performed as to whether a +certain bit is set. Which bit is tested for is determined by the +class. For the wide character classification functions this is made visible. -There is a type representing the classification, a function to retrieve -this value for a specific class, and a function to test using the -classification value whether a given character is in this class. On top -of this the normal character classification functions as used for +There is a type classification type defined, a function to retrieve this +value for a given class, and a function to test whether a given +character is in this class, using the classification value. On top of +this the normal character classification functions as used for @code{char} objects can be defined. @comment wctype.h @comment ISO @deftp {Data type} wctype_t The @code{wctype_t} can hold a value which represents a character class. -The ony defined way to generate such a value is by using the +The only defined way to generate such a value is by using the @code{wctype} function. @pindex wctype.h @@ -306,8 +306,8 @@ This type is defined in @file{wctype.h}. The @code{wctype} returns a value representing a class of wide characters which is identified by the string @var{property}. Beside some standard properties each locale can define its own ones. In case -no property with the given name is known for the current locale for the -@code{LC_CTYPE} category the function returns zero. +no property with the given name is known for the current locale +selected for the @code{LC_CTYPE} category, the function returns zero. @noindent The properties known in every locale are: @@ -339,11 +339,11 @@ by a successful call to @code{wctype}. This function is declared in @file{wctype.h}. @end deftypefun -This makes it easier to use the commonly-used classification functions -that are defined in the C library. There is no need to use +To make it easier to use the commonly-used classification functions, +they are defined in the C library. There is no need to use @code{wctype} if the property string is one of the known character classes. In some situations it is desirable to construct the property -string and then it becomes important that @code{wctype} can also handle the +strings, and then it is important that @code{wctype} can also handle the standard classes. @cindex alphanumeric character @@ -420,7 +420,7 @@ wide characters: @smallexample n = 0; -while (iswctype (*wc)) +while (iswdigit (*wc)) @{ n *= 10; n += *wc++ - L'0'; @@ -604,11 +604,11 @@ This function is a GNU extension. It is declared in @file{wchar.h}. @node Using Wide Char Classes, Wide Character Case Conversion, Classification of Wide Characters, Character Handling @section Notes on using the wide character classes -The first note is probably nothing astonishing but still occasionally a +The first note is probably not astonishing but still occasionally a cause of problems. The @code{isw@var{XXX}} functions can be implemented using macros and in fact, the GNU C library does this. They are still available as real functions but when the @file{wctype.h} header is -included the macros will be used. This is nothing new compared to the +included the macros will be used. This is the same as the @code{char} type versions of these functions. The second note covers something new. It can be best illustrated by a @@ -630,8 +630,8 @@ is_in_class (int c, const char *class) @} @end smallexample -Now with the @code{wctype} and @code{iswctype} one could avoid the -@code{if} cascades. But rewriting the code as follows is wrong: +Now, with the @code{wctype} and @code{iswctype} you can avoid the +@code{if} cascades, but rewriting the code as follows is wrong: @smallexample int @@ -644,7 +644,7 @@ is_in_class (int c, const char *class) The problem is that it is not guaranteed that the wide character representation of a single-byte character can be found using casting. -In fact, usually this fails miserably. The correct solution for this +In fact, usually this fails miserably. The correct solution to this problem is to write the code as follows: @smallexample @@ -657,10 +657,10 @@ is_in_class (int c, const char *class) @end smallexample @xref{Converting a Character}, for more information on @code{btowc}. -Please note that this change probably does not improve the performance +Note that this change probably does not improve the performance of the program a lot since the @code{wctype} function still has to make -the string comparisons. But it gets really interesting if the -@code{is_in_class} function would be called more than once using the +the string comparisons. It gets really interesting if the +@code{is_in_class} function is called more than once for the same class name. In this case the variable @var{desc} could be computed once and reused for all the calls. Therefore the above form of the function is probably not the final one. @@ -669,18 +669,17 @@ function is probably not the final one. @node Wide Character Case Conversion, , Using Wide Char Classes, Character Handling @section Mapping of wide characters. -As for the classification functions, the @w{ISO C} standard also -generalizes the mapping functions. Instead of only allowing the two -standard mappings, the locale can contain others. Again, the -@code{localedef} program already supports generating such locale data -files. +The classification functions are also generalized by the @w{ISO C} +standard. Instead of just allowing the two standard mappings, a +locale can contain others. Again, the @code{localedef} program +already supports generating such locale data files. @comment wctype.h @comment ISO @deftp {Data Type} wctrans_t This data type is defined as a scalar type which can hold a value representing the locale-dependent character mapping. There is no way to -construct such a value except using the return value of the +construct such a value apar from using the return value of the @code{wctrans} function. @pindex wctype.h @@ -693,8 +692,8 @@ This type is defined in @file{wctype.h}. @deftypefun wctrans_t wctrans (const char *@var{property}) The @code{wctrans} function has to be used to find out whether a named mapping is defined in the current locale selected for the -@code{LC_CTYPE} category. If the returned value is non-zero it can -afterwards be used in calls to @code{towctrans}. If the return value is +@code{LC_CTYPE} category. If the returned value is non-zero, you can use +it afterwards in calls to @code{towctrans}. If the return value is zero no such mapping is known in the current locale. Beside locale-specific mappings there are two mappings which are @@ -707,15 +706,15 @@ guaranteed to be available in every locale: @pindex wctype.h @noindent -This function is declared in @file{wctype.h}. +These functions are declared in @file{wctype.h}. @end deftypefun @comment wctype.h @comment ISO @deftypefun wint_t towctrans (wint_t @var{wc}, wctrans_t @var{desc}) -The @code{towctrans} function maps the input character @var{wc} -according to the rules of the mapping for which @var{desc} is an -descriptor and returns the value so found. The @var{desc} value must be +@code{towctrans} maps the input character @var{wc} +according to the rules of the mapping for which @var{desc} is a +descriptor, and returns the value it finds. @var{desc} must be obtained by a successful call to @code{wctrans}. @pindex wctype.h @@ -723,8 +722,8 @@ obtained by a successful call to @code{wctrans}. This function is declared in @file{wctype.h}. @end deftypefun -The @w{ISO C} standard also defines for the generally available mappings -convenient shortcuts so that it is not necesary to call @code{wctrans} +For the generally available mappings, the @w{ISO C} standard defines +convenient shortcuts so that it is not necessary to call @code{wctrans} for them. @comment wctype.h @@ -765,6 +764,6 @@ This function is declared in @file{wctype.h}. @end deftypefun The same warnings given in the last section for the use of the wide -character classification function applies here. It is not possible to +character classification functions apply here. It is not possible to simply cast a @code{char} type value to a @code{wint_t} and use it as an -argument for @code{towctrans} calls. +argument to @code{towctrans} calls. diff --git a/manual/locale.texi b/manual/locale.texi index 6cfacbdb8c..096ac48105 100644 --- a/manual/locale.texi +++ b/manual/locale.texi @@ -99,7 +99,7 @@ most of Spain. The set of locales supported depends on the operating system you are using, and so do their names. We can't make any promises about what locales will exist, except for one standard locale called @samp{C} or -@samp{POSIX}. Later we will describe how to construct locales XXX. +@samp{POSIX}. Later we will describe how to construct locales. @comment (@pxref{Building Locale Files}). @cindex combining locales @@ -183,12 +183,12 @@ to use for all purposes except as overridden by the variables above. @vindex LANGUAGE When developing the message translation functions it was felt that the -functionality provided by the variables above is not sufficient. E.g., it -should be possible to specify more than one locale name. For an example -take a Swedish user who better speaks German than English, the programs -messages by default are written in English. Then it should be possible -to specify that the first choice for the language is Swedish, the second -choice is German, and if this also fails English is used. This is +functionality provided by the variables above is not sufficient. For +example, it should be possible to specify more than one locale name. +Take a Swedish user who better speaks German than English, and a program +whose messages are output in English by default. It should be possible +to specify that the first choice of language is Swedish, the second +German, and if this also fails to use English. This is possible with the variable @code{LANGUAGE}. For further description of this GNU extension see @ref{Using gettextized software}. @@ -226,7 +226,7 @@ category @var{category} to @var{locale}. If @var{category} is @code{LC_ALL}, this specifies the locale for all purposes. The other possible values of @var{category} specify an -individual purpose (@pxref{Locale Categories}). +single purpose (@pxref{Locale Categories}). You can also use this function to find out the current locale by passing a null pointer as the @var{locale} argument. In this case, @@ -250,19 +250,19 @@ don't make any promises about what it looks like. But if you specify the same ``locale name'' with @code{LC_ALL} in a subsequent call to @code{setlocale}, it restores the same combination of locale selections. -To ensure to be able to use the string encoding the currently selected -locale at a later time one has to make a copy of the string. It is not -guaranteed that the return value stays valid all the time. +To be sure you can use the returned string encoding the currently selected +locale at a later time, you must make a copy of the string. It is not +guaranteed that the returned pointer remains valid over time. When the @var{locale} argument is not a null pointer, the string returned -by @code{setlocale} reflects the newly modified locale. +by @code{setlocale} reflects the newly-modified locale. If you specify an empty string for @var{locale}, this means to read the appropriate environment variable and use its value to select the locale for @var{category}. -If a nonempty string is given for @var{locale} the locale with this name -is used, if this is possible. +If a nonempty string is given for @var{locale}, then the locale of that +name is used if possible. If you specify an invalid locale name, @code{setlocale} returns a null pointer and leaves the current locale unchanged. @@ -303,7 +303,7 @@ with_other_locale (char *new_locale, @end smallexample @strong{Portability Note:} Some @w{ISO C} systems may define additional -locale categories and future versions of the library will do so. For +locale categories, and future versions of the library will do so. For portability, assume that any symbol beginning with @samp{LC_} might be defined in @file{locale.h}. @@ -332,7 +332,7 @@ Defining and installing named locales is normally a responsibility of the system administrator at your site (or the person who installed the GNU C library). It is also possible for the user to create private locales. All this will be discussed later when describing the tool to -do so XXX. +do so. @comment (@pxref{Building Locale Files}). If your program needs to use something other than the @samp{C} locale, @@ -342,27 +342,27 @@ locale explicitly by name. Remember, different machines might have different sets of locales installed. @node Locale Information, Formatting Numbers, Standard Locales, Locales -@section Accessing the Locale Information +@section Accessing Locale Information -There are several ways to access the locale information. The simplest +There are several ways to access locale information. The simplest way is to let the C library itself do the work. Several of the -functions in this library access implicitly the locale data and use -what information is available in the currently selected locale. This is +functions in this library implicitly access the locale data, and use +what information is provided by the currently selected locale. This is how the locale model is meant to work normally. -As an example take the @code{strftime} function which is meant to nicely +As an example take the @code{strftime} function, which is meant to nicely format date and time information (@pxref{Formatting Date and Time}). Part of the standard information contained in the @code{LC_TIME} -category are, e.g., the names of the months. Instead of requiring the +category is the names of the months. Instead of requiring the programmer to take care of providing the translations the -@code{strftime} function does this all by itself. When using @code{%A} -in the format string this will be replaced by the appropriate weekday -name of the locale currently selected for @code{LC_TIME}. This is the -easy part and wherever possible functions do things automatically as in -this case. - -But there are quite often situations when there is simply no functions -to perform the task or it is simply not possible to do the work +@code{strftime} function does this all by itself. @code{%A} +in the format string is replaced by the appropriate weekday +name of the locale currently selected by @code{LC_TIME}. This is an +easy example, and wherever possible functions do things automatically +in this way. + +But there are quite often situations when there is simply no function +to perform the task, or it is simply not possible to do the work automatically. For these cases it is necessary to access the information in the locale directly. To do this the C library provides two functions: @code{localeconv} and @code{nl_langinfo}. The former is @@ -379,14 +379,13 @@ as far as the system follows the Unix standards. @subsection @code{localeconv}: It is portable but @dots{} Together with the @code{setlocale} function the @w{ISO C} people -invented @code{localeconv} function. It is a masterpiece of misdesign. -It is expensive to use, it is not extendable, and is not generally -usable as it provides access only to the @code{LC_MONETARY} and -@code{LC_NUMERIC} related information. If it is applicable for a -certain situation it should nevertheless be used since it is very -portable. In general it is better to use the function @code{strfmon} -which can be used to format monetary amounts correctly according to the -selected locale by implicitly using this information. +invented the @code{localeconv} function. It is a masterpiece of poor +design. It is expensive to use, not extendable, and not generally +usable as it provides access to only @code{LC_MONETARY} and +@code{LC_NUMERIC} related information. Nevertheless, if it is +applicable to a given situation it should be used since it is very +portable. The function @code{strfmon} formats monetary amounts +according to the selected locale using this information. @pindex locale.h @cindex monetary value formatting @cindex numeric value formatting @@ -407,8 +406,8 @@ value. @comment locale.h @comment ISO @deftp {Data Type} {struct lconv} -This is the data type of the value returned by @code{localeconv}. Its -elements are described in the following subsections. +@code{localeconv}'s return value is of this data type. Its elements are +described in the following subsections. @end deftp If a member of the structure @code{struct lconv} has type @code{char}, @@ -487,7 +486,7 @@ members have the same value.) In the standard @samp{C} locale, both of these members have the value @code{CHAR_MAX}, meaning ``unspecified''. The ISO standard doesn't say -what to do when you find this the value; we recommend printing no +what to do when you find this value; we recommend printing no fractional digits. (This locale also specifies the empty string for @code{mon_decimal_point}, so printing any fractional digits would be confusing!) @@ -521,8 +520,8 @@ The local currency symbol for the selected locale. In the standard @samp{C} locale, this member has a value of @code{""} (the empty string), meaning ``unspecified''. The ISO standard doesn't say what to do when you find this value; we recommend you simply print -the empty string as you would print any other string found in the -appropriate member. +the empty string as you would print any other string pointed to by this +variable. @item char *int_curr_symbol The international currency symbol for the selected locale. @@ -533,9 +532,9 @@ three-letter abbreviation determined by the international standard followed by a one-character separator (often a space). In the standard @samp{C} locale, this member has a value of @code{""} -(the empty string), meaning ``unspecified''. We recommend you simply -print the empty string as you would print any other string found in the -appropriate member. +(the empty string), meaning ``unspecified''. We recommend you simply print +the empty string as you would print any other string pointed to by this +variable. @item char p_cs_precedes @itemx char n_cs_precedes @@ -547,8 +546,8 @@ negative amounts. In the standard @samp{C} locale, both of these members have a value of @code{CHAR_MAX}, meaning ``unspecified''. The ISO standard doesn't say -what to do when you find this value, but we recommend printing the -currency symbol before the amount. That's right for most countries. +what to do when you find this value. We recommend printing the +currency symbol before the amount, which is right for most countries. In other words, treat all nonzero values alike in these members. The POSIX standard says that these two members apply to the @@ -573,7 +572,7 @@ negative amounts. In the standard @samp{C} locale, both of these members have a value of @code{CHAR_MAX}, meaning ``unspecified''. The ISO standard doesn't say what you should do when you find this value; we suggest you treat it as -one (print a space). In other words, treat all nonzero values alike in +1 (print a space). In other words, treat all nonzero values alike in these members. These members apply only to @code{currency_symbol}. When you use @@ -581,7 +580,7 @@ These members apply only to @code{currency_symbol}. When you use @code{int_curr_symbol} itself contains the appropriate separator. The POSIX standard says that these two members apply to the -@code{int_curr_symbol} as well as the @code{currency_symbol}. But an +@code{int_curr_symbol} as well as the @code{currency_symbol}. However, an example in the @w{ISO C} standard clearly implies that they should apply only to the @code{currency_symbol}---that the @code{int_curr_symbol} contains any appropriate separator, so you should never print an @@ -592,16 +591,16 @@ printing international currency symbols, and print no extra space. @end table @node Sign of Money Amount, , Currency Symbol, The Lame Way to Locale Data -@subsubsection Printing the Sign of an Amount of Money +@subsubsection Printing the Sign of a Monetary Amount These members of the @code{struct lconv} structure specify how to print -the sign (if any) in a monetary value. +the sign (if any) of a monetary value. @table @code @item char *positive_sign @itemx char *negative_sign These are strings used to indicate positive (or zero) and negative -(respectively) monetary quantities. +monetary quantities, respectively. In the standard @samp{C} locale, both of these members have a value of @code{""} (the empty string), meaning ``unspecified''. @@ -615,7 +614,7 @@ unreasonable.) @item char p_sign_posn @itemx char n_sign_posn -These members have values that are small integers indicating how to +These members are small integers that indicate how to position the sign for nonnegative and negative monetary quantities, respectively. (The string used by the sign is what was specified with @code{positive_sign} or @code{negative_sign}.) The possible values are @@ -650,36 +649,35 @@ symbol. It is not clear whether you should let these members apply to the international currency format or not. POSIX says you should, but intuition plus the examples in the @w{ISO C} standard suggest you should -not. We hope that someone who knows well the conventions for formatting -monetary quantities will tell us what we should recommend. +not. We hope that someone who knows the conventions for formatting +monetary quantities well will tell us what we should recommend. @node The Elegant and Fast Way, , The Lame Way to Locale Data, Locale Information @subsection Pinpoint Access to Locale Data When writing the X/Open Portability Guide the authors realized that the @code{localeconv} function is not enough to provide reasonable access to -the locale information. The information which was meant to be available +locale information. The information which was meant to be available in the locale (as later specified in the POSIX.1 standard) requires more -possibilities to access it. Therefore the @code{nl_langinfo} function +ways to access it. Therefore the @code{nl_langinfo} function was introduced. @comment langinfo.h @comment XOPEN @deftypefun {char *} nl_langinfo (nl_item @var{item}) The @code{nl_langinfo} function can be used to access individual -elements of the locale categories. I.e., unlike the @code{localeconv} -function which always returns all the information @code{nl_langinfo} -lets the caller select what information is necessary. This is very -fast and it is no problem to call this function multiple times. +elements of the locale categories. Unlike the @code{localeconv} +function, which returns all the information, @code{nl_langinfo} +lets the caller select what information it requires. This is very +fast and it is not a problem to call this function multiple times. -The second advantage is that not only the numeric and monetary -formatting information is available. Also the information of the +A second advantage is that in addition to the numeric and monetary +formatting information, information from the @code{LC_TIME} and @code{LC_MESSAGES} categories is available. -The type @code{nl_type} is defined in @file{nl_types.h}. -The argument @var{item} is a numeric values which must be one of the -values defined in the header @file{langinfo.h}. The X/Open standard -defines the following values: +The type @code{nl_type} is defined in @file{nl_types.h}. The argument +@var{item} is a numeric value defined in the header @file{langinfo.h}. +The X/Open standard defines the following values: @vtable @code @item ABDAY_1 @@ -698,7 +696,7 @@ corresponds to Sunday. @itemx DAY_5 @itemx DAY_6 @itemx DAY_7 -Similar to @code{ABDAY_1} etc, but here the return value is the +Similar to @code{ABDAY_1} etc., but here the return value is the unabbreviated weekday name. @item ABMON_1 @itemx ABMON_2 @@ -712,7 +710,7 @@ unabbreviated weekday name. @itemx ABMON_10 @itemx ABMON_11 @itemx ABMON_12 -The return value is abbreviated name for the month names. @code{ABMON_1} +The return value is abbreviated name of the month. @code{ABMON_1} corresponds to January. @item MON_1 @itemx MON_2 @@ -726,129 +724,127 @@ corresponds to January. @itemx MON_10 @itemx MON_11 @itemx MON_12 -Similar to @code{ABMON_1} etc but here the month names are not abbreviated. +Similar to @code{ABMON_1} etc., but here the month names are not abbreviated. Here the first value @code{MON_1} also corresponds to January. @item AM_STR @itemx PM_STR -The return values are strings which can be used in the time representation |
