Plural Form(s) in Translation(s)

Материал из Wiki.crossplatform.ru

Перейти к: навигация, поиск
Image:qt-logo_new.png Image:qq-title-article.png
Qt Quarterly | Выпуск 19 | Документация


by Jan-Arve Sæther

Ever found yourself writing tr("%1 object(s) found") .arg(count) inone of your applications? Qt 4.2 введет сильный механизм, чтобы обработать множественные числа изящным способом, который работает для всех языков, и это требует небольшой дополнительной работы от разработчика.

Содержание


[править] Что за проблема с множественным числом?

Вы, скорее всего видел программы, которые используют одну и ту же строку в единственном и множественном числе, используя скобки, чтобы комбинировать формы единственного и множественного числа в одной строке (например, "6 occurrence(s)replaced"). center

Естественно, было бы предпочтительнее, чтобы показать "6 occurrences replaced" с 's', и "1 occurrence replaced" без 's'. Некоторые разработчики решают эту проблему с помощью кода, который выглядит подобно этому:

tr("%1 item%2 replaced").arg(count)
                        .arg(count == 1 ? "" : "s");

This approach works for languages like English that form their pluralusing 's', but as soon as we try to translate the application tolanguages like Arabic, Chinese, German, Hebrew, or Japanese(to name just a few), this breaks in a horrible way.

Developers who are slightly more sympathetic might write code thatlooks more like this:

QString message;
if (count == 1) {
    message = tr("%1 item replaced").arg(count);
} else {
    message = tr("%1 items replaced").arg(count);
}

This code is definitely more internationalization-friendly, but itstill makes two assumptions about the target language:

  • It assumes that the target language has two grammatical numbers(singular and plural).
  • It assumes that the plural form should be used in the "n = 0" case(e.g., "0 items").

These assumptions hold for many of the world's languages, includingDutch, English, Finnish, Greek, Hebrew, Hindi, Mongolian, Swahili,Turkish, and Zulu, but there are many languages out there for whichthey don't.

Case in point: In French and Brazilian Portuguese (but notinternational Portuguese, interestingly enough), the singular form isused in conjunction with 0 (e.g., "0 maison", not "0 maisons"),breaking assumption 2. In Polish, there are three grammaticalnumbers:

  • Singular: n = 1
  • Paucal: n = 2--4, 22--24, 32--34, 42--44, ...
  • Plural: n = 0, 5--21, 25--31, 35--41, ...

For example, the Polish word dom ("house") has the paucal formdomy and the plural form domуw. The table below shows therendition of "n house(s)" in English, French, and Polish fordifferent values of n.

Английский Французкий Польский Русский
0 houses 0 maison 0 domуw 0 домов
1 house 1 maison 1 dom 1 дом
2 houses 2 maisons 2 domy 2 дома
3 houses 3 maisons 3 domy 3 дома
4 houses 4 maisons 4 domy 4 дома
5 houses 5 maisons 5 domуw 5 домов
21 houses 21 maisons 21 domуw 21 дом
22 houses 22 maisons 22 domy 22 дома
24 houses 24 maisons 24 domy 24 дома
30 houses 30 maisons 30 domуw 30 домов

Other languages have other rules:

  • Latvian has a specific grammatical number, the nullar, forthe "n = 0" case.
  • Dhivehi, Inuktitut, Irish, Maori, and a few other languageshave a dual form for the "n = 2" case.
  • Czech, Slovak, Lithuanian, and Macedonian have a dual, but they useit according to more complex rules.
  • Slovenian has a trial in addition to the singular, dual, and pluralforms.
  • Romanian handles the "n >= 20" case differently fromthe "n < 20" case.
  • Arabic has six different forms, depending on the value of n.
  • Chinese, Japanese, Korean, and many other languages don't distinguishbetween the singular and the plural.

This is just a partial list, but it clearly shows the complexity ofthe problem.

[править] How Does Qt 4.2 Address This Problem?

Qt 4.2 includes a QObject::tr() overloadthat will make it very easy to write "plural-aware" internationalizedapplications. This new overload has the following signature:
QString tr(const char *text, const char *comment, int n);
Depending on the value of n, the tr() function will return adifferent translation, with the correct grammatical number for thetarget language. Also, any occurrence of "%n" is replaced with n'svalue. For example:
tr("%n item(s) replaced", "", count);

If a French translation is loaded, this will expand to "0 item remplacй","1 item remplacй", "2 item's remplacйs", etc., depending onns value. And if no translation is loaded, the orignal string is used,with "%n" replaced with count's value (e.g., "6 item(s) replaced").

To obtain a more natural English text, you need to load an Englishtranslation. [1]An English translation offers other advantages, such asthe possibility of editing the application's English user interfacewithout touching the source code.

When the application is ready to be translated, the developers mustrun lupdate as usual to generate one or several .ts filesthat can be edited using Qt Linguist. In Qt Linguist, thetranslator can specify the target language by clickingEdit|Translation File Settings. Specifying a targetlanguage is necessary so that Qt Linguist knows how manytranslations are necessary for a source string that contains "%n".

center

The screenshot above shows how Qt Linguist lets the translatorenter three different translations corresponding to the threegrammatical numbers (singular, paucal, and plural) in the Polish language.

[править] How Does It Work Under the Hood?

Qt Linguist and its helper tool lrelease know the specificplural rules for all the languages supported by QLocale. These rules are encoded in the binary.qm file that is generated from the .ts file, so thattr() uses the correct form based on n's value. The tablebelow shows the specific rules that are produced by Qt Linguist andlrelease for a selection of languages.

Language Form 1 Form 2 Form 3
English n == 1 otherwise N/A
French n < 2 otherwise N/A
Czech n % 100 == 1 n % 100 >= 2
&& n % 100 <= 4
otherwise
Irish n == 1 n == 2 otherwise
Latvian n % 10 == 1
&& n % 100 != 11
n != 0 otherwise
Lithuanian n % 10 == 1
&& n % 100 != 11
n % 100 != 12
&& n % 10 == 2
otherwise
Macedonian n % 10 == 1 n % 10 == 2 otherwise
Polish n == 1 n % 10 >= 2
&& n % 10 <= 4
&& (n % 100 < 10
n % 100 > 20) otherwise
Romanian n == 1 n == 0
(n % 100 >= 1
&& n % 100 <= 20)
otherwise
Russian n % 10 == 1
&& n % 100 != 11
n % 10 >= 2
&& n % 10 <= 4
&& (n % 100 < 10
n % 100 > 20) otherwise
Slovak n == 1 n >= 2 && n <= 4 otherwise
Japanese otherwise N/A N/A

These rules are hard-coded in Qt Linguist and lrelease andneither the application developers nor the translators need tounderstand them.

Considering how easy it is to use the new tr()overload, there should be no excuse(s) anymore for not handlingplural forms correctly in Qt applications.


[1] For simplicity, weassume that the source language is English. It can be any language,even languages that cannot be expressed using the ISO 8859-1(Latin-1) encoding. See the Release Managerchapter of the Qt Linguist manual for details.
Copyright © 2006 Trolltech Trademarks