Establishment of a set of Soft Cultural Elements

General Note

This part 3 will specify:

Part 3 will be elaborated in close collaboration with the Unicode Consortium, notably with the TC on the Common Locale Data Repository, and in continuous discussions with LISA.

Definitions

Locale: an identifier (id) that refers to a set of user preferences that tend to be shared across significant swaths of the world (Unicode TR35)

Soft Cultural Element:TODO

W3.1 Establishment of set of Soft Cultural Elements

State of the Art

Today's IT system are expected to adapt to the users' cultural expectations. Current operating systems and / or major desktop or web applications meet these expectations to quite some degree. As a matter of course, they switch the way dates, numbers or currencies are displayed, adapt the description of menus to the users' language, change the keyboard etc.

In doing so, systems build on so-called locale data that captures many frequently shared preferences ranging from date formats to translations for frequently used terms. Indeed, locale data currently largely consists of two different types of locale preferences:

These two types are largely orthogonal. On the one hand, languages are used very often in several countries, and on the other hand, many, if not most countries use more than one language on their territory. A number of preferences, notably keyboards and typographic conventions, are influenced by both language and regional preferences.

Locales (MK)

Locale data can be expressed in a number of widely-used formats including the POSIX format (TODO: REFERENCE) and, more recently, the CLDR's Locale Data Markup Language (LDML) (cf. below). Strictly speaking, the term locale refers in this context only to the identifier associated with the data itself.

TODO: Describe the nature of the identifier / BCP 47

The Common Locale Data Repository (CLDR) (MK)

TODO

Describe both the CLDR process, typical categories, locale inheritance and the LDML

Limitations of the CLDR (MK)

As we have seen, the CLDR currently has a number of conscious limitations. This is largely related to the self-imposed concentration on linguistic data (cf. TR35, section 2) and the tendency to identify differences between locales primarily with language differences. This decision is not arbitrary, but reflects that language data is by far the best understood type of locale data and the one easiest and most precise to describe.

The Soft Cultural Registry takes the approach that the cultural diversity that is transcends language and regional categories is just as important and needs to be captured accordingly.

We argue below that we need a third, again orthogonal type of locale data into account, data that relates to users' non-language cultural expectations. These expectations do not necessarily correspond with the users' language or country. They may instead be either broader --- e.g. for expectations shared across Europe --- or more specific --- e.g. for expectations typical only to a given region.

Situation in eGovernment (MK)

If the locale support in operating system support and major applications is today rather elaborate, the same is unfortunately not true for many small applications that are developed without sufficient awareness for cultural diversity. This is in particular also true for many eGovernment offerings on various levels:

In both cases, lack of awareness heavily impacts sharing and reusing existing services and / or user interface solutions in Europe while at the same time impacting the usability of applications even within one country. Solutions are potentially more complex, though, as the eGovernment data on which services operate and which they have to exchange is necessarily heavily reflecting local expectations.

Methodology (MK, MP)

TODO:

Initial Taxonomy

Note: This initial taxonomy starts out from work done in members of the CEN/ISSS CDFG (including the authors)

Note: This is a very initial taxonomy of the soft cultural elements. These elements now have to analyzed to what degree their description can be formalized in the context of the eGRN and the CLDR. The list then needs to be adapted --- expanded and shortened --- accordingly

Typographic Conventions (MK)

Note: A couple of these may already be covered by the CLDR

"How-do-you-do-logy" (MP)

The title "How-do-you-do-logy" is an eye-catching description for a wide range of issues that are related to the ways of referring to and interacting with another person in a way that appears "correct" to the person being addressed. These forms of referrring to and interacting with other people can vary significantly betwen different cultures and, currently, there are no resources that provide a systematic collection of these rules, although there are many books that describe certain aspects of such issues as part of a broader description of cross-cultural business "etiquette". Capturing this "How-to-do-logy" in a formally structured way would permit its use in the design of services that present information to or interact with an end-user in any form of conversational manner. Use of this "How-to-do-logy" resource would ensure that the service appears to treat its end-users in ways that are seen as appropriate by the them.

Some of the important issues that form part of "How-to-do-logy" for a specific culture are:

  1. What are the customary greeting formulae?
  2. How to address a married woman or an unmarried one in different ways?
  3. What are the customary levels of formality in business communications?
    • Is there a distinction between informal and formal address ("tu" vs. "vous" or "Du" vs. "Sie"?)?
    • If so, is it customarily used in business communications? Is it easy to pass from the formal to the informal form or not?
    • How should typical applications address their users?
  4. Is the usual address via first or secondary name or something else (e. g. via father's name)?
  5. What is the importance of titles?

One significant theme that underlies many of the above issues is the way that people perceive themselves, others, society and the interactions between these themes. Several of the significant names in the study of cross-cultural communication have identified a number of dimensions related to these perceptions and used these to categorize different cultures according to these dimensions. Unfortunately, each of the major figures in the science of cross-cultural communication has identified a different set of dimensions and this, together with the somewhat abstract concepts that these dimensions describe, makes precise categorization of the characteristics of different cultures an inexact science. Some of the the work undertaken in the development of these theories is based upon a large amount of careful data collection and other work appears to be somewhat more anecdotal in its nature. Nevertheless, these theories can help in the categorization of cultures according to some of the above "How-to-do-logy" factors. In particularl they can be helpful when considering issues of formality/informatlity and the importance in some cultures of precise forms of address that reflect a rigid heirarchical view of society in general, and business relationships in particular.

It is suggested that for many of the "How-to-do-logy" factors listed above, and any others that are added, the only way of obtaining appropriate information will be to attempt to gather information from people within the specific culture. The most obvious approach to this would be to use the standard CLDR data collection process, ensuring that the submission of the culture specific data emanated from a source that was part of that culture. Some of the items on the "How-to-do-logy" list appear to be closely related to the language being spoken (e.g. 1 and 2) and some appear to relate more strongly to the geographic location (country or region) (e.g. 5). The basic mix of language and country that underlies the CLDR concept of locale would therefore appear to be a quite appropriate way to partition the cultures for which the data is to be gathered.

The more theoretic models referred to above may prove to be a useful source of information to help to determine factors such as:

A dimension "context" is used by Hall [1] to differentiate the directness of communication and this may be a factor that is worth adding to the "How-to-do-logy" list. In "low-context" societies (e.g. the USA) information is explicitly expressed in the text of a message. In contrast, in "high-context" societies (e.g. Japan) the message is inherent in the occasion, the physical setting, and the relationship between the participants. Therefore a person from within the Japanese culture would understand a response of "Yes" in the complete context in which it was stated (it could actually mean "Maybe", "Don't know" or "No"), whereas a person from the USA who did not understand this "high-context" culture would take "Yes" to mean exactly what it says. Although this example describes a face-to-face interaction, coding cultures as "high-context" or "low-context" or somewhere in between might help a person from a different culture to excercise particular care in interacting with a service that was designed by and for another culture.

Further work on trying to isolate those aspects of "How-to-do-logy" that can be captured within the CLDR and of identifying those aspects of the cultural theories that may assist in the categorizing of "How-to-do-logy" factors across cultures will be undertaken.

[1] Hall, E.T. (1989) Beyond Culture. Anchor Books, New York 1989

Personal names (MK)

Colour Conventions (MP)

In each culture, there are specific meanings associated with colours and these meanings frequently vary between different cultures. There are a number of colours that have common meanings across a wide range of cultures. For example, bright red has a worldwide association with danger by 96% of a diverse worldwide sample and with war by 88% [1]. However, the prospect of recording every possible association as part of the CLDR is an enormous task and one that may be subject to a lot of dispute as many of the possible meanings associated with colours may not be universally accepted. What would be useful would be to record those colour associations that have quite strong negative, and to a lesser extent, positive associations. It would also be highly valuable to indicate the sometimes large cross-cultural differences in associated meaning.

A CLDR resource that captured significant associations of negative and positive meanings with colours and indicated how these associations differed across cultures would provide valuable support to the design of visually presented services. This could be used to avoid using colours in ways that accidentally give offence and, to a lesser extent would assist in creating positive emotions for the service users. This CLDR resource could, at one level, be used to automatically identify potential cultural mismatches between the design of a localized version of a service and the cultural expectations of the planned audience for that service. This CLDR resource could also be used by design support tools to provide real-time guidance to service designers during the design of a service.

The most significant issues in creating such a CLDR resource are the scope of the content of the resource and the way in which that resource is structured. At a simplistic level, the scope of the resource is to include "significant" positive and negative associations of meaning for any cultural grouping. The two major issues to be decided in delivering this scope are how an association could be judged to be "significant" and also the ways in which cultural groupings could be defined. The significance issue could potentially be resolved by use of the existing Unicode voting methods for agreeing that items should be included in the CLDR. The choice of how to define cultural groups is more complex. In some regions, quite a number of the colour associations are primarily related to a specific religion. Another set of associations may be specific to a racial grouping. In principle it might be possible to add religions and racial groups to the linguistic, national and regional groupings that are already identified in the CLDR. In practice adding such groups might be inadvisable for the following reasons:

It is probable that services will primarily be localized for the same language, national and regional groupings that are already identified in the CLDR. The only practical way in which the issues of racial and religious derived colour associations can be addressed in system design is to take account of the predominant religions and races that are prevalent in a particular country or region. Fortunately, the existing sources of information related to colour associations in regions and countries already reflect these cultural and religious influences.

Having decided on the scope of this new colour related resource, the final issue is to decide on a way of representing this information in the CLDR. The table below uses colour as the primary organising dimension. Such a structure would allow a designer or a software application to use the table to generate two lists of countries, one with negative associations and one with positive ones, for each colour that is used in the service. This list of countries could then be compared with those for which the service has been localized and any warnings of negative associations (or highlighting of positive associations) could be generated.

Colour

Code

Association

Cultures

Meaning

Red

RDN

Negative

Madagascar

Burial

South Africa

Mourning

Ghana

Mourning

Egypt

Death

China

Bloodshed - war

Red

RDP

Positive

China

Most popular colour

India

Life, action, gaiety

Indonesia

Luck

Pink

PNN

Negative

---

---

Pink

PNP

Positive

India

Happy, hopeful

Japan

Health, happiness

Singapore

Happy, feminine

Purple

PRN

Negative

Peru

Not favoured - avoid

Purple

PRP

Positive

United Kingdom

Prestige

United States

Creativity, exciting

The above is only a small subset of the possible set of colour associations. However, some examples have been included to illustrate that it is possible that the same colour can have both strong negative and strong positive associations for the same culture (e.g. see the colour red for China). It may also be the case that a colour may only have strong potitive associations and no negative associations, which appears to be the case for the colour pink.

An alternative way of using this new resource would be for the service designer, or software tool, to specify the regions or countries for which the service has been localized and receive a list of colours to avoid (those with negative associations) and those with very positive associations. A table with the colour and culture columns reversed would meet this need. In the CLDR there is a precedent for representing the same resource in two alternate presentations - "Territory-Language Information" and "Language-Territory Information" and this application seems to be another good situation where this approach is justified.

An issue that deserves a significant debate is the degree to which it is important to reflect a few strong religious colour associations that may pose a significant risk that failing to consider them might offend potential users of a service that is targeting users in any geographic region. The introduction of religious groupings would be a new departure for the CLDR and might, as suggested above, be an issue that would raise concern and controversy.

Another important factor that needs to be taken into account in populating a table of colour associations such as the one above is whether it is anticipated that the colour association data will mainly be used in the context of leisure related services or business related services. The reason for considering this factor relates to a study of the use of factory machines in mainland China [2] which indicated that in the working context of a factory environment the workers were perfectly familiar with the international standard (IEC73) meaning of "danger" for the colour red but, outside that context, they associated it with the traditional Chinese meaning of "Luck".

Use of Language (MK)

In multilingual areas, in which situation is a given language typically used (e.g. in Luxemburg)

Legal requirements and semantic considerations (MP)

What are typical legal requirements and semantic expectations in a given country that impact applications?

In both cases, the description can point at pertinent resources in the rest of the eGRN. Work done in other domains can be examined to identify promising approaches. One area is the work related to eHealth records in the eHealth sector which can be examined to identify ways in which legal permissions have been coded such that they can be automatically processed when making decisions related to the distribution of "extracts" from eHealth records. Work on these domains has taken place in both national, European and international standards fora and some of these solutions may be appropriate for non-eHealth applications such as eGovernment.

Co-ordination with ETSI work on eHealth user profiles may be one route to rapidly understanding the various (proposed) solutions from the eHealth domain. Certainly, substantially common solutions for eHealth and eGovernment could form the basis for a widespread and hence well supported solutions for the difficult issues associated with the coding and processing of legal requirements and of semantic expectations.

Note: this data is closely related to SEMIC and must be coordinated accordingly

W3.2 Development of a formal structure

Note

In particular, a report of the results of W 3.1 (really the same as 3.1 --- how do we want to capture the results without reporting them?), an XML schema for the registration of Soft Cultural Elements (coordinated with the general ontology W1.1 and with the CLDR) and corresponding documentation. In addition, we need a metadata model for this to integrate with the general methodology of the registry.

References

SG: Interinstitutional style guide of the Publications office of the European Union, EU style guides

Unicode TR35: Unicode Technical Standard #35 Unicode Locale Data Markup Language (LDML)

egovpt_fg: WP3 (last edited 2008-06-08 22:08:21 by MikePluke)