<status value="active"/> <experimental value="false"/> <publisher value="HL7, Inc"/> <contact> <telecom> <system value="url"/> <value value="http://hl7.org"/> </telecom> </contact> <description value="FHIR Value set/code system definition for HL7 v2 table 0211 ( Alternate Character Sets)"/> <immutable value="true"/> <compose> <include> <system value="http://terminology.hl7.org/CodeSystem/v2-0211"/> </include> </compose> </ValueSet>

Alternate Character Sets

Code

Description

Comment

Version

8859/1

The printable characters from the ISO 8859/1 Character set

added v2.3

8859/15

The printable characters from the ISO 8859/15 (Latin-15)

added v2.6

8859/2

The printable characters from the ISO 8859/2 Character set

added v2.3

8859/3

The printable characters from the ISO 8859/3 Character set

added v2.3

8859/4

The printable characters from the ISO 8859/4 Character set

added v2.3

8859/5

The printable characters from the ISO 8859/5 Character set

added v2.3

8859/6

The printable characters from the ISO 8859/6 Character set

added v2.3

8859/7

The printable characters from the ISO 8859/7 Character set

added v2.3

8859/8

The printable characters from the ISO 8859/8 Character set

added v2.3

8859/9

The printable characters from the ISO 8859/9 Character set

added v2.3

ASCII

The printable 7-bit ASCII character set.

(This is the default if this field is omitted)

added v2.3

BIG-5

Code for Taiwanese Character Set (BIG-5)

Does not need an escape sequence. BIG-5 does not need an escape sequence. ASCII is a 7 bit character set, which means that the top bit of the byte is “0”. The parser knows that when the top bit of the byte is “0”, the character set is ASCII. When it is “1”, the following bytes should be handled as 2 bytes (or more). No escape technique is needed. However, since some servers do not correctly interpret when they receive a top bit “1”, it is advised, in internet RFC, to not use these kind of non-safe non-escape extension.

added v2.5

CNS 11643-1992

Code for Taiwanese Character Set (CNS 11643-1992)

Does not need an escape sequence.

added v2.5

GB 18030-2000

Code for Chinese Character Set (GB 18030-2000)

Does not need an escape sequence.

added v2.5

ISO IR14

Code for Information Exchange (one byte)(JIS X 0201-1976).

Note that the code contains a space, i.e., "ISO IR14".

added v2.3.1

ISO IR159

Code of the supplementary Japanese Graphic Character set for information interchange (JIS X 0212-1990).

Note that the code contains a space, i.e., "ISO IR159".

added v2.3.1

ISO IR6

ASCII graphic character set consisting of 94 characters.

http://www.itscj.ipsj.or.jp/ISO-IR/006.pdf

added v2.7

ISO IR87

Code for the Japanese Graphic Character set for information interchange (JIS X 0208-1990),

Note that the code contains a space, i.e., “ISO IR87”. The JIS X 0208 needs an escape sequence. In Japan, the escape technique is ISO 2022. From basic ASCII, escape sequence “escape” $ B (in HEX, 1B 24 42) lets the parser know that following bytes should be handled 2-byte wise. Back to ASCII is 1B 28 42.

added v2.3.1

JAS2020

A subset of ISO2020 used for most Kanjii transmissions

deprecated

added v2.3, removed after v2.3

JIS X 0202

ISO 2022 with escape sequences for Kanjii

deprecated

added v2.3, removed after v2.3

KS X 1001

Code for Korean Character Set (KS X 1001)

added v2.5

UNICODE

The world wide character standard from ISO/IEC 10646-1-1993

Deprecated. Retained for backward compatibility only as v 2.5. Replaced by specific Unicode encoding codes.

added v2.3

UNICODE UTF-16

UCS Transformation Format, 16-bit form

UTF-16 is identical to ISO/IEC 10646 UCS-2. Note that the code contains a space before UTF but not before and after the hyphen.

added v2.5, removed after v2.7.1

UNICODE UTF-32

UCS Transformation Format, 32-bit form

UTF-32 is defined by Unicode Technical Report #19, and is an officially recognized encoding as of Unicode Version 3.1. UTF-32 is a proper subset of ISO/IEC 10646 UCS-4. Note that the code contains a space before UTF but not before and after the hyphen.

added v2.5, removed after v2.7.1

UNICODE UTF-8

UCS Transformation Format, 8-bit form

UTF-8 is a variable-length encoding, each code value is represented by 1,2 or 3 bytes, depending on the code value. 7 bit ASCII is a proper subset of UTF-8. Note that the code contains a space before UTF but not before and after the hyphen. Since UTF-8 represents the full UNICODE character set, the following restriction apply to its use: 1. UTF-8 must be the default encoding of the message, UTF-8 cannot be specified as an additional character set in MSH-18 2. There are no other character sets allowed in a message where UTF-8 is the default encoding in the message. In other words, UNICODE UTF-8 can only be specified as a single value in MSH-18 3. A message encoded in UTF-8 must not use a Byte Order Mark (BOM).

added v2.5