Data Types - Abstract Specification

Chair/Editor Gunther Schadow
gunther@aurora.rg.iupui.edu
Regenstrief Institute for Health Care
Editor Paul Biron
paul.v.biron@kp.org
Kaiser Permanente, Southern California
Editor Lloyd McKenzie
lmckenzi@ca.ibm.com
IBM Global Services
Editor Grahame Grieve
grahame@kestral.com.au
Kestral Computing Pty. Ltd.
Editor Doug Pratt
Douglas.Pratt@siemens.com
Siemens

1

Preface

Note to Readers: This document passed final normative ballot in the last voting cycle but has not, as yet, been updated with all the editorial changes necessary to reconcile the negative votes from that cycle.

This document specifies the HL7 Version 3 Data Types on an abstract layer, independent of representation. By "independent of representation" we mean independent of both abstract syntax as well as implementation in any particular implementation technology.

This document is accompanied by Implementation Technology Specifications (ITS). The ITS documents can serve as a quick compendium to the data types that is more practically oriented toward the representation in that particular implementation technology.

Vocabulary tables within this specification list the current contents of vocabulary domains for ease of reference by the reader. However, at any given time the normative source for these domains is the vocabulary tables in the RIM database. For some large domains, only a sample of possible values is shown. The complete domains can be referenced in the vocabulary tables by looking up the domain name associated with the table in the RIM vocabulary tables.

2

Acknowledgements

This specification is the result of many years of intense work through e-mail, telephone conferences and meeting discussions. And ballot reconciliation. Thanks go to many individuals who participated at various times in design, discussions and ballot review. Gunther Schadow (Regenstrief Institute for Health Care) chaired this task force, and is the main author of this document. Paul V. Biron (Kaiser Permanente), Doug Pratt (Siemens), Lloyd McKenzie (IBM), and Grahame Grieve (Kestral Computing Pty. Ltd.) have served as co-editors at various times. Major contributions of thoughts and support come from Mark Tucker (Regenstrief Institute), George Beeler, Stan Huff (Intermountain Health Care), as well as Mike Henderson (Kaiser Permanente), Anthony Julian (Mayo), Joann Larson (Kaiser Permanente), Mark Shafarman (Oacis Healthcare Systems), Wes Rishel (Gartner Group), and Robin Zimmerman (Kaiser Permanente). Acknowledgements for their critical review and infusion of ideas go to Bob Dolin (Kaiser Permanente), Clem McDonald (Regenstrief Institute), Kai Heitmann (HL7 Germany), Rob Seliger (Sentillion), and Harold Solbrig (Mayo Clinic). Vital support came from the members of the task force, Laticia Fitzpatrick (Kaiser Permanente), Matt Huges, Randy Marbach (Kaiser Permanente), Larry Reis (Wizdom Systems), Carlos Sanroman (Kaiser Permanente), Greg Thomas (Kaiser Permanente). Thanks James Case (University of California, Davis), Norman Daoust (Partners HealthCare Systems), Irma Jongeneel (HL7 The Netherlands), Michio Kimura (HL7 Japan), John Molina (SMS), Richard Ohlmann (McKessonHBOC), David Rowed (HL7 Australia), and Klaus Veil (Macquarie Health Corp., HL7 Australia), for sharing their expertise in critical questions. This work was made possible by the Regenstrief Institute for Health Care.

3

Changes to the document since last ballot

  • Withdraw originalText in CS (technical correction - addition in previous ballot cycle was invalid)

  • Fixed Examples, minor syntax & typo fixes

  • Fix inheritence heirarchy of CS,CV,CE

4

Outstanding issues

  • Conformance framework for constraining data types

Table of Contents

1 Introduction
1.1 What is a Data Type?
1.2 Representation of Data Values
1.3 Properties of Data Values
1.4 Need for Abstraction
1.5 Need for an HL7 Data Type Standard
1.6 Requirements
1.7 Forms of Data Type Definitions
1.7.1 Formal Data Type Definition Language
1.7.2 Tables of Properties
1.7.3 Unified Modeling Language (UML) Diagrams
1.8 Overview of Data Types
1.9 Introduction to the Formal Data Type Definition Language (DTDL)
1.9.1 Declaration
1.9.2 Invariant Statements
1.9.3 Type Conversion
1.9.4 Literal Form
1.9.5 Generic Data Types
1.10 Conformance
1.11 DataValue (ANY)
1.11.1 Properties of DataValue (ANY)
1.12 DataType (TYPE) specializes ANY
1.12.1 Short Name (shortName : CS)
1.12.2 Long Name (longName : CS)
1.12.3 Implies (implies : BN)
2 Basic Types
2.1 Boolean (BL) specializes ANY
2.1.1 Properties of Boolean (BL)
2.2 BooleanNonNull (BN) specializes BL
2.2.1 Properties of BooleanNonNull (BN)
2.3 Binary Data (BIN) specializes LIST<BN>
2.4 Encapsulated Data (ED) specializes BIN
2.4.1 Properties of Encapsulated Data (ED)
2.5 Character String (ST) specializes ED
2.5.1 Properties of Character String (ST)
2.6 Concept Descriptor (CD) specializes ANY
2.6.1 Properties of Concept Descriptor (CD)
2.7 Concept Role (CR) specializes ANY
2.7.1 Name (name : CV, default NULL)
2.7.2 Value (value : CD, default NULL)
2.7.3 Inversion Indicator (inverted : BN, default false)
2.8 Coded Simple Value (CS) specializes CD
2.8.1 Properties of Coded Simple Value (CS)
2.9 Coded Value (CV) specializes CD
2.9.1 Properties of Coded Value (CV)
2.10 Coded Ordinal (CO) specializes CV
2.10.1 Properties of Coded Ordinal (CO)
2.11 Coded With Equivalents (CE) specializes CD
2.11.1 Properties of Coded With Equivalents (CE)
2.12 Character String with Code (SC) specializes ST
2.12.1 Properties of Character String with Code (SC)
2.13 Unique Identifier String (UID) specializes ST
2.14 ISO Object Identifier (OID) specializes UID
2.14.1 HL7-Assigned OIDs
2.14.2 Literal Form
2.15 DCE Universal Unique Identifier (UUID) specializes UID
2.15.1 Literal Form
2.16 HL7 Reserved Identifier Scheme (RUID) specializes UID
2.17 Instance Identifier (II) specializes ANY
2.17.1 Properties of Instance Identifier (II)
2.18 Universal Resource Locator (URL) specializes ANY
2.18.1 Scheme (scheme : CS)
2.18.2 Address (address : ST)
2.18.3 Literal Form
2.19 Telecommunication Address (TEL) specializes URL
2.19.1 Properties of Telecommunication Address (TEL)
2.20 Address Part (ADXP) specializes ST
2.20.1 Address Part Type (partType : CS)
2.21 Postal Address (AD) specializes LIST<ADXP>
2.21.1 Properties of Postal Address (AD)
2.22 Entity Name Part (ENXP) specializes ST
2.22.1 Name Part Type (partType : CS)
2.22.2 Qualifier (qualifier : SET<CS>)
2.23 Entity Name (EN) specializes LIST<ENXP>
2.23.1 Properties of Entity Name (EN)
2.23.2 Examples
2.24 Trivial Name (TN) specializes EN
2.25 Person Name (PN) specializes EN
2.26 Organization Name (ON) specializes EN
2.26.1 Examples
2.27 Abstract Type Quantity (QTY) specializes ANY
2.27.1 Properties of Abstract Type Quantity (QTY)
2.28 Integer Number (INT) specializes QTY
2.28.1 Properties of Integer Number (INT)
2.29 Real Number (REAL) specializes QTY
2.29.1 Properties of Real Number (REAL)
2.30 Ratio (RTO) specializes QTY
2.30.1 Properties of Ratio (RTO)
2.31 Physical Quantity (PQ) specializes QTY
2.31.1 Properties of Physical Quantity (PQ)
2.32 Physical Quantity Representation (PQR) specializes CV
2.32.1 Value (value : REAL)
2.32.2 Code (code : ST, default NULL, inherited from CV)
2.32.3 Code System (codeSystem : UID, inherited from CV)
2.32.4 Code System Name (codeSystemName : ST, default NULL, inherited from CV)
2.32.5 Code System Version (codeSystemVersion : ST, default NULL, inherited from CV)
2.32.6 Display Name (displayName : ST, default NULL, inherited from CV)
2.32.7 Original Text (originalText : ED, default NULL, inherited from CV)
2.33 Monetary Amount (MO) specializes QTY
2.33.1 Properties of Monetary Amount (MO)
2.34 Calendar (CAL) specializes SET<CLCY>
2.35 Calendar Cycle (CLCY) specializes ANY
2.36 Point in Time (TS) specializes QTY
2.36.1 Properties of Point in Time (TS)
3 Generic Collections
3.1 Set (SET) specializes ANY
3.1.1 Properties of Set (SET)
3.2 Sequence (LIST) specializes ANY
3.2.1 Properties of Sequence (LIST)
3.3 GeneratedSequence (GLIST) specializes LIST
3.3.1 Head Item (head : T, inherited from LIST)
3.3.2 Increment (increment : QTY)
3.3.3 Period Step Count (period : INT, default ∞)
3.3.4 Denominator (denominator : INT, default 1)
3.4 SampledSequence (SLIST) specializes LIST
3.4.1 Scale Origin (origin : T)
3.4.2 Scale Factor (scale : QTY)
3.4.3 Sampled Digits (digits : LIST<INT>)
3.5 Bag (BAG) specializes ANY
3.5.1 Properties of Bag (BAG)
3.6 Interval (IVL) specializes SET
3.6.1 Properties of Interval (IVL)
3.7 Interval of Physical Quantities (IVL<PQ>) specializes IVL
3.8 Interval of Point in Time (IVL<TS>) specializes IVL
3.8.1 Promotion of Points in Time Values to Intervals (promotion : IVL<TS>, inherited from IVL)
3.8.2 Literal Form
4 Generic Type Extensions
4.1 History Item (HXIT) specializes T
4.1.1 Valid Time (validTime : IVL<TS>)
4.2 History (HIST) specializes SET<HXIT>
4.2.1 Properties of History (HIST)
4.3 Uncertain Value - Probabilistic (UVP) specializes T
4.3.1 Properties of Uncertain Value - Probabilistic (UVP)
4.4 Non-Parametric Probability Distribution (NPPD) specializes SET<UVP>
4.4.1 Most Likely (mostLikely : UVP)
5 Timing Specification
5.1 Periodic Interval of Time (PIVL) specializes SET
5.1.1 Properties of Periodic Interval of Time (PIVL)
5.1.2 Periodic Intervals as Sets
5.2 Event-Related Periodic Interval of Time (EIVL) specializes SET
5.2.1 Properties of Event-Related Periodic Interval of Time (EIVL)
5.2.2 Resolving the Event-Relatedness
5.3 General Timing Specification (GTS) specializes SET<TS>
5.3.1 Convex Hull
5.3.2 GTS as a Sequence of Occurrence Intervals
5.3.3 Interleaving Schedules and Periodic Hull
5.3.4 Literal Form

Appendices

A Informative Types
A.1 Parametric Probability Distribution (PPD) specializes T
A.1.1 Properties of Parametric Probability Distribution (PPD)
A.2 Probability Distribution over Real Numbers (PPD<REAL>) specializes PPD
A.2.1 Converting a real number (REAL) to an uncertain real number (PPD<REAL>)
A.2.2 Concise Literal Form
A.3 Parametric Probability Distributions over Physical Quantities (PPD<PQ>) specializes PPD
A.3.1 Concise Literal Form
A.4 Probability Distribution over Time Points (PPD<TS>) specializes PPD
A.4.1 Converting a point in time (TS) to an uncertain point in time

NormativeStandard1

1

Introduction

1.1

What is a Data Type?

Every data element has a data type. Data types define the meaning (semantics) of data values that can be assigned to a data element. Meaningful exchange of data requires that we know the definition of values so exchanged. This is true for complex "values" such as business messages as well as for simpler values such as character strings or integer numbers.

According to ISO 11404, a data type is "a set of distinct values, characterized by properties of those values and by operations on those values." A data type has intension and extension. Intentionally, the data type defines the properties exposed by every data value of that type. Extensionally, data types have a set of data values that are of that type (the type's "value set").

Semantic properties of data types are what ISO 11404 calls "properties of those values and [...] operations on those values." A semantic property of a data type is referred to by a name and has a value for each data value. The value of a data value's property must itself be a value defined by a data type - no data value exists that would not be defined by a data type.

Data types are thus the basic building blocks used to construct any higher order meaning: messages, computerized patient record documents, or business objects and their transactions. What, then, is the difference between a data type and a message, document, or business object? Data type values stand for themselves, the value is all that counts, neither identity nor state or changing of state is defined for a data value. Conversely in business objects, we track state and identity; the properties of an identical object might change between now and later. Not so with data values: a data value and its properties are constant. For example, number 5 is always number 5, there is no difference between this number 5 and that number 5 (no identity distinguished from value), number 5 never changes to number 6 (no change of state). One can think of data values as immutable objects where identity does not matter (identity and equality are the same.)1

1.2

Representation of Data Values

Data values can be represented through various symbols but the data value's meaning is not bound to any particular representation.

For example, cardinal numbers (non-negative integers) are defined - intentionally - as a data type where each value has a successor value, where zero is the successor of no other cardinal value. Based on this definition we can define addition, multiplication, and other mathematical operations. Whatever representation reflects the rules we stated in the intentional definition of the cardinal data type is a valid representation of cardinal numbers. Examples for valid cardinal number representations are decimal digit strings, bags of glass marbles, or scratches on a wall. The number five is represented by the word "five" by the Arabic number "5" or the Roman number "V". The representation does not matter as long as it conforms to the semantic definition of the data type.

Another example, the Boolean data type is defined by its extension, the two distinct values true and false and the rules of negation and combining these values in conjunction and disjunction. The representation of Boolean values can be the words "true" and "false," "yes" and "no," the numbers 0 and 1, any two signs that are distinct from each other. The representation of data types does not matter as long as it conforms to the semantic definition of the data type.

This specification defines the semantics, the meaning of the HL7 data types. This specification is about semantics only, independent from representational and operational concerns or specific implementation technologies. Additional standards for representing the data values defined here are being defined for various technological approaches. These standards are called "Implementable Technology Specification" (ITS.) Those ITS define how values are represented so that they conform to the semantic definitions of this specifications, this may include syntaxes for character or binary representations, and computer procedures to act on the representation of data values. The meaning of these ITS representations communicated, generated, and processed in computer programs, is defined based on this standard, the semantic data type specification.

1.3

Properties of Data Values

Data values have properties defined by their data type. The "fields" of "composite data types" are the most common example of such properties. However, more generally one should think of a data value's property as logical predicates or as mathematical functions; in simpler but still correct terms, properties are questions one can ask about a data value to receive another data value as an answer.

A property is referred to by its name. For example, the data type integer may have a property named "sign." A property has a domain, which is the set of possible "answer" values. The set of possible "answer" values is defined by the property's data type, but the domain of a property may be a subset of the data type's value set.

A property may also have arguments, additional information one must supply with a question to get an answer. For example, an important property of an integer number is that one integer plus another integer results in another integer, so the plus property of one integer needs an argument: the other integer.

Whether semantic properties have arguments is not a fundamentally relevant distinction. A data type's semantic property without arguments is not necessarily a "field" of a "composite" data type. For example, for integer values, we can define the property is-zero that has the Boolean value true when the number is zero and false when the number is not zero. This does not mean that is-zero must be an explicit component of any integer representation.

A data type's semantic property with arguments has no specific operational notions such as "procedure call," "passing arguments," "return values," "throwing exceptions," etc. These are all concepts of computer systems implementation of data types - but these operational notions are irrelevant for the semantics of data types.

This specification is about semantics of data types only. Neither is it about value representation syntax (not even an abstract syntax), nor is it about an operational interface to the data values.

1.4

Need for Abstraction

Why does this specification make such a big issue about its being abstract from representation syntax as well as operational implementation?

HL7 needs this kind of abstract semantic data type specification for a very practical purpose. One important design feature of HL7 version 3 is its openness towards representation and implementation technologies. All HL7 version 3 specifications are supposed to be done in a form independent from specific representation and implementation technologies. HL7 acknowledges that, while at times some representation and implementation technologies may be more popular than others, technology is going to change - and with changing technology, representations of data values will change. HL7 standards are primarily targeted to healthcare domain information, independent from the technology supporting this information. HL7 expects that specifications defined independent from today's technology will continue to be useful, even after the next technological "paradigm shift".

The issue of data types is closer to implementation technology than most other HL7 information standards - and therein lays a certain danger that we define data types too dependent on current implementation technologies.

The majority of HL7 standards are about complex business objects. Complex business objects with many informational attributes can be specified as abstract syntax, where components are eventually defined in terms of data types. Conversely, defining data types in terms of abstract syntax is of little use because the components of such abstract syntax constructs would still have to have data types.2

Why is this specification so circular? Why is the data type "ANY" defined in terms of specializations of itself?

This specification needs to be independent of any particular implementation, and is therefore abstract, and not intended to be implementable. In this sense, the circularity is not a problem, since it does not introduce any uncertainty about what this specification says.

Why doesn't this specification define a set of primitive data types based on which composite data types could be defined simply as abstract syntax?

Any concrete implementation of the HL7 standards must ultimately use the built-in data types of their implementation technology. Therefore, we need a very flexible mapping between HL7 abstract data types and those data types built into any specific implementation technology. With a semantic specification, an Implementable Technology Specification (ITS) can conform simply by stating a mapping between the constructs of its technology and the HL7 version 3 data type semantics. Whether a data type is primitive of composite is irrelevant from a semantic perspective, and the answer may be different for different implementation technologies.

For example, this standard specifies a character string as a data type with many properties (e.g., charset, language, etc.) However, in many Implementation Technologies, character strings are primitive first class data types. We encourage that these native data types be used rather than a structure that slavishly represents all the semantic properties as "components." This specification only requires that the properties defined for data values can somehow be inferred from whatever representation is chosen, it does not matter how these values are represented. Whether "primitive" or "composite", with few or many "components", as "fields" or "methods" - this is all irrelevant.

For another example, a decimal representation, a floating-point register and a scaled integer are all possible native representations of real numbers for different implementation technologies. Some of these representations have properties that others do not have. Scaled integers, for instance, have a fixed precision and a relatively small range. Floating-point values have variable precision and a large range, but floating-point values lose any information about precision. Decimal representations are of variable precision and maintain the precision information (yet are slow to processing.) The data type semantics must be independent from all these accidental properties of the various representations, and must define the essential properties that any technology should be able to represent.

1.5

Need for an HL7 Data Type Standard

Why does HL7 need its own data type standard? Why can't HL7 simply adopt a standard defined by some other body?

As noted in the previous section, all HL7 implementation technologies have some data type system, but there are differences among the data type systems between implementation technologies. In addition, many implementation technologies' data type systems are not powerful enough to express the concepts that matter for the HL7 application layer.

For example, few implementation technologies provide the concepts of physical quantities, precision, ranges, missing information, and uncertainty that are so relevant in scientific and health care computing.

On the other hand, implementation technologies do make distinctions that are not relevant from the abstract semantics viewpoint, e.g., fixed point vs. floating-point real numbers; 8, 16, 32, or 64-bit integers; date vs. timestamp.

A number of data type systems have been used as input to this specification. These include the type systems of many major programming languages, including BASIC, Pascal, MODULA-2, C, C++, JAVA, ADA, LISP and SCHEME. This also includes type systems of language-independent implementation technologies, such as Abstract Syntax Notation One (ASN.1), Object Management Group's (OMG) Interface Definition Language (IDL) and Object Constraint Language (OCL), SQL 92 and SQL 99, the ISO 11404 language independent data types, and XML Schema Part 2 data types. Health care standards related data types have been considered as well, among these HL7 version 2.x, types used by CEN TC 251 messages and Electronic Health Record Architecture (EHCRA) and DICOM.

1.6

Requirements

The data types described in this specification are designed to meet a number of requirements. These include

  • Modelling considerations
  • Implementation Considerations
  • Compatibility with other data type standards
  • Functional Requirements identified in other HL7 standards where the data types are used.

Of these, the last is the most important consideration. These data types are designed to deliver the functionality required throughout the HL7 standards. These requirements are not always compatible, and throughout this specification there is a number of places where particular design features are less than optimal for one of the 4 considerations listed above. In a number of these places, the requirements that led to this design feature are described in a requirements section. These requirements sections are only informative, not normative.

Requirement:
The Reference Information Model defines a number of reference classes on which all domain information models are based. Each of these reference classes has a series of attributes which has an assigned type. Where the reference classes are used (cloned into) in domain models, the types in the reference classes may be replaced by other types to clarify and constrain the use of the attribute in the clone classes

This data types specification must define the rules for which data types can be substituted in this fashion. This specification chooses to use the specialization metaphor as a basis for the substitution rules, since this is widely understood and used method in theory and practice, and because these rules are more easily understood and managed than the alternatives. This use of specialization may lead to designs that may appear unfamiliar to some.

1.7

Forms of Data Type Definitions

This specification defines data types in several forms, using textual description, UML diagrams, tables, and a formal definition.

1.7.1

Formal Data Type Definition Language

A formal definition of data types is used in order to clarify the semantics of the proposed types as unambiguously as possible. This data type definition language is described in detail in (§ ). Formal languages make crisp essential statement and are therefore accessible to some formal argument of proof or rebuttal. However, the terseness of such formal statements may also be difficult to understand by humans. Therefore, all the important inferences from the formal statements are also included as plain English statements.

1.7.2

Tables of Properties

For a quick overview at the beginning of many data types this specification contains tables listing "primary" properties. "Primary" properties are a somewhat fuzzy notion of those properties that are more likely to be thought of as "fields" when the data type where implemented as a record, or that are expected to be used more often. These tables are provided to facilitate an overview of the content and purpose of data types. There is no requirement that the properties listed in these tables be represented as fields, and these tables are not abstract syntax definitions.

Each row of the property tables describes one property with the following columns:

  1. Name - the name of the property as stated in the formal definition. For some data types, the name field of the first property may be empty. This may happen in those data types that are defined as extension of other data types and when it is not useful for the summary of the child to show any properties of the parent.

  2. Type - the data type of that property.

  3. Definition - a short text describing the meaning of the property.

1.7.3

Unified Modeling Language (UML) Diagrams

The Unified Modeling Language (UML) is used for a graphical presentation of how data types relate to each other . Data types are shown as UML classes using the shortname for the class. Properties of types are shown as UML operations. Generic types are shown as UML parameterized classes, with UML realization links relating their instantiations.

Much of the detail of the data type declarations cannot be represented in the UML representation. Therefore the formal definition of the data types in the Data Type Definition Language (DTDL) should be used for detailed specification of the data types.

Some of the constraints from the DTDL are represented as constraints on the operations. Where constrains are shown, they are statements that will be true and are taken from the DTDL specification

The UML Diagrams use a stereotype "mixin". The mixin stereotype applies to a parameterized class, and denotes that the class specializes the parameter type and expresses all the properties of the type T in addition to it's own properties

1.8

Overview of Data Types

UML Overview of Data Types (link to graphic opens in a new window)
Table 1: Overview of HL7 version 3 data types
Name Symbol Description
DataValue ANY Defines the basic properties of every data value. This is an abstract type, meaning that no value can be just a data value without belonging to any concrete type. Every concrete type is a specialization of this general abstract DataValue type.
Boolean BL The Boolean type stands for the values of two-valued logic. A Boolean value can be either true or false, or, as any other value may be NULL.
BooleanNonNull BN The BooleanNonNull constrains the boolean type so that the value may not be NULL. This type is created for use within the data types specification where it is not appropriate for a null value to be used
Encapsulated Data ED Data that is primarily intended for human interpretation or for further machine processing outside the scope of HL7. This includes unformatted or formatted written language, multimedia data, or structured information in as defined by a different standard (e.g., XML-signatures.) Instead of the data itself, an ED may contain only a reference (see TEL.) Note that the ST data type is a specialization of the ED data type when the ED media type is text/plain.
Character String ST The character string data type stands for text data, primarily intended for machine processing (e.g., sorting, querying, indexing, etc.) Used for names, symbols, and formal expressions.
Concept Descriptor CD A concept descriptor represents any kind of concept usually by giving a code defined in a code system. A concept descriptor can contain the original text or phrase that served as the basis of the coding and one or more translations into different coding systems. A concept descriptor can also contain qualifiers to describe, e.g., the concept of a "left foot" as a postcoordinated term built from the primary code "FOOT" and the qualifier "LEFT". In cases of an exceptional value, the concept descriptor need not contain a code but only the original text describing that concept.
Coded Simple Value CS Coded data in its simplest form, where only the code is not predetermined. The code system and code system version are fixed by the context in which the CS value occurs. CS is used for coded attributes that have a single HL7-defined value set.
Coded Ordinal CO Coded data, where the domain from which the codeset comes is ordered. The Coded Ordinal data type adds semantics related to ordering so that models that make use of such domains may introduce model elements that involve statements about the order of the terms in a domain.
Coded With Equivalents CE Coded data that consists of a coded value (CV) and, optionally, coded value(s) from other coding systems that identify the same concept. Used when alternative codes may exist.
Character String with Code SC A character string that optionally may have a code attached. The text must always be present if a code is present. The code is often a local code.
Instance Identifier II An identifier that uniquely identifies a thing or object. Examples are object identifier for HL7 RIM objects, medical record number, order id, service catalog item id, Vehicle Identification Number (VIN), etc. Instance identifiers are defined based on ISO object identifiers.
Telecommunication Address TEL A telephone number (voice or fax), e-mail address, or other locator for a resource mediated by telecommunication equipment. The address is specified as a Universal Resource Locator (URL) qualified by time specification and use codes that help in deciding which address to use for a given time and purpose.
Postal Address AD Mailing and home or office addresses. A sequence of address parts, such as street or post office Box, city, postal code, country, etc.
Entity Name EN A name for a person, organization, place or thing. A sequence of name parts, such as given name or family name, prefix, suffix, etc. Examples for entity name values are "Jim Bob Walton, Jr.", "Health Level Seven, Inc.", "Lake Tahoe", etc. An entity name may be as simple as a character string or may consist of several entity name parts, such as, "Jim", "Bob", "Walton", and "Jr.", "Health Level Seven" and "Inc.", "Lake" and "Tahoe".
Trivial Name TN A restriction of entity name that is effectively a simple string used for a simple name for things and places.
Person Name PN An Entity Name used when the named Entity is a Person. A sequence of name parts, such as given name or family name, prefix, suffix, etc. A name part is a restriction of entity name part that only allows those entity name parts qualifiers applicable to person names. Since the structure of entity name is mostly determined by the requirements of person name, the restriction is very minor.
Organization Name ON An Entity Name used when the named Entity is an Organization. A sequence of name parts.
Integer Number INT Integer numbers (-1,0,1,2, 100, 3398129, etc.) are precise numbers that are results of counting and enumerating. Integer numbers are discrete, the set of integers is infinite but countable. No arbitrary limit is imposed on the range of integer numbers. Two NULL flavors are defined for the positive and negative infinity.
Real Number REAL Fractional numbers. Typically used whenever quantities are measured, estimated, or computed from other real numbers. The typical representation is decimal, where the number of significant decimal digits is known as the precision.
Ratio RTO A quantity constructed as the quotient of a numerator quantity divided by a denominator quantity. Common factors in the numerator and denominator are not automatically cancelled out. The RTO data type supports titers (e.g., "1:128") and other quantities produced by laboratories that truly represent ratios. Ratios are not simply "structured numerics", particularly blood pressure measurements (e.g. "120/60") are not ratios. In many cases the REAL should be used instead of the RTO.
Physical Quantity PQ A dimensioned quantity expressing the result of measuring.
Monetary Amount MO A monetary amount is a quantity expressing the amount of money in some currency. Currencies are the units in which monetary amounts are denominated in different economic regions. While the monetary amount is a single kind of quantity (money) the exchange rates between the different units are variable. This is the principle difference between physical quantity and monetary amounts, and the reason why currency units are not physical units.
Point in Time TS A quantity specifying a point on the axis of natural time. A point in time is most often represented as a calendar expression.
Set SET A value that contains other distinct values in no particular order.
Sequence LIST A value that contains other discrete values in a defined sequence.
Bag BAG An unordered collection of values, where each value can be contained more than once in the collection.
Interval IVL A set of consecutive values of an ordered base data type.
History HIST A set of data values that have a a valid-time property and thus conform to the history item (HXIT) type. The history information is not limited to the past; expected future values can also appear.
Uncertain Value - Probabilistic UVP A generic data type extension used to specify a probability expressing the information producer's belief that the given value holds.
Periodic Interval of Time PIVL An interval of time that recurs periodically. Periodic intervals have two properties, phase and period. The phase specifies the "interval prototype" that is repeated every period.
Event-Related Periodic Interval of Time EIVL Specifies a periodic interval of time where the recurrence is based on activities of daily living or other important events that are time-related but not fully determined by time.
General Timing Specification GTS A set of points in time, specifying the timing of events and actions and the cyclical validity-patterns that may exist for certain kinds of information, such as phone numbers (evening, daytime), addresses (so called "snowbirds," residing closer to the equator during winter and farther from the equator during summer) and office hours.
Parametric Probability Distribution PPD A generic data type extension specifying uncertainty of quantitative data using a distribution function and its parameters. Aside from the specific parameters of the distribution, a mean (expected value) and standard deviation is always given to help maintain a minimum layer of interoperability if receiving applications cannot deal with a certain probability distribution.

1.9

Introduction to the Formal Data Type Definition Language (DTDL)

NOTE: This is not an API specification. While this formal language might resemble some programming language or interface definition language, it is not intended to define the details of programs and other means of implementation. The formal definitions are a normative part of this specification, but this particular language needs not be implemented or used in conformant systems; nor need all the semantic properties be implemented or used by conformant systems. The internal working of systems, their way to implement data types, their functionality and services is entirely out of scope of this specification. The formal definition only specifies the meaning of the data values through making statements how one would theoretically expect these values to relate and behave.

This formal data type definition language3 specifies:

  • type name and short name;

  • named values of a fully enumerated extension;

  • semantic properties, unary, binary, and higher order properties;

  • invariants, i.e. constraints over the properties.

  • allowable type conversions;

  • syntax of character string value literals (if any);

Definition of a data type occurs in two steps. First, the data type is declared. The declaration claims a name for a new data type with a list of names, types, and signatures of the new type's semantic properties. This declares, not defines the type. The definition occurs in both logic statements about what is always true about this type's values and their properties (invariant statements.)

1.9.1

Declaration

Every data type is declared in a form that begins with the keyword type. For example, the following is the header of a declaration for the data type Boolean that has the short name alias BL and specializes the data type ANY.4

Definition 1:
type Boolean alias BL specializes ANY
    values(true, false)
{
    BL      not;
    BL      and(BL x);
};
      

The Boolean data type declaration also contains a values-clause that declares the Boolean's complete set of values (its extension) as named entities. These named values are also valid character string literals. None of the other data types defined in this specification has a finite value set, which is why the values-clause is unique to the Boolean. In the marked-up formal language, value names use Italics font.

The block in curly braces following the header contains declarations of the semantic properties that hold for every value of the data type. A semicolon terminates each property declaration; and another semicolon after the closing curly brace terminates the data type declaration.

A property declaration mentions from left to right: (1) the data type of the property's value domain, (2) the property name, and (3) an optional argument list. The argument list of a property is enclosed in parentheses containing a sequence of argument declarations. Each argument is declared by the data type name and argument name. Semantic properties without arguments do not use an empty argument list.5

The specializes-clause means (a) inheritance of properties from the genus to the species, and (b) substitutability of values of the species type for variables of the genus type. Specialization can include the definition of additional properties and the specification of constraints on inherited properties for the specialized type.

An example for inheritance is: when CD has the property code and CS specializes CD then CS also has this property code even though isNull is not listed explicitly in the property declaration of BL. An example for substitutability is: when a property is declared as of a data type CD, and CS specializes CD, then a value of such property may be of type CS. In other words, substitutability is the same as subsumption of all values of type CS being also values of type CD.6

An example of substitution used throughout

The type-declaration may be qualified by the keyword abstract, protected., or private. An abstract type is a type where no value can be just of this type without belonging to a concrete specialization of the abstract. A protected type is a type that is used inside this specification but no property outside this specification should be declared of a protected type. A private type is an internal "helper" abstraction, defined only for the purpose of defining some aspect of the semantics of deata types but that is not used even as the type of another protected or public type's property.7 (We also use the qualifier private at one point. Private types are only specified for the sake of formal definition of other types and are not used in any form outside this specification.)

1.9.2

Invariant Statements

The declaration of semantic properties, their names, data types, and arguments provide only clues as to what the new data type might be about. The true definition lies in the invariant statements. Invariant statements are logical statements that are true at all times.

Throughout this specification, invariant statements are provided in a formal syntax but are also written in plain English. The advantage of the formal syntax is that it can be interpreted unambiguously, and that it is strongly typed. The advantage of plain English statements is that they are more understandable, especially to those untrained in reading formal languages.

The formal syntax does help to sharpen the decisiveness of this specification. In some cases, however, the full semantics of a type are beyond what can be fully expressed in such invariant statements. The combination of both plain and formal language helps to make this specification more clear.

Invariant statements are formed using the invariant keyword that declares one or more variables in the same form as an argument list of a property. The invariant statement can contain a where clause that constrains the arguments for the entire invariant body. The invariant body is enclosed in curly braces. It contains a list of assertions that must all be true.

Definition 2:
invariant(BL x) where x.nonNull {
    x.and(true).equal(x);
};
      

The semantics of the invariant statement is a logic predicate with a universal quantifier ("for all").

The above invariant statement can be read in English as "For all Boolean values x, where x is non-NULL it holds that x AND true equals x." All properties should be named such that one can read the assertions like English sentences.8

The argument list of an invariant statement need not be specified if no such argument is needed.

Definition 3:
invariant {
    true.not.equal(false);
    false.not.equal(true);
};
      
1.9.2.1
Assertion Expressions

Assertions in invariant statements are expressions built with the semantic properties of defined data types. Assertion expressions must have a Boolean value (true or false.)9 No primitive data types, or operations, pre-exist the definition of any data type. The only preexisting features of the assertion expression language are:10

  • character strings representing utterances in the data type definition language;

  • the notion of an assertion being successful (true) or failing (false);

  • the invariant statement: invariant(...) where ... {...};

  • the universal quantifier expression form forall (...) where ... {...}; synonymous to the invariant statement;

  • the existence quantifier expression form exists (...) where ... {...};

  • the implicit conjunction (logical AND) between the semicolon-separated assertions: assertion1; assertion2; ... ; assertionn;

  • variables and declarations in the invariant argument list;

  • the property reference using the period: x.property;

  • implicit and explicit type conversion: (T)x;

  • parentheses to override the priorities of the conversion and property resolution operators: (T)x.property versus((T)x).property.

1.9.2.2
Nested Quantifier Expressions

Within assertion expressions, nested quantifier statements can be formed similar to invariant statements. In fact, the universal quantifier built using the forall keyword is the same as the invariant statement. The universal quantifier can be used in a nested expression when the complexity of the problem requires it, such as in the following example:

Definition 4:
invariant(SET<T> x, y) where x.nonNull {
  x.subset(y).equal(
      forall(T element) where x.contains(element) {
        y.contains(element);
      });
};
        

The existence quantifier has the meaning as in common propositional logic. For example, the following invariant means: "SET values x and y intersect if and only if there exists an element e that is contained in both sets x and y."

Definition 5:
invariant(SET x, y) where x.nonNull {
  x.intersects(y).equal(
      exists(T e) {
        x.contains(e);
        y.contains(e);
      });
};
        

The existence quantifier may have a where-clause; however, there is no difference whether an assertion is made as a where-clause or in the body of the existence quantifier. Conversely, for universal quantifiers, the where-clause weakens the assertion since the body now only applies for values that meet the criterion in the where-clause.

1.9.3

Type Conversion

This specification defines certain allowable conversions between data types. For example, there is a pair of conversions between the Character String (ST) and Encode Data (ED). This means that if a one expects an ED value but actually has an ST value instead, one can turn the ST value into an ED.11

Three kinds of type conversions are defined: promotion, demotion, and character string literals. Type conversions can be implicit or explicit. Implicit type conversion occurs when a certain type is expected (e.g. as an argument to a statement) but a different type is actually provided. If the type provided has a conversion to the type expected the conversion should be done implicitly.

NOTE: an Implementation Technology Specification will have to specify how implicit type conversions are supported. Some technologies support it directly others do not; in any case, processing rules can be set that specify how these conversions are realized.

An explicit conversion can be specified in an assertion expression using the converted-to type name in parenthesis before the converted value. For example the following is an explicit type conversion in the where clause of an invariant statement.

Definition 6:
invariant(ED x) where ((ST)x).nonNull { ... };
      

The type conversion has lower priority than the property resolution period. Thus "(T)a.b " converts the value of the property b of variable a to data type T while "((T)a).b " converts the value of variable a to T and then references property b of that converted value.

Implicit type conversions in the assertion expressions are performed where possible. If a property's formal argument is declared of data type T; but the expression used as an actual argument is of type U; and if U does not extend T; and if U defines a conversion to T, that conversion from T to U takes effect.

1.9.3.1
Demotion

A demotion is a conversion with a net loss of information. Generally, this means that a more complex type is converted into a simple type.

An example for a demotion is the conversion from Interval (IVL) to a simple Quantity (QTY), e.g. the center of the interval. In the data type definition language, a demotion is declared using the keyword demotion and the data type name to which to demote:

Definition 7:
type Interval alias IVL {
  ...
  demotion  QTY;
  ...
};
        

The specification of demotions shall indicate what information is lost and what the major consequences of losing this information are.

1.9.3.2
Promotion

A promotion is a conversion where new information is generated. Generally, this means that a simpler type is converted into a more complex type.

For example, we allow any Quantity (QTY) to be converted to an Interval (IVL). However, IVL has more semantic properties than QTY, low and high boundary. Thus, the conversion of QTY to IVL is a promotion. The additional properties of QTY not present in IVL must assume new values, default values, or computed values. The specification of the promotion must indicate what these values are or how they can be generated.

A promoting conversion from type QTY to type IVL is defined as a semantic property of data type QTY using the keyword promotion and the data type name to which to promote:

Definition 8:
type Quantity alias QTY {
  ...
  promotion   IVL;
  ...
};
        

Typically, a promotion is defined from a simple type to a more complex type. Also typically, the simple type is declared earlier in this document than a more complex type. Declaring all promotions to complex types in the simple type would thus involve forward references and would be confusing to the reader. Therefore, an alternative syntax allows promotions to be defined in the more complex type. This is indicated by naming the type from which to promote in an argument list behind the type to which to promote.

Definition 9:
type Interval alias IVL {
  ...
  promotion   IVL (QTY x);
  ...
};
        

1.9.4

Literal Form

A literal is a character string representation of a data value. Literals are defined for many types. A literal is a type conversion from and to a Character String (ST) with a specially defined syntax.

Not every conversion from and to an ST is a literal conversion, however. A literal for a data type should be able to represent the entire value set of a data type whereas any other conversion to and from ST may only map a smaller subset of the converted data type.

The purpose of having literals is so that one can write down values in a short human readable form. For example, literals for the types integer number (INT) and real number (REAL) are strings of sign, digits, possibly a decimal point, etc. The more important interval types (IVL<REAL>, IVL< PQ>, IVL<TS>) have literal representations that allow one to use, e.g., "<5" to mean "less than 5", which is much more readable than a fully structured form of the interval. For some of the more advanced data types such as intervals, general timing specification, and parametric probability distribution we expect that the literal form may be the only form seen for representing these values until users have become used to the underlying conceptualizations.

Each literal conversion has its own syntax (grammar,) often aligned with what people find intuitive. This syntax may therefore not be completely straightforward from a computer's perspective.12

NOTE: Character string based Implementable Technology Specifications (ITS) of these abstract data types may or may not choose the literals defined here as their representations for these data types. We expect that the XML ITS, will use some but not all of the literals defined here.
1.9.4.1
Declaration

In the data type definition language we declare a literal form as a property of a data type using the keyword literal followed by the data type name ST, since the literal is a conversion to and from the ST data type.

Definition 10:
type IntegerNumber alias INT {
  ...
  literal   ST;
  ...
};
        
1.9.4.2
Definition

The actual definition of the literal form occurs outside the data type declaration body using an attribute grammar. An attribute grammar is a grammar that specifies both syntax and semantics of language structures. The syntax is defined in essentially the Backus-Naur-Form (BNF).13

For example, consider the following simple definition of a data type for cardinal numbers (positive integers.) This type definition depends only the Boolean data type (BL) and has a character string literal declared:

Definition 11:
type CardinalNumber alias CARD {
  BL  isZero;
  BL  equal(ANY x);
  CARD  successor;
  CARD  plus(CARD x);
  CARD  timesTen;
  literal   ST;
};
        
  • Syntax Definition

The literal syntax and semantics is first exposed completely and then described in all detail.

Definition 12:
CARD.literal ST {
  CARD
  : CARD digit  { $.equal($1.timesTen.plus($2); }
  | digit   { $.equal($1); };

  CARD digit
  : "0"   { $.isZero; }
  | "1"     { $.equal(0.successor); }
  | "2"     { $.equal(1.successor); }
  ...
  | "8"   { $.equal(7.successor); }
  | "9"     { $.equal(8.successor); }
};
        

Every syntactic rule consists of the name of a symbol, a colon and the definition (so called production) of the symbol. A production is a sequence of symbols. These other symbols are also defined in the grammar, or they are terminal symbols. Terminal symbols are character strings written in double quotes or string patterns (called regular expressions.) Thus the form:

Definition 13:
CARD : CARD digit | digit;
        

means, that any cardinal number symbol is a cardinal number symbol followed by a digit or just a digit. The vertical bar stands for a disjunction (logical OR.) A syntactic rule ends with a semicolon.

Every symbol has exactly one value of a defined data type. The data type of the symbol's value is declared where the symbol is defined:

Definition 14:
CARD digit : "0" | "1" | "2" | ... | "8" | "9";
        

means that the symbol digits has a value of type CARD. The start-symbol is the data type itself and does not need a separate name.

  • Semantics Definition

The semantics of the literal expression is specified in semantic rules enclosed in curly braces for each of the defined productions of a symbol:

symbol : production1 { rule1 } | production2 { rule2 } | ... | productionn { rulen };

A semantic rule is simply a semicolon-separated list of Boolean assertion expressions of the same kind as those used in invariant statements. However, there are special variables defined in the semantic rule that all begin with a dollar character (e.g., $, $1, $2, $3, ...) The simple $ stands for the value of the currently defined symbol; while $1, $2, $3, etc. stand for the values of the parts of the semantic rule's associated production. For example, in

Definition 15:
CARD
: CARD digit  { $.equal($1.timesTen.plus($2); }
| digit   { $.equal($1); };
        

the first production "CARD digit" has a semantic rule that says: the value $ of the defined symbol equals the value $1 of the first symbol CARD times ten plus the value $2 of the second symbol digit.14

  • Terminal Symbols

A terminal symbol can be specified as a string pattern, so-called regular expression. The regular expression syntax used here is the classic syntax invented by Aho and used in AWK, LEX, GREP, and PERL. Regular expressions appear between two slashes /.../. In a regular expression pattern every character except [ ] ^ $ . / : ( ) \ | ? * + { } matches itself. The other characters that are actually used in this specification are defined in Table 2.

Table 2: Special Characters for Regular Expressions
Pattern Definition
[ ... ] Specifies a character class. For example, /[A-Za-z]/ matches the characters of the upper and lower case English alphabet.
[^ ...] Specifies a character class negatively. For example, /[^BCD]/ matches any character except B, C, and D.
...? The preceding pattern is optional. For example, /ab?c/ matches "ac" and "abc".
...* The preceding pattern may occur zero or many times. For example, /ab*c/ matches "ac", "abc", "abbc", "abbbc", etc.
...+ The preceding pattern may occur one or more times. For example, /ab+c/ matches "abc", "abbc", "abbbc", but not "ac".
... {n,m} The preceding pattern may occur n to m times where n and m are cardinal numbers 0 ( n ( m. For example, /ab{2,4}c/ matches "abbc", "abbbc", and "abbbbc".
... | ... The pattern on either side of the bar may match. For example, /ab|cd/ matches "abd" and "acd" but not "abcd".
( ... ) The pattern in parentheses is used as one pattern for the above operators. For example, /a(bc)*/ matches "a", "abc", "abcbc", "abcbcbc", etc.
... : ... The left pattern matches if followed by the right pattern, but the right pattern is not consumed by a match. For example, /ab:c/ matches "abc" but not "ab", however, the value of a symbol thus matched is "ab" and the "c" is left over for the next symbol. The colon is a slight deviation from the conventional slash / but the slash is also conventionally used to enclose the entire pattern and may occur as a character to match - three meanings is one too many.
... \ ... Matches the following character literally, i.e. escapes from any special meaning of that character. For example, /a\+b/ matches "a+b".
... \/ ... Matches the slash as a character. For example, /a\/bc/ macthes "a/bc".

1.9.5

Generic Data Types

Generic data types are incomplete type definitions. This incompleteness is signified by one or more parameters to the type definition. Usually parameters stand for other types. Using parameters, a generic type might declare semantic properties of other not fully specified data types. For example, the generic data type Interval is declared with a parameter T that can stand for any Quantity data type (QTY). The components low and high are declared as being of type T.

Definition 16:
template<QTY T>
type Interval<T> alias IVL<T> {
    T low;
  T   high;
};
      

Instantiating a generic type means completing its definition. For example, to instantiate an Interval, one must specify of what base data type the interval should be. This is done by binding the parameter T. To instantiate an Interval of Integer numbers, one would bind the parameter T to the type Integer. Thus, the incomplete data type Interval is completed to the data type Interval of Integer.

For example the following type definition for MyType declares a property named "multiplicity" that is an interval of the cardinal number data type used in the above examples.

Definition 17:
type MyType alias MT {
    IVL<CARD> multiplicity;
};
      
1.9.5.1
Generic Collections

Generic data types for collections are being used throughout this specification. The most important of them are

Set (SET<T>.) A set contains elements in no particular order and without duplicate elements.

Sequence (LIST<T>.) A sequence is a collection of values in an arbitrary but particular order. A sequence has a head and a tail, where the head is an element and the tail is the sequence without its head.

Interval (IVL<T>.) An interval is a continuous subset of an ordered type.

These and other generic types are fully defined in (§ ). These generic data types and their properties are being used in this specification early on. For the best understanding of this specification knowledge about the set, sequence and interval is important and the reader is advised to refer to (§ ). when coming across a generic type being used to define another type.

1.9.5.2
Generic Type Extensions

Generic data type extensions are generic types with one parameter type that the generic type specializes. In the formal data type definition language, generic type specializations follow the pattern:

Definition 18:
template<ANY T> type GenericTypeExtensionName
              specializes T { ... };
        

These generic type extensions inherit properties of their base type and add some specific feature to it. The generic type extension is a specialization of the base type, thus a value of the extension data type can be used instead of its base data type.15

NOTE: values of extended types can be substituted for their base type. However, an ITS may make some constraints as to what extensions to accommodate. Particularly, extensions need not be defined for those components carrying the values of data value properties. Thus, while any data value can be annotated outside the data type specification, an ITS may not provide for a way to annotate the value of a data value property.

Fundamental data types
Fundamental data types

1.10

Conformance

If an application receives or parses an instance that is not valid with regard to this specification, the receiver is permitted to reject the instance in whatever fashion it deems appropriate but it is not required to. Note that some other HL7 standard or artefact such as a conformance statement may make additional constraints on behaviour in such cases.

1.11

DataValue (ANY)

Definition:      Defines the basic properties of every data value. This is an abstract type, meaning that no value can be just a data value without belonging to any concrete type. Every concrete type is a specialization of this general abstract DataValue type.

Definition 19:
abstract type DataValue alias ANY {
    TYPE  dataType;
    BN  nonNull;
    CS  nullFlavor;
    BN  isNull;
    BL  notApplicable;
    BL  unknown;
    BL  other;
    BL  equal(ANY x);
};
      

1.11.1

Properties of DataValue (ANY)

1.11.1.1
Data Type (dataType : TYPE)

Definition:      Represents the fact that every data value implicitly carries information about its own data type. Thus, given a data value one can inquire about its data type.

Definition 20:
invariant(ANY x) {
  x.dataType.nonNull;
};
        
1.11.1.2
Proper Value (nonNull : BN)

Definition:      Indicates that a value is a non-exceptional value of the data type.

Definition 21:
invariant(ANY x) {
  x.isNull.equal(x.nonNull.not);
};
        

When a property, RIM attribute, or message field is called mandatory this means that any non-NULL value of the type to which the property belongs has a non-NULL value for that property, in other words, a field may not be NULL, providing that its container (object, segment, etc.) is to have a non-NULL value.

1.11.1.3
Exceptional Value (isNull : BN)

Definition:      Indicates that a value is an exceptional value, or a NULL-value. A null value means that the information does not exist, is not available or cannot be expressed in the data type's normal value set.

Every data element has either a proper value or it is considered NULL. If (and only if) it is NULL, the provides more detail as to in what way or why no proper value is supplied.

Definition 22:
invariant(ANY x) {
  x.isNull.equal(x.nullFlavor.implies(NI));
};
        
1.11.1.4
Exceptional Value Detail (nullFlavor : CS)

Definition:      If a value is an exceptional value (NULL-value), this specifies in what way and why proper information is missing.

Definition 23:
invariant(ANY x) {
  x.nonNull.equal(x.nullFlavor.isNull);
};
        
Table 3: Domain NullFlavor:
code name definition
NI NoInformation No information whatsoever can be inferred from this exceptional value. This is the most general exceptional value. It is also the default exceptional value.
  NA not applicable No proper value is applicable in this context (e.g., last menstrual period for a male).
  UNK unknown A proper value is applicable, but not known.
    NASK not asked This information has not been sought (e.g., patient was not asked)
    ASKU asked but unknown Information was sought but not found (e.g., patient was asked but didn't know)
      NAV temporarily unavailable Information is not available at this time but it is expected that it will be available later.
  OTH other The actual value is not an element in the value domain of a variable. (e.g., concept not provided by required code system).
    PINF positive infinity Positive infinity of numbers.
    NINF negative infinity Negative infinity of numbers.
  MSK masked There is information on this item available but it has not been provided by the sender due to security, privacy or other reasons. There may be an alternate mechanism for gaining access to this information. Note: using this null flavor does provide information that may be a breach of confidentiality. Its primary purpose is for those circumstances where it is necessary to inform the receiver that the information does exist.
NP not present Value is not present in a message. This is only defined in messages, never in application data! All values not present in the message must be replaced by the applicable default, or no-information (NI) as the default of all defaults.

The null flavors are a general domain extension of all normal data types. Note the distinction between value domain of any data type and the vocabulary domain of coded data types. A vocabulary domain is a value domain for coded values, but not all value domains are vocabulary domains.

The null flavor "other" is used whenever the actual value is not in the required value domain, this may be, for example, when the value exceeds some constraints that are defined too restrictive (e.g., age less than 100 years.)

NOTE: NULL-flavors are applicable to any property of a data value or a higher-level object attribute. Where the difference of null flavors is not significant, ITS are not required to represent them. If nothing else is noted in this specification, ITS need not represent general NULL-flavors for data-value properties.

Some of these null flavors are associated with named properties that can be used as simple predicates for all data values. This is done to simplify the formulation of invariants in the remainder of this specification.

Remember the difference between semantic properties and representational "components" of data values. An ITS must only represent those components that are needed to infer the semantic properties. The null-flavor predicates ANY.nonNull, ANY.isNull, ANY.notApplicable, ANY.unknown, and ANY.other can all be inferred from the property.

1.11.1.5
Inapplicable Proper Value (notApplicable : BL)

Definition:      A predicate indicating that this exceptional value is of ANY.nullFlavor not-applicable (NA), i.e., that a proper value is not meaningful in the given context.

Definition 24:
invariant(ANY x) {
  x.notApplicable.equal(x.nullFlavor.implies(NA));
};
        
1.11.1.6
unknown (unknown : BL)

Definition:      A predicate indicating that this exceptional value is of ANY.nullFlavor unknown (UNK).

Definition 25:
invariant(ANY x) {
  x.unknown.equal(x.nullFlavor.implies(UNK));
};
        
1.11.1.7
Value Domain Exception (other : BL)

Definition:      A predicate indicating that this exceptional value is of ANY.nullFlavor other (OTH), i.e., that the required value domain does not contain the appropriate value.

Definition 26:
invariant(ANY x) {
  x.other.equal(x.nullFlavor.implies(OTH));
};
        
1.11.1.8
Equality (equal : BL)

Definition:      Equality is a reflexive, symmetric, and transitive relation between any two data values. Only proper values can be equal, null values never are equal (even if they have the same null flavor.)

Definition 27:
invariant(ANY x, y, z)
  where x.nonNull.and(y.nonNull).and(z.nonNull)
{
  x.equal(x);                       /* reflexivity */
  x.equal(y).equal(y.equal(x));     /* symmetry */
  x.equal(y).and(y.equal(z)).
        implies(x.equal(z))         /* transitivity */
  x.equal(y).implies(x.dataType.equal(y.dataType);
};
        

How equality is determined must be defined for each data type. If nothing else is specified, two data values are equal if they are indistinguishable, that is, if they differ in none of their semantic properties. A data type can "override" this general definition of equality, by specifying its own equal relationship. This overriding of the equality relation can be used to exclude semantic properties from the equality test. If a data type excludes semantic properties from its definition of equality, this implies that certain properties (or aspects of properties) that are not part of the equality test are not essential to the meaning of the value.

For example the physical quantity has the two semantic properties (1) a real number and (2) a coded unit of measure. The equality test, however, must account for the fact that, e.g., 1 meter equals 100 centimeters; independent equality of the two semantic properties is too strong a criterion for the equality test. Therefore, physical quantity must override the equality definition.

1.12

DataType (TYPE) specializes ANY

Definition:      A meta-type declared in order to allow the formal definitions to speak about the data type of a value. Any data type defined in this specification is a value of the type DataType.

Definition 28:
private type DataType alias TYPE specializes DataValue {
    CS  shortName;
    CS  longName;

    BN	implies(TYPE that);
};

        

1.12.1

Short Name (shortName : CS)

Definition:      A CS specifying the alias of the data type.

Definition 29:
invariant(DataType x) where x.nonNull {
  x.shortName.nonNull;
};
           

1.12.2

Long Name (longName : CS)

Definition:      A CS specifying the full name of the data type.

1.12.3

Implies (implies : BN)

Definition:      A data type implies another data type if it has the same type or is a specialisation of it.

2

Basic Types

2.1

Boolean (BL) specializes ANY

Definition:      The Boolean type stands for the values of two-valued logic. A Boolean value can be either true or false, or, as any other value may be NULL.

Definition 30:
type Boolean alias BL specializes ANY
    values(true, false)
{
            BL  and(BL x);
            BL  not;
  literal   ST;
            BL  or(BL x);
            BL  xor(BL x);
            BL  implies(BL x);
};
    

With any data value potentially being NULL, the two-valued logic is effectively extended to a three-valued logic as shown in the following truth tables:

Table 4: Truth tables for Boolean logic with NULL values
NOT   AND true false NULL OR true false NULL
true false true true false NULL true true true true
false true false false false false false true false NULL
NULL NULL NULL NULL false NULL NULL true NULL NULL

Where a boolean operation is performed upon 2 data types with different nullFlavors, the nullFlavor of the result is the first common ancestor of the 2 different nullFlavors, though conformant applications may also create a result that is any common ancestor

2.1.1

Properties of Boolean (BL)

2.1.1.1
Negation (not : BL)

Definition:      Negation of a Boolean turns true into false and false into true and is NULL for NULL values.

Definition 31:
invariant(BL x) {
  true.not.equal(false);
  false.not.equal(true);
  x.isNull.equal(x.not.isNull);
};
      
2.1.1.2
Conjunction (and : BL)

Definition:      Conjunction (AND) is associative and commutative, with true as a neutral element. False AND any Boolean value is false. These rules hold even if one or both of the operands are NULL. If both operands for AND are NULL, the result is NULL.

Definition 32:
invariant(BL x) {
  x.and(true).equal(x);
  x.and(false).equal(false);
  x.isNull.implies(x.and(y).isNull);
};
      
2.1.1.3
Disjunction (or : BL)

Definition:      The disjunction x OR y is false if and only if x is false and y is false.

Definition 33:
invariant(BL x, y) {
  x.or(y).equal(x.not.and(y.not).not);
};
      
2.1.1.4
Exclusive Disjunction (xor : BL)

Definition:      The exclusive-OR constrains OR such that the two operands may not both be true.

Definition 34:
invariant(BL x, y) {
  x.xor(y).equal(x.or(y).and(x.and(y).not));
};
      
2.1.1.5
Implication (implies : BL)

Definition:      A rule of the form IF condition THEN conclusion. Logically the implication is defined as the disjunction of the negated condition and the conclusion, meaning that when the condition is true the conclusion must be true to make the overall statement true. The logical implication is important to make invariant statements.

Definition 35:
invariant(BL condition, conclusion) {
  condition.implies(conclusion).equal(
         condition.not.or(conclusion));
};
      

The implication is not reversible and does not specify what is true when the condition is false (ex falso quodlibet lat. “from false follows anything”).

2.1.1.6
Literal Form

The literal form of the Boolean is determined by the named values specified in the values clause, i.e., true and false.

2.2

BooleanNonNull (BN) specializes BL

Definition:      The BooleanNonNull constrains the boolean type so that the value may not be NULL. This type is created for use within the data types specification where it is not appropriate for a null value to be used

Definition 36:
private type BooleanNonNull alias BN specializes BL;
};
    

2.2.1

Properties of BooleanNonNull (BN)

2.2.1.1
isNull (isNull : BN)
Definition 37:
invariant (BN x) {
  x.isNull.not
};
    

Overview of Text and Multimedia Data Types
Overview of Text and Multimedia Data Types

2.3

Binary Data (BIN) specializes LIST<BN>

Definition:      Binary data is a raw block of bits. Binary data is a protected type that should not be declared outside the data type specification.

A bit is semantically identical with a non-null Boolean value. Thus, all binary data is — semantically — a sequence of non-null Boolean values.

Definition 38:
protected type BinaryData alias BIN specializes LIST<BN>;
    
NOTE: the representation of arbitrary binary data is the responsibility of an ITS. How the ITS accomplishes this depends on the underlying Implementation Technology (whether it is character-based or binary) and on the represented data. Semantically character data is represented as binary data, however, a character-based ITS should not convert character data into arbitrary binary data and then represent binary data in a character encoding. Ultimately even character-based implementation technology will communicate binary data.

An empty sequence is not considered binary data but counts as a NULL-value. In other words, non-NULL binary data contains at least one bit. No bit in a non-NULL binary data value can be NULL.

Definition 39:
invariant(BIN x) where x.nonNull {
  x.notEmpty;
  x.length.greaterThan(0);
};
      

2.4

Encapsulated Data (ED) specializes BIN

Definition:      Data that is primarily intended for human interpretation or for further machine processing outside the scope of HL7. This includes unformatted or formatted written language, multimedia data, or structured information in as defined by a different standard (e.g., XML-signatures.) Instead of the data itself, an ED may contain only a reference (see TEL.) Note that the ST data type is a specialization of the ED data type when the ED media type is text/plain.

Table 5: Property Summary of Encapsulated Data
Name Type Description
mediaType CS Identifies the type of the encapsulated data and identifies a method to interpret or render the data.
charset CS For character-based encoding types, this property specifies the character set and character encoding used. The charset shall be identified by an Internet Assigned Numbers Authority (IANA) Charset Registration [] in accordance with RFC 2978 [].
language CS For character based information the language property specifies the human language of the text.
compression CS Indicates whether the raw byte data is compressed, and what compression algorithm was used.
reference TEL A telecommunication address (TEL), such as a URL for HTTP or FTP, which will resolve to precisely the same binary data that could as well have been provided as inline data.
integrityCheck BIN The integrity check is a short binary value representing a cryptographically strong checksum that is calculated over the binary data. The purpose of this property, when communicated with a reference is for anyone to validate later whether the reference still resolved to the same data that the reference resolved to when the encapsulated data value with reference was created.
integrityCheckAlgorithm CS Specifies the algorithm used to compute the integrityCheck value.

The cryptographically strong checksum algorithm Secure Hash Algorithm-1 (SHA-1) is currently the industry standard. It has superseded the MD5 algorithm only a couple of years ago, when certain flaws in the security of MD5 were discovered. Currently the SHA-1 hash algorithm is the default choice for the integrity check algorithm. Note that SHA-256 is also entering widespread usage.

thumbnail ED An abbreviated rendition of the full data. A thumbnail requires significantly fewer resources than the full data, while still maintaining some distinctive similarity with the full data. A thumbnail is typically used with by-reference encapsulated data. It allows a user to select data more efficiently before actually downloading through the reference.
Definition 40:
type EncapsulatedData alias ED specializes  BIN {
  CS   mediaType;
  CS   charset;
  CS   language;
  CS   compression;
  TEL  reference;
  BIN  integrityCheck;
  CS   integrityCheckAlgorithm;
  ED   thumbnail;
  BL   equal(ANY x);
};
    

Encapsulated data can be present in two forms, inline or by reference. Inline data is communicated or moved as part of the encapsulated data value, whereas by-reference data may reside at a different (remote) location. The data is the same whether it is located inline or remote.

2.4.1

Properties of Encapsulated Data (ED)

2.4.1.1
Media Type (mediaType : CS, default text/plain)

Definition:      Identifies the type of the encapsulated data and identifies a method to interpret or render the data.

The mediaType is a mandatory property, i.e., every non-NULL instance of ED must have a non-NULL medaType property.

Definition 41:
invariant(ED x) where x.nonNull {
  x.mediaType.nonNull;
};
        

The IANA defined domain of media types is established by the Internet standard RFC 2045 [http://www.ietf.org/rfc/rfc2045.txt] and 2046 [http://www.ietf.org/rfc/rfc2046.txt]. RFC 2046 defines the media type to consist of two parts:

  1. top level media type, and

  2. media subtype.

However, this specification treats the entire media type as one atomic code symbol in the form defined by IANA, i.e., top level type followed by a slash "/" followed by media subtype. Currently defined media types are registered in a database [http://www.iana.org/assignments/media-types/index.html] maintained by IANA. Currently more than 160 different MIME media types are defined, with the list growing rapidly. In general, all those types defined by the IANA may be used.

To promote interoperability, this specification prefers certain media types to others. This is to define a greatest common denominator on which interoperability is not only possible, but that is powerful enough to support even advanced multimedia communication needs.

Table 6 below assigns a status to certain MIME media types, where the status means one of the following:

  • required: Every HL7 application must support at least the required media types if it supports a given kind of media. One required media-type for each kind of media exists. Some media types are required for a specific purpose, which is then indicated as "required for ..."

  • recommended: Other media types are recommended for a particular purpose. For any given purpose there should be only very few additionally recommended media types and the rationale, conditions and assumptions of such recommendations must be made very clear.

  • indifferent: This status means, HL7 neither forbids nor endorses the use of this media type. All media types not mentioned in Table 6 have status indifferent by default. Since there is one required and several recommended media types for most practically relevant use cases, media types of this status should be used very conservatively.

  • deprecated: Deprecated media types should not be used, because these media types are flawed, because there are better alternatives, or because of certain risks. Such risks could be security risks, for example, the risk that such a media type could spread computer viruses. Not every flawed media type is marked as deprecated, though. A media type that is not mentioned in Table 6, and thus has status indifferent, may well be flawed.

Table 6: Domain MediaType:
code name status definition
text/plain  Plain Text  required  For any plain text. This is the default and is equivalent to a character string (ST) data type. 
text/x-hl7-ft  HL7 Text  recommended  For compatibility, this represents the HL7 v2.x FT data type. Its use is recommended only for backward compatibility with HL7 v2.x systems. 
text/html  HTML Text  recommended  For marked-up text according to the Hypertext Mark-up Language. HTML markup is sufficient for typographically marking-up most written-text documents. HTML is platform independent and widely deployed. 
application/pdf  PDF  recommended  The Portable Document Format is recommended for written text that is completely laid out and read-only. PDF is a platform independent, widely deployed, and open specification with freely available creation and rendering tools. 
text/xml  XML Text  indifferent  For structured character based data. There is a risk that general SGML/XML is too powerful to allow a sharing of general SGML/XML documents between different applications. 
text/rtf  RTF Text  indifferent  The Rich Text Format is widely used to share word-processor documents. However, RTF does have compatibility problems, as it is quite dependent on the word processor. May be useful if word processor edit-able text should be shared. 
application/msword  MSWORD  deprecated  This format is very prone to compatibility problems. If sharing of edit-able text is required, text/plain, text/html or text/rtf should be used instead. 
audio/basic  Basic Audio  required  This is a format for single channel audio, encoded using 8bit ISDN mu-law [PCM] at a sample rate of 8000 Hz. This format is standardized by: CCITT, Fascicle III.4 -Recommendation G.711. Pulse Code Modulation (PCM) of Voice Frequencies. Geneva, 1972. 
audio/mpeg  MPEG audio layer 3  required  MPEG-1 Audio layer-3 is an audio compression algorithm and file format defined in ISO 11172-3 and ISO 13818-3. MP3 has an adjustable sampling frequency for highly compressed telephone to CD quality audio. 
audio/k32adpcm  K32ADPCM Audio  indifferent  ADPCM allows compressing audio data. It is defined in the Internet specification RFC 2421 [ftp://ftp.isi.edu/in-notes/rfc2421.txt]. Its implementation base is unclear. 
image/png  PNG Image  required  Portable Network Graphics (PNG) [http://www.cdrom.com/pub/png] is a widely supported lossless image compression standard with open source code available. 
image/gif  GIF Image  indifferent  GIF is a popular format that is universally well supported. However GIF is patent encumbered and should therefore be used with caution. 
image/jpeg  JPEG Image  required  This format is required for high compression of high color photographs. It is a "lossy" compression, but the difference to lossless compression is almost unnoticeable to the human vision. 
application/dicom    recommended  Digital Imaging and Communications in Medicine (DICOM) MIME type defined in RFC3240 [href="http://ietf.org/rfc/rfc3240.txt].  
image/g3fax  G3Fax Image  recommended  This is recommended only for fax applications. 
image/tiff  TIFF Image  indifferent  Although TIFF (Tag Image File Format) is an international standard it has many interoperability problems in practice. Too many different versions that are not handled by all software alike. 
video/mpeg  MPEG Video  required  MPEG is an international standard, widely deployed, highly efficient for high color video; open source code exists; highly interoperable. 
video/x-avi  X-AVI Video  deprecated  The AVI file format is just a wrapper for many different codecs; it is a source of many interoperability problems. 
model/vrml  VRML Model  recommended  This is an openly standardized format for 3D models that can be useful for virtual reality applications such as anatomy or biochemical research (visualization of the steric structure of macromolecules) 

The set of required media types is very small so that no undue requirements are forced on HL7 applications, especially legacy systems. In general, no HL7 application is forced to support any given kind of media other than written text. For example, many systems just do not want to receive audio data, because those systems can only show written text to their users. It is a matter of application conformance statements to say: "I will not handle audio". Only if a system claims to handle audio media, it must support the required media type for audio.

2.4.1.2
Charset (charset : CS)

Definition:      For character-based encoding types, this property specifies the character set and character encoding used. The charset shall be identified by an Internet Assigned Numbers Authority (IANA) Charset Registration [http://www.iana.org/assignments/character-sets] in accordance with RFC 2978 [http://www.ietf.org/rfc/rfc2978.txt].

The charset domain is maintained by the Internet Assigned Numbers Authority (IANA) [http://www.iana.org/assignments/character-sets]. The IANA source specifies names and multiple aliases for most character sets. For HL7's purposes, use of multiple alias names is not allowed. The standard name for HL7 is the one marked by IANA as "preferred for MIME." If IANA has not marked one of the aliases as "preferred for MIME" the main name shall be the one used for HL7.

Table 7 lists a few of the IANA defined character sets that are of interest to current HL7 members.

Table 7: Domain Charset:
code name definition
EBCDIC EBCDIC HL7 is indifferent to the use of this Charset.
ISO-10646-UCS-2 ISO-10646-UCS-2 Deprecated for HL7 use.
ISO-10646-UCS-4 ISO-10646-UCS-4 Deprecated for HL7 use.
ISO-8859-1 ISO-8859-1 HL7 is indifferent to the use of this Charset.
ISO-8859-2 ISO-8859-2 HL7 is indifferent to the use of this Charset.
ISO-8859-5 ISO-8859-5 HL7 is indifferent to the use of this Charset.
JIS-2022-JP JIS-2022-JP HL7 is indifferent to the use of this Charset.
US-ASCII US-ASCII Required for HL7 use.
UTF-7 UTF-7 HL7 is indifferent to the use of this Charset.
UTF-8 UTF-8 Required for Unicode support.
NOTE: The above list is not complete let alone exclusive. In particular, international HL7 affiliates may make special recommendations about charsets to be used in their realm. These recommendations may add additional charsets and may reassign the recommendations status of a listed charset.

The charset property needs to be known where the data of the ED is character type data in any form. If the data is provided in-line, then the charset must be known. If the data is provided as a reference, and the access method does not provide the charset for the data, typically as a mime header, then the charset must be conveyed as part of the ED.

Interested readers may also want to consult the "Character Model for the World Wide Web" [http://www.w3.org/TR/charmod] for a more complete discussion of character set and related issues

2.4.1.3
Language (language : CS)

Definition:      For character based information the language property specifies the human language of the text.

The need for a language code for text data values is documented in RFC 2277, IETF Policy on Character Sets and Languages [http://www.ietf.org/rfc/rfc2277.txt]. Further background information can be found in Using International Characters in Internet Mail [http://www.imc.org/mail-i18n.html], a memo by the Internet Mail Consortium.

The principles of the code domain of this attribute are specified by the Internet standard RFC 3066. The RFC 3066 coding scheme is constructed from a primary subtag component encoded using the language codes of ISO 639, plus two codes for extensions for languages not represented in ISO 639. The code optionally includes a second subtag component encoded using the two letter country codes of ISO 3166, or a language code extension registered by the Internet Assigned Names Authority [http://www.iana.org/assignments/language-tags].17

While Language tags usually alter the meaning of the text, the language does not alter the meaning of the characters in the text. 18

NOTE: Representation of language tags to text is highly dependent on the ITS. An ITS may use the native way of language tagging provided by its target implementation technology. Some may have language information in a separate component, e.g., XML has the xml:lang tag for strings. Others may rely on language tags as part of the binary character string representation, e.g., ISO 10646 (Unicode) and its "plane-14" language tags.

The language tag should not be mandatory if it is not mandatory in the implementation technology. Semantically, language tagging of strings follows a default-logic. In circumstances where a realm may support multiple langauges, it is up to the realm to define rules to handle language where none is specified when no language is specified. If no other rule is specified, the local language of the reader is assumed. If a language is set for an entire message or document, that language is the default. If any information element or value that is superior in the syntax hierarchy specifies a language, that language is the default for all subordinate text values.

If language tags are present in the beginning of the encoded binary text (e.g., through Unicode's plane-14 tags) this is the source of the language property of the encapsulated data value.

2.4.1.4
Compression (compression : CS, default NULL)

Definition:      Indicates whether the raw byte data is compressed, and what compression algorithm was used.

Table 8: Domain CompressionAlgorithm:
code name definition
DF deflate The deflate compressed data format as specified in RFC 1951 [ftp://ftp.isi.edu/in-notes/rfc1951.txt].
GZ gzip A compressed data format that is compatible with the widely used GZIP utility as specified in RFC 1952 [ftp://ftp.isi.edu/in-notes/rfc1952.txt] (uses the deflate algorithm).
ZL zlib A compressed data format that also uses the deflate algorithm. Specified as RFC 1950 [ftp://ftp.isi.edu/in-notes/rfc1950.txt]
Z compress Original UNIX compress algorithm and file format using the LZC algorithm (a variant of LZW). Patent encumbered and less efficient than deflate.

Character strings may never be compressed.

2.4.1.5
Reference (reference : TEL)

Definition:      A telecommunication address (TEL), such as a URL for HTTP or FTP, which will resolve to precisely the same binary data that could as well have been provided as inline data.

The semantic value of an encapsulated data value is the same, regardless whether the data is present inline data or just by-reference. However, an encapsulated data value without inline data behaves differently, since any attempt to examine the data requires the data to be downloaded from the reference. An encapsulated data value may have both inline data and a reference.

The reference must point to the same data as provided inline. It is an error if the data resolved through the reference does not match either the integrity check, in-line data, or data that had earlier been retrieved through the reference and then cached.

The reference may contain a usablePeriod to indicate that the data may only be available for a limited period of time. Whether the reference is limited by a usablePeriod or not, the content of the reference is fixed for all time. Any application using the reference must always receive the same data. The reference cannot be reused to send a different version of the same data, or different data.

By-reference encapsulated data may not be allowed depending on the attribute or component that is declared encapsulated data. Character strings must always be inline.

2.4.1.6
Integrity Check (integrityCheck : BIN)

Definition:      The integrity check is a short binary value representing a cryptographically strong checksum that is calculated over the binary data. The purpose of this property, when communicated with a reference is for anyone to validate later whether the reference still resolved to the same data that the reference resolved to when the encapsulated data value with reference was created.

It is an error if the data resolved through the reference does not match the integrity check.

The integrity check is calculated according to the ED.integrityCheckAlgorithm. By default, the Secure Hash Algorithm-1 (SHA-1) shall be used. The integrity check is binary encoded according to the rules of the integrity check algorithm.

The integrity check is calculated over the raw binary data that is contained in the data component, or that is accessible through the reference. No transformations are made before the integrity check is calculated. If the data is compressed, the Integrity Check is calculated over the compressed data.

2.4.1.7
Integrity Check Algorithm (integrityCheckAlgorithm : CS, default SHA-1)

Definition:      Specifies the algorithm used to compute the integrityCheck value.19

Table 9: Domain IntegrityCheckAlgorithm:
code name definition
SHA-1 secure hash algorithm - 1 This algorithm is defined in FIPS PUB 180-1: Secure Hash Standard. As of April 17, 1995.
SHA-256 secure hash algorithm - 256 This algorithm is defined in FIPS PUB 180-2: Secure Hash Standard.
2.4.1.8
Thumbnail (thumbnail : ED, default NULL)

Definition:      An abbreviated rendition of the full data. A thumbnail requires significantly fewer resources than the full data, while still maintaining some distinctive similarity with the full data. A thumbnail is typically used with by-reference encapsulated data. It allows a user to select data more efficiently before actually downloading through the reference.

Originally, the term thumbnail refers to an image in a lower resolution (or smaller size) than another image. However, the thumbnail concept can be metaphorically used for media types other than images. For example, a movie may be represented by a shorter clip; an audio-clip may be represented by another audio-clip that is shorter, has a lower sampling rate, or a lossy compression.

Thumbnails may not be allowed depending on the attribute or component that is declared encapsulated data. Values of type ST never have thumbnails, and a thumbnail may not itself contain a thumbnail.

Definition 42:
invariant(ED x) where x.thumbnail.nonNull {
  x.thumbnail.thumbnail.isNull;
};
          
NOTE: The ITS should consider the case where the thumbnail and the original both have the same properties of type, charset and compression. In this case, these properties need not be represented explicitly for the thumbnail but might be "inherited" from the main encapsulated data value to its thumbnail.
2.4.1.9
Equality (equal : BL, inherited from ANY)

Two values of type ED are equal if and only if their mediatype and data are equal. For those ED values with compressed data or referenced data, only the de-referenced and uncompressed data counts for the equality test. The compression, thumbnail and reference property themselves are excluded from the equality test. In addition the language property is excluded from the test, due to the problems this would introduce values of type ED where the language is not specified. If the ED.mediaType is character based and the charset property is not equal, the charset property must be resolved through mapping of the data between the different character sets.

The integrity check algorithm and integrity check is excluded from the equality test. However, since equality of integrity check value is strong indication for equality of the data, the equality test can be practically based on the integrity check, given equal integrity check algorithm properties.

2.5

Character String (ST) specializes ED

Definition:      The character string data type stands for text data, primarily intended for machine processing (e.g., sorting, querying, indexing, etc.) Used for names, symbols, and formal expressions.

The character string is a restricted encapsulated data type (ED), whose type property is fixed to text/plain, and whose data must be inlined and not compressed. Thus, the properties compression, reference, integrity check, algorithm, and thumbnail are not applicable. The character string data type is used when the appearance of text does not bear meaning, which is true for formalized text and all kinds of names.

Table 10: Property Summary of Character String
Name Type Description
mediaType CS Identifies the type of the encapsulated data and identifies a method to interpret or render the data.
charset CS For character-based encoding types, this property specifies the character set and character encoding used. The charset shall be identified by an Internet Assigned Numbers Authority (IANA) Charset Registration [] in accordance with RFC 2978 [].

The character string (ST) data type interprets the encapsulated data as character data (as opposed to bits), depending on the charset property of the encapsulated data type.

Definition 43:
type CharacterString alias ST specializes ED {
    INT   length;
    ST    headCharacter;
    ST    tailString;
};
    
NOTE: Because many of the properties of the encapsulated data are bound to a default value, an ITS need not represent these properties at all. In fact, if the character encoding is also fixed, the ITS only represents the encoded character data.

The headCharacter and tailString properties define ST as a sequence of entities each of which uniquely identifies one character from the joint set of all characters known by any language of the world. 20 The length of a character string is the number of characters in the string.

The head of a string is a string of only one character. A character string must at least have one character or else it is NULL. A zero-length string is an exceptional value (NULL), not a proper character string value.

Definition 44:
invariant(ST x) where x.nonNull {
  x.headCharacter.notEmpty;
  x.headCharacter.length.equal(1);
  x.headCharacter.tailString.isEmpty;
  x.tailString.isEmpty.implies(x.length.equal(1));
  x.tailString.notEmpty.implies(x.length
           .equal(x.tailString.length.successor));
};
    

The length of a string is the number of characters, not the number of encoded bytes. Byte encoding is an ITS issue and is not relevant on the application layer.

The following rules apply to whitespace contained within character strings:

  • TAB, space and end-of-line are all considered whitespace characters.

  • Both preceding and trailing whitespace is significant.

  • Different whitespace characters are not interchangable.

  • Different representations of end-of-line are normalised according to the method described in the XML specification [http://www.w3.org/TR/2000/REC-xml-20001006#sec-line-ends]

  • Sequences of whitespace cannot be compressed to shorter sequences.

Requirement:
ST is a specialization of ED so that any RIM attribute which has the type ED can be constrained to a ST. The most important case is Act.text, which is an ED to cater for the use of references and multimedia data, but is often constrained to plain text.

2.5.1

Properties of Character String (ST)

2.5.1.1
Media Type (mediaType : CS, default text/plain, inherited from ED)
Definition 45:
invariant(ST x) where x.nonNull {
  x.mediaType.equal("text/plain");
};
       

Fixed to be "text/plain".

2.5.1.2
Charset (charset : CS, inherited from ED)
Definition 46:
invariant(ST x) where x.nonNull {
  x.charset.nonNull;
};
       

Values of type ST must have a known charset.

2.5.1.3
Compression (compression : CS, default NULL, fixed)
Definition 47:
invariant(ST x) where x.nonNull {
  x.compression.notApplicable;
};
       

Values of type ST cannot be compressed.

2.5.1.4
Reference (reference : TEL, fixed)
Definition 48:
invariant(ST x) where x.nonNull {
  x.reference.notApplicable;
};
       

Values of type ST may not reference content from some other location.

2.5.1.5
Integrity Check (integrityCheck : BIN, fixed)
Definition 49:
invariant(ST x) where x.nonNull {
  x.integrityCheck.notApplicable;
};
       

Integrity check code is not used with values of type ST.

2.5.1.6
Integrity Check Algorithm (integrityCheckAlgorithm : CS, default SHA-1, fixed)
Definition 50:
invariant(ST x) where x.nonNull {
  x.integrityCheckAlgorithm.notApplicable;
};
       

Integrity check code is not used with values of type ST.

2.5.1.7
Thumbnail (thumbnail : ED, default NULL, fixed)
Definition 51:
invariant(ST x) where x.nonNull {
  x.thumbnail.notApplicable;
};
       

Values of type ST do not have thumbnails.

2.5.1.8
Literal Form

Two variations of character string literals are defined, a token form and a quoted string.21 The token form consists only of the lower case and upper case Latin alphabet, the ten decimal digits and the underscore. The quoted string can contain any character between double-quotes. The double quotes prevent a character string from being interpreted as some other literal. The token form allows keywords and names to be parsed from the data type specification language.

Definition 52:
ST.literal ST {
  ST : /"[^]+"/ { $.equal($1); }         /* quoted string */
     | /[a-zA-Z0-9_]+/ { $.equal($1); }; /* token form */
};
      
NOTE: Since character string literals are so fundamental to implementation technology, most ITS will specify some modified character string literal form. However, ITS designers must be aware of the interaction between the character string literal form and the literal forms defined for other data types. This is particularly critical if the other data type's literal form is structured with major components separated by break-characters (e.g., real number, physical quantity, set, and list literals, etc.)

The Concept Descriptor information model.
The Concept Descriptor information model.

2.6

Concept Descriptor (CD) specializes ANY

Definition:      A concept descriptor represents any kind of concept usually by giving a code defined in a code system. A concept descriptor can contain the original text or phrase that served as the basis of the coding and one or more translations into different coding systems. A concept descriptor can also contain qualifiers to describe, e.g., the concept of a "left foot" as a postcoordinated term built from the primary code "FOOT" and the qualifier "LEFT". In cases of an exceptional value, the concept descriptor need not contain a code but only the original text describing that concept.

Table 11: Property Summary of Concept Descriptor
Name Type Description
code ST The plain code symbol defined by the code system. For example, "784.0" is the code symbol of the ICD-9 code "784.0" for headache.
codeSystem UID Specifies the code system that defines the code.
codeSystemName ST The common name of the coding system.
codeSystemVersion ST If applicable, a version descriptor defined specifically for the given code system.
displayName ST A name or title for the code, under which the sending system shows the code value to its users.
originalText ED The text or phrase used as the basis for the coding.
translation SET<CD> A set of other concept descriptors that translate this concept descriptor into other code systems.
qualifier LIST<CR> Specifies additional codes that increase the specificity of the the primary code.
Definition 53:
type ConceptDescriptor alias CD specializes ANY {

            ST    code;
            ST    displayName;
            UID   codeSystem;
            ST    codeSystemName;
            ST    codeSystemVersion;
            ED    originalText;
            LIST<CR>  qualifier;
            SET<CD>   translation;
            CS	  codingRationale;
            BL  equal(ANY x);
            BL  implies(CD x);
  demotion  ED;
};
    

The concept descriptor is mostly used in one of its restricted or “profiled” forms, CS, CE, CV.

2.6.1

Properties of Concept Descriptor (CD)

2.6.1.1
Code (code : ST, default NULL)

Definition:      The plain code symbol defined by the code system. For example, "784.0" is the code symbol of the ICD-9 code "784.0" for headache.

A non-exceptional CD value has a non-NULL code property whose value is a character string that is a symbol defined by the coding system identified by the codeSystem property. Conversely, a CD value without a value for the code property, or with a value that is not from the cited coding system is an exceptional value (NULL of flavor other).

Definition 54:
invariant(CD x) where x.nonNull {
  x.code.nonNull;
};
      
2.6.1.2
Code System (codeSystem : UID)

Definition:      Specifies the code system that defines the code.

Code systems shall be referred to by Unique Identifier (UID). The UID allows unambiguous reference to standard HL7 codes, other standard code systems, as well as local codes. HL7 shall assign an UID to each of its code tables as well as to external standard coding systems that are being used with HL7. Local sites must use their ISO Object Identifier (OID) to construct a globally unique local coding system identifier.

Under HL7's branch, 2.16.840.1.113883, the sub-branches 5 and 6 contain HL7 standard and external code system identifiers respectively. The HL7 Vocabulary Technical Committee maintains these two branches.

A non-exceptional CD value (i.e. a CD value that has a non-null code property) has a non-NULL code system specifying the system of concepts that defines the code. In other words whenever there is a code there is also a code system.

NOTE: although every non-NULL CD value has a defined code system, in some circumstances, the ITS representation for the CD value needs not explicitly mention the code system. For example, when the context mandates one and only one code system to be used specifying the code system explicitly would be redundant. However, in that case the code system property assumes that context-specific default value and is not NULL.
Definition 55:
invariant(CD x) where x.code.nonNull {
  x.codeSystem.nonNull;
};
      

An exceptional CD of NULL-flavor "other" indicates that a concept could not be coded in the coding system specified. Thus, for these coding exceptions, the code system that did not contain the appropriate concept must be provided in the code system property.

Some code domains are qualified such that they include the portion of any pertinent local coding system that does not simply paraphrase the standard coding system (coded with extensibility, CWE.) If a CWE qualified field actually contains such a local code, the coding system must specify the local coding system from which the local code was taken. However, for CWE domains the local code is a valid member of the domain, so that local codes in CWE domains constitute neither an error nor an exceptional (NULL/other) value in the sense of this specification.

Definition 56:
invariant(CD x) where x.other {
  x.code.other;
  x.codeSystem.nonNull;
};
      
2.6.1.3
Code System Name (codeSystemName : ST, default NULL)

Definition:      The common name of the coding system.

The code system name has no computational value. The purpose of a code system name is to assist an unaided human interpreter of a code value to interpret the code system UID. It is suggested — though not absolutely required — that ITS provide for code system name fields in order to annotate the UID for human comprehension.

HL7 systems must not functionally rely on the code system name. The code system name can never modify the meaning of the code system UID value and cannot exist without the UID value.

Definition 57:
invariant(CD x) {
  x.codeSystemName.nonNull.implies(x.codeSystem.nonNull);
};
      
2.6.1.4
Code System Version (codeSystemVersion : ST, default NULL)

Definition:      If applicable, a version descriptor defined specifically for the given code system.

HL7 shall specify how these version strings are formed for each external code system. If HL7 has not specified how version strings are formed for a particular coding system, version designations have no defined meaning for such coding system.

Different versions of one code system must be compatible. Whenever a code system changes in an incompatible way, it will constitute a new code system, not simply a different version, regardless of how the vocabulary publisher calls it.

For example, the publisher of ICD-9 and ICD-10 calls these code systems, "revision 9" and "revision 10" respectively. However, ICD-10 is a complete redesign of the ICD code, not a backward compatible version. Therefore, for the purpose of this data type specification, ICD-9 and ICD-10 are different code systems, not just different versions. By contrast, when LOINC updates from revision "1.0j" to "1.0k", HL7 would consider this to be just another version of LOINC, since LOINC revisions are backwards compatible.

Definition 58:
invariant(CD x) {
  x.codeSystemVersion.nonNull.implies(x.codeSystem.nonNull);
};
      
2.6.1.5
Display Name (displayName : ST, default NULL)

Definition:      A name or title for the code, under which the sending system shows the code value to its users.

The display name is included both as a courtesy to an unaided human interpreter of a code value and as a documentation of the name used to display the concept to the user. The display name has no functional meaning; it can never exist without a code; and it can never modify the meaning of the code.

NOTE: HL7 offers a "print name" in it's predefined vocabulary domains. These values are suitable for use in the displayName.
NOTE: Display names may not alter the meaning of the code value. Therefore, display names should not be presented to the user on a receiving application system without ascertaining that the display name adequately represents the concept referred to by the code value. Communication must not simply rely on the display name. The display name's main purpose is to support debugging of HL7 protocol data units (e.g., messages.)
Definition 59:
invariant(CD x) {
  x.displayName.nonNull.implies(x.code.nonNull);
};
      
2.6.1.6
Original Text (originalText : ED, default NULL)

Definition:      The text or phrase used as the basis for the coding.

The original text exists in a scenario where an originator of the information does not assign a code, but where the code is assigned later by a coder (post-coding.) In the production of a concept descriptor, original text may thus exist without a code.

NOTE: Although post-coding is often performed from free text information, such as documents, scanned images or dictation, multi-media data is explicitly not permitted as original text. Also, the original text property is not meant to be a link into the entire source document. The link between different artifacts of medical information (e.g., document and coded result) is outside the scope of this specification and is maintained elsewhere in the HL7 standards. The original text is an excerpt of the relevant information in the original sources, rather than a pointer or exact reproduction. Thus the original text is to be represented in plain text form.

Values of type CD may have a non-NULL original text property despite having a NULL code property. Any CD value with the code property of NULL signifies a coding exception. In this case, the originalText property is a name or description of the concept that was not coded. Such exceptional CD may contain translations. Such translations directly encode the concept described in the original text property.

A concept descriptor can be demoted into a character string (ST) value representing only the original text of the CD value.

Definition 60:
invariant(CD x) where x.originalText.nonNull {
  ((ST)x).equal(x.originalText);
};
      
2.6.1.7
Translation (translation : SET<CD>, default NULL)

Definition:      A set of other concept descriptors that translate this concept descriptor into other code systems.

The translation property is a set of other concept descriptors that each translate the first concept descriptor into different code systems. Each element of the translation set was translated from the first concept descriptor. Each translation may, however, also contain translations. Thus, when a code is translated multiple times the information about which code served as the input to which translation will be preserved.

NOTE: the translations are quasi-synonyms of one real-world concept. Every translation in the set is supposed to express the same meaning "in other words." However, exact synonymy rarely exists between two structurally different coding systems. For this reason, not all of the translations will be equally exact.
2.6.1.8
Qualifier (qualifier : LIST<CR>, default NULL)

Definition:      Specifies additional codes that increase the specificity of the the primary code.

The primary code and all the qualifiers together make up one concept. A concept descriptor with qualifiers is also called a code phrase or postcoordinated expression.

Qualifiers constrain the meaning of the primary code, but cannot negate it or change it's meaning to that of another value in the primary coding system

Qualifiers can only be used according to well-defined rules of post-coordination. A value of type CD may only have qualifiers if it's code system defines the use of such qualifiers or if there is a third code system that specifies how other code systems may be combined.

For example, SNOMED CT allows constructing concepts as a combination of multiple codes. SNOMED CT defines a concept "cellulitis (disorder)" (128045006) an attribute "finding site" (363698007) and another concept "foot structure (body structure)" (56459004). SNOMED CT allows one to combine these codes in a code phrase:

Example 1:

<observation>
...

    <valuecode="128045006" codeSystem="&amp;SNOMED-CT;" displayName="cellulitis (disorder)">
        <qualifiercode="56459004" displayName="foot structure">
            <namecode="363698007" displayName="finding site" />
        </qualifier>
    </value>
...

</observation>

In this example, there is one code system, SNOMED-CT that defines all the primary code and the qualifiers and how these are used, which is why in our example representation the codeSystem does not need to be mentioned for the qualifier name and value (the codeSystem is inherited from the primary code.)

It is important to note that the allowable qualifiers are specified by the code system. For instance, in SNOMED CT, there is a defined set of qualifying attributes, and only Findings and Disorders can be qualified with the "finding site" attribute. Use of qualifiers outside the boundaries specified by the code system is a non-conformant use of the CD data type. Adherence to the rules specified by the code system enables post-coordinated expressions to be compared with pre-coordinated concepts (such as where one might compare the above code phrase to the pre-coordinated concept "cellulitis of foot (disorder)" (128276007), which is defined within SNOMED CT as having a finding site of foot structure). The CD datatype does not provide for normalization of compositional expressions, therefore it is possible to create ambiguous expressions. Users should understand that they must provide the additional constraints necessary to assure unambiguous data representation, if they are planning to create compositional expressions using the CD datatype. Otherwise, they risk the inability to retrieve a complete set of all records corresponding to any given query.

Another common example is the U.S. Centers for Medicare and Medicaid Services (CMS) (previously known as the Health Care Financing Administration, HCFA) procedure codes. CMS procedure codes (HCPCS) are based on CPT-4 and add additional qualifiers to it. For example, the patient with above finding (plus peripheral arterial disease, diabetes mellitus, and a chronic skin lesion at the left great toe) may have an amputation of that toe. The CPT-4 concept is "Amputation, toe metatarsophalangeal joint" (28820) and a HCPCS qualifier needs to be added to indicate "left foot, great toe" (TA). Thus we code:

Example 2:

<procedure>
...

    <cdcode="28820" codeSystem="&amp;CP4;" displayName="Amputation, toe metatarsophalangeal joint">
        <qualifiercode="TA" codeSystem="&amp;HCP;" displayName="left foot, great toe" />
    </cd>
...

</procedure>

In this example, the code system of the qualifier (HCPCS) is different than the code system of the primary code (CPT-4.) It is only because there are well-defined rules that define how these codes can be combined, that the qualifier may be used. Note also, that the role name is optional, and for HCPCS codes there are no distinguished role names.

The order of qualifiers is preserved, particularly for the case where the coding system allows post-coordination but defines no role names. (e.g., some ICD-9CM codes, or the old SNOMED "multiaxial" coding.)

2.6.1.9
Equality (equal : BL, inherited from ANY)

The main use of concept descriptors is for the purpose of indexing, querying and decision-making based on a coded value. A semantically unambiguous specification of coded values therefore requires a clear definition of what equality of concept descriptor values means and how CD values should be compared. (For more details on comparing pre- and post-coordinated expressions, see Dolin RH, Spackman KA, Markwell D. Selective Retrieval of Pre- and Post-coordinated SNOMED Concepts. Fall AMIA 2002; 210-14, or the July 2003 SNOMED CT Implementation Guide.)

The equality of two concept descriptor values is determined solely based upon the code and coding system. The code system version is excluded from the equality test.22 If qualifiers are present, the qualifiers are included in the equality test. Translations are not included in the equality test.23 Exceptional concept descriptor values are not equal even if they have the same NULL-flavor or the same original text.24

Definition 61:
invariant(CD x, y) x.nonNull.and(y.nonNull) {
  x.equal(y).equal(x.code.equal(y.code)
                .and(x.codeSystem.equal(y.codingSystem))
                .and(x.qualifier.equal(y.qualifier)));
};
      

Some code systems define certain style options to their code values. For example, the U.S. National Drug Code (NDC) has a dash and a non-dash form. An example for the dash form may be 1234-5678-90 when the non-dash form is 01234567890. Another example for this problem is when certain ISO or ANSI code tables define optional alphanumeric and numeric forms of two or three character lengths all in one standard.

In the case where code systems provide for multiple representations, HL7 shall make a ruling about which is the preferred form. HL7 shall document that ruling where that respective external coding system is recognized. HL7 shall decide upon the preferred form based on criteria of practicality and common use. In absence of clear criteria of practicality and common use, the safest, most extensible, and least stylized (the least decorated) form shall be given preference.25

2.6.1.10
Implies (implies : BL)

Definition:      Specifies whether this concept descriptor is a specialization of the operand concept descriptor.

Naturally, concepts can be narrowed and widened to include or exclude other concepts. Many coding systems have an explicit notion of concept specialization and generalization. The HL7 vocabulary principles also provide for concept specialization for HL7 defined value sets. The implies-property is a predicate that compares whether one concept is a specialization of another concept, and therefore implies that other concept.

When writing predicates (e.g., conditional statements) that compare two codes, one should usually test for implication not equality of codes.

For example, in Table 20 the "telecommunication use" concepts: work (W), home (H), primary home (HP), and vacation home (HV) are defined, where both HP and HV imply H. When selecting any home phone number, one should test whether the given use-code c  implies H. Testing for c  equal H would only find unspecified home phone numbers, but not the primary home phone number.

Operationally, implication can be evaluated in one of two ways. The code system literals may be designed such that one single hierarchy is reflected in the code literal itself (e.g., ICD-9.) Apart from such special cases, however, a terminological knowledge base and an appropriate subsumption algorithm will be required to evaluate implication statements. For post-coordinated coding systems, designing such a subsumption algorithm is a non-trivial task.26

Use of the full concept descriptor data type is not common. It requires a conscious decision and documented rationale. In all other cases, one of the CD restrictions shall be used.27

All CD restrictions constrain certain properties of the CD. Properties may be constrained to the extent that only one value may be allowed for that property, in which case mentioning the property becomes redundant. Constraining a property to one value is referred to as suppressing that property. Although, conceptually a suppressed property is still semantically applicable, it is safe for an HL7 interface to assume the implicit default value without testing.

NOTE: In general, this is true of many types in this data types specification, however it is a frequently asked question concerning the CD descendents.

2.7

Concept Role (CR) specializes ANY

Definition:      A concept qualifier code with optionally named role. Both qualifier role and value codes must be defined by the coding system of the CD containing the concept qualifier. For example, if SNOMED RT defines a concept "leg", a role relation "has-laterality", and another concept "left", the concept role relation allows to add the qualifier "has-laterality: left" to a primary code "leg" to construct the meaning "left leg".

Table 12: Property Summary of Concept Role
Name Type Description
name CV Specifies the manner in which the concept role value contributes to the meaning of a code phrase. For example, if SNOMED RT defines a concept "leg", a role relation "has-laterality", and another concept "left", the concept role relation allows to add the qualifier "has-laterality: left" to a primary code "leg" to construct the meaning "left leg". In this example "has-laterality" is the CR.name.
value CD The concept that modifies the primary code of a code phrase through the role relation. For example, if SNOMED RT defines a concept "leg", a role relation "has-laterality", and another concept "left", the concept role relation allows adding the qualifier "has-laterality: left" to a primary code "leg" to construct the meaning "left leg". In this example "left" is the CR.value.
inverted BN Indicates if the sense of the role name is inverted. This can be used in cases where the underlying code system defines inversion but does not provide reciprocal pairs of role names. By default, inverted is false.

The use of qualifiers is strictly governed by the code system used. The CD data type does not permit using code qualifiers with code systems that do not provide for qualifiers (e.g. pre-coordinated systems, such as LOINC, ICD-10 PCS.)

Definition 62:
protected type ConceptRole alias CR specializes ANY {
  CV  name;
  BN  inverted;
  CD  value;
};
    

2.7.1

Name (name : CV, default NULL)

Definition:      Specifies the manner in which the concept role value contributes to the meaning of a code phrase. For example, if SNOMED RT defines a concept "leg", a role relation "has-laterality", and another concept "left", the concept role relation allows to add the qualifier "has-laterality: left" to a primary code "leg" to construct the meaning "left leg". In this example "has-laterality" is the CR.name.

If the coding system of the CD containing the CR allows postcoordination but no role names (e.g. SNOMED) the name attribute can be NULL.

2.7.2

Value (value : CD, default NULL)

Definition:      The concept that modifies the primary code of a code phrase through the role relation. For example, if SNOMED RT defines a concept "leg", a role relation "has-laterality", and another concept "left", the concept role relation allows adding the qualifier "has-laterality: left" to a primary code "leg" to construct the meaning "left leg". In this example "left" is the CR.value.

This property is of type concept descriptor and thus can in turn have qualifiers. This allows qualifiers to nest. Qualifiers can only be used as far as the underlying code system defines them. It is not allowed to use any kind of qualifiers for code systems that do not explicitly allow and regulate such use of qualifiers.

Definition 63:
invariant(CR x) where x.nonNull {
  x.value.nonNull;
};
      

2.7.3

Inversion Indicator (inverted : BN, default false)

Definition:      Indicates if the sense of the role name is inverted. This can be used in cases where the underlying code system defines inversion but does not provide reciprocal pairs of role names. By default, inverted is false.

For example, a code system may define the role relation "causes" besides the concepts "Streptococcus pneumoniae" and "Pneumonia". If that code system allows its roles to be inverted, one can construct the post-coordinated concept "Pneumococcus pneumonia" through "Pneumonia - causes, inverted - Streptococcus pneumoniae."

Roles may only be inverted if the underlying coding system allows such inversion. Notably, if a coding system defines roles in inverse pairs or intentionally does not define certain inversions, the appropriate role code (e.g. "caused-by") must be used rather than inversion. It must be known whether the inverted property is true or false, since if it is NULL, the role cannot be interpreted.

NOTE: the property "inverted" should be conveyed in an indicator attribute, whose default value is false. That way the inverted indicator does not have to be sent when the role is not inverted.

2.8

Coded Simple Value (CS) specializes CD

Definition:      Coded data in its simplest form, where only the code is not predetermined. The code system and code system version are fixed by the context in which the CS value occurs. CS is used for coded attributes that have a single HL7-defined value set.

Table 13: Property Summary of Coded Simple Value
Name Type Description
code ST The plain code symbol defined by the code system. For example, "784.0" is the code symbol of the ICD-9 code "784.0" for headache.
originalText ED The text or phrase used as the basis for the coding.
Definition 64:
type CodedSimpleValue alias CS specializes CV {
  ST    code;
  literal   ST;
};
    

CS can only be used in either of the following cases:

  1. for a coded attribute which has a single HL7-defined code system, and where code additions to that value set require formal HL7 action (such as harmonization.) Such coded attributes must be assigned the CS restriction.

  2. for a property in this specification that is assigned to a single code system defined either in this specification or defined outside HL7 by a body that has authority over the concept and the maintenance of that code system.

For example, since the ED type subscribes to the MIME design, it trusts IETF to manage the media type. This includes that this specification subscribes to the extension mechanism built into the MIME media type code (e.g., "application/x-myapp").

For CS values, the designation of the domain qualifier will always be CNE (coded, non-extensible) and the context will determine which HL7 values to use. 28

2.8.1

Properties of Coded Simple Value (CS)

2.8.1.1
Code (code : ST, default NULL, inherited from CD)
Definition 65:
invariant(CS x) where x.nonNull {
  x.code.nonNull;
};
      
2.8.1.2
Code System (codeSystem : UID, fixed)

Every non-NULL CS value has a defined code system. The ITS representation of the CS needs not explicitly mention the code system, because the context mandates one and only one code system to be used. Specifying the code system explicitly would be redundant. However, the code system property assumes that context-specific default value and is not NULL.

Definition 66:
invariant(CS x) where x.code.nonNull {
  x.codeSystem.nonNull;
  x.codeSystem.equal(CONTEXT.codeSystem);
};
      

An exceptional CS of NULL-flavor "other" indicates that a concept could not be coded in the coding system specified. In these cases, the code must be Null.

Definition 67:
invariant(CS x) where x.other {
  x.code.isNull;
  x.codeSystem.nonNull;
};
      
2.8.1.3
Code System Name (codeSystemName : ST, default NULL, fixed)
Definition 68:
invariant(CS x) {
  x.codeSystemName.equal(CONTEXT.codeSystemName);
};
      
2.8.1.4
Code System Version (codeSystemVersion : ST, default NULL, fixed)
Definition 69:
invariant(CS x) {
  x.codeSystemVersion.equal(CONTEXT.codeSystemVersion);
};
          
2.8.1.5
Display Name (displayName : ST, default NULL, fixed)
Definition 70:
invariant(CS x) {
  x.displayName.notApplicable;
};
          
2.8.1.6
Original Text (originalText : ED, default NULL, inherited from CD)
Definition 71:
invariant(CS x) {
  x.originalText.notApplicable;
};
          
2.8.1.7
Translation (translation : SET<CD>, default NULL, fixed)
Definition 72:
invariant(CS x) {
  x.translation.notApplicable;
};
      
2.8.1.8
Qualifier (qualifier : LIST<CR>, default NULL, fixed)
Definition 73:
invariant(CS x) {
  x.qualifier.notApplicable;
};
      
2.8.1.9
Literal Form
Definition 74:
CS.literal ST {
  ST : /[a-zA-Z0-9_]+/  { $.equal($1); };
};
      

The string literal form of CS is primarily defined for the purposes of this specification. The literal form is a representation of the code for the codeSystem for the context of the CS in string format. You cannot determine the codeSystem or version from the literal itself, so the literal only has use where the context is known

2.9

Coded Value (CV) specializes CD

Definition:      Coded data, specifying only a code, code system, and optionally display name and original text. Used only as the data type for other data types' properties.

Table 14: Property Summary of Coded Value
Name Type Description
code ST The plain code symbol defined by the code system. For example, "784.0" is the code symbol of the ICD-9 code "784.0" for headache.
codeSystem UID Specifies the code system that defines the code.
codeSystemName ST The common name of the coding system.
codeSystemVersion ST If applicable, a version descriptor defined specifically for the given code system.
displayName ST A name or title for the code, under which the sending system shows the code value to its users.
originalText ED The text or phrase used as the basis for the coding.
Definition 75:
type CodedValue alias CV specializes CE {
  ST    code;
  UID   codeSystem;
  ST    codeSystemName;
  ST    codeSystemVersion;
  ST    displayName;
  ED    originalText;
};
    

This type is used when any reasonable use case will require only a single code value to be sent. Thus, it should not be used in circumstances where multiple alternative codes for a given value are desired. This type may be used with both the CNE (coded, non-extensible) and the CWE (coded, with extensibility) domain qualifiers.

2.9.1

Properties of Coded Value (CV)

2.9.1.1
Code (code : ST, default NULL, inherited from CD)
2.9.1.2
Code System (codeSystem : UID, inherited from CD)
2.9.1.3
Code System Name (codeSystemName : ST, default NULL, inherited from CD)
2.9.1.4
Code System Version (codeSystemVersion : ST, default NULL, inherited from CD)
2.9.1.5
Display Name (displayName : ST, default NULL, inherited from CD)
2.9.1.6
Original Text (originalText : ED, default NULL, inherited from CD)
2.9.1.7
Translation (translation : SET<CD>, default NULL, fixed)
Definition 76:
invariant(CV x) {
  x.translation.notApplicable;
};
      
2.9.1.8
Qualifier (qualifier : LIST<CR>, default NULL, fixed)
Definition 77:
invariant(CV x) {
  x.qualifier.notApplicable;
};
      

2.10

Coded Ordinal (CO) specializes CV

Definition:      Coded data, where the domain from which the codeset comes is ordered. The Coded Ordinal data type adds semantics related to ordering so that models that make use of such domains may introduce model elements that involve statements about the order of the terms in a domain.

Definition 78:
type CodedOrdinal alias CO specializes CV {
  BL    lessOrEqual(CO o);
  BL    lessThan(CO o);
  BL    greaterThan(CO o);
  BL    greaterOrEqual(CO o);
};
    

The relative order of this type's values need not be independently obvious in their literal representation. It is expected that an application will look up the ordering of these values from some table.

2.10.1

Properties of Coded Ordinal (CO)

2.10.1.1
Less-or-equal (lessOrEqual : BL)

Definition:      The ordering relation is based on lessOrEqual which is taken as primitive in this specification.

All other order relations can be derived from this one. Taking lessOrEqual as primitive accomodates partial orderings.

Order relationships typically hold only within a single coding system.

2.10.1.2
Less-than (lessThan : BL)
Definition 79:
invariant(CO x, y) where x.nonNull.and(y.nonNull) {
  x.lessThan(y).equal(y.lessOrEqual(x).and(x.equal(y).not));
};
         
2.10.1.3
Greater-than (greaterThan : BL)
Definition 80:
invariant(CO x, y) where x.nonNull.and(y.nonNull) {
  x.greaterThan(y).equal(y.lessThan(x));
};
         
2.10.1.4
Greater-or-equal (greaterOrEqual : BL)
Definition 81:
invariant(CO x, y) where x.nonNull.and(y.nonNull) {
  x.greaterOrEqual(y).equal(y.lessOrEqual(x));
};
         

2.11

Coded With Equivalents (CE) specializes CD

Definition:      Coded data that consists of a coded value (CV) and, optionally, coded value(s) from other coding systems that identify the same concept. Used when alternative codes may exist.

Table 15: Property Summary of Coded With Equivalents
Name Type Description
code ST The plain code symbol defined by the code system. For example, "784.0" is the code symbol of the ICD-9 code "784.0" for headache.
codeSystem UID Specifies the code system that defines the code.
codeSystemName ST The common name of the coding system.
codeSystemVersion ST If applicable, a version descriptor defined specifically for the given code system.
displayName ST A name or title for the code, under which the sending system shows the code value to its users.
originalText ED The text or phrase used as the basis for the coding.
translation SET<CD> A set of other concept descriptors that translate this concept descriptor into other code systems.
Definition 82:
type CodedWithEquivalents alias CE specializes CD {
    ST       code;
    UID      codeSystem;
    ST       codeSystemName;
    ST       codeSystemVersion;
    ST       displayName;
    ED       originalText;
    SET<CV>  translation;
};
    

The CE type is used when the use case indicates that alternative codes may exist and where it is useful to communicate these. The CE type provides for a primary code value, plus a set of alternative or equivalent representations.

2.11.1

Properties of Coded With Equivalents (CE)

2.11.1.1
Code (code : ST, default NULL, inherited from CD)
2.11.1.2
Code System (codeSystem : UID, inherited from CD)
2.11.1.3
Code System Name (codeSystemName : ST, default NULL, inherited from CD)
2.11.1.4
Code System Version (codeSystemVersion : ST, default NULL, inherited from CD)
2.11.1.5
Display Name (displayName : ST, default NULL, inherited from CD)
2.11.1.6
Original Text (originalText : ED, default NULL, inherited from CD)
2.11.1.7
Translation (translation : SET<CD>, default NULL, inherited from CD)
2.11.1.8
Qualifier (qualifier : LIST<CR>, default NULL, fixed)
Definition 83:
invariant(CE x) {
  x.qualifier.notApplicable;
};
      

2.12

Character String with Code (SC) specializes ST

Definition:      A character string that optionally may have a code attached. The text must always be present if a code is present. The code is often a local code.

Table 16: Property Summary of Character String with Code
Name Type Description
code CE A code representing the string data. For example, the string data may be a user-message out of a message-catalog where the code represents the identifier of the message in the message catalog.
Definition 84:
type CharacterStringWithCode alias SC specializes ST {
  CE code;
};
    

This data type is used in cases where coding is exceptional (e.g., user text messages are essentially text messages, and a printable message is the important content. Yet, sometimes messages come from a catalog of canned messages, which the SC allows to reference.

Any non-null SC value MAY have a code, however, a code MUST NOT be given without the text.

Definition 85:
invariant(SC x) where x.nonNull {
  x.code.nonNull.implies(x.notEmpty);
};
      

2.12.1

Properties of Character String with Code (SC)

2.12.1.1
Code (code : CE)

Definition:      A code representing the string data. For example, the string data may be a user-message out of a message-catalog where the code represents the identifier of the message in the message catalog.


Instance Identifier data types.
Instance Identifier data types.

2.13

Unique Identifier String (UID) specializes ST

Definition:      A unique identifier string is a character string which identifies an object in a globally unique and timeless manner. The allowable formats and values and procedures of this data type are strictly controlled by HL7. At this time, user-assigned identifiers may be certain character representations of ISO Object Identifiers (OID) and DCE Universally Unique Identifiers (UUID). HL7 also reserves the right to assign other forms of UIDs, such as mnemonic identifiers for code systems.

The sole purpose of the UID is to be a globally and timelessly unique identifier. The form of the UID, whether it is an OID, an UUID or any other form is entirely irrelevant. As far as HL7 is concerned, the only thing one can do with a UID is denote to the object for which it stands. Comparison of UIDs is literal, i.e. if two UIDs are literally identical, they are assumed to denote to the same object. If two UIDs are not literally identical they may not denote to the same object.

Definition 86:
type UniqueIdentifierString alias UID specializes ST { };
    

No difference in semantics is recognized between the different allowed forms of the UID. The different forms are not distinguished by a component within or aside from the identifier string itself.

Even though this specification recognizes no semantic difference between the different forms of the unique identifier forms, there are differences of how these identifiers are built and managed, which is the sole reason to define subtypes to the UID for each of the variants.

2.14

ISO Object Identifier (OID) specializes UID

Definition:      A globally unique string representing an ISO Object Identifier (OID) in a form that consists only of numbers and dots (e.g., "2.16.840.1.113883.3.1"). According to ISO, OIDs are paths in a tree structure, with the left-most number representing the root and the right-most number representing a leaf.

Each branch under the root corresponds to an assigning authority. Each of these assigning authorities may, in turn, designate its own set of assigning authorities that work under its auspices, and so on down the line. Eventually, one of these authorities assigns a unique (to it as an assigning authority) number that corresponds to a leaf node on the tree. The leaf may represent an assigning authority (in which case the root OID identifies the authority), or an instance of an object. An assigning authority owns a namespace, consisting of its sub-tree.

OIDs are the preferred scheme for unique identifiers. OIDs should always be used except if one of the inclusion criteria for other schemes apply.

ISO/IEC 8824:1990(E) clause 28 defines the Object Identifier as

28.9 The semantics of an object identifier value are defined by reference to an object identifier tree. An object identifier tree is a tree whose root corresponds to [the ISO/IEC 8824 standard] and whose vertices [i.e. nodes] correspond to administrative authorities responsible for allocating arcs [i.e. branches] from that vertex. Each arc from that tree is labeled by an object identifier component, which is [an integer number]. Each information object to be identified is allocated precisely one vertex (normally a leaf) and no other information object (of the same or a different type) is allocated to that same vertex. Thus an information object is uniquely and unambiguously identified by the sequence of [integer numbers] (object identifier components) labeling the arcs in a path from the root to the vertex allocated to the information object.

28.10 An object identifier value is semantically an ordered list of object identifier component values. Starting with the root of the object identifier tree, each object identifier component value identifies an arc in the object identifier tree. The last object identifier component value identifies an arc leading to a vertex to which an information object has been assigned. It is this information object, which is identified by the object identifier value. [...]

Definition 87:
type ObjectIdentifier alias OID specializes UID, LIST<INT> {
  INT   leaf;
  OID   butLeaf;
  OID   value(namespace OID);
  literal ST;
};
    

According to ISO/IEC 8824 an object identifier is a sequence of object identifier component values, which are integer numbers. These component values are ordered such that the root of the object identifier tree is the head of the list followed by all the arcs down to the leaf representing the information object identified by the OID. The fact that OID specializes LIST< INT> represents this path of object identifier component values from the root to the leaf.

The leaf and "butLeaf" properties take the opposite view. The leaf is the last object identifier component value in the list, and the "butLeaf" property is all of the OID but the leaf. In a sense, the leaf is the identifier value and all of the OID but the leaf refers to the namespace in which the leaf is unique and meaningful.

However, what part of the OID is considered value and what is namespace may be viewed differently. In general, any OID component sequence to the left can be considered the namespace in which the rest of the sequence to the right is defined as a meaningful and unique identifier value. The value-property with a namespace OID as its argument represents this point of view.29

Definition 88:
invariant(OID x) x.nonNull {
  x.notEmpty;
  x.tail.isEmpty.implies(x.leaf.equal(x.tail));
  x.tail.notEmpty.implies(x.leaf.equal(x.tail.leaf);
  x.tail.isEmpty.implies(x.butLeaf.isNull);
  x.tail.notEmpty.implies(x.butLeaf.head.equal(x.head)
            .and(x.butLeaf.tail.equal(x.butLeaf(x.tail))));
  forall(OID v; OID n) where v.equal(x.value(n)) {
    n.isEmpty.implies(v.equal(x));
    n.notEmpty.implies(v.equal(x.value(n.tail)));
  };
};
    

2.14.1

HL7-Assigned OIDs

HL7 shall establish an OID registry and assign OIDs in its branch for HL7 users and vendors upon their request. HL7 shall also assign OIDs to public identifier-assigning authorities both U.S. nationally (e.g., the U.S. State driver license bureaus, U.S. Social Security Administration, HIPAA Provider ID registry, etc.) and internationally (e.g., other countries Social Security Administrations, Citizen ID registries, etc.) The HL7 registered OIDs must be used for these organizations, regardless whether these organizations have other OIDs assigned from other sources.

When assigning OIDs to third parties or entities, HL7 shall investigate whether an OID is already assigned for such entities through other sources. It this is the case, HL7 shall record such OID in a catalog, but HL7 shall not assign a duplicate OID in the HL7 branch. If possible, HL7 shall notify a third party when an OID is being assigned for that party in the HL7 branch.

Though HL7 shall exercise diligence before assigning an OID in the HL7 branch to third parties, given the lack of a global OID registry mechanism, one cannot make absolutely certain that there is no preexisting OID assignment for such third-party entity. Also, a duplicate assignment can happen in the future through another source. If such cases of supplicate assignment become known to HL7, HL7 shall make efforts to resolve this situation. For continued interoperability in the meantime, the HL7 assigned OID shall be the preferred OID used.

While most owners of an OID will "design" their namespace sub-tree in some meaningful way, there is no way to generally infer any meaning on the parts of an OID. HL7 does not standardize or require any namespace sub-structure. An OID owner, or anyone having knowledge about the logical structure of part of an OID, may still use that knowledge to infer information about the associated object; however, the techniques cannot be generalized.

Example for a tree of ISO object identifiers. HL7's OID is 2.16.840.1.113883. (link to graphic opens in a new window)

An HL7 interface must not rely on any knowledge about the substructure of an OID for which it cannot control the assignment policies.

2.14.2

Literal Form

The structured definition of the OID is provided mostly to be faithful to the OID specification. Within HL7, OIDs are used as UID strings only, i.e., the literal string value is the only thing that is communicated and is the only thing that a reciever should have to consider when working with UIDs in the scope of the HL7 specification.

Definition 89:
OID.literal ST {
    OID : INT "." OID { $.head.equal($1);
      $.tail.equal($3); }
        | INT   { $.head.equal($1);
      $.tail.isEmpty; }
};
      

For compatibility with the DICOM standard, the literal form of the OID should not exceed 64 characters. (see DICOM part 5, section 9).

2.15

DCE Universal Unique Identifier (UUID) specializes UID

Definition:      A globally unique string representing a DCE Universal Unique Identifier (UUID) in the common UUID format that consists of 5 hyphen-separated groups of hexadecimal digits having 8, 4, 4, 4, and 12 places respectively.

Both the UUID and its string representation are defined by the Open Group, CDE 1.1 Remote Procedure Call specification, Appendix A.

UUIDs are assigned based on Ethernet MAC addresses, the point in time of creation and some random component. This mix is believed to generate sufficiently unique identifiers without any organizational policy for identifier assignment (in fact this piggy-backs on the organization of MAC address assignment.)

UUIDs are not the preferred identifier scheme for use as HL7 UIDs. UUIDs may be used when identifiers are issued to objects representing individuals (e.g., entity instance identifiers, act event identifiers, etc.) For objects describing classes of things or events (e.g., catalog items), OIDs are the preferred identifier scheme.

Definition 90:
type UniversalUniqueIdentifier alias UUID specializes UID {
  INT timeLow;
  INT timeMid;
  INT timeHighAndVersion;
  INT clockSequence;
  INT node;
};
    

2.15.1

Literal Form

The structured definition of the UUID is provided mostly to be faithful to the UUID specification. Within HL7, UUIDs are used as UID strings only, i.e., the literal string value is the only thing that is communicated and is the only thing that a reciever should have to consider when working with UIDs in the scope of the HL7 specification.

The literal form for the UUID is defined according to the original specification of the UUID. However, because the HL7 UIDs are case sensitive, for use with HL7, the hexadecimal digits A-F in UUIDs must be converted to upper case.

Definition 91:
UUID.literal ST {
  UUID : hex8 "-" hex4 "-" hex4 "-" hex4 "-" hex12 {
          $.timeLow.equal($1);
          $.timeMid.equal($3);
          $.timeHighAndVersion.equal($5);
          $.clockSequence.equal($7);
          $.node.equal($9);
  }

  INT hex4 :  hexDigit hexDigit hexDigit hexDigit {
          $.equal($1.times(16).plus($2)
	             .times(16).plus($3)
		     .times(16).plus($4);
  }

  INT hex8 :  hexDigit hexDigit hexDigit hexDigit
              hexDigit hexDigit hexDigit hexDigit {
          $.equal($1.times(16).plus($2)
	             .times(16).plus($3)
		     .times(16).plus($4)
	             .times(16).plus($5)
		     .times(16).plus($6)
	             .times(16).plus($7)
		     .times(16).plus($8);
  }

  INT hex12 : hexDigit hexDigit hexDigit hexDigit
              hexDigit hexDigit hexDigit hexDigit
              hexDigit hexDigit hexDigit hexDigit {
          $.equal($1.times(16).plus($2)
	             .times(16).plus($3)
		     .times(16).plus($4)
	             .times(16).plus($5)
		     .times(16).plus($6)
	             .times(16).plus($7)
		     .times(16).plus($8)
		     .times(16).plus($9)
	             .times(16).plus($10)
		     .times(16).plus($11)
		     .times(16).plus($12);
  }

  INT hexDigit
  : "0" { $.equal(0); }
  | "1" { $.equal(1); }
  | "2" { $.equal(2); }
  | "3" { $.equal(3); }
  | "4" { $.equal(4); }
  | "5" { $.equal(5); }
  | "6" { $.equal(6); }
  | "7" { $.equal(7); }
  | "8" { $.equal(8); }
  | "9" { $.equal(9); }
  | "A" { $.equal(10); }
  | "B" { $.equal(11); }
  | "C" { $.equal(12); }
  | "D" { $.equal(13); }
  | "E" { $.equal(14); }
  | "F" { $.equal(15); }
};
      
NOTE: The output of UUID related programs and functions may use all sorts of forms, upper case, lower case, and with or without the hyphens that group the digits. This variate output must be postprocessed to conform to the HL7 specification, i.e., the hyphens must be inserted for the 8-4-4-4-12 grouping and all hexadecimal digits must be converted to upper case.

2.16

HL7 Reserved Identifier Scheme (RUID) specializes UID

Definition:      A globally unique string defined exclusively by HL7. Identifiers in this scheme are only defined by balloted HL7 specifications. Local communities or systems must never use such reserved identifiers based on bilateral negotiations.

HL7 reserved identifiers are strings that consist only of (US-ASCII) letters, digits and hyphens, where the first character must be a letter. HL7 may assign these reserved identifiers as mnemonic identifiers for major concepts of interest to HL7.

2.17

Instance Identifier (II) specializes ANY

Definition:      An identifier that uniquely identifies a thing or object. Examples are object identifier for HL7 RIM objects, medical record number, order id, service catalog item id, Vehicle Identification Number (VIN), etc. Instance identifiers are defined based on ISO object identifiers.

Table 17: Property Summary of Instance Identifier
Name Type Description
root UID A unique identifier that guarantees the global uniqueness of the instance identifier. The root alone may be the entire instance identifier.
extension ST A character string as a unique identifier within the scope of the identifier root.
assigningAuthorityName ST A human readable name or mnemonic for the assigning authority. The Assigning Authority Name has no computational value. The purpose of a Assigning Authority Name is to assist an unaided human interpreter of an II value to interpret the authority. Note: no automated processing must depend on the assigning authority name to be present in any form.
displayable BL Specifies if the identifier is intended for human display and data entry (displayable = true) as opposed to pure machine interoperation (displayable = false).
Definition 92:
type InstanceIdentifier alias II specializes ANY {
  ST      extension;
  UID     root;
  ST      assigningAuthorityName;
  BL      equal(ANY x);
};
    

2.17.1

Properties of Instance Identifier (II)

2.17.1.1
Root (root : UID)

Definition:      A unique identifier that guarantees the global uniqueness of the instance identifier. The root alone may be the entire instance identifier.

In the presence of a non-null extension, the root is commonly interpreted as the "assigning authority", that is, it is supposed that the root somehow refers to an organization that assigns identifiers sent in the extension. However, the root does not have to be an organizational UID, it can also be a UID specifically registered for an identifier scheme.30

Definition 93:
invariant(II x) where x.nonNull {
  root.nonNull;
};
      
2.17.1.2
Extension (extension : ST, default NULL)

Definition:      A character string as a unique identifier within the scope of the identifier root.

The extension is a character string that is unique in the namespace designated by the root. If a non-NULL extension is exists, the root specifies a namespace (sometimes called "assigning authority" or "identifier type".) The extension property may be NULL in which case the root OID is the complete unique identifier.

The root and extension scheme effectively means that the concatenation of root and extension must be a globally unique identifier for the item that this II value identifies.

It is recommended that systems use the OID scheme for external identifiers of their communicated objects. The extension property is mainly provided to accommodate legacy alphanumeric identifier schemes.

Some identifier schemes define certain style options to their code values. For example, the U.S. Social Security Number (SSN) is normally written with dashes that group the digits into a pattern "123-12-1234". However, the dashes are not meaningful and a SSN can just as well be represented as "123121234" without the dashes.

In the case where identifier schemes provide for multiple representations, HL7 shall make a ruling about which is the preferred form. HL7 shall document that ruling where that respective external identifier scheme is recognized. HL7 shall decide upon the preferred form based on criteria of practicality and common use. In absence of clear criteria of practicality and common use, the safest, most extensible, and least stylized (the least decorated) form shall be given preference.31

HL7 may also decide to map common external identifiers to the value portion of the II.root OID. For example, the U.S. SSN could be represented as 2.16.840.1.113883.4.1.123121234. The criteria of practicality and common use will guide HL7's decision on each individual case.

2.17.1.3
Assigning Authority Name (assigningAuthorityName : ST)

Definition:      A human readable name or mnemonic for the assigning authority. The Assigning Authority Name has no computational value. The purpose of a Assigning Authority Name is to assist an unaided human interpreter of an II value to interpret the authority. Note: no automated processing must depend on the assigning authority name to be present in any form.

2.17.1.4
Displayable (displayable : BL)

Definition:      Specifies if the identifier is intended for human display and data entry (displayable = true) as opposed to pure machine interoperation (displayable = false).

2.17.1.5
Equality (equal : BL, inherited from ANY)

Two instance identifiers are equal if and only if their root and extension properties are equal.

Definition 94:
invariant(II x, y) where x.nonNull.and(y.nonNull) {
  x.equal(y).equal(x.root.equal(y.root)
                .and(x.extension.equal(y.extension)));
};
      

URL And TEL data types
URL And TEL data types

2.18

Universal Resource Locator (URL) specializes ANY

Definition:      A telecommunications address specified according to Internet standard RFC 2396 [http://www.ietf.org/rfc/rfc2396.txt]. The URI specifies the protocol and the contact point defined by that protocol for the resource. Notable uses of the telecommunication address data type are for telephone and telefax numbers, e-mail addresses, Hypertext references, FTP references, etc.

The Internet standard RFC 2396 [http://www.ietf.org/rfc/rfc2396.txt] defines a URI as follows:

Just as there are many different methods of access to resources, there are several schemes for describing the location of such resources. The generic syntax for URLs provides a framework for new schemes to be established using protocols other than those defined in this document.

URLs are used to "locate" resources, by providing an abstract identification of the resource location. Having located a resource, a system may perform a variety of operations on the resource, as might be characterized by such words as "access", "update", "replace", "find attributes". In general, only the "access" method needs to be specified for any URL scheme.

By agreement, it is permissable to use a URI in place of a URL. In these cases, it is still expected that the resources identified is accessible by some agreed method. A common use of URI's is to refer to SOAP attachments

Definition 95:
protected type UniversalResourceLocator
                 alias URL specializes ANY {
  CS  scheme;
  ST  address;
  literal ST;
};
    

2.18.1

Scheme (scheme : CS)

Definition:      Identifies the protocol used to interpret the address string and to access the resource so addressed.

Some URL schemes are registered by the Internet Assigned Numbers Authority (IANA) [http://www.iana.org], however IANA only registers URL schemes that are defined in Internet RFC documents. In fact there are a number of URL schemes defined outside RFC documents, part of which are registered with the World Wide Web Consortium (W3C).32

Similar to the ED.mediaType, HL7 makes suggestions about values classifying them as required, recommended, other, and deprecated. Any scheme not mentioned has status other.

Table 18: Domain URLScheme:
code name definition
tel Telephone A voice telephone number [draft-antti-telephony-url-11.txt].
fax Fax A telephone number served by a fax device [draft-antti-telephony-url-11.txt].
mailto Mailto Electronic mail address [RFC 2368].
http HTTP Hypertext Transfer Protocol [RFC 2068].
ftp FTP The File Transfer Protocol (FTP) [RFC 1738].
mllp HL7 Minimal Lower Layer Protocol The traditional HL7 Minimal Lower Layer Protocol. The URL has the form of a common IP URL e.g., mllp://<host>:<port>/ with <host> being the IP address or DNS hostname and <port> being a port number on which the MLLP protocol is served.
file File Host-specific local file names [RCF 1738]. Note that the file scheme works only for local files. There is little use for exchanging local file names between systems, since the receiving system likely will not be able to access the file.
nfs NFS Network File System protocol [RFC 2224]. Some sites use NFS servers to share data files.
telnet Telnet Reference to interactive sessions [RFC 1738]. Some sites, (e.g., laboratories) have TTY based remote query sessions that can be accessed through telnet.
modem Modem A telephone number served by a modem device [draft-antti-telephony-url-11.txt].

Note that this specification explicitly limits itself to URLs. Universal Resource Names (URN) are not covered by this specification. URNs are a kind of identifier scheme for other than accessible resources. This specification, however, is only concerned with accessible resources, which belong into the URL category.

2.18.2

Address (address : ST)

Definition:      The address is a character string whose format is entirely defined by the URL.scheme.

2.18.3

Literal Form

While conceptually URL has the properties scheme and address, the common appearance of a URL is as a string literal formed according to the Internet standard. The general syntax of the URL literal is:

Definition 96:
URL.literal ST {
  URL : /[a-z0-9+.-]+/ ":" ST { $.scheme.equal($1);
                                $.address.equal($3); }
};
      

Telephone and FAX Numbers

Note that there is no special data type for telephone numbers, telephone numbers are telecommunication addresses and are specified as URL.

The telephone number URL is defined in Internet RFC 2806 [http://www.ietf.org/rfc/rfc2806.txt]. Its definition is summarized in this subsection. This summary does not override or change any of the Internet specification's rulings.

The voice telephone URLs begin with "tel:" and fax URLs begin with "fax:"

The URL.address is the telephone number in accordance with ITU-T E.123 Telephone Network and ISDN Operation, Numbering, Routing and Mobile Service: Notation for National and International Telephone Numbers (1993). While HL7 does not add or withdraw from the URL specification, the preferred subset of the URL.address address syntax is given as follows:

Definition 97:
proctected type TelephoneURL specializes URL {
  literal ST {
    URL : /(tel)|(fax)/ ":" address   { $.scheme.equal($1);
                  $.address.equal($3); };
    ST address : "+" phoneDigits
    ST phoneDigits : digitOrSeparator
               phoneDigits | digitOrSeparator
    ST digitOrSeparator : digit | separator;
    ST digit : /[0..9]/;
    ST separator : /[().-]/;
  };
};
      

The global absolute telephone numbers starting with the "+" and country code are preferred. Separator characters serve as decoration but have no bearing on the meaning of the telephone number. For example: "tel:+13176307960" and "tel:+1(317)630-7960" are both the same telephone number; "fax:+49308101724" and "fax:+49(30)8101-724" are both the same fax number.

2.19

Telecommunication Address (TEL) specializes URL

Definition:      A telephone number (voice or fax), e-mail address, or other locator for a resource mediated by telecommunication equipment. The address is specified as a Universal Resource Locator (URL) qualified by time specification and use codes that help in deciding which address to use for a given time and purpose.

Table 19: Property Summary of Telecommunication Address
Name Type Description
useablePeriod GTS Specifies the periods of time during which the telecommunication address can be used. For a telephone number, this can indicate the time of day in which the party can be reached on that telephone. For a web address, it may specify a time range in which the web content is promised to be available under the given address.
use SET<CS> One or more codes advising a system or user which telecommunication address in a set of like addresses to select for a given telecommunication need.

The semantics of a telecommunication address is that a communicating entity (the responder) listens and responds to that address, and therefore can be contacted by an other communicating entity (the initiator.)

The responder of a telecommunication address may be an automatic service that can respond with information (e.g., FTP or HTTP services.) In such case a telecommunication address is a reference to that information accessible through that address. A telecommunication address value can thus be resolved to some information (in the form of encapsulated data, ED.)

Definition 98:
type TelecommunicationAddress alias TEL specializes URL {
  GTS   useablePeriod;
  SET<CS>   use;
  BL  equal(ANY x);
};
    

The telecommunication address is an extension of the Universal Resource Locator (URL) specified according to Internet standard RFC 2396 [http://www.ietf.org/rfc/rfc2396.txt]. The URL specifies the protocol and the contact point defined by that protocol for the resource. Notable use cases for the telecommunication address data type are for telephone and fax numbers, e-mail addresses, Hypertext references, FTP references, etc.

2.19.1

Properties of Telecommunication Address (TEL)

2.19.1.1
Useable Period (useablePeriod : GTS)

Definition:      Specifies the periods of time during which the telecommunication address can be used. For a telephone number, this can indicate the time of day in which the party can be reached on that telephone. For a web address, it may specify a time range in which the web content is promised to be available under the given address.

2.19.1.2
Use Code (use : SET<CS>)

Definition:      One or more codes advising a system or user which telecommunication address in a set of like addresses to select for a given telecommunication need.

Table 20: Domain TelecommunicationAddressUse:
code name definition
H home A communication address at a home, attempted contacts for business purposes might intrude privacy and chances are one will contact family or other household members instead of the person one wishes to call. Typically used with urgent cases, or if no other contacts are available.
  HP primary home The primary home, to reach a person after business hours.
  HV vacation home A vacation home, to reach a person while on vacation.
WP work place An office address. First choice for business related contacts during business hours.
AS answering service An automated answering machine used for less urgent cases and if the main purpose of contact is to leave a message or access an automated announcement.
EC emergency contact A contact specifically designated to be used for emergencies. This is the first choice in emergencies, independent of any other use codes.
PG pager A paging device suitable to solicit a callback or to leave a very short message.
MC mobile contact A telecommunication device that moves and stays with its owner. May have characteristics of all other use codes, suitable for urgent matters, not the first choice for routine business.

The telecommunication use code is not a complete classification for equipment types or locations. Its main purpose is to suggest or discourage the use of a particular telecommunication address. There are no easily defined rules that govern the selection of a telecommunication address.

2.19.1.3
Equality (equal : BL, inherited from ANY)

Two telecommunication address values are considered equal if both their URLs are equal. Use code and valid time are excluded from the equality test.

Definition 99:
invariant(TEL x, y) x.nonNull.and(y.nonNull) {
  x.equal(y).equal(((URL)x).equal((URL)y));
};
      

Data types for Postal Address and Entity Names (Person,
Organization, and Trivial Names) are all based on extensions of
a character string.
Data types for Postal Address and Entity Names (Person, Organization, and Trivial Names) are all based on extensions of a character string.

2.20

Address Part (ADXP) specializes ST

Definition:      A character string that may have a type-tag signifying its role in the address. Typical parts that exist in about every address are street, house number, or post box, postal code, city, country but other roles may be defined regionally, nationally, or on an enterprise level (e.g. in military addresses). Addresses are usually broken up into lines, which are indicated by special line-breaking delimiter elements (e.g., DEL).

Table 21: Property Summary of Address Part
Name Type Description
partType CS Specifies whether an address part names the street, city, country, postal code, post box, etc. If the type is NULL the address part is unclassified and would simply appear on an address label as is.
Definition 100:
protected type AddressPart alias ADXP specializes ST {
  CS  type;
};
    

2.20.1

Address Part Type (partType : CS)

Definition:      Specifies whether an address part names the street, city, country, postal code, post box, etc. If the type is NULL the address part is unclassified and would simply appear on an address label as is.

Table 22: Domain AddressPartType:
code name definition
DEL delimiter Delimiters are printed without framing white space. If no value component is provided, the delimiter appears as a line break.
CNT country Country
STA state or province A sub-unit of a country with limited sovereignty in a federally organized country.
CPA county or parish A sub-unit of a state or province. (49 of the United States of America use the term "county;" Louisiana uses the term "parish".)
CTY municipality The name of the city, town, village, or other community or delivery center
ZIP postal code A postal code designating a region defined by the postal service.
SAL street address line
  BNR building number The number of a building, house or lot alongside the street. Also known as "primary street number". This does not number the street but rather the building.
    BNN building number numeric The numeric portion of a building number
  DIR direction Direction (e.g., N, S, W, E)
  STR street name
    STB street name base The base name of a roadway or artery recognized by a municipality (excluding street type and direction)
    STTYP street type The designation given to the street. (e.g. Street, Avenue, Crescent, etc.)
ADL additional locator This can be a unit designator, such as apartment number, suite number, or floor. There may be several unit designators in an address (e.g., "3rd floor, Appt. 342"). This can also be a designator pointing away from the location, rather than specifying a smaller location within some larger one (e.g., Dutch "t.o." means "opposite to" for house boats located across the street facing houses).
  UNID unit identifier The number or name of a specific unit contained within a building or complex, as assigned by that building or complex.
  UNIT unit designator Indicates the type of specific unit contained within a building or complex. E.g. Appartment, Floor
CAR care of The name of the party who will take receipt at the specified address, and will take on responsibility for ensuring delivery to the target recipient
CEN census tract A geographic sub-unit delineated for demographic purposes.

2.21

Postal Address (AD) specializes LIST<ADXP>

Definition:      Mailing and home or office addresses. A sequence of address parts, such as street or post office Box, city, postal code, country, etc.

The AD is primarily used to communicate data that will allow printing mail labels, that will allow a person to physically visit that address. The postal address data type is not supposed to be a container for additional information that might be useful for finding geographic locations (e.g., GPS coordinates) or for performing epidemiological studies. Such additional information is captured by other, more appropriate HL7 elements.

Table 23: Property Summary of Postal Address
Name Type Description
use SET<CS> A set of codes advising a system or user which address in a set of like addresses to select for a given purpose.
useablePeriod GTS A General Timing Specification (GTS) specifying the periods of time during which the address can be used. This is used to specify different addresses for different times of the week or year.
isNotOrdered BL A boolean value specifying whether the order of the address parts is known or not. While the address parts are always a Sequence, the order in which they are presented may or may not be known. Where this matters, the isNotOrdered property can be used to convey this information.
formatted ST A character string value with the address formatted in lines and with proper spacing. This is only a semantic property to define the function of some of the address part types.

Remember that semantic properties are bare of all control flow semantics. The AD.formatted could be implemented as a "procedure" that would "return" the formatted address, but it would not usually be a variable to which one could assign a formatted address. However, HL7 does not define applications but only the semantics of exchanged data values. Hence, the semantic model abstracts from concepts like "procedure", "return", and "assignment" but speaks only of property and value.

Addresses are conceptualized as text with added logical mark-up. The mark-up may break the address into lines and may describe in detail the role of each address part if it is known. Address parts occur in the address in the order in which they would be printed on a mailing label. The approach is similar to HTML or XML markup of text (but it is not technically limited to XML representations.)

Addresses are essentially sequences of address parts, but add a "use" code and a valid time range for information about if and when the address can be used for a given purpose.

Definition 101:
type PostalAddress alias AD specializes LIST<ADXP> {
  SET<CS>   use;
  GTS useablePeriod;
  BL  isNotOrdered;
  BL  equal(ANY x);
  ST  formatted;
};
    

2.21.1

Properties of Postal Address (AD)

2.21.1.1
Use Code (use : SET<CS>)

Definition:      A set of codes advising a system or user which address in a set of like addresses to select for a given purpose.

Table 24: Domain PostalAddressUse:
code name definition
PHYS visit address A physical address, used primarily to visit the addressee.
PST postal address Used to send mail.
TMP temporary address A temporary address, may be good for visit or mailing. Note that an address history can provide more detailed information.
BAD bad address A flag indicating that the address is bad, in fact, useless.
H home A communication address at a home, attempted contacts for business purposes might intrude privacy and chances are one will contact family or other household members instead of the person one wishes to call. Typically used with urgent cases, or if no other contacts are available.
HP primary home The primary home, to reach a person after business hours.
HV vacation home A vacation home, to reach a person while on vacation.
WP work place An office address. First choice for business related contacts during business hours.
ABC Alphabetic Alphabetic transcription of name (Japanese: romaji)Alphabetic transcription of name (Japanese: romaji)Alphabetic transcription of name (Japanese: romaji)
SYL Syllabic Syllabic transcription of name (e.g., Japanese kana, Korean hangul)Syllabic transcription of name (e.g., Japanese kana, Korean hangul)Syllabic transcription of name (e.g., Japanese kana, Korean hangul)
IDE Ideographic Ideographic representation of name (e.g., Japanese kanji, Chinese characters)Ideographic representation of name (e.g., Japanese kanji, Chinese characters)Ideographic representation of name (e.g., Japanese kanji, Chinese characters)

An address without specific use code might be a default address useful for any purpose, but an address with a specific use code would be preferred for that respective purpose.

2.21.1.2
Useable Period (useablePeriod : GTS)

Definition:      A General Timing Specification (GTS) specifying the periods of time during which the address can be used. This is used to specify different addresses for different times of the week or year.

2.21.1.3
Is Not Ordered (isNotOrdered : BL)

Definition:      A boolean value specifying whether the order of the address parts is known or not. While the address parts are always a Sequence, the order in which they are presented may or may not be known. Where this matters, the isNotOrdered property can be used to convey this information.

2.21.1.4
Equality (equal : BL, inherited from ANY)

Two address values are considered equal if both contain the same address parts, independent of ordering. Use code and valid time are excluded from the equality test.

Definition 102:
invariant(AD x, y) x.nonNull.and(y.nonNull) {
  x.equal(y).equal((
        forall(ADXP p) where x.contains(p) {
	  y.contains(p);
	}).and.(
        forall(ADXP p) where x.contains(p) {
	  y.contains(p);
	}));
};
      
2.21.1.5
Formatting Address (formatted : ST)

Definition:      A character string value with the address formatted in lines and with proper spacing. This is only a semantic property to define the function of some of the address part types.34

The AD data type's main purpose is to capture postal addresses, such that one can visit that address or send mail to it. Humans will look at addresses in printed form, such as on a mailing label. The AD data type defines precise rules of how its data is formatted.35

Addresses are ordered lists of address parts. Each address part is printed in the order of the list from left to right and top to bottom (or in any other language-specific reading direction, which to determine is outside the scope of this specification.) Every address part value is printed. Most address parts are framed by white space. The following six rules govern the setting of whitespace.

  1. Whitespace never accumulates, i.e. two subsequent spaces are the same as one. Subsequent line breaks can be reduced to one. Whitespace around a line break is not significant.

  2. Literals may contain explicit whitespace, subject to the same white space reduction rules. There is no notion of a literal line break within the text of a single address part.

  3. Leading and trailing explicit whitespace is insignificant in all address parts, except for delimiter (DEL) address parts.

  4. By default, an address part is surrounded by implicit whitespace.

  5. Delimiter (DEL) address parts are not surrounded by any implicit white space.

  6. Leading and trailing explicit whitespace is significant in delimiter (DEL) address parts.

This means that all address parts are generally surrounded by white space, but whitespace does never accumulate. Delimiters are never surrounded by implicit whitespace and every whitespace contributed by preceding or succeeding address parts is discarded, whether it was implicit or explicit.

The following shows examples of addresses in the XML ITS form.

1050 W Wishard Blvd,
RG 5th floor,
Indianapolis, IN 46240.

Can be encoded in any of the following forms:36

The first form would result from a system that only stores addresses as free text or in a list of fields line1, line2, etc.:

Example 3:

<addruse="WP">
1050 W Wishard Blvd,
RG 5th floor,
Indianapolis, IN 46240
</addr>

The second form is more specific about the role of the address parts than the first one:

Example 4:

<addruse="WP">
    <streetAddressLine>1050 W Wishard Blvd</streetAddressLine>,

    <streetAddressLine>RG 5th floor</streetAddressLine>,

    <city>Indianapolis</city>,

    <state>IN</state>
    <postalCode>46240</postalCode>
</addr>

This form is the typical form seen in the U.S., where street address is sometimes separated, and city, state and ZIP code are always separated.

The third is even more specific:

Example 5:

<addruse="WP">
    <houseNumber>1050</houseNumber>
    <direction>W</direction>
    <streetName>Wishard Blvd</streetName>,

    <additionalLocator>RG 5th floor</additionalLocator>,

    <city>Indianapolis</city>,

    <state>IN</state>
    <postalCode>46240</postalCode>
</addr>

The latter form above is not used in the USA. However, it is useful in Germany, where many systems keep house number as a distinct field. For example, the German address:

Windsteiner Weg 54a,
D-14165 Berlin

would most likely be encoded as follows:37

Example 6:

<addruse="HP">
    <streetName>Windsteiner Weg</streetName>
    <houseNumber>54a</houseNumber>,

    <country>D</country>-

    <postalCode>14165</postalCode>
    <city>Berlin</city>
</addr>

2.22

Entity Name Part (ENXP) specializes ST

Definition:      A character string token representing a part of a name. May have a type code signifying the role of the part in the whole entity name, and a qualifier code for more detail about the name part type. Typical name parts for person names are given names, and family names, titles, etc.