![]() ANSI/HL7 V3 DT, R1-2004 HL7 Version 3 Standard: Data Types - Abstract Specification, Release 1 11/29/2004 |
| Chair/Editor | Gunther Schadow gunther@aurora.rg.iupui.edu Regenstrief Institute for Health Care |
| Editor | Paul Biron paul.v.biron@kp.org Kaiser Permanente, Southern California |
| Editor | Lloyd McKenzie lmckenzi@ca.ibm.com IBM Global Services |
| Editor | Grahame Grieve grahame@kestral.com.au Kestral Computing Pty. Ltd. |
| Editor | Doug Pratt Douglas.Pratt@siemens.com Siemens |
Last Published: 07/30/2007 6:20 PM
HL7® Version 3 Standard, © 2007 Health Level Seven®, Inc. All Rights Reserved.
HL7 and Health Level Seven are registered trademarks of Health Level Seven, Inc. Reg. U.S. Pat & TM Off
This document specifies the HL7 Version 3 Data Types on an abstract layer, independent of representation. By "independent of representation" we mean independent of both abstract syntax as well as implementation in any particular implementation technology.
This document is accompanied by Implementation Technology Specifications (ITS). The ITS documents can serve as a quick compendium to the data types that is more practically oriented toward the representation in that particular implementation technology.
Vocabulary tables within this specification list the current contents of vocabulary domains for ease of reference by the reader. However, at any given time the normative source for these domains is the vocabulary tables in the RIM database. For some large domains, only a sample of possible values is shown. The complete domains can be referenced in the vocabulary tables by looking up the domain name associated with the table in the RIM vocabulary tables.
This specification is the result of many years of intense work through e-mail, telephone conferences and meeting discussions. And ballot reconciliation. Thanks go to many individuals who participated at various times in design, discussions and ballot review. Gunther Schadow (Regenstrief Institute for Health Care) chaired this task force, and is the main author of this document. Paul V. Biron (Kaiser Permanente), Doug Pratt (Siemens), Lloyd McKenzie (IBM), and Grahame Grieve (Kestral Computing Pty. Ltd.) have served as co-editors at various times. Major contributions of thoughts and support come from Mark Tucker (Regenstrief Institute), George Beeler, Stan Huff (Intermountain Health Care), as well as Mike Henderson (Kaiser Permanente), Anthony Julian (Mayo), Joann Larson (Kaiser Permanente), Mark Shafarman (Oacis Healthcare Systems), Wes Rishel (Gartner Group), and Robin Zimmerman (Kaiser Permanente). Acknowledgements for their critical review and infusion of ideas go to Bob Dolin (Kaiser Permanente), Clem McDonald (Regenstrief Institute), Kai Heitmann (HL7 Germany), Rob Seliger (Sentillion), and Harold Solbrig (Mayo Clinic). Vital support came from the members of the task force, Laticia Fitzpatrick (Kaiser Permanente), Matt Huges, Randy Marbach (Kaiser Permanente), Larry Reis (Wizdom Systems), Carlos Sanroman (Kaiser Permanente), Greg Thomas (Kaiser Permanente). Thanks James Case (University of California, Davis), Norman Daoust (Partners HealthCare Systems), Irma Jongeneel (HL7 The Netherlands), Michio Kimura (HL7 Japan), John Molina (SMS), Richard Ohlmann (McKessonHBOC), David Rowed (HL7 Australia), and Klaus Veil (Macquarie Health Corp., HL7 Australia), for sharing their expertise in critical questions. This work was made possible by the Regenstrief Institute for Health Care.
Every data element has a data type. Data types define the meaning (semantics) of data values that can be assigned to a data element. Meaningful exchange of data requires that we know the definition of values so exchanged. This is true for complex "values" such as business messages as well as for simpler values such as character strings or integer numbers.
According to ISO 11404, a data type is "a set of distinct values, characterized by properties of those values and by operations on those values." A data type has intension and extension. Intentionally, the data type defines the properties exposed by every data value of that type. Extensionally, data types have a set of data values that are of that type (the type's "value set").
Semantic properties of data types are what ISO 11404 calls "properties of those values and [...] operations on those values." A semantic property of a data type is referred to by a name and has a value for each data value. The value of a data value's property must itself be a value defined by a data type - no data value exists that would not be defined by a data type.
Data types are thus the basic building blocks used to construct any higher order meaning: messages, computerized patient record documents, or business objects and their transactions. What, then, is the difference between a data type and a message, document, or business object? Data type values stand for themselves, the value is all that counts, neither identity nor state or changing of state is defined for a data value. Conversely in business objects, we track state and identity; the properties of an identical object might change between now and later. Not so with data values: a data value and its properties are constant. For example, number 5 is always number 5, there is no difference between this number 5 and that number 5 (no identity distinguished from value), number 5 never changes to number 6 (no change of state). One can think of data values as immutable objects where identity does not matter (identity and equality are the same.)1
Data values can be represented through various symbols but the data value's meaning is not bound to any particular representation.
For example, cardinal numbers (non-negative integers) are defined - intentionally - as a data type where each value has a successor value, where zero is the successor of no other cardinal value. Based on this definition we can define addition, multiplication, and other mathematical operations. Whatever representation reflects the rules we stated in the intentional definition of the cardinal data type is a valid representation of cardinal numbers. Examples for valid cardinal number representations are decimal digit strings, bags of glass marbles, or scratches on a wall. The number five is represented by the word "five" by the Arabic number "5" or the Roman number "V". The representation does not matter as long as it conforms to the semantic definition of the data type.
Another example, the Boolean data type is defined by its extension, the two distinct values true and false and the rules of negation and combining these values in conjunction and disjunction. The representation of Boolean values can be the words "true" and "false," "yes" and "no," the numbers 0 and 1, any two signs that are distinct from each other. The representation of data types does not matter as long as it conforms to the semantic definition of the data type.
This specification defines the semantics, the meaning of the HL7 data types. This specification is about semantics only, independent from representational and operational concerns or specific implementation technologies. Additional standards for representing the data values defined here are being defined for various technological approaches. These standards are called "Implementable Technology Specification" (ITS.) Those ITS define how values are represented so that they conform to the semantic definitions of this specifications, this may include syntaxes for character or binary representations, and computer procedures to act on the representation of data values. The meaning of these ITS representations communicated, generated, and processed in computer programs, is defined based on this standard, the semantic data type specification.
Data values have properties defined by their data type. The "fields" of "composite data types" are the most common example of such properties. However, more generally one should think of a data value's property as logical predicates or as mathematical functions; in simpler but still correct terms, properties are questions one can ask about a data value to receive another data value as an answer.
A property is referred to by its name. For example, the data type integer may have a property named "sign." A property has a domain, which is the set of possible "answer" values. The set of possible "answer" values is defined by the property's data type, but the domain of a property may be a subset of the data type's value set.
A property may also have arguments, additional information one must supply with a question to get an answer. For example, an important property of an integer number is that one integer plus another integer results in another integer, so the plus property of one integer needs an argument: the other integer.
Whether semantic properties have arguments is not a fundamentally relevant distinction. A data type's semantic property without arguments is not necessarily a "field" of a "composite" data type. For example, for integer values, we can define the property is-zero that has the Boolean value true when the number is zero and false when the number is not zero. This does not mean that is-zero must be an explicit component of any integer representation.
A data type's semantic property with arguments has no specific operational notions such as "procedure call," "passing arguments," "return values," "throwing exceptions," etc. These are all concepts of computer systems implementation of data types - but these operational notions are irrelevant for the semantics of data types.
This specification is about semantics of data types only. Neither is it about value representation syntax (not even an abstract syntax), nor is it about an operational interface to the data values.
Why does this specification make such a big issue about its being abstract from representation syntax as well as operational implementation?
HL7 needs this kind of abstract semantic data type specification for a very practical purpose. One important design feature of HL7 version 3 is its openness towards representation and implementation technologies. All HL7 version 3 specifications are supposed to be done in a form independent from specific representation and implementation technologies. HL7 acknowledges that, while at times some representation and implementation technologies may be more popular than others, technology is going to change - and with changing technology, representations of data values will change. HL7 standards are primarily targeted to healthcare domain information, independent from the technology supporting this information. HL7 expects that specifications defined independent from today's technology will continue to be useful, even after the next technological "paradigm shift".
The issue of data types is closer to implementation technology than most other HL7 information standards - and therein lays a certain danger that we define data types too dependent on current implementation technologies.
The majority of HL7 standards are about complex business objects. Complex business objects with many informational attributes can be specified as abstract syntax, where components are eventually defined in terms of data types. Conversely, defining data types in terms of abstract syntax is of little use because the components of such abstract syntax constructs would still have to have data types.2
Why is this specification so circular? Why is the data type "ANY" defined in terms of specializations of itself?
This specification needs to be independent of any particular implementation, and is therefore abstract, and not intended to be implementable. In this sense, the circularity is not a problem, since it does not introduce any uncertainty about what this specification says.
Why doesn't this specification define a set of primitive data types based on which composite data types could be defined simply as abstract syntax?
Any concrete implementation of the HL7 standards must ultimately use the built-in data types of their implementation technology. Therefore, we need a very flexible mapping between HL7 abstract data types and those data types built into any specific implementation technology. With a semantic specification, an Implementable Technology Specification (ITS) can conform simply by stating a mapping between the constructs of its technology and the HL7 version 3 data type semantics. Whether a data type is primitive of composite is irrelevant from a semantic perspective, and the answer may be different for different implementation technologies.
For example, this standard specifies a character string as a data type with many properties (e.g., charset, language, etc.) However, in many Implementation Technologies, character strings are primitive first class data types. We encourage that these native data types be used rather than a structure that slavishly represents all the semantic properties as "components." This specification only requires that the properties defined for data values can somehow be inferred from whatever representation is chosen, it does not matter how these values are represented. Whether "primitive" or "composite", with few or many "components", as "fields" or "methods" - this is all irrelevant.
For another example, a decimal representation, a floating-point register and a scaled integer are all possible native representations of real numbers for different implementation technologies. Some of these representations have properties that others do not have. Scaled integers, for instance, have a fixed precision and a relatively small range. Floating-point values have variable precision and a large range, but floating-point values lose any information about precision. Decimal representations are of variable precision and maintain the precision information (yet are slow to processing.) The data type semantics must be independent from all these accidental properties of the various representations, and must define the essential properties that any technology should be able to represent.
Why does HL7 need its own data type standard? Why can't HL7 simply adopt a standard defined by some other body?
As noted in the previous section, all HL7 implementation technologies have some data type system, but there are differences among the data type systems between implementation technologies. In addition, many implementation technologies' data type systems are not powerful enough to express the concepts that matter for the HL7 application layer.
For example, few implementation technologies provide the concepts of physical quantities, precision, ranges, missing information, and uncertainty that are so relevant in scientific and health care computing.
On the other hand, implementation technologies do make distinctions that are not relevant from the abstract semantics viewpoint, e.g., fixed point vs. floating-point real numbers; 8, 16, 32, or 64-bit integers; date vs. timestamp.
A number of data type systems have been used as input to this specification. These include the type systems of many major programming languages, including BASIC, Pascal, MODULA-2, C, C++, JAVA, ADA, LISP and SCHEME. This also includes type systems of language-independent implementation technologies, such as Abstract Syntax Notation One (ASN.1), Object Management Group's (OMG) Interface Definition Language (IDL) and Object Constraint Language (OCL), SQL 92 and SQL 99, the ISO 11404 language independent data types, and XML Schema Part 2 data types. Health care standards related data types have been considered as well, among these HL7 version 2.x, types used by CEN TC 251 messages and Electronic Health Record Architecture (EHCRA) and DICOM.
The data types described in this specification are designed to meet a number of requirements. These include
Of these, the last is the most important consideration. These data types are designed to deliver the functionality required throughout the HL7 standards. These requirements are not always compatible, and throughout this specification there is a number of places where particular design features are less than optimal for one of the 4 considerations listed above. In a number of these places, the requirements that led to this design feature are described in a requirements section. These requirements sections are only informative, not normative.
| Requirement The Reference Information Model defines a number of reference classes on which all domain information models are based. Each of these reference classes has a series of attributes which has an assigned type. Where the reference classes are used (cloned into) in domain models, the types in the reference classes may be replaced by other types to clarify and constrain the use of the attribute in the clone classes. This data types specification must define the rules for which data types can be substituted in this fashion. This specification chooses to use the specialization metaphor as a basis for the substitution rules, since this is widely understood and used method in theory and practice, and because these rules are more easily understood and managed than the alternatives. This use of specialization may lead to designs that may appear unfamiliar to some. |
This specification defines data types in several forms, using textual description, UML diagrams, tables, and a formal definition.
A formal definition of data types is used in order to clarify the semantics of the proposed types as unambiguously as possible. This data type definition language is described in detail in Introduction to the Formal Data Type Definition Language (DTDL) (§ 1.9 ). Formal languages make crisp essential statement and are therefore accessible to some formal argument of proof or rebuttal. However, the terseness of such formal statements may also be difficult to understand by humans. Therefore, all the important inferences from the formal statements are also included as plain English statements.
For a quick overview at the beginning of many data types this specification contains tables listing "primary" properties. "Primary" properties are a somewhat fuzzy notion of those properties that are more likely to be thought of as "fields" when the data type where implemented as a record, or that are expected to be used more often. These tables are provided to facilitate an overview of the content and purpose of data types. There is no requirement that the properties listed in these tables be represented as fields, and these tables are not abstract syntax definitions.
Each row of the property tables describes one property with the following columns:
The Unified Modeling Language (UML) is used for a graphical presentation of how data types relate to each other . Data types are shown as UML classes using the shortname for the class. Properties of types are shown as UML operations. Generic types are shown as UML parameterized classes, with UML realization links relating their instantiations.
Much of the detail of the data type declarations cannot be represented in the UML representation. Therefore the formal definition of the data types in the Data Type Definition Language (DTDL) should be used for detailed specification of the data types.
Some of the constraints from the DTDL are represented as constraints on the operations. Where constrains are shown, they are statements that will be true and are taken from the DTDL specification
The UML Diagrams use a stereotype "mixin". The mixin stereotype applies to a parameterized class, and denotes that the class specializes the parameter type and expresses all the properties of the type T in addition to it's own properties
NOTE: This is not an API specification. While this formal language might resemble some programming language or interface definition language, it is not intended to define the details of programs and other means of implementation. The formal definitions are a normative part of this specification, but this particular language needs not be implemented or used in conformant systems; nor need all the semantic properties be implemented or used by conformant systems. The internal working of systems, their way to implement data types, their functionality and services is entirely out of scope of this specification. The formal definition only specifies the meaning of the data values through making statements how one would theoretically expect these values to relate and behave.
This formal data type definition language3 specifies:
Definition of a data type occurs in two steps. First, the data type is declared. The declaration claims a name for a new data type with a list of names, types, and signatures of the new type's semantic properties. This declares, not defines the type. The definition occurs in both logic statements about what is always true about this type's values and their properties (invariant statements.)
Every data type is declared in a form that begins with the keyword type. For example, the following is the header of a declaration for the data type Boolean that has the short name alias BL and specializes the data type ANY.4
|
The Boolean data type declaration also contains a values-clause that declares the Boolean's complete set of values (its extension) as named entities. These named values are also valid character string literals. None of the other data types defined in this specification has a finite value set, which is why the values-clause is unique to the Boolean. In the marked-up formal language, value names use Italics font.
The block in curly braces following the header contains declarations of the semantic properties that hold for every value of the data type. A semicolon terminates each property declaration; and another semicolon after the closing curly brace terminates the data type declaration.
A property declaration mentions from left to right: (1) the data type of the property's value domain, (2) the property name, and (3) an optional argument list. The argument list of a property is enclosed in parentheses containing a sequence of argument declarations. Each argument is declared by the data type name and argument name. Semantic properties without arguments do not use an empty argument list.5
The specializes-clause means (a) inheritance of properties from the genus to the species, and (b) substitutability of values of the species type for variables of the genus type. Specialization can include the definition of additional properties and the specification of constraints on inherited properties for the specialized type.
An example for inheritance is: when CD has the property code and CS specializes CD then CS also has this property code even though isNull is not listed explicitly in the property declaration of BL. An example for substitutability is: when a property is declared as of a data type CD, and CS specializes CD, then a value of such property may be of type CS. In other words, substitutability is the same as subsumption of all values of type CS being also values of type CD.6
The type-declaration may be qualified by the keyword abstract, protected., or private. An abstract type is a type where no value can be just of this type without belonging to a concrete specialization of the abstract. A protected type is a type that is used inside this specification but no property outside this specification should be declared of a protected type. A private type is an internal "helper" abstraction, defined only for the purpose of defining some aspect of the semantics of deata types but that is not used even as the type of another protected or public type's property.7 (We also use the qualifier private at one point. Private types are only specified for the sake of formal definition of other types and are not used in any form outside this specification.)
The declaration of semantic properties, their names, data types, and arguments provide only clues as to what the new data type might be about. The true definition lies in the invariant statements. Invariant statements are logical statements that are true at all times.
Throughout this specification, invariant statements are provided in a formal syntax but are also written in plain English. The advantage of the formal syntax is that it can be interpreted unambiguously, and that it is strongly typed. The advantage of plain English statements is that they are more understandable, especially to those untrained in reading formal languages.
The formal syntax does help to sharpen the decisiveness of this specification. In some cases, however, the full semantics of a type are beyond what can be fully expressed in such invariant statements. The combination of both plain and formal language helps to make this specification more clear.
Invariant statements are formed using the invariant keyword that declares one or more variables in the same form as an argument list of a property. The invariant statement can contain a where clause that constrains the arguments for the entire invariant body. The invariant body is enclosed in curly braces. It contains a list of assertions that must all be true.
|
The semantics of the invariant statement is a logic predicate with a universal quantifier ("for all").
The above invariant statement can be read in English as "For all Boolean values x, where x is non-NULL it holds that x AND true equals x." All properties should be named such that one can read the assertions like English sentences.8
The argument list of an invariant statement need not be specified if no such argument is needed.
|
Assertions in invariant statements are expressions built with the semantic properties of defined data types. Assertion expressions must have a Boolean value (true or false.)9 No primitive data types, or operations, pre-exist the definition of any data type. The only preexisting features of the assertion expression language are:10
Within assertion expressions, nested quantifier statements can be formed similar to invariant statements. In fact, the universal quantifier built using the forall keyword is the same as the invariant statement. The universal quantifier can be used in a nested expression when the complexity of the problem requires it, such as in the following example:
|
The existence quantifier has the meaning as in common propositional logic. For example, the following invariant means: "SET values x and y intersect if and only if there exists an element e that is contained in both sets x and y."
|
The existence quantifier may have a where-clause; however, there is no difference whether an assertion is made as a where-clause or in the body of the existence quantifier. Conversely, for universal quantifiers, the where-clause weakens the assertion since the body now only applies for values that meet the criterion in the where-clause.
This specification defines certain allowable conversions between data types. For example, there is a pair of conversions between the Character String (ST) and Encode Data (ED). This means that if a one expects an ED value but actually has an ST value instead, one can turn the ST value into an ED.11
Three kinds of type conversions are defined: promotion, demotion, and character string literals. Type conversions can be implicit or explicit. Implicit type conversion occurs when a certain type is expected (e.g. as an argument to a statement) but a different type is actually provided. If the type provided has a conversion to the type expected the conversion should be done implicitly.
NOTE: An Implementation Technology Specification will have to specify how implicit type conversions are supported. Some technologies support it directly others do not; in any case, processing rules can be set that specify how these conversions are realized.
An explicit conversion can be specified in an assertion expression using the converted-to type name in parenthesis before the converted value. For example the following is an explicit type conversion in the where clause of an invariant statement.
|
The type conversion has lower priority than the property resolution period. Thus "(T)a.b " converts the value of the property b of variable a to data type T while "((T)a).b " converts the value of variable a to T and then references property b of that converted value.
Implicit type conversions in the assertion expressions are performed where possible. If a property's formal argument is declared of data type T; but the expression used as an actual argument is of type U; and if U does not extend T; and if U defines a conversion to T, that conversion from T to U takes effect.
A demotion is a conversion with a net loss of information. Generally, this means that a more complex type is converted into a simple type.
An example for a demotion is the conversion from Interval (IVL) to a simple Quantity (QTY), e.g. the center of the interval. In the data type definition language, a demotion is declared using the keyword demotion and the data type name to which to demote:
|
The specification of demotions shall indicate what information is lost and what the major consequences of losing this information are.
A promotion is a conversion where new information is generated. Generally, this means that a simpler type is converted into a more complex type.
For example, we allow any Quantity (QTY) to be converted to an Interval (IVL). However, IVL has more semantic properties than QTY, low and high boundary. Thus, the conversion of QTY to IVL is a promotion. The additional properties of QTY not present in IVL must assume new values, default values, or computed values. The specification of the promotion must indicate what these values are or how they can be generated.
A promoting conversion from type QTY to type IVL is defined as a semantic property of data type QTY using the keyword promotion and the data type name to which to promote:
|
Typically, a promotion is defined from a simple type to a more complex type. Also typically, the simple type is declared earlier in this document than a more complex type. Declaring all promotions to complex types in the simple type would thus involve forward references and would be confusing to the reader. Therefore, an alternative syntax allows promotions to be defined in the more complex type. This is indicated by naming the type from which to promote in an argument list behind the type to which to promote.
|
A literal is a character string representation of a data value. Literals are defined for many types. A literal is a type conversion from and to a Character String (ST) with a specially defined syntax.
Not every conversion from and to an ST is a literal conversion, however. A literal for a data type should be able to represent the entire value set of a data type whereas any other conversion to and from ST may only map a smaller subset of the converted data type.
The purpose of having literals is so that one can write down values in a short human readable form. For example, literals for the types integer number (INT) and real number (REAL) are strings of sign, digits, possibly a decimal point, etc. The more important interval types (IVL<REAL>, IVL<PQ>, IVL<TS>) have literal representations that allow one to use, e.g., "<5" to mean "less than 5", which is much more readable than a fully structured form of the interval. For some of the more advanced data types such as intervals, general timing specification, and parametric probability distribution we expect that the literal form may be the only form seen for representing these values until users have become used to the underlying conceptualizations.
Each literal conversion has its own syntax (grammar,) often aligned with what people find intuitive. This syntax may therefore not be completely straightforward from a computer's perspective.12
NOTE: Character string based Implementable Technology Specifications (ITS) of these abstract data types may or may not choose the literals defined here as their representations for these data types. We expect that the XML ITS, will use some but not all of the literals defined here.
The actual definition of the literal form occurs outside the data type declaration body using an attribute grammar. An attribute grammar is a grammar that specifies both syntax and semantics of language structures. The syntax is defined in essentially the Backus-Naur-Form (BNF).13
For example, consider the following simple definition of a data type for cardinal numbers (positive integers.) This type definition depends only the Boolean data type (BL) and has a character string literal declared:
|
The literal syntax and semantics is first exposed completely and then described in all detail.
|
Every syntactic rule consists of the name of a symbol, a colon and the definition (so called production) of the symbol. A production is a sequence of symbols. These other symbols are also defined in the grammar, or they are terminal symbols. Terminal symbols are character strings written in double quotes or string patterns (called regular expressions.) Thus the form:
|
means, that any cardinal number symbol is a cardinal number symbol followed by a digit or just a digit. The vertical bar stands for a disjunction (logical OR.) A syntactic rule ends with a semicolon.
Every symbol has exactly one value of a defined data type. The data type of the symbol's value is declared where the symbol is defined:
|
means that the symbol digits has a value of type CARD. The start-symbol is the data type itself and does not need a separate name.
The semantics of the literal expression is specified in semantic rules enclosed in curly braces for each of the defined productions of a symbol:
symbol : production1 { rule1 } | production2 { rule2 } | ... | productionn { rulen };
A semantic rule is simply a semicolon-separated list of Boolean assertion expressions of the same kind as those used in invariant statements. However, there are special variables defined in the semantic rule that all begin with a dollar character (e.g., $, $1, $2, $3, ...) The simple $ stands for the value of the currently defined symbol; while $1, $2, $3, etc. stand for the values of the parts of the semantic rule's associated production. For example, in
|
the first production "CARD digit" has a semantic rule that says: the value $ of the defined symbol equals the value $1 of the first symbol CARD times ten plus the value $2 of the second symbol digit.14
A terminal symbol can be specified as a string pattern, so-called regular expression. The regular expression syntax used here is the classic syntax invented by Aho and used in AWK, LEX, GREP, and PERL. Regular expressions appear between two slashes /.../. In a regular expression pattern every character except [ ] ^ $ . / : ( ) \ | ? * + { } matches itself. The other characters that are actually used in this specification are defined in Table 2.
Generic data types are incomplete type definitions. This incompleteness is signified by one or more parameters to the type definition. Usually parameters stand for other types. Using parameters, a generic type might declare semantic properties of other not fully specified data types. For example, the generic data type Interval is declared with a parameter T that can stand for any Quantity data type (QTY). The components low and high are declared as being of type T.
|
Instantiating a generic type means completing its definition. For example, to instantiate an Interval, one must specify of what base data type the interval should be. This is done by binding the parameter T. To instantiate an Interval of Integer numbers, one would bind the parameter T to the type Integer. Thus, the incomplete data type Interval is completed to the data type Interval of Integer.
For example the following type definition for MyType declares a property named "multiplicity" that is an interval of the cardinal number data type used in the above examples.
|
Generic data types for collections are being used throughout this specification. The most important of them are
Set (SET<T>) A set contains elements in no particular order and without duplicate elements.
Sequence (LIST<T>) A sequence is a collection of values in an arbitrary but particular order. A sequence has a head and a tail, where the head is an element and the tail is the sequence without its head.
Interval (IVL<T>) An interval is a continuous subset of an ordered type.
These and other generic types are fully defined in Generic Data Types (§ 1.9.5 ). These generic data types and their properties are being used in this specification early on. For the best understanding of this specification knowledge about the set, sequence and interval is important and the reader is advised to refer to Generic Data Types (§ 1.9.5 ). when coming across a generic type being used to define another type.
Generic data type extensions are generic types with one parameter type that the generic type specializes. In the formal data type definition language, generic type specializations follow the pattern:
|
These generic type extensions inherit properties of their base type and add some specific feature to it. The generic type extension is a specialization of the base type, thus a value of the extension data type can be used instead of its base data type.15
NOTE: Values of extended types can be substituted for their base type. However, an ITS may make some constraints as to what extensions to accommodate. Particularly, extensions need not be defined for those components carrying the values of data value properties. Thus, while any data value can be annotated outside the data type specification, an ITS may not provide for a way to annotate the value of a data value property.

If an application receives or parses an instance that is not valid with regard to this specification, the receiver is permitted to reject the instance in whatever fashion it deems appropriate but it is not required to. Note that some other HL7 standard or artefact such as a conformance statement may make additional constraints on behaviour in such cases.
Definition: Defines the basic properties of every data value. This is an abstract type, meaning that no value can be just a data value without belonging to any concrete type. Every concrete type is a specialization of this general abstract DataValue type.
|
Definition: Represents the fact that every data value implicitly carries information about its own data type. Thus, given a data value one can inquire about its data type.
|
Definition: Indicates that a value is a non-exceptional value of the data type.
|
When a property, RIM attribute, or message field is called mandatory this means that any non-NULL value of the type to which the property belongs has a non-NULL value for that property, in other words, a field may not be NULL, providing that its container (object, segment, etc.) is to have a non-NULL value.
Definition: Indicates that a value is an exceptional value, or a NULL-value. A null value means that the information does not exist, is not available or cannot be expressed in the data type's normal value set.
Every data element has either a proper value or it is considered NULL. If (and only if) it is NULL, the isNull provides more detail as to in what way or why no proper value is supplied.
|
Definition: If a value is an exceptional value (NULL-value), this specifies in what way and why proper information is missing.
|
The null flavors are a general domain extension of all normal data types. Note the distinction between value domain of any data type and the vocabulary domain of coded data types. A vocabulary domain is a value domain for coded values, but not all value domains are vocabulary domains.
The null flavor "other" is used whenever the actual value is not in the required value domain, this may be, for example, when the value exceeds some constraints that are defined too restrictive (e.g., age less than 100 years.)
NOTE: NULL-flavors are applicable to any property of a data value or a higher-level object attribute. Where the difference of null flavors is not significant, ITS are not required to represent them. If nothing else is noted in this specification, ITS need not represent general NULL-flavors for data-value properties.
Some of these null flavors are associated with named properties that can be used as simple predicates for all data values. This is done to simplify the formulation of invariants in the remainder of this specification.
Remember the difference between semantic properties and representational "components" of data values. An ITS must only represent those components that are needed to infer the semantic properties. The null-flavor predicates nonNull, isNull, notApplicable, unknown, and other can all be inferred from the nullFlavor property.
Definition: A predicate indicating that this exceptional value is of nullFlavor not-applicable (NA), i.e., that a proper value is not meaningful in the given context.
|
Definition: A predicate indicating that this exceptional value is of nullFlavor unknown (UNK).
|
Definition: A predicate indicating that this exceptional value is of nullFlavor other (OTH), i.e., that the required value domain does not contain the appropriate value.
|
Definition: Equality is a reflexive, symmetric, and transitive relation between any two data values. Only proper values can be equal, null values never are equal (even if they have the same null flavor.)
|
How equality is determined must be defined for each data type. If nothing else is specified, two data values are equal if they are indistinguishable, that is, if they differ in none of their semantic properties. A data type can "override" this general definition of equality, by specifying its own equal relationship. This overriding of the equality relation can be used to exclude semantic properties from the equality test. If a data type excludes semantic properties from its definition of equality, this implies that certain properties (or aspects of properties) that are not part of the equality test are not essential to the meaning of the value.
For example the physical quantity has the two semantic properties (1) a real number and (2) a coded unit of measure. The equality test, however, must account for the fact that, e.g., 1 meter equals 100 centimeters; independent equality of the two semantic properties is too strong a criterion for the equality test. Therefore, physical quantity must override the equality definition.
Definition: A meta-type declared in order to allow the formal definitions to speak about the data type of a value. Any data type defined in this specification is a value of the type DataType.
|
Definition: A CS specifying the alias of the data type.
|
Definition: A data type implies another data type if it has the same type or is a specialisation of it.
Definition: BL stands for the values of two-valued logic. A BL value can be either true or false, or, as any other value may be NULL.
|
With any data value potentially being NULL, the two-valued logic is effectively extended to a three-valued logic as shown in the following truth tables:
Where a boolean operation is performed upon 2 data types with different nullFlavors, the nullFlavor of the result is the first common ancestor of the 2 different nullFlavors, though conformant applications may also create a result that is any common ancestor
Definition: Conjunction (AND) is associative and commutative, with true as a neutral element. False AND any Boolean value is false. These rules hold even if one or both of the operands are NULL. If both operands for AND are NULL, the result is NULL.
|
Definition: A rule of the form IF condition THEN conclusion. Logically the implication is defined as the disjunction of the negated condition and the conclusion, meaning that when the condition is true the conclusion must be true to make the overall statement true. The logical implication is important to make invariant statements.
|
The implication is not reversible and does not specify what is true when the condition is false (ex falso quodlibet lat. “from false follows anything”).
Definition: BIN is a raw block of bits. BIN is a protected type that should not be declared outside the data type specification.
A bit is semantically identical with a non-null BL value. Thus, all binary data is — semantically — a sequence of non-null BL values.
protected type BinaryData alias BIN specializes LIST<BN>; |
NOTE: The representation of arbitrary binary data is the responsibility of an ITS. How the ITS accomplishes this depends on the underlying Implementation Technology (whether it is character-based or binary) and on the represented data. Semantically character data is represented as binary data, however, a character-based ITS should not convert character data into arbitrary binary data and then represent binary data in a character encoding. Ultimately even character-based implementation technology will communicate binary data.
An empty sequence is not considered binary data but counts as a NULL-value. In other words, non-NULL binary data contains at least one bit. No bit in a non-NULL binary data value can be NULL.
|
Definition: Data that is primarily intended for human interpretation or for further machine processing outside the scope of HL7. This includes unformatted or formatted written language, multimedia data, or structured information in as defined by a different standard (e.g., XML-signatures.) Instead of the data itself, an ED may contain only a reference (see TEL.) Note that ST is a specialization of the ED where the mediaType is fixed to text/plain.
| Name | Type | Description |
|---|---|---|
| mediaType | CS | Identifies the type of the encapsulated data and identifies a method to interpret or render the data. |
| charset | CS | For character-based encoding types, this property specifies the character set and character encoding used. The charset shall be identified by an Internet Assigned Numbers Authority (IANA) Charset Registration [http://www.iana.org/assignments/character-sets] in accordance with RFC 2978 [http://www.ietf.org/rfc/rfc2978.txt]. |
| language | CS | For character based information the language property specifies the human language of the text. |
| compression | CS | Indicates whether the raw byte data is compressed, and what compression algorithm was used. |
| reference | TEL | A telecommunication address (TEL), such as a URL for HTTP or FTP, which will resolve to precisely the same binary data that could as well have been provided as inline data. |
| integrityCheck | BIN | The integrity check is a short binary value representing a cryptographically strong checksum that is calculated over the binary data. The purpose of this property, when communicated with a reference is for anyone to validate later whether the reference still resolved to the same data that the reference resolved to when the encapsulated data value with reference was created. |
| integrityCheckAlgorithm | CS |
Specifies the algorithm used to compute the integrityCheck
value.
The cryptographically strong checksum algorithm Secure Hash Algorithm-1 (SHA-1) is currently the industry standard. It has superseded the MD5 algorithm only a couple of years ago, when certain flaws in the security of MD5 were discovered. Currently the SHA-1 hash algorithm is the default choice for the integrity check algorithm. Note that SHA-256 is also entering widespread usage. |
| thumbnail | ED | An abbreviated rendition of the full data. A thumbnail requires significantly fewer resources than the full data, while still maintaining some distinctive similarity with the full data. A thumbnail is typically used with by-reference encapsulated data. It allows a user to select data more efficiently before actually downloading through the reference. |
Encapsulated data can be present in two forms, inline or by reference. Inline data is communicated or moved as part of the encapsulated data value, whereas by-reference data may reside at a different (remote) location. The data is the same whether it is located inline or remote.
Definition: Identifies the type of the encapsulated data and identifies a method to interpret or render the data.
mediaType is a mandatory property, i.e., every non-NULL instance of ED must have a non-NULL mediaType property.
|
The IANA defined domain of media types is established by the Internet standard RFC 2045 [http://www.ietf.org/rfc/rfc2045.txt] and 2046 [http://www.ietf.org/rfc/rfc2046.txt]. RFC 2046 defines the media type to consist of two parts:
However, this specification treats the entire media type as one atomic code symbol in the form defined by IANA, i.e., top level type followed by a slash "/" followed by media subtype. Currently defined media types are registered in a database [http://www.iana.org/assignments/media-types/index.html] maintained by IANA. Currently more than 160 different MIME media types are defined, with the list growing rapidly. In general, all those types defined by the IANA may be used.
To promote interoperability, this specification prefers certain media types to others. This is to define a greatest common denominator on which interoperability is not only possible, but that is powerful enough to support even advanced multimedia communication needs.
Table 6 below assigns a status to certain MIME media types, where the status means one of the following:
The set of required media types is very small so that no undue requirements are forced on HL7 applications, especially legacy systems. In general, no HL7 application is forced to support any given kind of media other than written text. For example, many systems just do not want to receive audio data, because those systems can only show written text to their users. It is a matter of application conformance statements to say: "I will not handle audio". Only if a system claims to handle audio media, it must support the required media type for audio.
Definition: For character-based encoding types, this property specifies the character set and character encoding used. The charset shall be identified by an Internet Assigned Numbers Authority (IANA) Charset Registration [http://www.iana.org/assignments/character-sets] in accordance with RFC 2978 [http://www.ietf.org/rfc/rfc2978.txt].
The charset domain is maintained by the Internet Assigned Numbers Authority (IANA) [http://www.iana.org/assignments/character-sets]. The IANA source specifies names and multiple aliases for most character sets. For HL7's purposes, use of multiple alias names is not allowed. The standard name for HL7 is the one marked by IANA as "preferred for MIME." If IANA has not marked one of the aliases as "preferred for MIME" the main name shall be the one used for HL7.
Table 7 lists a few of the IANA defined character sets that are of interest to current HL7 members.
NOTE: The above list is not complete let alone exclusive. In particular, international HL7 affiliates may make special recommendations about charsets to be used in their realm. These recommendations may add additional charsets and may reassign the recommendations status of a listed charset.
The charset property needs to be known where the data of the ED is character type data in any form. If the data is provided in-line, then the charset must be known. If the data is provided as a reference, and the access method does not provide the charset for the data, typically as a mime header, then the charset must be conveyed as part of the ED.
Interested readers may also want to consult the "Character Model for the World Wide Web" [http://www.w3.org/TR/charmod] for a more complete discussion of character set and related issues
Definition: For character based information the language property specifies the human language of the text.
The need for a language code for text data values is documented in RFC 2277, IETF Policy on Character Sets and Languages [http://www.ietf.org/rfc/rfc2277.txt]. Further background information can be found in Using International Characters in Internet Mail [http://www.imc.org/mail-i18n.html], a memo by the Internet Mail Consortium.
The principles of the code domain of this attribute are specified by the Internet standard RFC 3066 [http://www.ietf.org/rfc/rfc3066.txt]. The RFC 3066 coding scheme is constructed from a primary subtag component encoded using the language codes of ISO 639, plus two codes for extensions for languages not represented in ISO 639. The code optionally includes a second subtag component encoded using the two letter country codes of ISO 3166, or a language code extension registered by the Internet Assigned Names Authority [http://www.iana.org/assignments/language-tags].17
While Language tags usually alter the meaning of the text, the language does not alter the meaning of the characters in the text.18
NOTE: Representation of language tags to text is highly dependent on the ITS. An ITS may use the native way of language tagging provided by its target implementation technology. Some may have language information in a separate component, e.g., XML has the xml:lang tag for strings. Others may rely on language tags as part of the binary character string representation, e.g., ISO 10646 (Unicode) and its "plane-14" language tags.
The language tag should not be mandatory if it is not mandatory in the implementation technology. Semantically, language tagging of strings follows a default-logic. In circumstances where a realm may support multiple langauges, it is up to the realm to define rules to handle language where none is specified when no language is specified. If no other rule is specified, the local language of the reader is assumed. If a language is set for an entire message or document, that language is the default. If any information element or value that is superior in the syntax hierarchy specifies a language, that language is the default for all subordinate text values.
If language tags are present in the beginning of the encoded binary text (e.g., through Unicode's plane-14 tags) this is the source of the language property of the encapsulated data value.
Definition: Indicates whether the raw byte data is compressed, and what compression algorithm was used.
Values of type ST may never be compressed.
Definition: A telecommunication address (TEL), such as a URL for HTTP or FTP, which will resolve to precisely the same binary data that could as well have been provided as inline data.
The semantic value of an encapsulated data value is the same, regardless whether the data is present inline data or just by-reference. However, an encapsulated data value without inline data behaves differently, since any attempt to examine the data requires the data to be downloaded from the reference. An encapsulated data value may have both inline data and a reference.
The reference must point to the same data as provided inline. It is an error if the data resolved through the reference does not match either the integrity check, in-line data, or data that had earlier been retrieved through the reference and then cached.
The reference may contain a usablePeriod to indicate that the data may only be available for a limited period of time. Whether the reference is limited by a usablePeriod or not, the content of the reference is fixed for all time. Any application using the reference must always receive the same data. The reference cannot be reused to send a different version of the same data, or different data.
By-reference encapsulated data may not be allowed depending on the attribute or component that is declared encapsulated data. Values of type ST must always be inline.
Definition: The integrity check is a short binary value representing a cryptographically strong checksum that is calculated over the binary data. The purpose of this property, when communicated with a reference is for anyone to validate later whether the reference still resolved to the same data that the reference resolved to when the encapsulated data value with reference was created.
It is an error if the data resolved through the reference does not match the integrity check.
The integrity check is calculated according to the integrityCheckAlgorithm. By default, the Secure Hash Algorithm-1 (SHA-1) shall be used. The integrity check is binary encoded according to the rules of the integrity check algorithm.
The integrity check is calculated over the raw binary data that is contained in the data component, or that is accessible through the reference. No transformations are made before the integrity check is calculated. If the data is compressed, the Integrity Check is calculated over the compressed data.
Definition: Specifies the algorithm used to compute the integrityCheck value.19
Definition: An abbreviated rendition of the full data. A thumbnail requires significantly fewer resources than the full data, while still maintaining some distinctive similarity with the full data. A thumbnail is typically used with by-reference encapsulated data. It allows a user to select data more efficiently before actually downloading through the reference.
Originally, the term thumbnail refers to an image in a lower resolution (or smaller size) than another image. However, the thumbnail concept can be metaphorically used for media types other than images. For example, a movie may be represented by a shorter clip; an audio-clip may be represented by another audio-clip that is shorter, has a lower sampling rate, or a lossy compression.
Thumbnails may not be allowed depending on the attribute or component that is declared encapsulated data. Values of type ST never have thumbnails, and a thumbnail may not itself contain a thumbnail.
|
NOTE: ITS's should consider the case where the thumbnail and the original both have the same properties of type, charset and compression. In this case, these properties need not be represented explicitly for the thumbnail but might be "inherited" from the main encapsulated data value to its thumbnail.
Two values of type ED are equal if and only if their mediatype and data are equal. For those ED values with compressed data or referenced data, only the de-referenced and uncompressed data counts for the equality test. The compression, thumbnail and reference property themselves are excluded from the equality test. In addition the language property is excluded from the test, due to the problems this would introduce values of type ED where the language is not specified. If the mediaType is character based and the charset property is not equal, the charset property must be resolved through mapping of the data between the different character sets.
The integrity check algorithm and integrity check is excluded from the equality test. However, since equality of integrity check value is strong indication for equality of the data, the equality test can be practically based on the integrity check, given equal integrity check algorithm properties.
Definition: The character string data type stands for text data, primarily intended for machine processing (e.g., sorting, querying, indexing, etc.) Used for names, symbols, and formal expressions.
ST is a restricted ED, whose ED.mediaType property is fixed to text/plain, and whose data must be inlined and not compressed. Thus, the properties compression, reference, integrity check, algorithm, and thumbnail are not applicable. The character string data type is used when the appearance of text does not bear meaning, which is true for formalized text and all kinds of names.
| Name | Type | Description |
|---|---|---|
| mediaType | CS | Identifies the type of the encapsulated data and identifies a method to interpret or render the data. |
| charset | CS | For character-based encoding types, this property specifies the character set and character encoding used. The charset shall be identified by an Internet Assigned Numbers Authority (IANA) Charset Registration [http://www.iana.org/assignments/character-sets] in accordance with RFC 2978 [http://www.ietf.org/rfc/rfc2978.txt]. |
| language | CS | For character based information the language property specifies the human language of the text. |
ST data type interprets the encapsulated data as character data (as opposed to bits), depending on the charset property of the encapsulated data type.
|
NOTE: Because many of the properties of the encapsulated data are bound to a default value, an ITS need not represent these properties at all. In fact, if the character encoding is also fixed, the ITS only represents the encoded character data.
The headCharacter and tailString properties define ST as a sequence of entities each of which uniquely identifies one character from the joint set of all characters known by any language of the world.20
The head of an ST is a string of only one character. An ST must have at least one character or else it is NULL. A zero-length ST is an exceptional value (NULL), not a proper value.
|
The length of an ST is the number of characters, not the number of encoded bytes, in the string. Byte encoding is an ITS issue and is not relevant on the application layer.
The following rules apply to whitespace contained within values of type ST:
| Requirement ST is a specialization of ED so that any RIM attribute which has the type ED can be constrained to a ST. The most important case is Act.text, which is an ED to cater for the use of references and multimedia data, but is often constrained to plain text. |
|
Fixed to be "text/plain".
|
Values of type ST must have a known charset.
Definition: For character based information the language property specifies the human language of the text.
The need for a language code for text data values is documented in RFC 2277, IETF Policy on Character Sets and Languages [http://www.ietf.org/rfc/rfc2277.txt]. Further background information can be found in Using International Characters in Internet Mail [http://www.imc.org/mail-i18n.html], a memo by the Internet Mail Consortium.
The principles of the code domain of this attribute are specified by the Internet standard RFC 3066 [http://www.ietf.org/rfc/rfc3066.txt]. The RFC 3066 coding scheme is constructed from a primary subtag component encoded using the language codes of ISO 639, plus two codes for extensions for languages not represented in ISO 639. The code optionally includes a second subtag component encoded using the two letter country codes of ISO 3166, or a language code extension registered by the Internet Assigned Names Authority [http://www.iana.org/assignments/language-tags].21
While Language tags usually alter the meaning of the text, the language does not alter the meaning of the characters in the text.22
NOTE: Representation of language tags to text is highly dependent on the ITS. An ITS may use the native way of language tagging provided by its target implementation technology. Some may have language information in a separate component, e.g., XML has the xml:lang tag for strings. Others may rely on language tags as part of the binary character string representation, e.g., ISO 10646 (Unicode) and its "plane-14" language tags.
The language tag should not be mandatory if it is not mandatory in the implementation technology. Semantically, language tagging of strings follows a default-logic. In circumstances where a realm may support multiple langauges, it is up to the realm to define rules to handle language where none is specified when no language is specified. If no other rule is specified, the local language of the reader is assumed. If a language is set for an entire message or document, that language is the default. If any information element or value that is superior in the syntax hierarchy specifies a language, that language is the default for all subordinate text values.
If language tags are present in the beginning of the encoded binary text (e.g., through Unicode's plane-14 tags) this is the source of the language property of the encapsulated data value.
|
Values of type ST cannot be compressed.
|
Values of type ST may not reference content from some other location.
|
Integrity check code is not used with values of type ST.
|
Integrity check algorithm is not used with values of type ST.
|
Values of type ST do not have thumbnails.
Two variations of ST literals are defined, a token form and a quoted string.23 The token form consists only of the lower case and upper case Latin alphabet, the ten decimal digits and the underscore. The quoted string can contain any character between double-quotes. The double quotes prevent a character string from being interpreted as some other literal. The token form allows keywords and names to be parsed from the data type specification language.
|
NOTE: Since ST literals are so fundamental to implementation technology, most ITS will specify some modified character string literal form. However, ITS designers must be aware of the interaction between the ST literal form and the literal forms defined for other data types. This is particularly critical if the other data type's literal form is structured with major components separated by break-characters (e.g., real number, physical quantity, set, and list literals, etc.)

Definition: A CD represents any kind of concept usually by giving a code defined in a code system. A CD can contain the original text or phrase that served as the basis of the coding and one or more translations into different coding systems. A CD can also contain qualifiers to describe, e.g., the concept of a "left foot" as a postcoordinated term built from the primary code "FOOT" and the qualifier "LEFT". In cases of an exceptional value, the CD need not contain a code but only the original text describing that concept.
CD is mostly used in one of its restricted or "profiled" forms, CS, CE, CV. Use of the full concept descriptor data type is not common. It requires a conscious decision and documented rationale. In all other cases, one of the CD restrictions shall be used.24
All CD restrictions constrain certain properties. Properties may be constrained to the extent that only one value may be allowed for that property, in which case mentioning the property becomes redundant. Constraining a property to one value is referred to as suppressing that property. Although, conceptually a suppressed property is still semantically applicable, it is safe for an HL7 interface to assume the implicit default value without testing.
NOTE: In general, this is true of many types in this data types specification, however it is a frequently asked question concerning the CD descendents.
Definition: The plain code symbol defined by the code system. For example, "784.0" is the code symbol of the ICD-9 code "784.0" for headache.
A non-exceptional CD value has a non-NULL code property whose value is a character string that is a symbol defined by the coding system identified by codeSystem. Conversely, a CD value without a value for the code property, or with a value that is not from the cited coding system is an exceptional value (NULL of flavor other).
|
Definition: Specifies the code system that defines the code.
Code systems shall be referred to by a UID, which allows unambiguous reference to standard HL7 codes, other standard code systems, as well as local codes. HL7 shall assign a UID to each of its code tables as well as to external standard coding systems that are being used with HL7. Local sites must use their ISO Object Identifier (OID) to construct a globally unique local coding system identifier.
Under HL7's branch, 2.16.840.1.113883, the sub-branches 5 and 6 contain HL7 standard and external code system identifiers respectively. The HL7 Vocabulary Technical Committee maintains these two branches.
A non-exceptional CD value (i.e. a CD value that has a non-null code property) has a non-NULL codeSystem specifying the system of concepts that defines the code. In other words whenever there is a code there is also a code system.
NOTE: Although every non-NULL CD value has a defined code system, in some circumstances, the ITS representation for the CD value needs not explicitly mention the code system. For example, when the context mandates one and only one code system to be used specifying the code system explicitly would be redundant. However, in that case the codeSystem takes on that context-specific default value and is not NULL.
|
An exceptional CD of NULL-flavor other indicates that a concept could not be coded in the coding system specified. Thus, for these coding exceptions, the code system that did not contain the appropriate concept must be provided in codeSystem.
Some code domains are qualified such that they include the portion of any pertinent local coding system that does not simply paraphrase the standard coding system (coded with extensibility, CWE.) If a CWE qualified field actually contains such a local code, the coding system must specify the local coding system from which the local code was taken. However, for CWE domains the local code is a valid member of the domain, so that local codes in CWE domains constitute neither an error nor an exceptional (NULL/other) value in the sense of this specification.
|
Definition: The common name of the coding system.
The code system name has no computational value. The purpose of a code system name is to assist an unaided human interpreter of a code value to interpret codeSystem. It is suggested — though not absolutely required — that ITS provide for codeSystemName in order to annotate the UID for human comprehension.
HL7 systems must not functionally rely on codeSystemName. codeSystemName can never modify the meaning of codeSystem and cannot exist without codeSystem.
|
Definition: If applicable, a version descriptor defined specifically for the given code system.
HL7 shall specify how these version strings are formed for each external code system. If HL7 has not specified how version strings are formed for a particular coding system, version designations have no defined meaning for such coding system.
Different versions of one code system must be compatible. Whenever a code system changes in an incompatible way, it will constitute a new code system, not simply a different version, regardless of how the vocabulary publisher calls it.
For example, the publisher of ICD-9 and ICD-10 calls these code systems, "revision 9" and "revision 10" respectively. However, ICD-10 is a complete redesign of the ICD code, not a backward compatible version. Therefore, for the purpose of this data type specification, ICD-9 and ICD-10 are different code systems, not just different versions. By contrast, when LOINC updates from revision "1.0j" to "1.0k", HL7 would consider this to be just another version of LOINC, since LOINC revisions are backwards compatible.
|
Definition: A name or title for the code, under which the sending system shows the code value to its users.
displayName is included both as a courtesy to an unaided human interpreter of a code value and as a documentation of the name used to display the concept to the user. The display name has no functional meaning; it can never exist without a code; and it can never modify the meaning of code.
NOTE: HL7 offers a "print name" in it's predefined vocabulary domains. These values are suitable for use in the displayName.
NOTE: Display names may not alter the meaning of the code value. Therefore, display names should not be presented to the user on a receiving application system without ascertaining that the display name adequately represents the concept referred to by the code value. Communication must not simply rely on the display name. The display name's main purpose is to support debugging of HL7 protocol data units (e.g., messages.)
|
Definition: The text or phrase used as the basis for the coding.
The original text exists in a scenario where an originator of the information does not assign a code, but where the code is assigned later by a coder (post-coding.) In the production of a concept descriptor, original text may thus exist without a code.
NOTE: Although post-coding is often performed from free text information, such as documents, scanned images or dictation, multi-media data is explicitly not permitted as original text. Also, the original text property is not meant to be a link into the entire source document. The link between different artifacts of medical information (e.g., document and coded result) is outside the scope of this specification and is maintained elsewhere in the HL7 standards. The original text is an excerpt of the relevant information in the original sources, rather than a pointer or exact reproduction. Thus the original text is to be represented in plain text form.
Values of type CD may have a non-NULL original text property despite having a NULL code. Any CD value with code of NULL signifies a coding exception. In this case, originalText is a name or description of the concept that was not coded. Such exceptional CD values may also contain translations. Such translations directly encode the concept described in originalText.
A CD can be demoted into an ST value representing only the originalText of the CD value.
|
Definition: A set of other concept descriptors that translate this concept descriptor into other code systems.
translation is a set of other CDs that each translate the first CD into different code systems. Each element of the translation set was translated from the first CD. Each translation may, however, also contain translations. Thus, when a code is translated multiple times the information about which code served as the input to which translation will be preserved.
NOTE: The translations are quasi-synonyms of one real-world concept. Every translation in the set is supposed to express the same meaning "in other words." However, exact synonymy rarely exists between two structurally different coding systems. For this reason, not all of the translations will be equally exact.
Definition: Specifies additional codes that increase the specificity of the the primary code.
The primary code and all the qualifiers together make up one concept. A CD with qualifiers is also called a code phrase or postcoordinated expression.
Qualifiers constrain the meaning of the primary code, but cannot negate it or change it's meaning to that of another value in the primary coding system
Qualifiers can only be used according to well-defined rules of post-coordination. A value of type CD may only have qualifiers if it's code system defines the use of such qualifiers or if there is a third code system that specifies how other code systems may be combined.
For example, SNOMED CT allows constructing concepts as a combination of multiple codes. SNOMED CT defines a concept "cellulitis (disorder)" (128045006) an attribute "finding site" (363698007) and another concept "foot structure (body structure)" (56459004). SNOMED CT allows one to combine these codes in a code phrase:
<observation>
...
<value code="128045006" codeSystem="&SNOMED-CT;" displayName="cellulitis (disorder)">
<qualifier code="56459004" displayName="foot structure">
<name code="363698007" displayName="finding site"/>
</qualifier>
</value>
...
</observation> |
In this example, there is one code system, SNOMED-CT that defines the primary code and all the qualifiers and how these are used, which is why in our example representation the codeSystem does not need to be mentioned for the qualifier name and value (the codeSystem is inherited from the primary code.)
It is important to note that the allowable qualifiers are specified by the code system. For instance, in SNOMED CT, there is a defined set of qualifying attributes, and only Findings and Disorders can be qualified with the "finding site" attribute. Use of qualifiers outside the boundaries specified by the code system is a non-conformant use of the CD data type. Adherence to the rules specified by the code system enables post-coordinated expressions to be compared with pre-coordinated concepts (such as where one might compare the above code phrase to the pre-coordinated concept "cellulitis of foot (disorder)" (128276007), which is defined within SNOMED CT as having a finding site of foot structure). CD does not provide for normalization of compositional expressions, therefore it is possible to create ambiguous expressions. Users should understand that they must provide the additional constraints necessary to assure unambiguous data representation, if they are planning to create compositional expressions using CD. Otherwise, they risk the inability to retrieve a complete set of all records corresponding to any given query.
Another common example is the U.S. Centers for Medicare and Medicaid Services (CMS) (previously known as the Health Care Financing Administration, HCFA) procedure codes. CMS procedure codes (HCPCS) are based on CPT-4 and add additional qualifiers to it. For example, the patient with above finding (plus peripheral arterial disease, diabetes mellitus, and a chronic skin lesion at the left great toe) may have an amputation of that toe. The CPT-4 concept is "Amputation, toe metatarsophalangeal joint" (28820) and a HCPCS qualifier needs to be added to indicate "left foot, great toe" (TA). Thus we code:
<procedure>
...
<cd code="28820" codeSystem="&CP4;" displayName="Amputation, toe metatarsophalangeal joint">
<qualifier code="TA" codeSystem="&HCP;" displayName="left foot, great toe"/>
</cd>
...
</procedure> |
In this example, the code system of the qualifier (HCPCS) is different than the code system of the primary code (CPT-4.) It is only because there are well-defined rules that define how these codes can be combined, that the qualifier may be used. Note also, that the role name is optional, and for HCPCS codes there are no distinguished role names.
The order of qualifiers is preserved, particularly for the case where the coding system allows post-coordination but defines no role names. (e.g., some ICD-9CM codes, or the old SNOMED "multiaxial" coding.)
The main use of concept descriptors is for the purpose of indexing, querying and decision-making based on a coded value. A semantically unambiguous specification of coded values therefore requires a clear definition of what equality of concept descriptor values means and how CD values should be compared. (For more details on comparing pre- and post-coordinated expressions, see Dolin RH, Spackman KA, Markwell D. Selective Retrieval of Pre- and Post-coordinated SNOMED Concepts. Fall AMIA 2002; 210-14, or the July 2003 SNOMED CT Implementation Guide.)
The equality of two CD values is determined solely based upon code and codeSystem. codeSystemVersion is excluded from the equality test.25 If qualifiers are present, the qualifiers are included in the equality test. Translations are not included in the equality test.26 Exceptional CD values are not equal even if they have the same NULL-flavor or the same original text.27
|
Some code systems define certain style options to their code values. For example, the U.S. National Drug Code (NDC) has a dash and a non-dash form. An example for the dash form may be 1234-5678-90 when the non-dash form is 01234567890. Another example for this problem is when certain ISO or ANSI code tables define optional alphanumeric and numeric forms of two or three character lengths all in one standard.
In the case where code systems provide for multiple representations, HL7 shall make a ruling about which is the preferred form. HL7 shall document that ruling where that respective external coding system is recognized. HL7 shall decide upon the preferred form based on criteria of practicality and common use. In absence of clear criteria of practicality and common use, the safest, most extensible, and least stylized (the least decorated) form shall be given preference.28
Definition: Specifies whether this CD is a specialization of the operand CD.
Naturally, concepts can be narrowed and widened to include or exclude other concepts. Many coding systems have an explicit notion of concept specialization and generalization. The HL7 vocabulary principles also provide for concept specialization for HL7 defined value sets. implies is a predicate that compares whether one concept is a specialization of another concept, and therefore implies that other concept.
When writing predicates (e.g., conditional statements) that compare two codes, one should usually test for implication not equality of codes.
For example, in Table 20 the "telecommunication use" concepts: work (W), home (H), primary home (HP), and vacation home (HV) are defined, where both HP and HV imply H. When selecting any home phone number, one should test whether the given use-code c implies H. Testing for c equal H would only find unspecified home phone numbers, but not the primary home phone number.
Operationally, implication can be evaluated in one of two ways. The code system literals may be designed such that one single hierarchy is reflected in the code literal itself (e.g., ICD-9.) Apart from such special cases, however, a terminological knowledge base and an appropriate subsumption algorithm will be required to evaluate implication statements. For post-coordinated coding systems, designing such a subsumption algorithm is a non-trivial task.29
Definition: A concept qualifier code with optionally named role. Both qualifier role and value codes must be defined by the coding system of the CD containing the concept qualifier. For example, if SNOMED RT defines a concept "leg", a role relation "has-laterality", and another concept "left", the concept role relation allows to add the qualifier "has-laterality: left" to a primary code "leg" to construct the meaning "left leg".
The use of qualifiers is strictly governed by the code system used. CD does not permit using code qualifiers with code systems that do not provide for qualifiers (e.g. pre-coordinated systems, such as LOINC, ICD-10 PCS.)
|
Definition: Specifies the manner in which the concept role value contributes to the meaning of a code phrase. For example, if SNOMED RT defines a concept "leg", a role relation "has-laterality", and another concept "left", the concept role relation allows to add the qualifier "has-laterality: left" to a primary code "leg" to construct the meaning "left leg". In this example, name is "has-laterality".
If the parent CD.codeSystem allows postcoordination but no role names (e.g. SNOMED) then name can be NULL.
Definition: The concept that modifies the primary code of a code phrase through the role relation. For example, if SNOMED RT defines a concept "leg", a role relation "has-laterality", and another concept "left", the concept role relation allows adding the qualifier "has-laterality: left" to a primary code "leg" to construct the meaning "left leg". In this example, value is "left".
value is of type CD and thus can in turn have qualifiers. This allows qualifiers to nest. Qualifiers can only be used as far as the underlying code system defines them. It is not allowed to use any kind of qualifiers for code systems that do not explicitly allow and regulate such use of qualifiers.
|
Definition: Indicates if the sense of name is inverted. This can be used in cases where the underlying code system defines inversion but does not provide reciprocal pairs of role names. By default, inverted is false.
For example, a code system may define the role relation "causes" and the concepts "Streptococcus pneumoniae" and "Pneumonia". If that code system allows its roles to be inverted, one can construct the post-coordinated concept "Pneumococcus pneumonia" through "Pneumonia - causes, inverted - Streptococcus pneumoniae."
Roles may only be inverted if the underlying coding system allows such inversion. Notably, if a coding system defines roles in inverse pairs or intentionally does not define certain inversions, the appropriate role code (e.g. "caused-by") must be used rather than inversion. It must be known whether the inverted property is true or false, since if it is NULL, the role cannot be interpreted.
NOTE: inverted should be conveyed in an indicator attribute, whose default value is false. That way the inverted indicator does not have to be sent when the role is not inverted.
Definition: Coded data in its simplest form, where only the code is not predetermined. The code system and code system version are fixed by the context in which the CS value occurs. CS is used for coded attributes that have a single HL7-defined value set.
|
CS can only be used in either of the following cases:
For example, since ED subscribes to the MIME design, it trusts IETF to manage the media type. This includes that this specification subscribes to the extension mechanism built into the MIME media type code (e.g., "application/x-myapp").
For CS values, the designation of the domain qualifier will always be CNE (coded, non-extensible) and the context will determine which HL7 values to use.30
Every non-NULL CS value has a defined . The ITS representation of CS need not explicitly mention the code system, because the context mandates one and only one code system to be used. Specifying the code system explicitly would be redundant. However, assumes the context-specific default value and is not NULL.
|
An exceptional CS of NULL-flavor other indicates that a concept could not be coded in the coding system specified. In these cases, code must be Null.
|
|
|
|
|
|
The string literal form of CS is primarily defined for the purposes of this specification. The literal form is a string representation of the code for the codeSystem for the context of the CS. You cannot determine codeSystem or codeSystemVersion from the literal itself, so the literal only has use where the context is known.
Definition: Coded data, specifying only a code, code system, and optionally display name and original text. Used only as the type of properties of other data types.
|
CV is used when any reasonable use case will require only a single code value to be sent. Thus, it should not be used in circumstances where multiple alternative codes for a given value are desired. This type may be used with both the CNE (coded, non-extensible) and the CWE (coded, with extensibility) domain qualifiers.
Definition: The plain code symbol defined by the code system. For example, "784.0" is the code symbol of the ICD-9 code "784.0" for headache.
A non-exceptional value has a non-NULL code property whose value is a character string that is a symbol defined by the coding system identified by codeSystem. Conversely, a value without a value for the code property, or with a value that is not from the cited coding system is an exceptional value (NULL of flavor other).
|
Definition: Specifies the code system that defines the code.
Code systems shall be referred to by a UID, which allows unambiguous reference to standard HL7 codes, other standard code systems, as well as local codes. HL7 shall assign a UID to each of its code tables as well as to external standard coding systems that are being used with HL7. Local sites must use their ISO Object Identifier (OID) to construct a globally unique local coding system identifier.
Under HL7's branch, 2.16.840.1.113883, the sub-branches 5 and 6 contain HL7 standard and external code system identifiers respectively. The HL7 Vocabulary Technical Committee maintains these two branches.
A non-exceptional value (i.e. a value that has a non-null code property) has a non-NULL codeSystem specifying the system of concepts that defines the code. In other words whenever there is a code there is also a code system.
NOTE: Although every non-NULL value has a defined code system, in some circumstances, the ITS representation for the value needs not explicitly mention the code system. For example, when the context mandates one and only one code system to be used specifying the code system explicitly would be redundant. However, in that case the codeSystem takes on that context-specific default value and is not NULL.
|
An exceptional of NULL-flavor other indicates that a concept could not be coded in the coding system specified. Thus, for these coding exceptions, the code system that did not contain the appropriate concept must be provided in codeSystem.
Some code domains are qualified such that they include the portion of any pertinent local coding system that does not simply paraphrase the standard coding system (coded with extensibility, CWE.) If a CWE qualified field actually contains such a local code, the coding system must specify the local coding system from which the local code was taken. However, for CWE domains the local code is a valid member of the domain, so that local codes in CWE domains constitute neither an error nor an exceptional (NULL/other) value in the sense of this specification.
|
Definition: The common name of the coding system.
The code system name has no computational value. The purpose of a code system name is to assist an unaided human interpreter of a code value to interpret codeSystem. It is suggested — though not absolutely required — that ITS provide for codeSystemName in order to annotate the UID for human comprehension.
HL7 systems must not functionally rely on codeSystemName. codeSystemName can never modify the meaning of codeSystem and cannot exist without codeSystem.
|
Definition: If applicable, a version descriptor defined specifically for the given code system.
HL7 shall specify how these version strings are formed for each external code system. If HL7 has not specified how version strings are formed for a particular coding system, version designations have no defined meaning for such coding system.
Different versions of one code system must be compatible. Whenever a code system changes in an incompatible way, it will constitute a new code system, not simply a different version, regardless of how the vocabulary publisher calls it.
For example, the publisher of ICD-9 and ICD-10 calls these code systems, "revision 9" and "revision 10" respectively. However, ICD-10 is a complete redesign of the ICD code, not a backward compatible version. Therefore, for the purpose of this data type specification, ICD-9 and ICD-10 are different code systems, not just different versions. By contrast, when LOINC updates from revision "1.0j" to "1.0k", HL7 would consider this to be just another version of LOINC, since LOINC revisions are backwards compatible.
|
Definition: A name or title for the code, under which the sending system shows the code value to its users.
displayName is included both as a courtesy to an unaided human interpreter of a code value and as a documentation of the name used to display the concept to the user. The display name has no functional meaning; it can never exist without a code; and it can never modify the meaning of code.
NOTE: HL7 offers a "print name" in it's predefined vocabulary domains. These values are suitable for use in the displayName.
NOTE: Display names may not alter the meaning of the code value. Therefore, display names should not be presented to the user on a receiving application system without ascertaining that the display name adequately represents the concept referred to by the code value. Communication must not simply rely on the display name. The display name's main purpose is to support debugging of HL7 protocol data units (e.g., messages.)
|
Definition: The text or phrase used as the basis for the coding.
The original text exists in a scenario where an originator of the information does not assign a code, but where the code is assigned later by a coder (post-coding.) In the production of a concept descriptor, original text may thus exist without a code.
NOTE: Although post-coding is often performed from free text information, such as documents, scanned images or dictation, multi-media data is explicitly not permitted as original text. Also, the original text property is not meant to be a link into the entire source document. The link between different artifacts of medical information (e.g., document and coded result) is outside the scope of this specification and is maintained elsewhere in the HL7 standards. The original text is an excerpt of the relevant information in the original sources, rather than a pointer or exact reproduction. Thus the original text is to be represented in plain text form.
Values of type may have a non-NULL original text property despite having a NULL code. Any value with code of NULL signifies a coding exception. In this case, originalText is a name or description of the concept that was not coded. Such exceptional values may also contain translations. Such translations directly encode the concept described in originalText.
A can be demoted into an ST value representing only the originalText of the value.
|
Definition: Coded data, where the coding system from which the code comes is ordered. CO adds semantics related to ordering so that models that make use of such domains may introduce model elements that involve statements about the order of the terms in a domain.
|
The relative order of CO values need not be independently obvious in their literal representation. It is expected that an application will look up the ordering of these values from some table.
Definition: The ordering relation is based on lessOrEqual which is taken as primitive in this specification.
All other order relations can be derived from this one. Since lessOrEqual is primitive, this accomodates partial orderings.
Order relationships typically hold only within a single coding system.
|
|
|
Definition: Coded data that consists of a coded value and, optionally, coded value(s) from other coding systems that identify the same concept. Used when alternative codes may exist.
|
CE is used when the use case indicates that alternative codes may exist and where it is useful to communicate these. CE provides for a primary code value, plus a set of alternative or equivalent representations.
Definition: The plain code symbol defined by the code system. For example, "784.0" is the code symbol of the ICD-9 code "784.0" for headache.
A non-exceptional value has a non-NULL code property whose value is a character string that is a symbol defined by the coding system identified by codeSystem. Conversely, a value without a value for the code property, or with a value that is not from the cited coding system is an exceptional value (NULL of flavor other).
|
Definition: Specifies the code system that defines the code.
Code systems shall be referred to by a UID, which allows unambiguous reference to standard HL7 codes, other standard code systems, as well as local codes. HL7 shall assign a UID to each of its code tables as well as to external standard coding systems that are being used with HL7. Local sites must use their ISO Object Identifier (OID) to construct a globally unique local coding system identifier.
Under HL7's branch, 2.16.840.1.113883, the sub-branches 5 and 6 contain HL7 standard and external code system identifiers respectively. The HL7 Vocabulary Technical Committee maintains these two branches.
A non-exceptional value (i.e. a value that has a non-null code property) has a non-NULL codeSystem specifying the system of concepts that defines the code. In other words whenever there is a code there is also a code system.
NOTE: Although every non-NULL value has a defined code system, in some circumstances, the ITS representation for the value needs not explicitly mention the code system. For example, when the context mandates one and only one code system to be used specifying the code system explicitly would be redundant. However, in that case the codeSystem takes on that context-specific default value and is not NULL.
|
An exceptional of NULL-flavor other indicates that a concept could not be coded in the coding system specified. Thus, for these coding exceptions, the code system that did not contain the appropriate concept must be provided in codeSystem.
Some code domains are qualified such that they include the portion of any pertinent local coding system that does not simply paraphrase the standard coding system (coded with extensibility, CWE.) If a CWE qualified field actually contains such a local code, the coding system must specify the local coding system from which the local code was taken. However, for CWE domains the local code is a valid member of the domain, so that local codes in CWE domains constitute neither an error nor an exceptional (NULL/other) value in the sense of this specification.
|
Definition: The common name of the coding system.
The code system name has no computational value. The purpose of a code system name is to assist an unaided human interpreter of a code value to interpret codeSystem. It is suggested — though not absolutely required — that ITS provide for codeSystemName in order to annotate the UID for human comprehension.
HL7 systems must not functionally rely on codeSystemName. codeSystemName can never modify the meaning of codeSystem and cannot exist without codeSystem.
|
Definition: If applicable, a version descriptor defined specifically for the given code system.
HL7 shall specify how these version strings are formed for each external code system. If HL7 has not specified how version strings are formed for a particular coding system, version designations have no defined meaning for such coding system.
Different versions of one code system must be compatible. Whenever a code system changes in an incompatible way, it will constitute a new code system, not simply a different version, regardless of how the vocabulary publisher calls it.
For example, the publisher of ICD-9 and ICD-10 calls these code systems, "revision 9" and "revision 10" respectively. However, ICD-10 is a complete redesign of the ICD code, not a backward compatible version. Therefore, for the purpose of this data type specification, ICD-9 and ICD-10 are different code systems, not just different versions. By contrast, when LOINC updates from revision "1.0j" to "1.0k", HL7 would consider this to be just another version of LOINC, since LOINC revisions are backwards compatible.
|
Definition: A name or title for the code, under which the sending system shows the code value to its users.
displayName is included both as a courtesy to an unaided human interpreter of a code value and as a documentation of the name used to display the concept to the user. The display name has no functional meaning; it can never exist without a code; and it can never modify the meaning of code.
NOTE: HL7 offers a "print name" in it's predefined vocabulary domains. These values are suitable for use in the displayName.
NOTE: Display names may not alter the meaning of the code value. Therefore, display names should not be presented to the user on a receiving application system without ascertaining that the display name adequately represents the concept referred to by the code value. Communication must not simply rely on the display name. The display name's main purpose is to support debugging of HL7 protocol data units (e.g., messages.)
|
Definition: The text or phrase used as the basis for the coding.
The original text exists in a scenario where an originator of the information does not assign a code, but where the code is assigned later by a coder (post-coding.) In the production of a concept descriptor, original text may thus exist without a code.
NOTE: Although post-coding is often performed from free text information, such as documents, scanned images or dictation, multi-media data is explicitly not permitted as original text. Also, the original text property is not meant to be a link into the entire source document. The link between different artifacts of medical information (e.g., document and coded result) is outside the scope of this specification and is maintained elsewhere in the HL7 standards. The original text is an excerpt of the relevant information in the original sources, rather than a pointer or exact reproduction. Thus the original text is to be represented in plain text form.
Values of type may have a non-NULL original text property despite having a NULL code. Any value with code of NULL signifies a coding exception. In this case, originalText is a name or description of the concept that was not coded. Such exceptional values may also contain translations. Such translations directly encode the concept described in originalText.
A can be demoted into an ST value representing only the originalText of the value.
|
Definition: A set of other concept descriptors that translate this concept descriptor into other code systems.
translation is a set of other s that each translate the first into different code systems. Each element of the translation set was translated from the first . Each translation may, however, also contain translations. Thus, when a code is translated multiple times the information about which code served as the input to which translation will be preserved.
NOTE: The translations are quasi-synonyms of one real-world concept. Every translation in the set is supposed to express the same meaning "in other words." However, exact synonymy rarely exists between two structurally different coding systems. For this reason, not all of the translations will be equally exact.
Definition: A character string that optionally may have a code attached. The text must always be present if a code is present. The code is often a local code.
|
SC is used in cases where coding is exceptional (e.g., user text messages are essentially text messages, and a printable message is the important content. Yet, sometimes messages come from a catalog of canned messages, which SC allows to reference.
Any non-null SC value MAY have a code, however, a code MUST NOT be given without the text.
|
Definition: A code representing the string data. For example, the string data may be a user-message out of a message-catalog where the code represents the identifier of the message in the message catalog.

Definition: A unique identifier string is a character string which identifies an object in a globally unique and timeless manner. The allowable formats and values and procedures of this data type are strictly controlled by HL7. At this time, user-assigned identifiers may be certain character representations of ISO Object Identifiers (OID) and DCE Universally Unique Identifiers (UUID). HL7 also reserves the right to assign other forms of UIDs (RUID), such as mnemonic identifiers for code systems.
The sole purpose of UID is to be a globally and timelessly unique identifier. The form of UID, whether it is an OID, a UUID or a RUID, is entirely irrelevant. As far as HL7 is concerned, the only thing one can do with a UID is denote to the object for which it stands. Comparison of UIDs is literal, i.e. if two UIDs are literally identical, they are assumed to denote to the same object. If two UIDs are not literally identical they may not denote to the same object.
|
No difference in semantics is recognized between the different allowed forms of the UID. The different forms are not distinguished by a component within or aside from the identifier string itself.
Even though this specification recognizes no semantic difference between the different forms of the unique identifier forms, there are differences of how these identifiers are built and managed, which is the sole reason to define subtypes to the UID for each of the variants.
Definition: A globally unique string representing an ISO Object Identifier (OID) in a form that consists only of numbers and dots (e.g., "2.16.840.1.113883.3.1"). According to ISO, OIDs are paths in a tree structure, with the left-most number representing the root and the right-most number representing a leaf.
Each branch under the root corresponds to an assigning authority. Each of these assigning authorities may, in turn, designate its own set of assigning authorities that work under its auspices, and so on down the line. Eventually, one of these authorities assigns a unique (to it as an assigning authority) number that corresponds to a leaf node on the tree. The leaf may represent an assigning authority (in which case the root OID identifies the authority), or an instance of an object. An assigning authority owns a namespace, consisting of its sub-tree.
OIDs are the preferred scheme for unique identifiers. OIDs should always be used except if one of the inclusion criteria for other schemes apply.
ISO/IEC 8824:1990(E) clause 28 defines the Object Identifier as
28.9 The semantics of an object identifier value are defined by reference to an object identifier tree. An object identifier tree is a tree whose root corresponds to [the ISO/IEC 8824 standard] and whose vertices [i.e. nodes] correspond to administrative authorities responsible for allocating arcs [i.e. branches] from that vertex. Each arc from that tree is labeled by an object identifier component, which is [an integer number]. Each information object to be identified is allocated precisely one vertex (normally a leaf) and no other information object (of the same or a different type) is allocated to that same vertex. Thus an information object is uniquely and unambiguously identified by the sequence of [integer numbers] (object identifier components) labeling the arcs in a path from the root to the vertex allocated to the information object.
28.10 An object identifier value is semantically an ordered list of object identifier component values. Starting with the root of the object identifier tree, each object identifier component value identifies an arc in the object identifier tree. The last object identifier component value identifies an arc leading to a vertex to which an information object has been assigned. It is this information object, which is identified by the object identifier value. [...]
|
According to ISO/IEC 8824 an object identifier is a sequence of object identifier component values, which are integer numbers. These component values are ordered such that the root of the object identifier tree is the head of the list followed by all the arcs down to the leaf representing the information object identified by the OID. The fact that OID specializes LIST<INT> represents this path of object identifier component values from the root to the leaf.
The leaf and "butLeaf" properties take the opposite view. The leaf is the last object identifier component value in the list, and the "butLeaf" property is all of the OID but the leaf. In a sense, the leaf is the identifier value and all of the OID but the leaf refers to the namespace in which the leaf is unique and meaningful.
However, what part of the OID is considered value and what is namespace may be viewed differently. In general, any OID component sequence to the left can be considered the namespace in which the rest of the sequence to the right is defined as a meaningful and unique identifier value. The value-property with a namespace OID as its argument represents this point of view.31
|
HL7 shall establish an OID registry and assign OIDs in its branch for HL7 users and vendors upon their request. HL7 shall also assign OIDs to public identifier-assigning authorities both U.S. nationally (e.g., the U.S. State driver license bureaus, U.S. Social Security Administration, HIPAA Provider ID registry, etc.) and internationally (e.g., other countries Social Security Administrations, Citizen ID registries, etc.) The HL7 registered OIDs must be used for these organizations, regardless whether these organizations have other OIDs assigned from other sources.
When assigning OIDs to third parties or entities, HL7 shall investigate whether an OID is already assigned for such entities through other sources. It this is the case, HL7 shall record such OID in a catalog, but HL7 shall not assign a duplicate OID in the HL7 branch. If possible, HL7 shall notify a third party when an OID is being assigned for that party in the HL7 branch.
Though HL7 shall exercise diligence before assigning an OID in the HL7 branch to third parties, given the lack of a global OID registry mechanism, one cannot make absolutely certain that there is no preexisting OID assignment for such third-party entity. Also, a duplicate assignment can happen in the future through another source. If such cases of supplicate assignment become known to HL7, HL7 shall make efforts to resolve this situation. For continued interoperability in the meantime, the HL7 assigned OID shall be the preferred OID used.
While most owners of an OID will "design" their namespace sub-tree in some meaningful way, there is no way to generally infer any meaning on the parts of an OID. HL7 does not standardize or require any namespace sub-structure. An OID owner, or anyone having knowledge about the logical structure of part of an OID, may still use that knowledge to infer information about the associated object; however, the techniques cannot be generalized.
Example for a tree of ISO object identifiers. HL7's OID is 2.16.840.1.113883. (link to graphic opens in a new window)An HL7 interface must not rely on any knowledge about the substructure of an OID for which it cannot control the assignment policies.
The structured definition of the OID is provided mostly to be faithful to the OID specification. Within HL7, OIDs are used as UID strings only, i.e., the literal string value is the only thing that is communicated and is the only thing that a reciever should have to consider when working with UIDs in the scope of the HL7 specification.
|
For compatibility with the DICOM standard, the literal form of the OID should not exceed 64 characters. (see DICOM part 5, section 9).
Definition: A globally unique string representing a DCE Universal Unique Identifier (UUID) in the common UUID format that consists of 5 hyphen-separated groups of hexadecimal digits having 8, 4, 4, 4, and 12 places respectively.
Both the UUID and its string representation are defined by the Open Group, CDE 1.1 Remote Procedure Call specification, Appendix A.
UUIDs are assigned based on Ethernet MAC addresses, the point in time of creation and some random component. This mix is believed to generate sufficiently unique identifiers without any organizational policy for identifier assignment (in fact this piggy-backs on the organization of MAC address assignment.)
UUIDs are not the preferred identifier scheme for use as HL7 UIDs. UUIDs may be used when identifiers are issued to objects representing individuals (e.g., entity instance identifiers, act event identifiers, etc.) For objects describing classes of things or events (e.g., catalog items), OIDs are the preferred identifier scheme.
|
The structured definition of the UUID is provided mostly to be faithful to the UUID specification. Within HL7, UUIDs are used as UID strings only, i.e., the literal string value is the only thing that is communicated and is the only thing that a reciever should have to consider when working with UIDs in the scope of the HL7 specification.
The literal form for the UUID is defined according to the original specification of the UUID. However, because the HL7 UIDs are case sensitive, for use with HL7, the hexadecimal digits A-F in UUIDs must be converted to upper case.
NOTE: The output of UUID related programs and functions may use all sorts of forms, upper case, lower case, and with or without the hyphens that group the digits. This variate output must be postprocessed to conform to the HL7 specification, i.e., the hyphens must be inserted for the 8-4-4-4-12 grouping and all hexadecimal digits must be converted to upper case.
Definition: A globally unique string defined exclusively by HL7. Identifiers in this scheme are only defined by balloted HL7 specifications. Local communities or systems must never use such reserved identifiers based on bilateral negotiations.
HL7 reserved identifiers are strings that consist only of (US-ASCII) letters, digits and hyphens, where the first character must be a letter. HL7 may assign these reserved identifiers as mnemonic identifiers for major concepts of interest to HL7.
Definition: An identifier that uniquely identifies a thing or object. Examples are object identifier for HL7 RIM objects, medical record number, order id, service catalog item id, Vehicle Identification Number (VIN), etc. Instance identifiers are defined based on ISO object identifiers.
|
Definition: A unique identifier that guarantees the global uniqueness of the instance identifier. The root alone may be the entire instance identifier.
In the presence of a non-null extension, the root is commonly interpreted as the "assigning authority", that is, it is supposed that the root somehow refers to an organization that assigns identifiers sent in the extension. However, the root does not have to be an organizational UID, it can also be a UID specifically registered for an identifier scheme.32
|
Definition: A character string as a unique identifier within the scope of the identifier root.
The extension is a character string that is unique in the namespace designated by the root. If a non-NULL extension is exists, the root specifies a namespace (sometimes called "assigning authority" or "identifier type".) The extension property may be NULL in which case the root OID is the complete unique identifier.
The root and extension scheme effectively means that the concatenation of root and extension must be a globally unique identifier for the item that this II value identifies.
It is recommended that systems use the OID scheme for external identifiers of their communicated objects. The extension property is mainly provided to accommodate legacy alphanumeric identifier schemes.
Some identifier schemes define certain style options to their code values. For example, the U.S. Social Security Number (SSN) is normally written with dashes that group the digits into a pattern "123-12-1234". However, the dashes are not meaningful and a SSN can just as well be represented as "123121234" without the dashes.
In the case where identifier schemes provide for multiple representations, HL7 shall make a ruling about which is the preferred form. HL7 shall document that ruling where that respective external identifier scheme is recognized. HL7 shall decide upon the preferred form based on criteria of practicality and common use. In absence of clear criteria of practicality and common use, the safest, most extensible, and least stylized (the least decorated) form shall be given preference.33
HL7 may also decide to map common external identifiers to the value portion of the II.root OID. For example, the U.S. SSN could be represented as 2.16.840.1.113883.4.1.123121234. The criteria of practicality and common use will guide HL7's decision on each individual case.
Definition: A human readable name or mnemonic for the assigning authority. The Assigning Authority Name has no computational value. The purpose of a Assigning Authority Name is to assist an unaided human interpreter of an II value to interpret the authority. Note: no automated processing must depend on the assigning authority name to be present in any form.
Definition: Specifies if the identifier is intended for human display and data entry (displayable = true) as opposed to pure machine interoperation (displayable = false).

Definition: A telecommunications address specified according to Internet standard RFC 2396 [http://www.ietf.org/rfc/rfc2396.txt]. The URI specifies the protocol and the contact point defined by that protocol for the resource. Notable uses of the telecommunication address data type are for telephone and telefax numbers, e-mail addresses, Hypertext references, FTP references, etc.
The Internet standard RFC 2396 [http://www.ietf.org/rfc/rfc2396.txt] defines a URI as follows:
Just as there are many different methods of access to resources, there are several schemes for describing the location of such resources. The generic syntax for URLs provides a framework for new schemes to be established using protocols other than those defined in this document.
URLs are used to "locate" resources, by providing an abstract identification of the resource location. Having located a resource, a system may perform a variety of operations on the resource, as might be characterized by such words as "access", "update", "replace", "find attributes". In general, only the "access" method needs to be specified for any URL scheme.
By agreement, it is permissable to use a URI in place of a URL. In these cases, it is still expected that the resources identified is accessible by some agreed method. A common use of URI's is to refer to SOAP attachments
|
Definition: Identifies the protocol used to interpret the address string and to access the resource so addressed.
Some URL schemes are registered by the Internet Assigned Numbers Authority (IANA) [http://www.iana.org], however IANA only registers URL schemes that are defined in Internet RFC documents. In fact there are a number of URL schemes defined outside RFC documents, part of which are registered with the World Wide Web Consortium (W3C).34
Similar to the ED.mediaType, HL7 makes suggestions about scheme values classifying them as required, recommended, other, and deprecated. Any scheme not mentioned has status other.
Note that this specification explicitly limits itself to URLs. Universal Resource Names (URN) are not covered by this specification. URNs are a kind of identifier scheme for other than accessible resources. This specification, however, is only concerned with accessible resources, which belong into the URL category.
Definition: The address is a character string whose format is entirely defined by the scheme.
While conceptually URL has the properties scheme and address, the common appearance of a URL is as a string literal formed according to the Internet standard. The general syntax of the URL literal is:
|
Note that there is no special data type for telephone numbers, telephone numbers are TELs and are specified as URLs.
The telephone number URL is defined in Internet RFC 2806 [http://www.ietf.org/rfc/rfc2806.txt]. Its definition is summarized in this subsection. This summary does not override or change any of the Internet specification's rulings.
The voice telephone URLs begin with "tel:" and fax URLs begin with "fax:"
The address is the telephone number in accordance with ITU-T E.123 Telephone Network and ISDN Operation, Numbering, Routing and Mobile Service: Notation for National and International Telephone Numbers (1993). While HL7 does not add or withdraw from the URL specification, the preferred subset of the address address syntax is given as follows:
|
The global absolute telephone numbers starting with the "+" and country code are preferred. Separator characters serve as decoration but have no bearing on the meaning of the telephone number. For example: "tel:+13176307960" and "tel:+1(317)630-7960" are both the same telephone number; "fax:+49308101724" and "fax:+49(30)8101-724" are both the same fax number.
Definition: A telephone number (voice or fax), e-mail address, or other locator for a resource mediated by telecommunication equipment. The address is specified as a Universal Resource Locator (URL) qualified by time specification and use codes that help in deciding which address to use for a given time and purpose.
The semantics of a telecommunication address is that a communicating entity (the responder) listens and responds to that address, and therefore can be contacted by an other communicating entity (the initiator.)
The responder of a telecommunication address may be an automatic service that can respond with information (e.g., FTP or HTTP services.) In such case a telecommunication address is a reference to that information accessible through that address. A telecommunication address value can thus be resolved to some information (in the form of encapsulated data, ED.)
|
The telecommunication address is an extension of the Universal Resource Locator (URL) specified according to Internet standard RFC 2396 [http://www.ietf.org/rfc/rfc2396.txt]. The URL specifies the protocol and the contact point defined by that protocol for the resource. Notable use cases for the telecommunication address data type are for telephone and fax numbers, e-mail addresses, Hypertext references, FTP references, etc.
Definition: Specifies the periods of time during which the telecommunication address can be used. For a telephone number, this can indicate the time of day in which the party can be reached on that telephone. For a web address, it may specify a time range in which the web content is promised to be available under the given address.
Definition: One or more codes advising a system or user which telecommunication address in a set of like addresses to select for a given telecommunication need.
The telecommunication use code is not a complete classification for equipment types or locations. Its main purpose is to suggest or discourage the use of a particular telecommunication address. There are no easily defined rules that govern the selection of a telecommunication address.
Definition: A character string that may have a type-tag signifying its role in the address. Typical parts that exist in about every address are street, house number, or post box, postal code, city, country but other roles may be defined regionally, nationally, or on an enterprise level (e.g. in military addresses). Addresses are usually broken up into lines, which are indicated by special line-breaking delimiter elements (e.g., DEL).
|
Definition: Specifies whether an address part names the street, city, country, postal code, post box, etc. If the type is NULL the address part is unclassified and would simply appear on an address label as is.