Storage - FHIR v3.3.0

R4 Ballot #1 (Mixed Normative/Trial use)

This page is part of the FHIR Specification (v3.3.0: R4 Ballot 2). The current version which supercedes this version is 5.0.0. For a full list of available versions, see the Directory of published versions . Page versions: R5 R4B R4

3.6 Using FHIR in persistent stores

FHIR Infrastructure

Work Group

Maturity Level: n/a

Ballot Status: Informative

Applications can use the resources defined by FHIR by storing them natively in a database or persistent store, where different applications or modules write and read the resources as part of their implementation. This page describes implementation issues encountered when storing resources natively in persistent stores.

Note that almost all applications store the information find in resources in some persistent store - usually a database. Most applications store the information found in the resources in some internal format, and only use resources natively when exchanging with other applications. This page concerns applications that store the FHIR resources natively.

3.6.1 Applicability of storing resources natively

In principle, resources are designed for exchange between systems, rather than as a database storage format. One practical consequence of this is that in some way, resources are highly denormalised, so that granular exchanges are fairly stand alone. (Note that in some ways, resources are also highly normalised; e.g. Patient demographic details are only found in the Patient resource).

To illustrate this, consider using an RxNorm code as a medication code. In a prescription resource (MedicationRequest), that will look like this:

<MedicationRequest xmlns="http://hl7.org/fhir">
  <medicationCodeableConcept>
   <coding>
     <system value="http://www.nlm.nih.gov/research/umls/rxnorm"/>
     <code value=""/>
   </coding>
  </medicationCodeableConcept>
</MedicationRequest>

If the resource was designed to prioritize storage efficiency, it would be normalized so that the code was just some primary key reference:

<MedicationRequest xmlns="http://hl7.org/fhir">
  <code key="123431232"/>
</MedicationRequest>

or something similar. Note: Whether to normalise or not is an engineering design question that has multiple different considerations, and very often product designers change their decision about this for engineering reasons with no associated change in functionality or requirements.

Resources, on the other hand, are designed for robustness and stability over efficiency. Using a key reference like the second example would require all the participants in an exchange to synchronize the key for the code, which is often not possible, and rarely stable.

In spite of this design criteria, it can still be useful and appropriate to store resources directly in a persistent store or database. Whether to do so depends on an application's requirements.

If an application's information requirements are nailed down - that is, well understood, clearly expressed, and not subject to uncontrolled change – then it is easy to design a data storage schema that is completely fit for purpose and highly efficient compared to storing FHIR resources. Classic Enterprise Information Systems (EHRs) tend to behave like this.

If, on the other hand, the application's information requirements are not at all nailed down, and implementers have to deal with whatever data comes in an ongoing basis, then resources are actually a very sound way to store an application's data. The the extensibility approach makes FHIR a particularly robust choice as the primary data store. Clinical Data Repositories tend to behave like this.

Most information systems fall somewhere between these 2 extremes, and determinming whether to store resources directly is not straight forward. Some applications take a hybrid approach – storing FHIR resources natively, and storing a well controlled subset of information in some expressly designed persistent scheme. The balance between these is driven by the stability of information requirements for different parts of the system.

3.6.2 Technology Choices

A wide variety of technologies exist for storing resources natively, including:

Classic SQL servers with JSON support (built in, or added on top by the application)
NoSQL servers (e.g. Mongodb , Couch , Hadoop , or Big Query )
Some RDF based store using the RDF in the turtle format, or in some native triple story (e.g. Jena )

This specification does not recommend any particular approach. The rest of this page describes general information management considerations that apply across all technologies.

3.6.3 Managing Joins

Most applications that store resources natively find the way references work in FHIR deserves particular focus. References can be absolute or relative URLs, they can be version specific or not, they may or many not follow FHIR’s well defined RESTful interface pattern, and they may or might not resolve in the local system. In addition, given the requirements consideration above, it's not always possible to enforce referential integrity when storing resources directly.

All the flexibility is required in various exchange scenarios, but it can present challenges for application design when building a coherent data store with resources.

Applications that store resources natively may want to extending the storage format with an element to capture the resolved link in addition to the existing reference/identifier in the Reference and canonical data types.

The RDF format does this explicitly for references and codings (fhir:reference and fhir:concept).

Applications that store JSON, will have to remove the extension for exchange, unless a native FHIR extension is used, which might not be the most efficient way to store the resolved link.

Note to balloters: Is there enough interest to standardize an extension? Ballot comments are welcome.