Syd Bauman
Dates and Times in DH
An annotated application profile
of ISO 8601:2019
for use with TEI
and other DH systems
version 1.0.0
Northeastern University Digital Scholarship Group
2026-01-21

Acknowledgements

Funding for this project was provided by the Northeastern University Library’s Digital Scholarship Group. The author is particularly indebted to Patrick Yott for his vision; to Sarah Connell, Caitlin Pollock, and Karin Bredenberg for review and copy editing; and to Ash Clark for making the result more visually appealing and more accessible.

1. Introduction

If you were to write today’s date, you might write “Sun 25 Dec 22”, or “12/25/22”, or “25/12/22”, or “Christmas, 2022”, or any one of dozens of other formats. But if you want a) unambiguous understanding of the date by future readers, and b) your current computer to understand it, you would write “2022-12-25”. In fact, if you were writing this date on the when attribute of a TEI element, you would be required to use that format. Besides being unambiguous1 and understandable by your computer, dates in this format are also easily sorted, since their components are in ‘big endian’ order.2

The standard that defines this format of a 4-digit year followed by a hyphen (U+002D) followed by a 2-digit month (01–12) followed by a hyphen followed by a 2-digit day number (01–31) is XML Schema Part 2: Datatypes, section 3.2.9 (but see also Date and Time Formats, also published by the W3C). That standard, in turn, was based on the more comprehensive, and more complicated, ISO 8601:1988(E) Data elements and interchange formats - Information interchange - Representation of dates and times published by the Organisation internationale de normalisation or ISO. This standard has since been updated and revised into the far more comprehensive, and far more complicated, ISO 8601:2019 Date and time — Representations for information interchange ISO 8601:2019, also published by the ISO.3

The purpose of this document is to give readers information and guidance on how to write dates, times, intervals, and durations, along with any inherent imprecision or uncertainty, such that the representation conforms to ISO 8601:2019, and thus can be used on the TEI when-iso attribute. This document is not intended to be an exhaustive tutorial on ISO 8601:2019, although in large part it could serve as such for many of the features of the standard.

1.1. Profile4

Whenever there is a choice between an “extended”, i.e. delimited, format (e.g. "2022-12-25") and a “basic”, i.e. undelimited, format (e.g. "20221225"), this document only considers the extended format. This document simply ignores the existence of the undelimited versions; they are not mentioned any further in the prose nor exemplified anywhere. (And they will not be represented in the regular expressions or iXML grammars that will hopefully someday accompany this document.)

Furthermore, this document presumes that in all cases users wish to express dates with 4-digit years. ISO 8601 permits an ‘expanded representation’ in which a year may be indicated by more than 4 digits. It also permits a year to be indicated exponentially. The ability to record dates prior to 10,000 BCE is clearly useful in some fields (e.g., geology); homo sapiens has been around for a period that would require 6 digits to represent, and the dates of the earliest cave drawings would require 5 digits to represent. However, humanist endeavors are generally more restricted in scope. Since this profile of ISO 8601:2019 is intended in large part for use with the Text Encoding Initiative, and the earliest known writing of any kind (let alone that which could be called text) can easily be represented in 4 digits,5 we limit ourselves to 4-digit years in the implicit formats, and do not consider exponential years even in the explicit formats. I.e., the entirety of ISO 8601:2019-2 § 4.7 is ignored.

Similarly, this document does not entertain the possibilities of expressing large numbers in exponential form, or of specifying the number of significant digits of a numeric value. (§ 4.4.2 and § 4.4.3, respectively.)

In this profile, the existence of leap seconds is ignored. In reality the Bureau international des poids et mesures may declare for any given year that the last minute of the year has either 59 or 61 seconds, rather than the usual 60. Thus ISO 8601:2019 has the capability to represent a minute as having up to 61 seconds. While it would make sense to develop validators (e.g., regular expressions or iXML grammar fragments) that allow the last minute of a given year to have 59, 60, or 61 seconds, the assumption here is that this is an extraordinarily rare requirement in DH, and thus not worth the immediate effort or added complexity.

The expression of grouped time scale units (§ 5) and selection rules (§ 12) are also ignored by this profile. While such capabilities may be of occasional use in DH, including them in this document would delay its publication by months, if not years.

This profile does not take ISO 8601-1:2019/Amd 1:2022 into account, both because I do not have a copy of it, and because the major change it makes (per Wikipedia) is to permit ‘24:00:00’ as a representation of ‘midnight at the end of a calendar day’, which should not be, and should never have been, allowed.

1.2. Internal format

In this document references to ‘ISO 8601’ without qualification are references to any version of the standard or all versions. References to ‘ISO 8601:2019’ without a part number are references to the two parts combined.

When written in this document, the portions of a representation that are placeholders for the actual values in an expression appear in blue and portions that are static appear in red. E.g., the general format for a modern date & time (without a time shift representation) is YYYY-MM-DDThh:mm:ss. (As you might guess, the actual formatting is not determined by the author of this document, but rather by the stylesheets applied to it.)

glossary
(U+2296)
an optional minus sign (U+002D).6
± (U+00B1)
either a plus sign (U+002B) or a minus sign (U+002D)
YYYY
a four-digit year
MM
a two-digit month (01–12)
DD
a two-digit day of month (01–31)
hh
a two-digit hour (00–23)
mm
a two-digit minute (00–59)
ss
a two-digit second (00–59)7
OOO
a three-digit day of the year (001–366)
(U+24E0)
a qualification character: one of ? (uncertain), ~ (approximate), or % (both uncertain and approximate).
(U+24D3)
either a period (U+002E) or a comma (U+002C) used as a decimal sign to separate the whole part from the decimal fractional part of a number
a terminal character with an overbar or macron — see following entries
one or more digits used as a decimal fraction
ⓓs̄ (U+0073, U+0304)
a fractional number of seconds (commonly used in the sciences, rarely needed in DH)
ⓓm̄ (U+006D, U+0304)
a fractional number of minutes (rarely used)
ⓓh̄ (U+0068, U+0304)
a fractional number of hours (rarely used)8
ⓓD̄ (U+0044, U+0304)
a fractional number of days (rarely used)
ⓓM̄ (U+004D, U+0304)
a fractional number of months (imprecise and rarely used)
ⓓȲ (U+0059, U+0304)
a fractional number of years (imprecise and rarely used)
digit
one of the characters ‘X’, ‘0’, ‘1’, ‘2’, ‘3’, ‘4’, ‘5’, ‘6’, ‘7’, ‘8’, or ‘9’

1.3. Not covered

In addition to those topics mentioned in 1.1. Profile, above, this document does not cover the temporal representations already permitted as values of the TEI when attribute in any further detail.9 Those values are:

  • date (⊖YYYY-MM-DD, e.g. "1945-10-17")
  • gYear (⊖YYYY, e.g. "1988")
  • gMonth (--MM, e.g. "--03"; not part of ISO 8601:2019)
  • gDay (---DD, e.g. "---13"; not part of ISO 8601:2019)
  • gYearMonth (⊖YYYY-MM, e.g. "1962-10")
  • gMonthDay (--MM-DD, e.g. "--04-01"; not part of ISO 8601:2019)
  • time (hh:mm:ss, e.g. "15:10:00")
  • dateTime (⊖YYYY-MM-DDThh:mm:ss, e.g. "1970-04-13T22:08:19-05:00")

Note that the example of ‘dateTime’, above, includes a time shift representation (aka a time zone); ‘time’ could also have a time shift representation appended. A time shift representation is optional, and is either ±hh, ±hh:mm, or Z.

1.3.1. other calendars

ISO 8601 covers the Gregorian calendar, and can easily be used to cover proleptic Gregorian dates. It makes no attempt to address Hebrew, Julian, Mayan, ‘Old Style’, or other calendar systems.

  <ab style="text-transform: uppercase;">
    Born <date when="1743-04-13">April 2 1743 O.S.</date>
    <lb/>
    Died <date when="1826-07-04">July 4 1826</date>
  </ab>           
  <date calendar="#Hijri" when="2024-12-22">76 Jumadal Akhirah 242 AH</date>

1.4. Implied & explicit formats

ISO 8601:2019 describes two major formats for expression of temporal information, the ‘implied form’ (or ‘implicit format’) and the ‘explicit form’. All of the examples so far in this document (“2022-12-25”, “15:10:00”, etc.) are in the implied form. In the implied form the unit of a component can be determined from its number of digits and the separator characters. In the explicit form the unit of a time scale component is indicated by a character that immediately follows the numeric value. E.g., "2022Y12M25D", "0013D", "T15H10M0S", etc.

Because these two forms are quite different, they are initially addressed in separate chapters.

In either format, there are never any spaces in a temporal representation. Yes, it is common to see a usage like ‘She was born 2024-11-27 04:01 at Exeter Hospital’. But this is an example of a complete ISO 8601 date followed by a space followed by an ISO 8601 time (without the T time designator and with reduced precision to the minute), not an ISO 8601 date & time.

2. Implied formats

In the implied formats, the characters -, +, :, T, and W are used to separate the various timescale components from one another. These characters, along with the strict length of each component, allow for unambiguous expression of dates and times.

2.1. Common form dates & times, dates, and times

A date or date and time may be represented using a combination of the year, month, day, hour, minute, and second with a variety of precisions.

  • ⊖YYYY-MM-DDThh:mm:ssⓓs̄
  • ⊖YYYY-MM-DDThh:mm:ss
  • ⊖YYYY-MM-DDThh:mmⓓm̄
  • ⊖YYYY-MM-DDThh:mm
  • ⊖YYYY-MM-DDThhⓓh̄
  • ⊖YYYY-MM-DDThh
  • ⊖YYYY-MM-DDⓓD̄
  • ⊖YYYY-MM-DD
  • ⊖YYYY-MMⓓM̄
  • ⊖YYYY-MM
  • ⊖YYYYⓓȲ
  • ⊖YYYY

Similarly a time alone may be represented using a variety of methods with a variety of precisions.

  • Thh:mm:ssⓓs̄ (the leading T is optional)
  • Thh:mm:ss (the leading T is optional)
  • Thh:mmⓓm̄ (the leading T is optional)
  • Thh:mm (the leading T is optional)
  • Thhⓓh̄
  • Thh

Note that a time expression (in any of the above implied formats, whether a date is included or not) — that is, an expression that includes either a ‘T’ or a ‘:’ — may be followed (without a space) by a time shift representation of ±hh, ±hh:mm, or Z. A time shift representation of ‘Z’ is the same as a time shift representation of ‘+00’ or ‘+00:00’. (And, in case you are curious, time shift representations of ‘-00’ and ‘-00:00’ do not exist, but would be the same if they did.)

Further note that any of the above expressions for date, time, or date & time (whether a complete representation or one with reduced precision, e.g. only a decade or only hours & minutes) can be considered a duration rather than a date or time. That is, we can think of "15:10" as ‘10 minutes past three in the afternoon’ or as ‘an interval of time lasting 15 hours plus 10 minutes’. Keep in mind, though, that a time shift indication cannot be appended to a representation of duration. (An interval of 15 hours and 10 minutes lasts 910 minutes whether you are in Victoria, BC or A Coruña, Spain.)

2.2. Ordinal dates

A date may instead be represented by a year and the ordinal day number within the year. The year is (as usual) expressed as a 4-digit number; the ordinal day is expressed as a 3-digit number (1–366, where 366 is only used for leap years); and the two are separated by a hyphen. For example, Valentine’s Day of 2024 was on the 45th day, thus "2024-045"; it is on the 45th day of every year. May Day, on the other hand, was "2024-122" that year, but was on "2023-121" the year before and on "2025-121" the year after, because 2024 was a leap year.

An ordinal date may also be combined with a time. Thus the following formats may be used to express an ordinal date, each of which could have a time shift indication appended.

  • ⊖YYYY-OOOThh:mm:ssⓓs̄
  • ⊖YYYY-OOOThh:mm:ss
  • ⊖YYYY-OOOThh:mmⓓm̄
  • ⊖YYYY-OOOThh:mm
  • ⊖YYYY-OOOThhⓓh̄
  • ⊖YYYY-OOOThh
  • ⊖YYYY-OOOⓓD̄
  • ⊖YYYY-OOO

Although it is feasible to think of ‘6543-210’ as a period of 6,543 years plus 210 days, ISO 8601:2019 does not permit using ordinal dates as durations. (Instead you would have to use something like ‘0210.574948665’, ‘0210-06.9’, or ‘0210-06-27.375’, or the explicit format (‘6543Y210D’)).

2.3. Week dates

A date may also be represented by its week within a given year, and the day within that week. For purposes of counting weeks for this representation, the first week of a year is the first week containing a Thursday.

In this representation, the year is (as usual) expressed as a 4-digit number; the week is expressed as a 2-digit number (01–53) immediately preceded by a W; and the day within the week is expressed by a 1-digit number (1–7, where "1" is Monday and "7" is Sunday). For example, Edie Brickell and Paul Simon ‘first laid eyes on each other’ on "1988-W44-6".

A week may be specified without a day of the week to indicate an entire week. And, as with ordinal dates and common dates, the time, possibly with a time shift, may be appended (with a T separating). Thus the following are possible forms using the week date.

  • ⊖YYYY-WWW-DThh:mm:ssⓓs̄
  • ⊖YYYY-WWW-DThh:mm:ss
  • ⊖YYYY-WWW-DThh:mmⓓm̄
  • ⊖YYYY-WWW-DThh:mm
  • ⊖YYYY-WWW-DThhⓓh̄
  • ⊖YYYY-WWW-DThh
  • ⊖YYYY-WWW-D
  • ⊖YYYY-WWW

All but the last two might have a time shift representation of ±hh, ±hh:mm, or Z appended. None but the last can represent a duration.

2.4. Supra-year specifications

The three units of time longer than a year typically used in DH are the decade, the century, and the millennium. ISO 8601, and thus this profile, provides no mechanism for directly indicating a millennium in an implicit form.

2.4.1. centuries

A century is indicated by a 2-digit number that may be preceded by a minus sign; the two digits are the same as the first two digits for every year in that century.10 Thus "20" is the 21st century (the years 2000 through 2099, inclusive), and "-01" is the 2nd century BCE (the years -0199 through -0100, inclusive).11 For example:

<sp who="#jones"><speaker>PROF. JONES:​</speaker><p>But of course there is considerable evidence of
    open-field villages as far back as the ​<date when-iso="09">tenth century​</date>.​</p></sp>

Warning: this example uses an invisible character (ZERO WIDTH SPACE, U+200B) for formatting.

ISO 8601:2019, like an astronomer, refers to the year before the year 1 AD as year 0000, as opposed to year -0001. Thus although the centuries "00" and "-00" are not the same century, they do overlap: they each contain the year "0000".

2.4.2. decades

A decade is indicated by a 3-digit number that may be preceded by a minus sign; the three digits are the same as the first three digits for every year in that decade. Thus "192" is the 1920s (the years 1920 through 1929, inclusive, in contrast to the year 192 AD, which is "0192"), and "-20" is the 2nd decade BCE (the years -0029 through -0020, inclusive). For example:

<sp who="#admiral_kirk"><speaker>KIRK:​</speaker><p>Him? He's harmless. Back in ​<date when-iso="196">the
    sixties​</date> he was part of the Free Speech movement
    at Berkeley. I think he had a little too much LDS.​</p></sp>

Warning: this example uses an invisible character (ZERO WIDTH SPACE, U+200B) for formatting.

Note that the decades "000" and "-000" are not the same decade, although they overlap: each contains the year "0000".

2.5. Unspecified digits

Any number of the digits of an ungrouped timescale component may be expressed as an X to indicate it is unspecified. So, e.g., while "196" indicates the 1960s (an entire decade), "196X" indicates a single year within the 1960s, but does not specify which one. XX:15 says a quarter past the hour, but does not specify past which hour.

2.6. Subsets of Year (other than month)

A year is commonly divided into 12 months. But there are other divisions of a year we may wish to express. ISO 8601-2:2019 provides for division of a year into seasons, semesters, quadrimesters, or trimesters (quarters).

  • 21 Spring (independent of location)
  • 22 Summer (independent of location)
  • 23 Autumn (independent of location)
  • 24 Winter (independent of location)
  • 25 Spring — Northern Hemisphere
  • 26 Summer — Northern Hemisphere
  • 27 Autumn — Northern Hemisphere
  • 28 Winter — Northern Hemisphere
  • 29 Spring — Southern Hemisphere
  • 30 Summer — Southern Hemisphere
  • 31 Autumn — Southern Hemisphere
  • 32 Winter — Southern Hemisphere
  • 33 Quarter 1 (3 months in duration)
  • 34 Quarter 2 (3 months in duration)
  • 35 Quarter 3 (3 months in duration)
  • 36 Quarter 4 (3 months in duration)
  • 37 Quadrimester 1 (4 months in duration)
  • 38 Quadrimester 2 (4 months in duration)
  • 39 Quadrimester 3 (4 months in duration)
  • 40 Semestral 1 (6 months in duration)
  • 41 Semestral 2 (6 months in duration)
<div type="article"><head>The White House Spokesman​</head><byline>By ​<persName>Lindsay Rogers​</persName></byline><dateline>ISSUE: ​<date when-iso="1926-26">Summer 1926​</date></dateline><p>Not the least of the distinctions of the American plan
                of government is that it has tried new devices and made
                important contributions to democratic theory. The success
    of …​</p><!-- ... --></div>

Warning: this example uses an invisible character (ZERO WIDTH SPACE, U+200B) for formatting.

2.7. Qualification: uncertainty and approximation

Using ISO 8601:2019 we can explicitly indicate components of temporal expressions (including the entire expression) which are considered uncertain (i.e., the source is unreliable) or approximate (i.e., an estimate that is possibly or even probably correct). The qualification character used to indicate uncertainty is ?; the character used to indicate approximation is ~; and to indicate both uncertainty and approximation is %.

To apply a qualification character to a single time scale component (e.g., just the date, but not the year or month), place the qualification character immediately to the left of the component. E.g., the expression "1692-06-~16" indicates a date in June of 1692, which was probably, or at least possibly, the 16th.

To apply a qualification character to a group of temporal expression components, including the entire expression, place the character to the right of the components it applies to or at the end of the expression (i.e., as the rightmost character of the expression). E.g. the expression "1869-10-02?" indicates an uncertain date of 02 October 1869; the expression "1869-10?-02" indicates a date of 02 October 1869 for which the encoder asserts that the source is dubious with respect to the year and month, but reliable about the date (the 2nd of the month).

To summarize, in implicit formats the qualification character (either ?, ~, or %) is placed either

  • at the end of an entire temporal expression (in which case it applies to the entire expression); or
  • immediately at the end of a subset of a temporal expression, i.e., before a ‘-’, ‘T’, or ‘:’ separator character (in which case it applies to the group of components to its left — note the above is just a special case of this); or
  • immediately before the value of a particular component of a temporal expression, i.e., after a ‘-’, ‘T’, or ‘:’ separator character (in which case it applies to the single component to its right).

It is possible to use more than one qualification character in a single expression, but two qualification characters should never occur immediately next to each other.

While it is technically allowable to use the same qualification character more than once to apply to the same component, an expression that uses fewer qualification characters is preferred. E.g., although "0123?-01-23?" is allowed, "0123-01-23?" is preferred, as the ? after the year adds no information.

Keeping this in mind, and using the symbol for a qualification character placed immediately to the right of a component (and thus applying to the entire portion of the expression that precedes it, i.e. is to its left), and the symbol for a qualification character placed immediately to the left of a component (and thus applying only to the single component that immediately follows it, i.e. the single component to its right), the following are possible formats for expression of a complete date & time in the implied format with use of a single type of qualification (i.e., one of ?, ~, or %), but applied to any number of components (including none and all six, which are the first and last items in the list, respectively).

  • YYYY-MM-DDTHH:mm:ss
  • YYYY-MM-DDTHH:mm:ⓛss
  • YYYY-MM-DDTHH:ⓛmm:ss
  • YYYY-MM-DDTⓛHH:mm:ss
  • YYYY-MM-ⓛDDTHH:mm:ss
  • YYYY-ⓛMM-DDTHH:mm:ss
  • YYYYⓡ-MM-DDTHH:mm:ss
  • YYYY-MM-DDTHH:ⓛmm:ⓛss
  • YYYY-MM-DDTⓛHH:mm:ⓛss
  • YYYY-MM-ⓛDDTHH:mm:ⓛss
  • YYYY-ⓛMM-DDTHH:mm:ⓛss
  • YYYYⓡ-MM-DDTHH:mm:ⓛss
  • YYYY-MM-DDTⓛHH:ⓛmm:ss
  • YYYY-MM-ⓛDDTHH:ⓛmm:ss
  • YYYY-ⓛMM-DDTHH:ⓛmm:ss
  • YYYYⓡ-MM-DDTHH:ⓛmm:ss
  • YYYY-MM-ⓛDDTⓛHH:mm:ss
  • YYYY-ⓛMM-DDTⓛHH:mm:ss
  • YYYYⓡ-MM-DDTⓛHH:mm:ss
  • YYYY-ⓛMM-ⓛDDTHH:mm:ss
  • YYYYⓡ-MM-ⓛDDTHH:mm:ss
  • YYYY-MMⓡ-DDTHH:mm:ss
  • YYYY-MM-DDTⓛHH:ⓛmm:ⓛss
  • YYYY-MM-ⓛDDTHH:ⓛmm:ⓛss
  • YYYY-ⓛMM-DDTHH:ⓛmm:ⓛss
  • YYYYⓡ-MM-DDTHH:ⓛmm:ⓛss
  • YYYY-MM-ⓛDDTⓛHH:mm:ⓛss
  • YYYY-ⓛMM-DDTⓛHH:mm:ⓛss
  • YYYYⓡ-MM-DDTⓛHH:mm:ⓛss
  • YYYY-ⓛMM-ⓛDDTHH:mm:ⓛss
  • YYYYⓡ-MM-ⓛDDTHH:mm:ⓛss
  • YYYY-MMⓡ-DDTHH:mm:ⓛss
  • YYYY-MM-ⓛDDTⓛHH:ⓛmm:ss
  • YYYY-ⓛMM-DDTⓛHH:ⓛmm:ss
  • YYYYⓡ-MM-DDTⓛHH:ⓛmm:ss
  • YYYY-ⓛMM-ⓛDDTHH:ⓛmm:ss
  • YYYYⓡ-MM-ⓛDDTHH:ⓛmm:ss
  • YYYY-MMⓡ-DDTHH:ⓛmm:ss
  • YYYY-ⓛMM-ⓛDDTⓛHH:mm:ss
  • YYYYⓡ-MM-ⓛDDTⓛHH:mm:ss
  • YYYY-MMⓡ-DDTⓛHH:mm:ss
  • YYYY-MM-DDⓡTHH:mm:ss
  • YYYY-MM-ⓛDDTⓛHH:ⓛmm:ⓛss
  • YYYY-ⓛMM-DDTⓛHH:ⓛmm:ⓛss
  • YYYYⓡ-MM-DDTⓛHH:ⓛmm:ⓛss
  • YYYY-ⓛMM-ⓛDDTHH:ⓛmm:ⓛss
  • YYYYⓡ-MM-ⓛDDTHH:ⓛmm:ⓛss
  • YYYY-MMⓡ-DDTHH:ⓛmm:ⓛss
  • YYYY-ⓛMM-ⓛDDTⓛHH:mm:ⓛss
  • YYYYⓡ-MM-ⓛDDTⓛHH:mm:ⓛss
  • YYYY-MMⓡ-DDTⓛHH:mm:ⓛss
  • YYYY-MM-DDⓡTHH:mm:ⓛss
  • YYYY-ⓛMM-ⓛDDTⓛHH:ⓛmm:ss
  • YYYYⓡ-MM-ⓛDDTⓛHH:ⓛmm:ss
  • YYYY-MMⓡ-DDTⓛHH:ⓛmm:ss
  • YYYY-MM-DDⓡTHH:ⓛmm:ss
  • YYYY-MM-DDTHHⓡ:mm:ss
  • YYYY-ⓛMM-ⓛDDTⓛHH:ⓛmm:ⓛss
  • YYYYⓡ-MM-ⓛDDTⓛHH:ⓛmm:ⓛss
  • YYYY-MMⓡ-DDTⓛHH:ⓛmm:ⓛss
  • YYYY-MM-DDⓡTHH:ⓛmm:ⓛss
  • YYYY-MM-DDTHHⓡ:mm:ⓛss
  • YYYY-MM-DDTHH:mmⓡ:ss
  • YYYY-MM-DDTHH:mm:ssⓡ

Qualification characters can (of course) be used in representations of dates (without time), times (without date), and reduced precision versions thereof, with or without a decimal fraction on the highest precision component. Furthermore it is possible to use two or three different qualification characters within a single expression. These possibilities are not covered by the above list. Qualification characters can also be used in ordinal dates (e.g., "1934?-016") or week dates (e.g., "2025-W04-~2").

Note that ISO 8601:2019 does not seem to say whether a qualification character applied to a time or date & time that has a time shift indicated is placed before the time shift representation or after. That is, it is inconclusive as to whether ‘around quarter to eight in the morning’ in Jerusalem should be represented with "07:45?+03" or with "07:45+03?". Logically the former seems more appropriate, as the uncertainty does not apply to the time zone; and there is no indication in ISO 8601:2019 that a qualification may be applied to a time shift. On the other hand, the only sentence that seems to apply says ‘a qualification symbol occurring at the rightmost end of the expression’. (Emphasis added.)

3. Explicit Formats

In the explicit form, each time scale component is expressed as a number followed by a letter that explicitly tells the reader (whether human or computer) what time component the number refers to. For example "1941Y8M15DT7H12M", which may be more easily visualized as 1941Y8M15DT7H12M, is the same as "1941-08-15T07:12". In the explicit formats there are no spaces or punctuation between the time scale components, which are always expressed in a particular order. Leading zeroes are not required. For example, in "2024Y1M26D" the number of months is represented by ‘1M’, not ‘01M’.

The basic symbols used in explicit expressions, in order
P
optionally precedes the entire expression to indicate it is a duration (or period)
Y
years
M
months
D
days
T
precedes the time components
H
hours
M
minutes
S
seconds
Z
precedes the time shift indication

As with the implicit format, a time, if present, is preceded by a T; unlike the implicit format, a component which has a zero value may be omitted. For example, "T30M" is the same as "0Y0M0DT0H30M0S". (It is also the same as "T0.5H" except it asserts half past midnight precise to the minute, whereas "T0.5H" asserts half past midnight with precision only to the tenth of an hour, or six minutes).

A time shift may be appended. It is preceded by a Z and expressed using an optional minus sign followed by an explicit time indication. (No time indication following the Z indicates the same time shift as ‘Z0H’ or ‘Z0H0M’, i.e. UTC.) For example, "2019Y5M1DT23H59MZ2H".

3.1. Explicit week and ordinal dates

To indicate a week date using the explicit format the week number is followed by the symbol K, for example "1988Y44K6D".

An ordinal date, i.e. the date expressed as the ordinal day number within the year rather than with a month and the day number within the month, is indicated in the explicit format by the symbol O. For example, "2024Y45O".

Week dates and ordinal dates in the explicit form may be qualified and may have a time shift appended.

3.2. Explicit century and decade

In the explicit format a century is indicated with the symbol C, thus "18C" is the 18th century and "-1C" represents the 2nd century BCE, i.e. the years -199–-100.

Similarly, a decade is indicated with the symbol J, thus "192J" is the decade of the 1920s.

Centuries and decades in the explicit form may be qualified. It is not clear whether ISO 8601:2019 permits a time shift with an explicit decade or century, but in any case this profile prohibits them.

4. Intervals

A time interval starts at one instant in time (which is necessarily described using a time component larger than an instant) and ends at another. Although we usually think of, use, and represent time intervals whose start time chronologically precedes their end time, ISO 8601:2019 allows expression of a negative duration, and thus an interval that goes backwards in time.

4.1. Single interval

A time interval is represented as one of

  • a start point and an end point
  • a start point and a duration
  • a duration and an end point

In all three cases, the two main components are separated by a slash (or solidus, U+002F). Start or end points may be represented using either the implicit or explicit form; a duration should be expressed in the explicit form with the preceding ‘P’.12 For example, "1994-04-07/1994-07-19" (which can be abbreviated to "1994-04-07/07-19"), "1994-04-07/P0Y3M13D", and "P104D/1994-07-19" all represent the same interval.

4.2. Recurring intervals

A recurring time interval is represented by a single time interval preceded by RN/, where N is a positive integer representing how many times the interval recurs13 or is absent, indicating that the interval repetition is unbounded. If the interval which is repeated starts with a date, time, or date & time, it represents the start of the first interval; if the interval which is repeated starts with a duration, then the end date represents the end of the last interval. For example, "R/1987-03-01/P1M".

5. Sets: conjunctions and disjunctions

5.1. Conjunction: multiple discrete dates

It is sometimes useful to express that a single event took place over multiple, discrete dates or times. Often expressing these dates as a list of <date> (or <time>) elements makes sense, but occasionally it is more reasonable to record them as a single element.

Take, for example, the entry for 1708-03-2114 from The ladies diary: or, the womens almanack, for the year of our lord 1708: ‘Sun riſes and ſets at 6. Equal Days and Nights.’. The author is relying on the ambiguity of the 12-hour clock to express two different times with the single character, 6. Or consider an advertisement for the Los Angeles appearance of Taylor Swift during her 1989 tour: ‘Appearing in Los Angeles, CA on the 22nd, 24th, 25th, and 26th of this August, 2015!’ Even if editors felt it appropriate to express tour dates using a range, in this case it would not work, because the range "2015-08-22..2015-08-26" (see the last paragraph of 6. Ranges) would incorrectly imply there was a performance on Sun 23 Aug 15. While other encodings are possible for these sorts of cases, ISO 8601:2019 has a method for expressing exactly what is needed here: the conjunction of a set of (possibly disjoint) dates, times, or both.

A set of comma-delimited expressions enclosed in curly braces (U+007B & U+007D) means “all members of the set”. Thus:

  Sun riſes and ſets at <time when-iso="{06:00,18:00}">6</time>. Equal Days and Nights.

And the expression for the Taylor Swift concert dates would be {2015-08-22,2015-08-24,2015-08-25,2015-08-26}.

5.2. Disjunction: one of a set

It is not at all uncommon to know some information about when something happened or to what date is being referred, but not enough to pin it down to one specific date. ISO 8601:2019 has a method for expressing exactly what is needed for this sort of case: the disjunction of a set of (possibly disjoint) dates, times, or both.

A set of comma-delimited expressions enclosed in square brackets means “one member of the set”. So, for example, I know that my grandfather won a copy of the Kelmscott Chaucer at an auction in NYC sometime between late 1945 and early 1947, and furthermore it was sometime between November and February. Thus the expression "[1945-11,1945-12,1946-01,1946-02,1946-11,1946-12,1947-01,1947-02]" represents the various possibilities, precise to the month.15

5.3. Set notes

Semantic note: The order of expressions within a set is irrelevant. It is a lot easier for humans to proofread if the expressions are in some useful order (e.g., chronologic), but the set has the same meaning no matter what the order.

Syntactic note: Whitespace may not be used to make the set more readable. Expressions are separated by a comma, not by a comma and a space. In fact, since the comma-separated expressions have no internal whitespace, there is no whitespace in the entire set.

Underspecification note: ISO 8601:2019 does not say what it means if the same expression occurs more than once in a set. Nor does it address the ambiguity that arises if a comma is used as the decimal sign in one of the components in a set. (In any case, this profile does not permit a comma to be used as a decimal sign when inside a set.)

Undercomprehension note: I have yet to figure out if a set may be an element of a set. Although it is not explicitly demonstrated anywhere in the standard, this nesting capability would be a useful feature. E.g., taking advantage of both it and the double-dot notation discussed in the next chapter, we could express the dates that Professor Remus Lupin took wolfsbane potion mixed by Professor Severus Snape as {1993-09-24..1993-09-30,1993-10-24..1993-10-30,1993-11-23..1993-11-29,1993-12-23..1993-12-28,1994-01-21..1994-01-27,1994-02-19..1994-02-25,1994-03-21..1994-03-27,1994-04-19..1994-04-25,1994-05-18..1994-05-24,[1994-06-17..1994-06-22,{1994-06-17..1994-06-21,1994-06-23}]}. That is, he took the potion for 7 days prior to each full moon during the Hogwart’s 1993–1994 term, except before the final full moon of the term he missed either the last or second-to-last dose.

6. Ranges

It is often useful to express that something occurred sometime before a given date (or time), or sometime after a given date (or time), without being able to specify the actual date (or time) of occurrence. For example, we do not know when, exactly, the Doves Press began, but we do know it was before Emery Walker joined the business in 1900.

ISO 8601:2019 provides a mechanism precisely for cases such as these. A date or time expression preceded by a double dot (.., i.e., two U+002E characters in a row) means “on or before” the date or time expression. The entire double-dot expression indicates a single time or date of the same precision as the expression provided after the dots, either equivalent to it or some date or time prior to it. For example, "..2013-06" means “a calendar month that was before July of 2013”, whereas "..1914-06-28T11:30+01" means “some minute before 11:31 CET on 28 June 1914”.

Similarly, a date or time expression followed by a double dot means “on or after” the date or time expression. So the sentence “Every day since we learned of global warming that we have done nothing is a day of shame.” might be encoded

  <date when-iso="1988-06-24..">Every day since we learned
  of global warming</date> that we have done nothing is a
            day of shame.

The “on or before” and the “on or after” representations can be combined as a shorthand for a conjunction set. For example, 2021-05..2022-01 represents the same set of calendar months as {2021-05,2021-06,2021-07,2021-08,2021-09,2021-10,2021-11,2021-12,2022-01}.

Notes
1
This format is unambiguous because we, modern human society, have agreed through our national standards bodies and the International Standards Organization that the order is year, month, day. Without that agreement ‘1797-03-04’ is ambiguous: it might indicate 04 March 1797 or it might indicate 03 April 1797. go back to main text
3
To be more precisely correct, the standard is published as two separate parts, ISO 8601-1:2019 Date and time — Representations for information interchange — Part 1: Basic rules and ISO 8601-2:2019 Date and time — Representations for information interchange — Part 2: Extensions. go back to main text
4
This profile is very similar to the EDTF level 2 profile specified in Appendix A of ISO 8601:2019-2. go back to main text
5
And it is basically inconceivable that human civilization as we know it, let alone computer text encoding, will exist 7,975 years from now. Heck, I would be surprised if it lasted 79.75 years. go back to main text
6
Thus this is the ‘two dee or not two dee’ sign.☺ go back to main text
7
Reminder, in rare circumstances a ‘60’ may be necessary, but these cases are not covered by this profile. go back to main text
8
Although I personally keep track of billable hours to the tenth of an hour, not the ¼ hour, which has made me an annoyance to my manager and very good at multiples of 6. go back to main text
9
Except to point out that gMonth, gMonthDay, and gDay indicate recurrence, not one of a set. E.g., "--08-24" means ‘every August 24th’, not ‘some August 24th, I do not know which’. go back to main text
10
There are two main systems for determining which years are part of each standard century. In the strict construction, the twentieth century comprises the years 1901–2000, inclusive. In the popular model, it is 1900–1999, inclusive. (See the discussion in the Wikipedia page for details.) ISO 8601:2019 uses the popular model. go back to main text
11
Many dating systems instead consider the 2nd century BCE to be -0200 through -0101, inclusive. go back to main text
12
It is not clear to me that this is a requirement of ISO 8601:2019, but since representing durations using the implicit form would be ambiguous … go back to main text
13
I suppose ‘0’ is allowed, but would be kinda pointless. go back to main text
14
Which entry is labeled ‘10’ for the 10th of March 1708, as it was written using ‘old style’ dates. go back to main text
15
In truth I further know the auction occurred on a schoolday, not a weekend or holiday. But adding that extra precision would make the example very long, and pedagogically speaking, changes nothing. go back to main text