Optimizing UTC → Unix Time Conversion For Size And Speed
How do you convert a UTC timestamp to Unix Time (seconds since the epoch)?
"2020-04-29 04:48:15" → 1588135695
Of course the right answer is “you use a standard library function.” But what if you don’t have one available? Or what if you’re the person implementing that library?
Converting the time portion is trivial. Unix Time pretends that leap seconds do not exist and makes every day exactly 86,400 seconds long. This is a fib on systems that implement UTC leap second insertion1, but it makes the algorithm very simple:
time_t hms_to_time(int h, int m, int s) {
return (h * 3600) + (m * 60) + s;
}
But the calendar part is more challenging. Months have unequal lengths, and leap years complicate everything. Leap years insert an extra day at the end of February whenever:
- the year is divisible by four
- excluding years divisible by 100
- but including years divisible by 400
Surprisingly, no version of C includes a UTC → Unix Time conversion
function in the standard library. There is a non-standard function
timegm()
, but its use is discouraged. The Linux
manpage
says:
These functions are nonstandard GNU extensions that are also present
on the BSDs. Avoid their use.
And on BSD:
The timegm() function is not specified by any standard; its function
cannot be completely emulated using the standard functions described above.
I needed an algorithm to perform this UTC→UnixTime conversion for the JSON parser in upb. The JSON mapping for Protocol Buffers says that timestamps are formatted using strings like:
1972-01-01T10:00:20.021Z
Once we have parsed the individual numbers out of such a timestamp
string, we need a way of translating to seconds since the Unix Epoch,
which is the internal representation of the
google.protobuf.Timestamp
type.
Since upb is written in C and intended to be portable, I needed to roll my own. Since upb aims to be as small and fast as possible, I became very interested in the problem of how far this algorithm could be pushed in size and speed.
A Fortran Solution from 1968
Several leads pointed me to a Letter to the Editor published in Communications of the ACM in October 1968 (Volume 11, Number 10). In this letter Henry F. Fliegel and Thomas C. Van Flandern offered the following Fortran function for performing this conversion. This function uses integer math (all divisions round down):
JD (I, J, K) = K - 32075 + 1461*(I + 4800 + (J - 14)/12)/4
+ 367*(J - 2 - (J - 14)/12*12)/12 - 3
*((I + 4900 + (J - 14)/12)/100)/4
JD
here refers to the Julian
Date, the number of days
that have passed since the Julian Date Epoch. This epoch is very long
ago, in 4713 BC, but it’s a simple matter to convert it to the Unix
epoch of 1970-01-01 by subtracting 2440588, the Julian Day number of
the Unix Epoch. The variables I
, J
, and K
are the year, month,
and day, respectively.
Once we have mapped a Y/M/D to a total number of days, it is easy to
multiply this by 86,400 seconds and add hms_to_time()
to get a full
Unix Time result.
The function above is remarkable. It does not use any lookup table,
and yet it somehow encodes the irregular pattern of month lengths and
leap year rules. It is trivial to convert this Fortran function to C
using the int
data type, which also rounds down for positive numbers
(the algorithm ensures that all numbers being divided are positive, as
long as year >= -4800
).
Improving Readability With A Lookup Table
In the interest of having an algorithm that is easier to understand than the Fortran algorithm above, I wrote the following algorithm for upb’s old JSON parser. All of the general techniques used here are published in various places, I just condensed them into the smallest and simplest algorithm I could. This uses the Unix epoch instead of a Julian Day.
/* https://github.com/protocolbuffers/upb/blob/22182e6e/upb/json/parser.rl#L1697
* epoch_days(1970, 1, 1) == 1970-01-01 == 0 */
int epoch_days_table(int year, int month, int day) {
static const uint16_t month_yday[12] = {0, 31, 59, 90, 120, 151,
181, 212, 243, 273, 304, 334};
uint32_t year_adj = year + 4800; /* Ensure positive year, multiple of 400. */
uint32_t febs = year_adj - (month <= 2 ? 1 : 0); /* Februaries since base. */
uint32_t leap_days = 1 + (febs / 4) - (febs / 100) + (febs / 400);
uint32_t days = 365 * year_adj + leap_days + month_yday[month - 1] + day - 1;
return days - 2472692; /* Adjust to Unix epoch. */
}
This algorithm uses a lookup table to determine the number of days for each month. Like the Fortran algorithm, it requires that division rounds down. But C rounds towards zero2 so like the Fortran algorithm above, we start by adding 4800 so the year is always positive.
This algorithm is nice and readable. It is small and fast. There is only one noticeable downside if we are being performance-obsessed: it relies on a lookup table, which adds a bit of latency in the best case, or a lot of latency if the lookup table is not in cache.
Eliminating the Lookup Table
More recently I came across the following algorithm published by Howard Hinnant. Like the Fortran algorithm, it avoids the lookup table. But its logic is a bit easier to follow, and Howard includes an extensive explanation for how it works.
// Returns number of days since civil 1970-01-01. Negative values indicate
// days prior to 1970-01-01.
// Preconditions: y-m-d represents a date in the civil (Gregorian) calendar
// m is in [1, 12]
// d is in [1, last_day_of_month(y, m)]
// y is "approximately" in
// [numeric_limits<Int>::min()/366, numeric_limits<Int>::max()/366]
// Exact range of validity is:
// [civil_from_days(numeric_limits<Int>::min()),
// civil_from_days(numeric_limits<Int>::max()-719468)]
template <class Int>
constexpr
Int
days_from_civil(Int y, unsigned m, unsigned d) noexcept
{
static_assert(std::numeric_limits<unsigned>::digits >= 18,
"This algorithm has not been ported to a 16 bit unsigned integer");
static_assert(std::numeric_limits<Int>::digits >= 20,
"This algorithm has not been ported to a 16 bit signed integer");
y -= m <= 2;
const Int era = (y >= 0 ? y : y-399) / 400;
const unsigned yoe = static_cast<unsigned>(y - era * 400); // [0, 399]
const unsigned doy = (153*(m + (m > 2 ? -3 : 9)) + 2)/5 + d-1; // [0, 365]
const unsigned doe = yoe * 365 + yoe/4 - yoe/100 + doy; // [0, 146096]
return era * 146097 + static_cast<Int>(doe) - 719468;
}
This algorithm replaces the month lookup table with the clever expression:
DayOfYear(adjusted_month) = (153 * adjusted_month + 2) / 5
This formula describes a line with slope of 153/5, or 30.6, which is between 30 and 31 days in each month. But it is cleverly chosen to exploit truncating integer division to step in the same pattern that months do.
My colleague Gerben Stavenga realized that there are many such expressions of this form that will calculate the same value. If we use a variant where the divisor is a power of two, the compiler can optimize the division to a right shift. Specifically we can use:
DayOfYear(adjusted_month) = (62719 * adjusted_month + 769) / 2048
With some help from Gerben I arrived at this formulation for the current version of upb:
/* https://github.com/protocolbuffers/upb/blob/22182e6e/upb/json_decode.c#L982
* epoch_days_fast(1970, 1, 1) == 1970-01-01 == 0. */
int epoch_days_fast(int y, int m, int d) {
const uint32_t year_base = 4800; /* Before min year, multiple of 400. */
const uint32_t m_adj = m - 3; /* March-based month. */
const uint32_t carry = m_adj > m ? 1 : 0;
const uint32_t adjust = carry ? 12 : 0;
const uint32_t y_adj = y + year_base - carry;
const uint32_t month_days = ((m_adj + adjust) * 62719 + 769) / 2048;
const uint32_t leap_days = y_adj / 4 - y_adj / 100 + y_adj / 400;
return y_adj * 365 + leap_days + month_days + (d - 1) - 2472632;
}
Note that the although this algorithm has some ternary operators, it
compiles into fully branchless code
with one cmov
in it. In fact, all the algorithms given in this
article compile to branchless code (on x86-64 using recent versions of
Clang).
Semantics and Correctness
Before we compare these algorithms for size and speed, a few notes about their semantics and correctness.
Supported Input Range
I verified all the algorithms in this article across a wide range of
dates (-4712-01-01 to 22666-12-20, or Julian Date 1 to 10,000,000).
For every day in this range I checked that the outputs match each
other, and that they match the results of timegm()
. There was a snag
on macOS, as its timegm()
function fails for all years prior to
1900, so I disabled it for those years.
The interval [-4712, 22666] is a very wide range of years, and likely to be far beyond what is needed in practice. That said, these algorithms effectively support much wider bounds than this. All of them support to at least the date 1000000-01-01 (year 1 Million).
Most of the algorithms given here use the year -4800 as their base,
and will return incorrect results for years before that. However the
days_from_civil()
algorithm is coded to support over 10 million years
into the past (assuming 32-bit integers). I think any of these
algorithms could likely be adjusted to support whatever window of
years is desired. The year -4800 is frequently chosen as a base out of
convention, as it is before the Julian Day epoch, which is before any
recorded historical events.
Input Normalization
Unlike the algorithms in this article, the timegm()
function
normalizes its input. For example, if you pass a month
of -1
to
timegm()
, that indicates December of the previous year. The
algorithms given here can return incorrect results if the month is out
of range, or can even crash with out-of-bounds array access if you are
using the epoch_days_table()
function. So any normalization/checking
of the month value must happen prior to calling the function.
Similarly timegm()
will write the normalized values back to the
input struct tm
, including tm_yday
(day of year) and tm_wday
(day of week) values that it calculated from the day, month, and year.
The algorithms in this article do not calculate day of week or day of
year.
I did not need these features for my use case, so for me these
represent unnecessary overhead. But it is worth noting that this is
not a completely apples-to-apples comparison. It could be an
interesting exercise to add these features so that the algorithms
given here could be a drop-in replacement for timegm()
.
Leap Seconds
When we use the hms_to_time()
function above, these algorithms
return correct results around leap seconds. The primary test case for
this is to verify:
UnixTime(1998, 12, 31, 23, 59, 60) -> 915148800
UnixTime(1999, 1, 1, 0, 0, 0) -> 915148800
These values follow the table given in the Wikipedia Article about Unix Time around leap seconds. These algorithms handle leap seconds correctly without any special code. The correct answers for seconds=60 fall out of the algorithm “for free”, as we are simply summing seconds.
Size and Speed Benchmarks
Here are microbenchmarks of the algorithms above, using the google/benchmark framework. Code for these benchmarks is available here.
I added the hms_to_time()
code to each algorithm so they would be
more comparable to the timegm()
function. I also put the functions
in a separate .cc
file from the benchmark loop so that they could
not be inlined, to make the comparison with libc more fair. I also
translated the 1968 algorithm from Fortran into C++.
On my Linux Desktop i7-8700K with glibc 2.30, I get:
Running bazel-bin/date-algorithms
Run on (12 X 4700 MHz CPU s)
CPU Caches:
L1 Data 32K (x6)
L1 Instruction 32K (x6)
L2 Unified 256K (x6)
L3 Unified 12288K (x1)
------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------
BM_YMDToUnix_Fortran 5 ns 5 ns 148069867
BM_YMDToUnix_Table 2 ns 2 ns 288071123
BM_YMDToUnix_DaysFromCivil 4 ns 4 ns 176474060
BM_YMDToUnix_Fast 3 ns 3 ns 269155478
BM_timegm_libc 46 ns 46 ns 15360235
All of the algorithms from this article are in the same general ballpark. The algorithm in libc is noticeably slower.
On my Mac Laptop running macOS 10.15.4, the disparity is even more stark:
Running bazel-bin/date-algorithms
Run on (12 X 2600 MHz CPU s)
CPU Caches:
L1 Data 32K (x6)
L1 Instruction 32K (x6)
L2 Unified 262K (x6)
L3 Unified 9437K (x1)
------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------
BM_YMDToUnix_Fortran 6 ns 6 ns 116857534
BM_YMDToUnix_Table 3 ns 3 ns 213395767
BM_YMDToUnix_DaysFromCivil 5 ns 5 ns 147238851
BM_YMDToUnix_Fast 3 ns 3 ns 211537241
BM_timegm_libc 5421 ns 5413 ns 122749
I looked at a profile and cross-referenced the Darwin code in
localtime.c.
It appears that almost all of the time is attributable to the
gmtsub()
function, of which about 70% is mutex lock/unlock to lazily
load time zone data for the “GMT” time zone. The timesub()
function
appears to take most of the remainder.
It is surprising that a lock would add this much overhead, as Jeff
Dean’s Latency Numbers Every Programmer Should
Know tells us to expect only
25ns for a mutex lock/unlock. Note that the benchmarks above are
single-threaded; under contention timegm()
on macOS degrades even
more, while the others are unaffected (as they do not access any
global state):
Running bazel-bin/date-algorithms
Run on (12 X 2600 MHz CPU s)
CPU Caches:
L1 Data 32K (x6)
L1 Instruction 32K (x6)
L2 Unified 262K (x6)
L3 Unified 9437K (x1)
----------------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------------
BM_YMDToUnix_Fortran/threads:6 1 ns 6 ns 119878410
BM_YMDToUnix_Table/threads:6 1 ns 3 ns 206599374
BM_YMDToUnix_DaysFromCivil/threads:6 1 ns 5 ns 139911858
BM_YMDToUnix_Fast/threads:6 1 ns 3 ns 208116546
BM_timegm_libc/threads:6 14795 ns 86855 ns 8178
YMDToUnix_Table
and YMDToUnix_Fast
are basically tied, though as
mentioned before the table-based algorithm can slow down drastically
(on the order of 100ns) if the table is not in cache, and is
vulnerable to crashes if the month is out of range.
For size profiling I used my tool Bloaty McBloatface. I display VM size only, so we see the parts of the binary that will actually be loaded into memory:
$ bloaty -d symbols,sections '--source-filter=^YMD|yday' --domain=vm \
bazel-bin/date-algorithms
VM SIZE
--------------
31.1% 236 YMDToUnix_Fortran()
86.4% 204 .text
10.2% 24 .eh_frame
3.4% 8 .eh_frame_hdr
27.7% 210 YMDToUnix_DaysFromCivil()
81.0% 170 .text
15.2% 32 .eh_frame
3.8% 8 .eh_frame_hdr
20.0% 152 YMDToUnix_Fast()
78.9% 120 .text
15.8% 24 .eh_frame
5.3% 8 .eh_frame_hdr
18.1% 137 YMDToUnix_Table()
76.6% 105 .text
17.5% 24 .eh_frame
5.8% 8 .eh_frame_hdr
3.2% 24 epoch_days_table()::month_yday
100.0% 24 .rodata
100.0% 759 TOTAL
Filtering enabled (source_filter); omitted 128Ki of entries
All these functions have nearly the same .eh_frame
overhead (this is
unwind information, used for debugging and stack traces), so we can
set that aside. When we account for the table size, YMDToUnix_Fast
ends up smallest, narrowly beating YMDToUnix_Table
. But these
differences are very minor. All of these algorithms are very small
and fast.
-
UTC days tick 86,401 seconds whenever a leap second occurs. Maintaining the fib that every day is 86,400 seconds long requires Unix Time to be discontinuous around a leap second. To avoid this complication, some systems have started “smearing” seconds around the leap second instead, so that every day is still exactly 86,400 seconds long, and it’s merely the length of each second that changes. This deviates from UTC, but solves lots of practical problems. ↩
-
Round-to-zero was standardized in C99. Before that the results for negative numbers were implementation-defined). ↩