Every day relativity. Time contraction in your daily commute • Adapting's Blog

If you are not familiar with time relativity, there’s this typical story of two identical twins, one of whom is an astronaut who travels to the most distant places for 20 years, and when she comes back she is only 5 years older! suddenly becoming 15 years younger than her sibling.

This may sound like science fiction, and it kind of is, but it’s also quite real, and it happens around us. When cosmic radiation clashes with atoms at the outer part of the atmosphere, some kinds of unstable particles called muons are produced. They are called unstable because after a very short lifetime (not even a second) they disintegrate. Given their speed, this means very few should be able to reach the Earth’s surface, but actually our readings detect many more. Why?

The muons are travelling close to the speed of light and, like the astronaut, they experience time relativity. In the same way the astronaut travelled for 20 years but it only felt like 5 to her, the muon is able to travel for an amount of time that’s 30 times its lifetime, and that’s enough to come down here.

Now, if you can’t be an astronaut, your second best chance at experiencing time relativity is in your daily commute. It’s actually a different flavour; it won’t provide you the means for eternal youth, but it will allow you to create time seemingly out of the blue, which is not that bad.

It all starts with some basic tools from set theory.

1. Injection, surjection, bijection

Let’s say you want to compare the size of two sets in terms of how many elements they contain, also called their cardinality. How would you go about it?

Of course you can just count the elements in each set and compare the figures. But what if the sets are too big to count, possibly infinite?

One way to go about it is to explore mappings that connect elements in one set with elements in the other set. Let’s call the first set $X$ and the second set $Y$ to see how this works. We denote the cardinality of $X$ as $|X|$ and similarly for any other set.

1.1 Injection

An injective map $f$ from $X$ to $Y$ (written as: $f : X \rightarrow Y$ ) is one that maps each element in $X$ to a distinct element in $Y$ , meaning no two elements in $X$ map to the same element in $Y$ ¹. If we can find one of those maps, this means that $Y$ contains at least as many elements as $X$ : $|X| \leq |Y|$ .

				Injection example (Wikipedia)

1.2 Surjection

A surjective map $f$ from $X$ to $Y$ (again: $f : X \rightarrow Y$ ) is one that can reach every element in $Y$ ². Note that for a map to be well defined each element in $X$ can only be mapped to a single element in $Y$ , so if the map is surjective it means that there are at least as many elements in $X$ as there are in $Y$ : $|X| \geq |Y|$ .

				Surjection example (Wikipedia)

1.3 Bijection

A bijective map is one that is both injective and surjective. Notice that in terms of cardinality this implies: $|X| \leq |Y|$ and $|X| \geq |Y|$ . The only way for both conditions to hold is $|X| = |Y|$ , that is: both sets must have the same size.

				Bijection example (Wikipedia)

Working with bijections allows us to prove some mind blowing results. The natural numbers ( ${0, 1, 2, 3, ...}$ ) are infinite, and you may think that, since for every two natural numbers there’s a single even number ( ${0, 2, 4, 6, ...}$ ), there must be twice as many natural numbers as there are even numbers.

Now note that every even number is of the form $2·n$ , where $n$ is a natural number, and precisely $f(n) = 2·n$ is a bijection between the two sets, so they must have the same cardinality! In the words of Feynman:

there are twice as many numbers as numbers.

Yes that’s counterintuitive, but when you deal with infinite stuff odd things happen.

Anyway, what’s relevant to us is the fact that when you have the same finite cardinality, injection implies surjection and vice versa. Think about it, if you are mapping each element distinctly, and both sets have the same size, you will cover the target set completely. Similarly, if you cover the target set completely, you must have mapped each element distinctly.

2. Breaking the bijection creates time dilation and contraction

For our analysis of arrival times under traffic we are interested in the function that maps departure times to arrival times. Let’s say we are measuring up to minutes, so there’s a finite amount of possible times throughout the day which is the same for both sets, they have the same cardinality.

What would our mapping look like? If the commute takes 30 min, at first it maps 7:30 to 8, 7:31 to 8:01 and so on. However, as traffic builds up, commutes take longer, 31 min, 32 min, 33 min. So perhaps leaving at 7:32 means you arrive at 8:03.

What does this mean? The mapping is not surjective, for instance there’s no way to arrive at 8:02! Note this has no implications in terms of cardinality, in our previous discussion we said two sets are equal size if you can find a bijection, not if every map is a bijection. Now note that since the map is not surjective it cannot be injective either (as injection implied surjection in the same cardinality case).

What does it mean for the map not to be injective? That people departing at two different times arrive simultaneously! This is because the one leaving late experiences a “time contraction” that allows them to catch up with the other one.

In other words, past the peak expansion time, you can afford delaying your departure without getting punished for it. Contrarily, before that small delays can have big consequences, which is something people are more aware of.

To sum up, given same cardinality, losing surjection implies losing injection, which in turn implies that you can leave later and arrive at the same time.

3. How relevant is this?

The bijection argument serves to prove that, under traffic, at some point two people departing at different times must arrive almost simultaneously. But it doesn’t tell us how far apart these times are. In other words, it doesn’t quantify time contraction or dilation

In order to do that, we need to be more specific about the parameters of our problem:

What’s the trip duration? Let’s say 30 min.
How much can traffic add to it? Let’s say it can add 50%, so 15 more min.
For how long does traffic last? Let’s say for an hour.

Below you can see how the traffic-induced delay varies according to departure time.

Note the parameters above do not fully specify the plotted curve, and other options are perfectly valid. The reasons for opting for this particular configuration are:

The curve is symmetric, which makes everything twice simpler.
The curve is smooth, there are no abrupt changes, which feels kind of natural.
The curve’s slope is always above -1.

If you want to read more on the specifics of this choice and why point 3. matters there are more details in the Appendix you can find a little bit more of detail.

Now let’s look at how this influences the connection between departure time, $t_{\text{departure}}$ and arrival time, $t_{\text{arrival}}$ . If there were no traffic, this connection would follow a perfectly straight line; you leave 5 min late, you arrive 5 min late. All points would lie on

t_{\text{arrival}} = t_{\text{departure}} + 30 \text{min}.

In the figure below, you can see that prior to the traffic peak, the slope of the curve increases above the “no traffic” condition, meaning a small delay in departing leads to larger delays in arrival. At some point the curve has to return to the “no traffic” line (unless there’s traffic throughout the entire day, which we assumed it’s not the case), which requires a flattening of the slope, meaning a large delay in departing leads to a small delay in arrival, the ideal scenario!

Interestingly, if everyone knew about this and started leaving later, the peak traffic time would just shift, and there would be no benefit. It’s a little bit of a game theory scenario. This is actually quite unlikely happen; in fact, the best choice is not to commute during time contraction, but to commute when there’s no traffic at all (in this case, before 7:30 or after 9:00). The reason people don’t do this is because they have some schedule constraint that forces them to a suboptimal decision; suboptimal in an ideal world, optimal in the life they actually live.

Still, the nice thing about time contraction is that you can reduce your commuting time with minimal impact on your arrival time, which is generally the constraining element, so whenever you have even small flexibility in that regard, it may be worth considering.

In case you want to dig deeper

Appendix A - Why the choice of a bell curve for modelling traffic?

Think of what we want. We want the traffic delay, $\tau$ , to be 100% at some peak time, $t_{\text{peak}}$ , and decay towards 0% as time, $t$ , moves away from that value. So we need a function that’s 1 when the distance, $d(t,t_{\text{peak}}$ , is 0, and decays as distance increases. Can you think of any candidates?

What about this one:

\tau(t) = \tau_{\text{peak}} · e^{-d(t,t_{\text{peak}})}.

You can check that,

when $d=0$ : $\tau=\tau_{\text{peak}}$ ,
when $d \rightarrow \infty$ : $\tau=0$ . After all, any number raised to a negative exponent takes values between 0 and 1, $e$ has some nice properties, but it doesn’t have to be $e$ .

How would you calculate the distance? A distance has to be non-negative (what would a negative distance mean?) so you can’t just use the difference between peak time, $t_{\text{peak}}$ , and time, $t$ . There’s an easy fix, add absolute values, so $|t_{\text{peak}} - t|$ could work³; the thing is that this distance has a “V” shape, meaning it’s not smooth at the middle. A nicer alternative is $d(t, t_{\text{peak}})=(t_{\text{peak}} - t)^2$ , the square ensures the output is non-negative and yields a “U” shape that looks better.

So you have

\tau(t) = \tau_{\text{peak}} · e^{-(t_{\text{peak}} - t)^2},

which is the bell curve basically. The only thing is that with the configuration above you cannot control the steepness of the curve. To do that you just have to add a parameter that modifies the steepness of the distance function by multiplying (makes it steeper) or dividing (makes it less steep) by a factor. If you decide to call this factor $\sigma^2$ and you’ve taken statistics at some point, this may ring a bell.

\tau(t) = \tau_{\text{peak}} · e^{-\left( \frac{t - t_{\text{peak}}}{\sigma} \right)^2}.

Appendix B - Respecting the solidity of cars

Actually there’s one more detail (there always is!). Before, we mentioned that the slope of the curve of departures-arrival could not be negative. A flat slope means you depart later and arrive at the same time, but a negative slope would mean you arrive later and arrive earlier? This is not impossible, but it’s inconsistent with the following idealised scenario: imagine your neighbour who takes exactly the same path as you do (same lanes at every point of the road) leaves some time later, there’s no way they can arrive earlier than you without traversing your car!

The easy way to go about this is to plot the graph and check visually the condition is met. In this case, the problem was simple enough for this to be good enough. In other cases you may need to follow a more rigorous approach and actually do the math. A very brief sketch of the process would be:

The derivative is a linear operator, and the slope of the no-traffic line is 1. This means, for the departure-commute curve to have non-negative slope, the slope is traffic has to be no lesser than -1. Let’s break this down a little bit.

The curve for connecting departures-arrival has the following form, where the commuting time, $\Delta t_{\text{commute}}$ , (30 min in our example) is a constant:
$t_{\text{arrival}} = t_{\text{departure}} + \Delta t_{\text{commute}} + \tau(t_{\text{departure}}).$
Denoting the derivative operator with respect to departure time (which gives us the slope) with $'$ , we have:
$t_{\text{arrival}}' = 1 + \tau'(t_\text{departure}), \quad t_{\text{arrival}}' > 0 \iff \tau'(t_\text{departure}) > -1.$
We know the slope of traffic, $\tau'$ attains a minimum (it does not go off to negative infinity). Find that minimum by solving $\tau'' = 0$ , it’s the second derivative because to find the minimum we are taking a derivative on top of the existing derivative.
Impose the restriction: the slope must be greater than -1. This translates into a restriction on the valid values that the parameters that determine the bell curve can take. In this case that’s $\tau_{\text{peak}}$ and $\sigma$ , but you could add your own.

Appendix C - How is smoothness meaningful?

For traffic we could have used a triangular shape (”/\”) instead of the bell curve. The main difference is that the triangle is not “smooth”, as the slope changes abruptly at the upper edge. This means the derivative (the slope) is not defined at that point (what’s the slope of an edge? it has two depending on where you look from), and having undefined stuff is kind of annoying to work with, so it’s a sensible default to opt for smooth curves whenever possible. See below that choosing a non-smooth distance function (such as the absolute value) leads to a non-smooth curve.

The most important thing here is your understanding of the physical problem you are dealing with. If the slope is a physical variable, it’s unlikely to change abruptly, everything in the real world⁴ changes continuously. Still, some things change so fast that, unless you are looking at a scale of nanoseconds, they seem to change discretely⁵.

In the case of traffic, there may be cases, such as accidents, where the change in traffic is seemingly discontinuous. However, since here we are modelling average daily traffic, which does not take into account such events, going for the sensible default of smoothness seems more reasonable. This is in any case quite a shallow take, I’m curious about anyone with a deep understanding of these topics.

Footnotes

Mathematically this is often expressed as: given $x_1, x_2 \in X$ ,

$f(x_1) = f(x_2) \Rightarrow x_1 = x_2.$
That is, if two elements of $X$ map to the same element in $Y$ , then they must be the same element in $X$ . ↩
Mathematically: $\forall y \in Y, \exists x \in X \text{ such that } f(x) = y$ . ↩
You may have noticed that the way absolute value $|·|$ is represented looks pretty much like cardinality. We have limited symbols after all, and generally the overlap is okay because the meaning can be deduced from context, in the same way that we use orange for a fruit and a colour and we do just fine. ↩
In the macroscopic world I should say. One of the fundamental foundations of quantum mechanics is that variables are discrete, they cannot take any value, so they jump from quantum to quantum. ↩
Interestingly, it’s actually the other way around. As stated in the previous note, when you look at changes of fundamental particles they are in fact discrete. However, these changes take place at such small scales that everything looks continuous, in the same way that a movie looks smooth to us despite actually being a concatenation of still photographs. ↩