CDMA needs precise sync for soft-handoff, but that's tangential to the rest. SONET explains it better, but even SONET is decades newer than the original requirements.
The T1 was introduced in 1962, having been developed in the 50s. T1 is self-clocking, in that the receiver recovers sync from the line (and there are mechanisms to guarantee enough ones-density to keep the clock recovery circuit working), so a point-to-point T1 circuit with analog on either end has no need for external timing -- one end can just free-run and the whole thing is fine.
But when you start connecting T1 (DS1)s together, moving DS0 signals between them ("time-slot interchange"), or bundling them into higher-rate signals (DS3/T3 and then up into SONET), it's utter chaos unless they're all synchronized.
There's considerable detail in the BSTJ archives, I could dig up some links later.
Not in telecommunications, but I do know that there are synchronous comm networks that don’t require any kind of sentinel / delimiter because there is a guarantee that the voltage will agree.
My signals background mostly comes out of music, so somebody else probably has a better answer. But any time you need to convert between digital and analog signals relatively seamlessly, ie, introducing minimal processing overhead, you’re going to run into these kinds of issues. In that regard, you can think of telecoms as a CPU, and just like in a computer, you need a system clock that the entire architecture agrees on.
First likely application that comes to my mind is VoIP.
It is one of those problems that is really easy to ignore, until you can't. I was using a cheapo 8051 based logic probe the other day and had a lot of fun dealing with not only jitter of two different oscillators (probe + device under test) - but differences in response time between probes (within the same sample period). If you ever find yourself wondering much influence your MCU's 4 clock cycle per instruction architecture has on the waveforms you're looking at... you're about to have a very bad time.
It's not about accuracy (nobody really cares if the whole network is running fast or slow compared to some other clock) so much as synchrony (the whole network needs to be running from the same clock).
Asynchronous networks can't guarantee latency or jitter performance, and they require buffers. Synchronous networks (remember, this is voice-grade stuff) can deliver bits out at the same rate as bits in, anywhere in the network, without dropped frames or stuffing or anything obnoxious like that.
Analogy time: If all you're familiar with is packet-switched networks, sure, let's call them trucks. A truck gets filled with payload, it leaves, and sometime later, it arrives at its destination. Packets get reordered, and it's someone else's problem to deal with that. A great many shipping-and-receiving docks contain a great deal of complexity, to check bills-of-lading against what was ordered, with warehouses to smooth over the difference. It works today because computers are powerful, and buffer bloat is still a plague.
Implementing VoIP with 1960s technology was... not practical.
So, synchronous networks don't require any of that complexity. Nothing's ever out of order, there's never a bit arriving too soon or too late. You don't even handle them as packets, they're literally just streams of bits, coming out of an ADC here, beating through the network in synch, and being shoved into a DAC over there. Rather than a truck, imagine a bunch of slow conveyor belts feeding a faster-moving belt with a spinning turnstile or little paddle controlling what goes onto it -- for this microsecond, input 1 can emit a bit onto the belt, then a microsecond later, it's input 2 and so on. So the belt contains an ordered stream of bits, and at the other end there's another paddle spinning in sync with the first, demultiplexing the bits back out to their sub-rate circuits.
This can be implemented with discrete transistors. There's no need for buffers or packet reordering or any such nonsense. And definitely no repeated frames. There are no hiccups, no dropped frames, no reassembly or reordering. And definitely no repeated frames. End-to-end latency is the speed of light plus one half a frame time, on average.
And if you can make the whole continent beat in sync, you can enjoy this level of service and predictability even on long-distance calls.
Initially (in the 1960s, as time-domain multiplexing was deployed) this was done by having one master oscillator in Kansas City (the approximate geographic center of the continental US), and fanning-out its timing through an enormous branching-tree structure of amplifiers called Synchronization Distribution Expanders. This performed beautifully but it was a logistical nightmare, any network disruption could cause problems, you needed local oscillators for holdover anyway, etc.
As GPS was made available for civilian applications, it made sense to convert to using Schriever AFB as the master clock, use the space segment as the distribution network, and simply give each office an antenna to tap into it. The implementation is a bit tricky because of doppler shift and stuff, but GPS receivers abstract all that away, and can hand you a timepulse which is synchronized to GPS system time. (Then the BITS rack converts that to the useful frequencies and distributes it within the office.)
Nothing off the top of my head that's succinct and readable.
You'd do well to paw through the Bell Systems Technical Journal -- there's some amazing stuff in there like the origins of Unix and C [1], or Shannon's information theorem [2] -- but some details of the network itself are either hard to search for, or might not be there at all.
Most of the old phreak philes are focused on the billing and routing aspects of the network, not so much on transport. I remember a few test equipment manuals had good introductions to the structure of a T1 frame, for instance, but perhaps not an exhaustive treatment of how that came to be.
This is an interesting question. It must be out there, mustn't it?