I do understand what lossless means. The point of my anecdote is a tale of warning that when going off and start designing new network protocols, especially one as bare bones as TTPoE, you need to consider what happens when someone has to deal with things going wrong. Diagnostics and maintenance matter in the real world for people running large systems with thousands or millions of moving parts. IPv4 and IPv6 bring along lots of tools that help in these scenarios, and IPv4/v6 headers don't actually have all that much overhead to parse and generate in hardware, plus they are protocols that have been around long enough to have many widely available hardware and software implementations in open source or to be purchased from vendors. I'm certain that there will be times when sysadmins will be cursing the fact that the folks who implemented TTPoE didn't have a ping-like tool available from the start.
FC remains a lossless protocol; bugs in multipathd just mean we live in an imperfect world. Your initial sentence, "FC is not entirely lossless," conflates a specific networking term of art with a pedantic application of the denotative definition of the word. If your point was that immature network technologies do not have as many diagnostic tools as mature ones do, you should have made that point instead of misappropriating jargon.
Anyway, to your specific point, IP at all is basically overkill in a cluster architecture. Very few IP stacks function properly without having to get things like ARP involved; the more of this stack you can get rid of, the better performance you get and there's less to maintain. TTPoE reminds me the most of ATA over Ethernet, a previous effort to shed the complexity of a protocol designed for global networking. It worked great until you hit scaling issues, which competing tech leveraged the aforementioned complexity to address.
At a high level, FC lost the write without responding to the write in a timely fashion while other I/Os went through. The write request never gets through from the host to the target, and the host ends up timing out the write then throws an I/O error => that seems like a lost write to me at a higher level. Lossless only applies at the lowest layer of the stack; any holistic view of the system would view this scenario as lossy.
I have implemented ARP and UDP on FPGAs for some toy projects, and it's really not that difficult. One of the use-cases I played around with was getting debug data out of an FPGA at multigigabit rates -- things like PCIe TLPs and raw SERDES data from an EPON implementation to debug a burst mode CDR. The fact that the protocol was IPv4/UDP was no impediment to having it push data through at line rate. Once you've implemented parallel CRC32 for ethernet packets from scratch on a 256-512 bit wide data bus where packets can start and end on arbitrary 32 bit boundaries, the complexity of IPv4 and UDP checksums is dead simple in comparison.
I understand and agree with throwing out TCP in TTPoE. I do not agree with throwing out IPv4 / IPv6. Heck, you don't even need ARP for v6, you could get away with link local addresses using the ethernet MAC address you already need to have anyways.