This has been debated over and over again over the past few decades. The problem is that Intel started with the wrong direction but ended up being the largest and the most advanced CPU manufacturer. I don’t have the official documents explaining why they decided to do it the opposite way, but I let bygones be bygones. Another contrarian decision they made is the CISC architecture, but it’s rather hard to tell whether it’s really bad or not, even by today’s knowledge. The very good decision they made is, however, to go binary on floating points. Who cares about decimal numbers anyway? At any rate, we may talk about the other two in the later posts.

For those good citizens of the Internet world, big endianness is the international standard. There’s no need to discuss further. For those always daring to challenge conventions and traditions, it should be more or less obvious that the international standard is the mathematically correct solution, period. If we examine carefully the available CPU instructions on data today, there are basically 4 categories only: (1) arithmetic, (2) comparison, (3) bitwise logic, and (4) movement. The rest have more to do with code than data. Code execution requires random access, as is the case with all graph-based automata or machines. No matter what you do, the ability to make multi-state decisions is very important, which requires jumps or branches on most modern computers.

- Arithmetic prefers little endianness. However, if we are processing two numbers at a time, either direction is fine. In fact, if arbitrary-precision mathematics is involved, big endianness is the only way to go as irrational numbers have only one end, that is, the starting end. Also, if we are to print the outputs, the most significant bits should always go first.
- Comparison prefers big endianness, as it stops at the first differing bit. In case length descriptors or any forms of headers are used, they are always compared before the actual data. This makes it trivial to perform universal or heterogeneous comparisons where data are associated with different types. When data can be dynamically associated with multiple types at a time, types can be ordered to guarantee consistent ordering as long as they belong to the data headers. Generally speaking, we want a cheaper way to encode the data, but the naive encoding suffices to satisfy the requirements needed to perform universal comparisons.
- Bitwise logic goes well in either direction. There’s not much to talk about.
- Movement also goes well in either direction. However, certain types of data don’t end, which must be big endian. For the sake of uniformity in data processing, everything is big endian, period. Exceptions are annoying, as always. Note that by movement, we mean copying memory cells to random locations, as is necessary to make it Turing-complete using one-dimensional memory addressing scheme alone.

The Internet machine is a graph at the physical layer, which defines the structure of our assembly language. When I say that the Internet machine is the superset of all machines online, I am counting the edges that connect the nodes as special-purposed computing devices, which naturally run in sequential mode. Mathematically, the probability of data transfer error is exponential to physical distance, assuming uniform signal interference, which is another strong reason why big endianness is critical in data processing, storage and transfer. Now, the big question is, how do we maximize the expressive power of data transfer with little extra hardware cost, other than just adding more memory?