07/28/2022

*************************************************

I thought it might be fun from time to time to explain a few obscure computer terms.

Obscure, but easy to understand (and yet probably totally useless to most people :-)

Just add them to your list of "Fun Facts About Microcontrollers".

(I suspect that list is pretty short for most people :-)

*************************

Today, I thought I'd talk about "Big Endian" versus "Little Endian".

If you heard someone talking about this, and I don't know where you would be to do that, you would think they said "Indian" not "Endian", but it has nothing to do with Indians - not at all.

It describes how certain data is stored internally, at a very, very low level, down in the guts of the computer. 

"Big Endian" means that certain data is stored with the "Most Significant" value, the "big end", first, descending to the "Least Significant" value, the "little end", last.

"Little Endian" is the opposite, the "Least Sigbificant" value, the "little end", is stored first, ascending to the "Most Significant" value, the "big end", last. 

*************************

For example, think about dates.

In many foreign countries, dates are presented as Day/Month/Year - which is "Little Endian", since the "day" is the "Least Significant" value and the "year" is the "Most Significant" value. 

"Computer Science Theory", whatever that is, tells us that dates should be Year/Month/Day or "Big Endian", since the "year" is the "Most Significant" value descending down to the "day", which is the "Least Significant" value. 

This way lists of dates are automatically kept in historical order without fancy date manipulation or testing. 

In the U.S. we normally use Month/Day/Year, which is called "Mixed Endian" or "Bi-Endian" (sounds a little rude :-)

*************************

In computers, the smallest piece of data you can have is a single "bit", basically an "on" or "off" switch which is used to represent the nunbers 1 or 0.

That's not much, so programmers found using blocks of eight bits (called "bytes") is more useful. 

An eight bit "byte" can take on 256 distinct combinations of "on" or "off" bits and so can represent the numbers 0 to 255. 

That's better, but still not very big. 

So multiple bytes are used to represent bigger integer numbers.

Two bytes, (containing 16 bits), can represent numbers from 0 to 65,535

Four bytes, (containing 32 bits), can represent numbers from 0 to 4,294,967,295

That's usually more than enough for most things. 

*************************

Let's say we have an integer made up of the four bytes, 12-34-56-78, where '12' is the "Most Significant" byte and '78' is the "Least Significant" byte.

If this integer is stored in "Big Endian" order, we would see 12-34-56-78 in the computer's memory.

If this integer is stored in "Little Endian" order, we would see 78-56-34-12 in the computer's memory.

Notice that the value inside each byte is not affected, just the order of the four bytes.

*************************

I could say that half of the computers in the world use "Big Endian" integers and half use "Little Endian" integers, but I don't know the actual breakdown. 

There are plenty in both camps. 

Since some of the biggest players, like Intel, use "Little Endian", I'm guessing "Little Endian" wins out as the most common. 

*************************

I bet you're thinking, "why would anybody in their right mind ever bother to use Little Endian?"

To be totally honest, I don't know, they just do.

It was dictated by Electrical Engineers a long time ago, and nobody ever accused an EE of being in their right mind. (Just kidding :-) 

I do have a theory about this, though. 

When you do addition, you always start at the "little end", the one's place, then the ten's place, then the hundred's, the thousand's, etc. 

I suspect it may be easier in the hardware to have the bytes stored in that order ready to go.

*************************

I suspect most computer programmers aren't familiar with the terms "Big Endian" and "Little Endian".

All the high level computer languages hide this and just deliver integers as integers.

I suspect most programmers do know that something funky goes on at the lowest levels, regarding byte order, but they rarely need to goof with it, so who cares.

*************************

Here's the "cute" part of all of this nonsense - that is - where the names come from.

They evidently come from the satirical book, "Gulliver's Travels", by Jonathan Swift (1726).

When Gulliver shipwrecks on the island of Lilliput (with all the tiny people), he finds they are bitterly, politically divided over how to crack open their eggs - whether to crack them open at the big end, the "Big Endians" or the little end, the "Little Endians". 

*************************

Is this controversial?

Well, all I can say is, that as a "Big Endian" (and please, no wisecracks about my "big end"), I can't stand those "Little Endian" people - who do they think they are?

No wait, that's wrong.

As a "Little Endian", I can't stand those "Big Endian" people - who do they think they are?

No wait, that's wrong, too.

Actually, I'm not aware anybody really cares about all this. 

And by now, I bet you don't either :-)

But you do have to keep this straight if you're programming in Assembly Language.

*************************

Stay Jazzed!
--Tom Swezey

...