09/28/2023

****************************************

Artificial Intelligence (AI) Explained

I thought from time to time I might explain some computer stuff that people might find interesting even if they have no practical use for it.

I'm not an expert on AI and there is a lot of research going on I don't know anything about, but I have looked into AI, out of my own interest, and I think I can explain some of the basic ideas behind how it works.

****************************************
In Math, in general, there are a lot of formulas.

You put a value into a formula and you get an answer out the other end.

INPUT --> FORMULA --> OUTPUT

For example, we could have a formula that converts "feet" to "inches".

You put in "feet" --> crank it through the FORMULA --> and get an answer out in "inches".

Example: 2.5 feet --> FORMULA --> 30 inches

Usually, we KNOW the INPUT and the FORMULA but we DON'T KNOW the OUTPUT a head of time - that's what the formula calculates for us.

****************************************
AI is sort of "sideways" on that.

The basic idea in AI is that we DON'T KNOW what the FORMULA is.

THAT'S the UNKNOWN thing we need to solve for - NOT the OUTPUT.

If we have enough KNOWN INPUT/OUTPUT PAIRS, where we know the right OUTPUT value for a given INPUT value, AI can grind out a FORMULA that will get us from that INPUT to that OUTPUT.

Once we have that FORMULA - AI is done.

We can then use that FORMULA, like we would any other formula, with other INPUTS to get  unknown OUTPUTS.

****************************************
Now you might ask, "Really? Is that ALL AI is?"

Yes, actually that really IS all AI is.

We are solving for an UNKNOWN FORMULA using KNOWN INPUT/OUTPUT PAIRS.

Then we can use that FORMULA for other INPUTS.

What is SPECIAL about AI is that the FORMULAS it usually comes up with are so big and complicated that they CAN'T be developed by hand using traditional methods.

We can currently ONLY get them using AI.

And some of those FORMULAS are pretty useful.

****************************************
Currently, the method at the heart of most AI, to figure out the FORMULA, is the "Backpropagation Neural Network".

Don't let the name scare you - let me cut through the "buzz words" :-)

"Neural" just means that it "mimics the human brain" - that's all.

And it's probably WRONG here anyway.

Now days they don't think the brain actually uses "Backpropagation".

OOPS!

****************************************
I don't remember if they taught anything in High School Math about "VECTORS" and "MATRIX/MATRICES", but they really could - they're not that complicated.

It's a branch of Mathematics called Linear Algebra.

It's really just Arithmetic 2.0 - it just uses a lot of basic arithmetic in it.

There's nothing exotic, like integrals, derivatives, differential equations, stuff like that in it.

It really just uses a lot of multiplications and divisions, additions and subtractions.

****************************************
Input into AI is usually in the form of a VECTOR, the FORMULA is usually a MATRIX and the output is commonly another VECTOR.

If you know about VECTORS and MATRICES great, if you don't, you won't need to here. 

It's enough just to say INPUT --> FORMULA --> OUTPUT

Anyway, the MATRIX is the "NETWORK" part of AI.

****************************************
I always like to work through an example to see what is really going on.

So I've contrived one.

Let's say we want a FORMULA such that we can put in the longitude and latitude of ANY point on earth and get out the longitude and latitude of the point on the exact OPPOSITE side of the earth from it.

For example, if we put in the North Pole - the FORMULA would give us the South Pole.

If we put in Minneapolis the OPPOSITE is a point in the Indian Ocean.

If we put in Tokyo, Japan the OPPOSITE is a point in Paraguay.

If we put in Cape Town, South Africa the OPPOSITE is a point near Hawaii.

If we put in Berlin, Germany the OPPOSITE is a point near New Zealand.

I think you get the idea of what we want (I hope :-)

But we don't know what that FORMULA is that calculates this.

So how do we get AI to give us that FORMULA?

****************************************
We start by creating a set of PAIRS where we ALREADY KNOW the OPPOSITE points for a bunch of given INPUT points. 

Maybe we just buy a cheap used globe at the nearest garage sale and figure out a few by hand. 

It doesn't matter how you get them - but we do NEED them :-)

ANY POINT (longitude, latitude) and it's OPPOSITE POINT (longitude, latitude).

Minneapolis and it's OPPOSITE.

Tokyo and it's OPPOSITE.

Cape Town and it's OPPOSITE.

Berlin and it's OPPOSITE.

and so forth.

In AI this is called "THE TRAINING SET".

****************************************
Next we estimate what the FORMULA should look like and set it up.

In our case, each INPUT is a two element VECTOR (longitude and latitude) and each OUTPUT is also a two element VECTOR (longitude and latitude).

This means the FORMULA should likely be a 2 by 2 MATRIX.

[Actually, in AI the FORMULA usually consists of a two stage process - with two MATRICES, the output from the first is fed into the second to get the final OUTPUT, but that's more than we need to know for our example].

You can use ANY values you want in the MATRIX to start with, even random numbers.

This is just to set up a starting point version of the FORMULA.

****************************************
Next we create a TRAINING LOOP step that we will repeat over and over again that actually calculates the FORMULA.

We add an additional stage to our FORMULA.

The new stage just calculates the DIFFERENCE between the OUTPUT the FORMULA gives us and the KNOWN right answer OUTPUT from the TRAINING SET PAIR.

For example MINNEAPOLIS ---> FORMULA ---> FORMULA-RESULT,

Then (FORMULA-RESULT minus the KNOWN-RIGHT-OUTPUT-FROM-THE-TRAINING-SET) ---> ERROR-AMOUNT

The ERROR-AMOUNT indicates how BAD our FORMULA currently is from the FORMULA we want.

Notice that by adjusting the values in the FORMULA by hand, the ERROR-AMOUNT will either get bigger or smaller.

We want it to get smaller - actually we want the ERROR-AMOUNT to go to ZERO.

So how can we adjust the FORMULA to move it toward (ERROR-AMOUNT = ZERO)?

****************************************
This part gets a little "mathy" (is that a word?) and requires some Calculus.

(You can skip ahead to the next section if you want - the details here aren't that important.)

Our FORMULA, with the new extra DIFFERENCE stage in it, is a "differentiable function".

AI commonly uses a method called "GRADIENT DESCENT" to adjust the FORMULA and make it a little better each time we cycle through the TRAINING LOOP.

A technical definition of this method is: "Gradient descent is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function."

That description doesn't help us very much to understand AI :-)

So we'll just say that the FORMULA gets "better" each time we run the TRAINING LOOP and adjusts the FORMULA toward being the final FORMULA we're seeking.

The "local minimum" we are seeking is when the ERROR-AMOUNT = ZERO.

****************************************
We now start looping through the TRAINING LOOP using the TRAINING SET.

We run each INPUT in the TRAINING SET through the FORMULA and see what it gives us for the OUTPUT.

The first time through, the OUTPUT from the FORMULA is likely going to be wildly off.

But since we KNOW ahead of time what the CORRECT OUTPUT values SHOULD BE - we can calculate the amount of ERROR there is for each TRAINING SET pair and adjust the FORMULA to be "better".

We run the TRAINING LOOP over, and over, and over, ... and over.

A hundred times, a thousand times, a million times, a billion times - what ever it takes - and it usually takes ALOT!

Each time through the TRAINING LOOP reduces the ERROR-AMOUNT by at least a little.

The FORMULA keeps auto adjusting toward the final FORMULA we're seeking.

Applying adjustments BACK into the FORMULA based on the ERROR-AMOUNT using "GRADIENT DESCENT" is the "BACKPROPAGATION" part of AI.

The TRAINING LOOP step is the "MACHINE LEARNING" part of AI. 

AI is adjusting the FORMULA based on the TRAINING SET to produce the correct FORMULA for that TRAINING SET.

****************************************
Once the FORMULA consistently gives the ERROR-AMOUNT as ZERO for each point in the TRAINING SET, WE'RE DONE!

Every point in the TRAINING SET now maps its INPUT value correctly THROUGH the FORMULA to it's KNOWN OUTPUT value.

****************************************
Now that we have the finished FORMULA, we can use it like any other old formula.

We don't have to repeat all this AI stuff again.

We now know the INPUT and the FORMULA and we can calculate the UNKNOWN OUTPUT.

For our example, we can now put in the longitude and latitude of Lima, Peru - which was NOT in the TRAINING SET - and the FORMULA will give us Lima's correct, previously unknown, OPPOSITE point.

The FORMULA should now work right for ANY longitude and latitude.

****************************************
So that's all AI really is?

Yes, using KNOWN TRAINING SAMPLE PAIRS, AI calculates a FORMULA which can then be used generically for ANY similar INPUT to calculate similar but UNKNOWN OUTPUT.

Q.E.D. - (actually, I don't think anyone says Q.E.D. anymore :-)
****************************************


****************************************
Let's do a fun example - just for kicks :-)

Let's take "head shot" pictures of a bunch of dogs and pair them with "head shot" pictures of a bunch of cats as our TRAINING SET.

Then we have AI crank out a FORMULA that converts "head shot" pictures of dogs into "head shot" pictures of cats.

For fun, we then give that FORMULA a "head shot" picture of ME - (some people say I have a dog face :-)

What would we get out?

Obviously a picture of a really "hip cat" beatnik, with a goatee, sunglasses and a beret, right? :-)

Okay, probably NOT - probably a spooky "half Tom" / "half cat" ("tomcat"? :-) creature.  

You could use that to scare kids with on Halloween :-)

****************************************
Let's do a more serious example.

Let's create a TRAINING SET consisting of:

100 pictures of cancerous blood cells - paired to "Cancer" as the OUTPUT.

and 100 pictures of healthy blood cells - paired to "No Cancer" as the OUTPUT.

We have AI crank out the FORMULA and run a picture of YOUR blood cells through it.

Could save your life!

****************************************

****************************************
You might ask, "why are we only hearing about AI NOW?"

After all, Linear Algebra and Calculus have been around since the late 17th century.

Neural networks since the late 1940s.

Backpropagation was invented in 1970.

I first heard about all of this in a lecture at a computer conference in 1989.

So why are we only seeing AI NOW?

Well, the longitude/latitude problem is pretty small and could probably even be done by hand.

But the really interesting problems for AI are BIG - REALLY BIG.

Cancer/No Cancer may want megapixel, hi-def picture INPUT to be accurate.

That means gigantic MATRICES.

And maybe more than 100 SAMPLES of each are needed to work right. Maybe 1000 or 10,000.
It could take millions, billions (trillions?) of loops through the TRAINING LOOP to get the FORMULA.

It's all just basic arithmetic, but it's a LOT of basic arithmetic - a SUPER-DUPER LOT.

You can't do that on a 1970s era PDP-11 computer

And you can't do that on your cell phone or even your PC either.

But companies like GOOGLE and MICROSOFT have MASSIVE amounts of computer power available and they CAN do this.

That's why you're starting to see AI now.

It's just a matter of raw computer power - a LOT of raw computer power.

****************************************

****************************************
Okay, everything so far has been sunshine, puppy dogs and unicorns.

But there are a LOT of PROBLEMS with AI. 

****************************************
One is just simple round off errors.

Matrix formulas in general may need very exacting values with very long decimal fractions in them to work accurately.

Computers have to do a round-off at some point.

That round-off may seem microscopic but when you're talking millions, billions (trillions?) of calculations it could throw everything whacky.

And you might not even know it happened!

****************************************
Another problem, more specific to AI, is that the "Gradient Descent" algorithm in the TRAINING LOOP can get stuck in a "local minimum" and stops making the FORMULA "better" each loop.

And that might not be anywhere close to a good version of the FORMULA.

Currently they try to spot when the ERROR-AMOUNT isn't shrinking any more.

Basically they just give the FORMULA a sort of whack upside the head and hope it knocks it loose and starts the ERROR-AMOUNT shrinking again.

****************************************
The major problem with AI is that WE DON'T REALLY KNOW WHAT IT'S DOING.

We assume the FORMULA is wrapping itself around patterns in the INPUT and translating them into patterns in the OUTPUT.

But we don't have a clue what those patterns are!

And what if there is NO pattern in the input or how does a random noise "pattern" in the input affect things? ("Gee, that cloud looks like Snoopy" isn't a real pattern :-).

What then? Who knows!

We don't know what weights it's giving to those patterns or how it affects decisions about what goes into the output.

In our dog-to-cat example, is it giving too much weight to the dog noses?, should it be using the dog eyes more? or less? or the ears more? or less? 

Or is something in the background on just one of the pictures goofing everything up? 

Who knows!

Maybe a minor bump on a single dog's nose might flip the cat from being black to white?

In our dog-to-cat example who cares - it's always cute anyway.

But in our Cancer/No Cancer example - something like that could be VERY important!

****************************************
Which brings us to the really BIGGEST problem of all in AI:

Since we don't really know what AI is doing - IS IT DOING IT CORRECTLY?

All we really know is that the FORMULA works perfectly for the TRAINING SET.

We expect, but don't really know, if it works at all for ANY other INPUT.

How can we ever know for sure?

Just double checking the output doesn't always help either.

It might be right 95% of the time but totally whacko the other 5%! 

That 5% might just never show up - or worse show up at a VERY inopportune moment! ("Open the pod bay doors, HAL" - "I'm sorry Dave, I can't do that").

Presumably a bigger TRAINING SET might help - might make the FORMULA more accurate - but how many TRAINING PAIRS are enough? Who knows!

And all these problems probably depend on the particular problem at hand - no general solutions to them.

****************************************

****************************************
So what's the bottom line on AI?

Okay, this is just my opinion :-)

AI is here to stay and it's going to be big - REALLY BIG!

There's a LOT of applications where "close enough" really is "close enough".

But NOT everything.

I think we should ALWAYS consider the output from AI to be more of a machine generated "opinion" rather than a "fact".

It reminds me of Ronald Reagan on the Russians - "Trust, but verify."

If there's a plane piloted by AI but no trained human pilots onboard - I'm NOT getting on that plane. Sorry, NOT now, NOT EVER! :-)

But if you just want to see what I look like as a cat - I'm okay with that - all day long :-)

****************************************


Stay Jazzed!
--Tom Swezey

...