ML Algorithms Explained #1

Hello everyone. I am starting a new chain of blog posts where I write about machine learning algorithms in detail. I am not an expert but the main goal of this series will be both teaching myself and producing content others can benefit from. I realized the best way for me to understand most of these algorithms will be through writing about them and trying to explain as much as possible.

Writing these series will not only help me better understand machine learning but will also give me some content I can rely on when in the future I say, ‘I remember writing lots of articles on ML. I didn’t know much when I first started the series, but as you can see now, I am deep into it!

Image generated by Adobe Firefly (Beta) with the following prompt: “a robot teaching machine learning”

That’s why it is very crucial for me to start this series as you can see! Also, who doesn’t love some freshly human-written content that isn’t some generative AI product? These days, tools like OpenAI’s GPT 4.0 and Google’s Bard have taken over the world like a storm. We all use them for something. Personally, I use them as private tutors. I ask and they explain and again and again. They even give Python code examples while explaining AI. I would need to pay a lot for a private tutor like that. To be honest, I also get scared when it writes better content and better code than mine. Whenever I write code, I give it to one of these LLMs to debug and give recommendations. They sometimes fail and mess things up but that mostly happens with mathematics or code that is highly mathematical.

Of course, I have many concerns as well. I currently have multiple websites on various topics with tons of unique content in them. These content are mostly AI generated (it is mentioned in each website if the content is human or AI generated) and the content is so good that when I read them before pushing them, I feel like a professional content creator has written them for me.

This blog, on the other hand, is of course purely Yunus-generated as it should be. This website is my personal blog and therefore everything you find here will be %100 my unique and original work. Mentioning this makes me feel good as I feel useful and productive but it doesn’t change the fact that AI writes better than me and probably if you are searching for a content creator, instead of hiring me you better hire a language model (I mean it).

See how I went off the rails? This makes me human! Machine learning is deep and infinite on its own. With this blog series, I can not claim to give you everything in a professional manner, but I can promise on my honest work and good intentions as a computer programmer.

We also have a chicken and an egg problem when it comes to generative AI. Tools like ChatGPT are written in programming languages, by humans. But these tools can also write similar code. As a programmer, am I obsolete? Will AI design and write all the new software? Well, not really. At least not for a while. Because you see, these tools are made of a bunch of code themselves. Even if they can generate new code on given prompts, this doesn’t change the fact that they are limited by design. Some human programmers need to design and write new code for these tools to keep improving and these tools are not yet able to create themselves. They will eventually achieve that level of intelligence but even then, there will be things for humans to do. Maybe we won’t be writing code anymore (I will be because I need to write code to calm down) but we will still be problem-solving. Problem-solving is the core of what we do and will always be the core of what a computer programmer does. I love writing code, that’s for sure (my GitHub is full of my small projects in various programming languages) but when things change and we need to adapt to the ‘new world’, I think I can adapt to my new routine in the metaverse full of AI agents.

As we will be discovering with this new series of ML algorithms, software is not only about writing code. You need to design it first. Machine learning is a mathematically intense field of study and before you write the ‘magical code’, you need to know what to write. That’s where algorithms come into play. What is going to be the algorithm that we will base the code on? What is the math behind this goal? What theories should we implement and know?

These are the questions I will be asking myself and you, during this journey of ML algorithms!

I will be using my good friend Mr. World Wide Web to do most of my research as I write this series as well as some paper books. Of course I can not ignore our stars Mr. Bard and Mr. GPT, as we will be using their knowledge to understand many ML concepts. I love using these tools as they are very good at explaining concepts to me in different levels. This will be also useful as I write this series. I will be giving credit to each when necessary, including any blog posts, newspapers, academic papers and books.

As this is the first blog post of our ML Algorithms series, I want to start our journey through the most basic/fundamental ML algorithm commonly accepted: “Linear regression”.

So what is linear regression?

Linear regression is a statistical method used for predictive analysis. With linear regression, you can make predictions on ‘continuous’ or ‘numeric’ variables. As variables get new values assigned to them, it opens the door to finding patterns in them. These sorts of variables represent a range of real numbers. These can take any value within a range, including decimal numbers. Here is an example data set generated by Google’s Bard that can be used to do linear regression:

Height (inches)Weight (pounds)
60120
62130
64140
66150
68160
Table generated by Google’s Bard on June 10th, 2023

This table is a standard dataset example when introducing the concept. The relationship between these two columns opens the door to predictions on how the new values may behave with each other and what they might be. Linear regression shows the relationship between the continuous variables. The relationship is always ‘linear’ in this case. We have two different variables. One is a ‘dependent’ and the other is an ‘independent’ variable. The linear relationship between the X-axis and Y-axis is called linear regression. X-axis is the ‘independent’ and Y-axis is the ‘dependent’ variable.

In the chart above, the ‘height’ column is our independent variable. The ‘weight’ column is the dependent variable. Weight depends on the height of the person. We can predict the future weight by the value of height.

We need a linear equation to do this:

Weight = 0.5 * Height + 100

This linear equation tells us that for every 1-inch increase in height, there is a corresponding 0.5 pounds increase in weight.

0.5 is the coefficient of ‘height’ and indicates the increase mentioned above.

The y-intercept of the equation, which is ‘100’, indicates that the average weight for someone who is 0 inches tall is 100 pounds.

For example, is you are 65 inches tall, we would predict your weight as 132.5 pounds.

We came to this conclusion because we applied our value of ’65’ to the equation like this:

Weight = 0.5 * 65 + 100 –> Weight = 132.5 pounds.

Great! But how did we come up with this magical ‘linear equation’ just by looking at the example dataset?

We had to find the slope and the y-intercept. We found the slope by taking the change in weight and dividing it by the change in height. We can see by looking at the table that the change in height is ‘2’ and the change in weight is ’10’. So Weight / Height = 10 / 2 = 5. Bingo! The slope (m) = 5.

Then we needed the y-intercept. Y-intercept means the point where the line crosses the y-axis. In our case, the y-axis represents the weight of a person who is 0 inches tall. To find the y-intercept, we plugged in 0 for height in the linear equation, like this:

Weight = 0.5 * 0 + 100 –> Weight = 100

This tells us the weight = 100 if height = 0.

Knowing the slope formula and the y-intercept will be useful:

Slope = (y2 – y1) / (x2 – x1)

This means the change in y divided by the change in x is equal to the slope.

On the other hand, y-intercept means the point where the line crosses the y-axis on a graph. Y-axis is the vertical axis, also called the ‘ordinate’.

The y-intercept is generally labeled with the value of the dependent variable when the independent variable is equal to 0.

Let’s make an example graph for us to understand better. For this, we will use the Python programming language and use the Pandas module alongside the matplotlib module.

Notice that because our values are a perfect fit, all the dots (y-values) will be right on the line.

This graph will visualize the regression of our dataset. Here is what it will look like:

In order to achieve this graph of our dataset, we wrote the Python code step by step like this:

Step #1: Import matplotlib.pyplot and pandas modules with the following import statements:

import matplotlib.pyplot as plt

import numpy as np

Step #2: Create an array for height-weight:

HW = np.array([
    [],
    [],
    [],
    [],
    []
    
])

Step #3: Fill the array with values:

HW = np.array([
  [60, 120],
  [62, 130],
  [64, 140],
  [66, 150],
  [68, 160]
    
])

Step #4: Define the colums:

H=HW[:, 0]
W=HW[:, 1]

Step #5: Create the scatter plot with height vs. weight columns as axes:

plt.scatter(H, W)

Step #6: Adding the labels to the axes:

plt.xlabel("Height (in inches)")
plt.ylabel("Weight (in pounds)")

Step #7: limiting the axes:

plt.xlim(0, max(H)+10)
plt.xlim(0, max(W)+10)

Step #8: Adding the axis line and ‘ticks’ positions and labels:

plt.axhline(0, color='BLACK', linewidth=0.8)
plt.axvline(0, color='BLACK', linewidth=0.8)

plt.xticks(np.arange(0,max(H)+10, 10))
plt.yticks(np.arange(0, max(W)+10, 10))

Step #9: Adding the regression line to the graph:

m = (W[-1] - W[0]) / (H[-1] - H[0])
intercept = W[0] - m * H[0]
x_values = np.linspace(0, max(H)+10, 100)
y_values = m * x_values + intercept
plt.plot(x_values, y_values, color='red')

For the above step, I would like to explain a little bit to clarify things. The first line where it says slope is equal to W[-1] – W[0] divided by the same for H, this is because in order to find the slope of a line, we use this formula: m = (x2-x1) / (y2-y1). -1 represents the second position and 0 represents the former position. We use this because using -1 and 0 helps us stay in the center of the data set and therefore get more accurate results from our regression.

In the second line where we declare the intercept variable, we use the equation for finding the intercept. The equation is intercept = y – m * x.

The third line of code creates an array of 100 evenly spaced numbers between 0 and the maximum value of H array, then adds 10. Why add 10? Because if you don’t add the extra 10, you won’t see all the regression line, this way we are ensuring we see it all.

When we look at the second variable ‘y_values’, we see that the slope is multiplied by the previous variable plus the intercept. Why? Because this also creates 100 different values which are calculated by multiplying the x_values list with the slope and adding the intercept value.

Now we have two different arrays of values (x_values and y_values). We need to put them on the graph as a red line so that we can see it visually. For that, we have the last line which implements the plt.plot() function, which takes three arguments: x_values, y_values, color.

Step #10: Display slope value:

m_text = f"Slope: {m:.2f}"
plt.text(10, 170, m_text, fontsize=12)

Step #11: Add grid lines to the graph:

plt.grid(color='gray', linestyle='--', linewidth=0.5)

Step #12: Show the plot:

plt.show()

hope it was helpful!

Bibliography:

https://www.javatpoint.com/linear-regression-in-machine-learning
https://chat.openai.com/?model=text-davinci-002-render-sha
https://bard.google.com/
https://firefly.adobe.com/generate/images
https://matplotlib.org/3.5.3/api/_as_gen/matplotlib.pyplot.html
June 10, 2023