Linear Regression – The next step

In the ML Pipeline, after Classification the next step is Regression. We are going to specifically discuss about a subset of Regression called Linear regression. Linear regression is an approach to create a model to understand the relationship between input and output numerical variables. It helps one to understand how the typical value of the dependent variable changes when any one of the independent variables is varied. It is used when we have an independent variable ( Features that are inputted to the model, example: number of rooms, location of house, year the house was built etc.)  and another variable that depends on the initial variable ( example – Price  of the house ). As the values of the independent variable( features ) change the values of the dependent variable changes. Using regression we can find the value of the dependent variable according to the change in value of the independent variable. The way regression works & its error finding will be discussed in the conclusion

let us take the example of Predicting the cost of a new house we cannot use classification here as it simply does not make sense. We are predicting the cost & not the category the house belongs to. Firstly we need to input the features, i.e all the factors which affect the final price of the house.

First, we need a dataset of house pricing, an easily available dataset is one that we can load from the python module sklearn. We need to use this python code to load the dataset:

After this step we can proceed! but first we must know all the features that we are inputting into the regression algorithm.

Feature Description All the features that will be used along with the description of each

These are all the features that we will be inputting into the model. The code for this is very Simple we have,

Python Code

Notice that, we have a variable called ‘test_array’. This array contains all the sample values that I have provided to serve as a testing medium. This piece of Code is relatively small and easy to understand. You may have also noticed that in the last line I am multiplying the prediction by 1000. This is because I want the output to be in the proper cash denominations. I am converting the prediction into a float and applying a ‘math.ceil’ operation to round off the prediction to make it more readable. Now let us see the output or prediction that the model makes based on the features that we give.

Prediction or Output This is the prediction that the model makes. The model predicts that a house in a city with the listed features has a average price of about 21,000 dollars.