Wednesday May 1st, 2019

Decision Tree

By Ebubekir Sezer

Decision Tree is a machine learning algorithm which you can make the classification and raegression. Implementation of this algorithm very easy. In decision tree algorithm, Decision were made by looking at the value which given at the start. The given root separated to the two branches and process goes with these branches.

I want to make prediction using by the decision tree algorithm. I have a dataset which contains experience and salary of the employees. Using by this dataset, I want to determine the salary of the employees who has fraactional experience.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

data=pd.read_csv("salary.csv",sep=";")
print(data.head())

x=data.iloc[:,0].values.reshape(-1,1)
y=data.iloc[:,1].values.reshape(-1,1)

You can see the libraries which i can use in the prediction in the above. By using Pandas, I added the my csv file. This data has two columns and i would like to initilaze x and y value for that I define these variables using the iloc.

from sklearn.tree import DecisionTreeRegressor
tree_regression=DecisionTreeRegressor()
tree_regression.fit(x,y)

x2=np.arange(min(x),max(x),0.01).reshape(-1,1)
y_head=tree_regression.predict(x2)

By using the Sklearn library, I added the Decision Tree Regressor to the my project and then i fitted the my data using by this regression model. To predict year values, i should make the interval between the 1 and 10 years and then i added a variable and then i increase the this variable by 0.01 between the 1 and 10 to predict salary. Lastly i made the graphical output of this prediction.

plt.scatter(x,y,color="red")
plt.plot(x2,y_head,color="green")
plt.xlabel("Years Experience")
plt.ylabel("Salary")

We should get the graphic like above. You can ask your questions via e-mail or comments.