Starting Machine Learning with Python!
Machine Learning is not a new word in this world but its highly popular. So, why is it popular so much? One answer is it ML algorithm can make predictions and take decision on its own.
This article, will give you a basic idea of how you can write a ML code of your own using Python and its modules.
We are going to use sample music dataset and do some predications. Big Shoutout to Programming with Mosh. Go check it out his video on YouTube.
Pre-requisite's
- music dataset Click to download
- Jupyter notebook (Anaconda)
- Basic Python knowledge
Basic rule to follow while doing ML analysis-
- Import the Data
- Check null data and Clean the Data
- Try to convert into numeric data (good for ML)
- Split the dataset into Training and Test
- Create a Model
- Make Predictions
- Evaluate and Improve
Lets Begin our ML Journey!
Open your Anaconda application and Launch Jupyter Notebook
Launch a new Python notebook
Import some python ML modules
import pandas as pd
Use pandas to read the musical csv files in Jupyter Notebook
df = pd.read_csv(“musical file path”)
#Check null data is present
df.isnull().sum() #0 indicate there is no null values in dataset
#Split the dataset for testing and training
x = df.drop(columns =[“genre”])
y = df[“genre”]
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size = 0.2)
#Okay, so we are good till here. We have check the null values and split the dataset for training and testing.
#Note- Larger the training dataset more accurate will be the predictions
#Choose a Machine Learning Algorithm
from sklearn.tree import DecisionTreeClassifier
model = DecisionTreeClassifier()
model.fit(x_train,y_train)
#Now, to predict the values follow the below command.
predictions = model.predict([[21,1]])
#[21,1] represents Age of 21 with gender Male as 1 = Male and 0 = Female
predictions
#To check the accuracy of your ML Algo -
from sklearn.metrics import accuracy_score
#calculating accuracy
#score = accuracy_score(y_test,predictions)
#score
#If score is = 1, model is 100% accurate, 0.75 means accuracy is 75% and so on.
#Actual script for reference
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
df = pd.read_csv(“\\Users\\MYPCDELLPC\\Desktop\\Python\\Project\\music\\music.csv”)
x= df.drop(columns = [“genre”])
y = df[“genre”]
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size = 0.2)
model = DecisionTreeClassifier()
model.fit(x_train,y_tra#calculating accuracy
#score = accuracy_score(y_test,predictions)
#scorei
predictions = model.predict(x_test)
predictions
#Note — replace x_test value with [Age,Gender]