Hello everyone, my name is Akshay. This is going to be my first Medium post and I would like to start with my favourite machine learning Algorithms.
I would try to keep my language as simple as possible so that it will help you to better understand how SVM works.
Support Vector Machines are supervised machine learning capable of performing linear or nonlinear classification, regression, and even outlier detection. SVMs are best for the classification of complex but small or medium-sized datasets.
In this figure- 1.1, we can see that 3 lines are fitting the data ( like Linear Regression ). But which of the line is the best fit.
Now, the line ( plain line ) is the best fit as the points are very far from the line.
The distance between the closest data points divided by the line( or hyperplane in multi-dimension )is known as the Margin.
A line can be declared as the best fit only if the Margin distance is Maximum.
The Points in orange ( close to the margin line ) are called Support Vectors.
Margin Violation —if any of the Instances or data points that lies in the middle of the margin or even on the wrong side.
For efficient SVM models, we need to keep the balance between:
- Keeping the margin as wide as possible.
- Limiting the margin violation ( minimum instances should lie in the margin )
Now, the question that arises here is that how we can keep this balance. So, we can control this by using a C hyperparameter.
Smaller C-value -> Wide margin, more margin violation.
Larger C-value -> Small Margin, less margin violation.
Note- If the SVM model is overfitting the data, we can try regularization by reducing the C value.
It is not always the case that you will get linear separable data ( like in the real-world ).
One approach is to add more features, such as the POLYNOMIAL feature ( adding more features by s increasing the power of existing features).
We can use the pipeline, containing polynomial features transformer, followed by StandardScaler and Linear SVC.
Note — Adding Polynomial features is simple and can work with all sort of Machine Learning Algorithms. However,
- ->Low Polynomial degree can’t deal with complex data
- ->High Polynomial degree creates huge number of features and makes the model slow.
What if I say that we can add many polynomial features to the dataset without having actually added them?
Yes, this is possible with help of — KERNEL Trick.
Here is the code for that-
This above code will train SVM with 3rd-degree polynomial. If SVM overfits, reduce the degree of vice-versa.
In this article, we have discussed about the basics of Support Vector Machines. All these models are supported by sklearn package, which is a powerful tool for Machine learning.
I hope that this article helps you to know the basics of SVMs. If you have any question, please let me know in the comment. All the contributions are welcome. ^.^
Thank you for your reading!