Building a Spam classifier using the NLTK library in python

Image for post
Image for post
Photo by Hannah Wright on Unsplash

Human beings have come a long way when it comes to communication.

There are thousands of languages spoken every single day with many people having a command over several languages.

The ability to communicate with another human being using a different language goes to show how powerful our brains are when it comes to picking up a language.

But can computers be trained to understand our languages as well? Yes, That has already been done and extensive research is still going on in the field of Natural Language Processing

Natural Language Processing or NLP for short is a sub-field of Artificial Intelligence which deals with how computers can be trained and programmed to understand multiple languages both in written and oral form. …

A simple guide on How to get started with KNN in python.

Image for post
Image for post
Photo by Florian Schmid on Unsplash

Easy to understand and easy to implement, The K-nearest-neighbors algorithm is one of the most widely used classification algorithms out there.

A non-parametric algorithm, The math behind KNN is simple to understand thus making it easy to interpret and explain. But that’s not all.

KNN is also famous for:

  • Being Robust; for instance, classes don’t have to be linearly separable.
  • Having few parameters to tune to find the best model
  • Having no assumptions whatsoever.

All of the above points combined with KNNs simplicity make it a good choice when working with classification problems. …

Careers, Data Science

How to enjoy your work without losing your sanity.

Image for post
Image for post
Photo by Clint McKoy on Unsplash

It’s been close to three months since my Data Science internship started.

From getting the chance to work on Real-life Data Science projects to dealing with clients, The Process has been quite the learning experience, and I firmly believe it’s just getting started.

But most recently, I have been facing what most people would call a “burnout.”

Even though rich in the learning experience, the past month has been grueling, to say the least, and it’s not because Data Science is “hard” or that there is too much learn.

It’s because understanding the client's requirement is the single most difficult task out there. …


Introduction to The Technology of the “future.”

Image for post
Image for post
Photo by Nick Chong on Unsplash

“Blockchain will change the world.” Oh, If I received a dollar for every time, I heard that statement.

Everyone is talking about the implications of cryptocurrency and How it will “revolutionize” finance as we know it, but very few people understand HOW it will do it.

The term “Blockchain” sounds a bit “cool,” I must admit, and that’s why everyone and their brother is using it repeatedly whenever the talk of the “future” comes in.

Yea, I know you want to sound like an intellectual in a conversation, and terms like “Blockchain” or “Machine Learning” do make you sound like one. But just knowing these terms isn’t enough. …

Careers, Data Science

The ability to Listen clearly and to Convey clearly is a must-have

Image for post
Image for post
Photo by Headway on Unsplash

“What is this?”
That’s one of the many statements I heard when I showed the progress I made on a project for my boss.
“I don’t understand what you have done here.” “This isn’t what I asked of you.”

Oh, How these one-line sentences can fill you up with anxiety.

For a second, you begin to think that all your hard work has been put aside, and instead of appreciation, You’re going to be served a whole platter of criticism.
The project you had been working on tirelessly has failed to live up to the expectations of your higher-ups, and that not only labels you as an “underperformer” but puts all your time and effort to waste. …

A detailed introduction to one of the most powerful classification algorithms out there.

Image for post
Image for post
Photo by Jeremy Bishop on Unsplash

One of the most popular machine learning algorithms out there, Decision Trees are by far the easiest to understand in how they function and that’s why
they are my go-to choice when dealing with any sort of classification problem.

Unlike Logistic Regression or Support Vector Machine which requires a solid mathematical foundation to be understood, Decision Trees literally mimic the way we Humans operate on a daily basis. …

Careers, Data Science, Opinion

Why making a career transition is not the end of the world.

Image for post
Image for post
Photo by NASA on Unsplash

Graduated in 2017, Worked multiple jobs, visited a foreign country, started my freelancing business, learned a new language, and most recently, dove deep into the world of Data Science.

6 months ago made the transition towards Data Science at the age of 26, Got a lot of criticism for lacking “stability” in my career, and was told numerous times that is was a bad decision because:

  1. Companies need “fresh” graduates for junior-level roles.
  2. Companies need experienced individuals for senior-level roles.
  3. You don’t have the right degree
  4. You have 3 years of experience on your resume which has NOTHING to do with Data Science.

Basic Pandas Functions to help you in your data preprocessing

Image for post
Image for post
Photo by Sid Balachandran on Unsplash

If you’re working within Data Science, You must be familiar with Pandas.

Specifically designed to carry out Data preprocessing tasks, Pandas has a ton of functionalities that can make managing, cleaning, visualizing, and retrieving data extremely easy. And as anyone would know, A large chunk of a Data Scientists' time goes into getting the data into a clean and understandable format for machine learning.

In this article, I will be going over the basic Pandas functions which have made my life as a Data Science intern, a whole lot easier.
For this article, I will be using the Titanic dataset from Kaggle. …

Data Science

Significance of Price Elasticity and how it is used to find the optimum price point.

Image for post
Image for post
Photo by Markus Spiske on Unsplash

As a data science intern, I have come to the realization that the amount of value you provide is directly proportional to the price tag you get.

The number of skills you possess only matters if those skills can translate to added value to your customer in the form of increased sales or decreases costs. Simple.
The more value you provide, the more valuable you are as a Data Scientist.

In today’s article, I will be going over a significant topic that I have come across repeatedly during my time as an internee and that topic is Price Elasticity.

What is “Price Elasticity”?

Price Elasticity tells us how sensitive sales of a particular product are to a unit change in its price.

Data Science

Time Series data is one of the fastest-growing data out there and that is why it is imperative to have a good understanding of it

Image for post
Image for post
Photo by Markus Spiske on Unsplash

Ever since I got my first internship at a Data Science startup, The learning curve has been immense and the set of problems I have faced, diverse. From preprocessing Retail data to understanding forecasting, It has been quite the journey and is just getting started.
Recently, I have had the chance to work on “Time Series” which to put it in easier terms, is a type of analysis that captures trends and studies the behavior of data over some time and I would like to share some important concepts and terminologies learned. For this article, I will be using a dummy Sale information dataset over the course of two years. …



Machine Learning enthusiast| Student of the field| Follow me on:

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store