The one with the generators!

Hey! welcome back! I wanted to bring to show something very cool. Its called a generator. Basically a generator functions allow you to declare a function that behaves like an iterator, i.e. it can be used in a for loop.

I have introduced you to generators before in the article about yield functions. Today we will look another way to improve our code by using generators and substitute our list comprehensions.

The performance improvement from the use of generators is the result of the lazy (on demand) generation of values, which translates to lower memory usage. Furthermore, we do not…

The one with cash

Welcome back! Today we’re gonna talk about a decorator that can help you speed up processes or functions in python.

Usually, when we create a function, we rely on the computational power to complete the process that we ask of it, whether it’s a mathematical process or a classification process we normally just call the function as many times as needed even though it might output the same result every time.

Here I will show you an easier way to speed up those processes and also help your computer out! We will use the functools library…

The one about the lazy formatting!

Hey! welcome back, here I will show you a cool Jupyter notebook and Jupyter lab extension that will save you loads of time when you want to make your code look presentable.

Ok, here’s the setup. You are writing some code for a job application or for your job and you will have people look at your code but you want to make sure it looks presentable! Well, there’s an awesome extension for it!

I know it can take a long time and effort to make your code look presentable but not anymore. nb_black is an easy-to-use extension that works…

The one about the try and except!

Hey! welcome back! so, today we will be looking at a really cool thing that you may not know python can do and I believe that it can be super helpful in your next coding project.

Today, we will check out the try: and except:items. If you have done some web scraping in python you will probably be familiar with them. Here we will explore other possible applications.

When it comes to try: and except: you can think of them as logic gates, similar to an if but in this case, what you…

Helping with the curse of dimensionality part 2.


So, in my previous article we saw a coulple of dimension reduction techniques. In this part we will look at some other more complicated froms of Feature selection. The techniques that we will use here will be:

  • Random Forest
  • Backward Feature Elimination
  • Forward Feature Selection

I will be working with the same dataset as before. Let’s get started!

Random Forest

Random Forest is a very popular algorithm for feature selection. This algorithm contains built in a feature importance so there is no need to break your head with coding your own.

Random Forest can…

Helping with the curse of dimensionality.


So, we looked at what is the curse of dimensionality, now lets see some techniques on how to mitigate it. There are different methods on how to resolve this and in this blog we’ll take a look.

In this part we will look at Feature selection, this helps us by only keeping the most relevant variables from the original dataset. The techniques that we will use here will be:

  • High Correlation filter
  • Low Variance Filter
  • Missing Value Ratio

The dataset that I will be using is the titanic dataset. Ok, let's get started.

Missing Value Ratio


What’s the big deal?


So, you’re an emerging Data Scientist or you’re dabbling in data analytics and you hear the “Curse of Dimensionality” mentioned a lot. We’ll maybe I can help to clear it up!

The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience. The expression was coined by Richard E. Bellman when considering problems in dynamic programming.

The thing about the curse of dimensionality is that when the features or dimensions increase, the volume…

A way to create generators or to use it for infinite numbers.

A useful trick to use when creating functions or just when you don’t want to create a computationaly heavy item you can use the yield object to produce a generator that can return an object on demand.

A perfect example to illustrate how yield can work is by using the Sieve of Eratosthenes example. How the sieve works is by taking a number and removing all the numbers ahead of it that are divisible by said number. …

A quick guide to plot your latitude and longitudes.

Today I want to bring you a quick tutorial for when you have some geodata and don’t know what to do or what does it look like! You can check out Folium’s GitHub here.

Let’s start by installing folium.

pip install folium

In this case, I will use a dataset that contains the Latitude and Longitude of houses in the Seattle, WA area.

Helping to bring a little clarity to black box models.

Hey again! This time I want to bring you a useful tool to help bring some clarification to uninterpretable models aka black box models.

The library we’re talking about is called Lime (You can find Lime’s GitHub here) is able to explain any black-box classifier, with two or more classes. All we require is that the classifier implements a function that takes in raw text or a NumPy array and outputs a probability for each class. Support for scikit-learn classifiers is built-in.

Lime is great for different kinds of classifications…

Ignacio Ruiz

A Data Scientist in the making!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store