Posts by Tag (grid view)

Python

Day 6 Algorit.ma : Unsupervised Learning

44 minute read

Day 6, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...

Day 5 Algorit.ma : Classification Model

27 minute read

Day 5, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...

Day 4 Algorit.ma : Regression Model

32 minute read

Day 4, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...

Day 3 Algorit.ma : Practical Statistics

23 minute read

Day 3, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...

Day 1 Algorit.ma : Python For Data Analysis

38 minute read

Day 1, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...

Preclass Bootcamp Algorit.ma BFLP Audit

3 minute read

Well this post (I hope I can make it as a series) will be my personal notes and documentation of data science bootcamp session from Algorit.ma. Please notes ...

How to get your personal Dota2 Data

3 minute read

This is going to be a short post. This is really interesting for me personally. As a Data Scientist and avid Dota 2 player, what could be better than doing d...

Pandas Exercise 7 : Visualization Bonus

1 minute read

The continuity of my practice on Pandas exercise from guisapmora. This one is interesting because it covers the basic exercise of visualization in Matplotlib.

Pandas Alternatives : Modin

3 minute read

Pandas library has became the “one must installed” library for data manipulation in python and is widely used by data scientist and analyst. Pandas provide a...

Pandas Exercise 7 : Visualization

3 minute read

The continuity of my practice on Pandas exercise from guisapmora. This one is interesting because it covers the basic exercise of visualization in Matplotlib.

Pandas Exercise 1 : Knowing your data

10 minute read

So in this exercise is we are going to use dataset from the internet to make it easier. You could download the exercise from here. I just bored and keep tryi...

Python 3.11

3 minute read

The latest version of Python has been released on 24th October 2022 last week. The 3.11 changelog consist of a lot of bug fixes, improvements, and additional...

Web Scraping with BeautifulSoup4

8 minute read

The surge of available data we can find on the internet is insane. With this surge, data analytics has become a hugely important part of the way organization...

Handling Imbalance Data

3 minute read

Data imbalance usually reflects an unequal distribution of classes within a dataset. In class imbalance, one trains on a dataset that contains a large number...

Python Data Visualization Guide

4 minute read

Creating a visualization may not as easier as it looks. Some of the visualizations may look cool but not interpret what they mean. Imagine after a hard and l...

Automated EDA Library for Python

2 minute read

After I reviewed my knowledge of exploratory data analysis (EDA) here, I am wondering if there is some way or a new way to understand your dataset more easil...

K-Means Exercise

7 minute read

Exercise from Jose Portilla Python for Data Science Bootcamp.

Categorical Encoding 2

11 minute read

Another reference and shared post from https://www.mygreatlearning.com/blog/label-encoding-in-python/

EDA Regression Exercise

5 minute read

Today, i will try Exploratory Data Analysis and regression with insurance data from Kaggle. Let’s take a look

Linear Regression Exercise

7 minute read

This week I will dedicate my time to solve all exercise from Jose Portilla Python for Data Science Bootcamp.

Feature Selection

15 minute read

Most data nowadays is huge and massive. Dataset often comes with many irrelevant features that do not contribute much to the accuracy of your predictive mode...

Categorical Encoding

5 minute read

Practically, in real dataset, the dataset contain categorical value. So what is the difference between casual string value and categorical value ? Well, some...

Word Vector

7 minute read

Machine learning can’t process non-numeric value. Then how to process image or text data ? Before you train your image or text data, you need to transform th...

Numerical Transformer

1 minute read

After rescaling or normalize the data, there is another way to change the distribution of the data by transformation. There are 3 different ways to transform...

Feature Scaling

5 minute read

Numerical data is already digestible by machine learning or mathematical formula. But it doesn’t mean that is no longer need feature engineering or preproces...

Handling Missing Values

11 minute read

Missing value in your data is pretty common in real life. In fact, the chance that at least one data point is missing increases as the data set size increase...

Stocks Finance Exercise

7 minute read

In this data project we will focus on exploratory data analysis of stock prices. Keep in mind, this project is just meant to practice your visualization and ...

Data Capstone Exercise

8 minute read

For this capstone project we will be analyzing some 911 call data from Kaggle. The data contains the following fields:

Python Crash Course Exercise 6

2 minute read

This day i will completing data visualization with Pandas Exercise. If you want to solve it all by yourself, you can download notebooks file here and sample ...

Python Crash Course Exercise 5

2 minute read

This day i will completing Matplotlib Exercise. If you want to solve it all by yourself, you can download notebooks file here

Python Crash Course Exercise 4

3 minute read

Today i will completing Pandas Exercise using SF Salaries. If you want to solve it all by yourself, you can download notebooks file here and dataset here

Python Crash Course Exercise 3

5 minute read

Today i will completing Pandas Exercise using Ecommerce Purchase. If you want to solve it all by yourself, you can download notebooks file here and dataset h...

Python Crash Course Exercise 2

4 minute read

Today i will completing Numpy Exercise. If you want to solve it all by yourself, you can download notebooks file here

Python Crash Course Exercise

2 minute read

This week I will dedicate my time to solve all exercise from Jose Portilla Python for Data Science Bootcamp. Today i will completing some exercise from Pytho...

Interactive Visualization Plotly

1 minute read

The plotly Python library is an interactive, open-source plotting library that supports over 40 unique chart types covering a wide range of statistical, fina...

Better Visualization : Seaborn 2

2 minute read

Multiplot grid are general types of plots that allow you to map plot types to rows and columns of a grid, this helps you create similar plots separated by fe...

Better Visualization : Seaborn

2 minute read

After discussing basic visualization with Matplotlib, now let’s try another but more attractive visualization library called Seaborn. Seaborn is a Python dat...

Python Matplotlib

3 minute read

Matplotlib is the most basic library of data visualization with Python. It created to try to replicate MatLab’s (another programming language) plotting capab...

Python Pandas 2

5 minute read

Continue from last post, lets continue about the features in pandas.

Python Pandas

6 minute read

Next, let’s discussing Pandas. Preparing the data and munging the same was the initial outcomes of python before the introduction of Panda libraries. after ...

Python Numpy

4 minute read

Let’s continue with Numpy. NumPy is a python library used for working with arrays. It also has functions for working in domain of linear algebra and matrices...

Python Basic

11 minute read

Before we dive into Pandas, Numpy and Matplotlib, let’s try remind us the basic of the python first. I wont cover all Python stuff because it took too much t...

Libraries in Python for Data Science

4 minute read

First of all, why using Python for Data Science ? According to recent surveys by KDNugget, Python is the preferred programming language for data scientists. ...

0010 Belajar Machine Learning : Matplotlib

1 minute read

Midnight post lagi -_-. Enaknya bahas apaan ya ??? Karena bakal nggak asik kalo ML tanpa illustrasi ( ͡° ͜ʖ ͡°), mendingan bahas tentang illustrasi yang bisa...

0001 Belajar Machine Learning : Pandas

3 minute read

Midnight post nih gan mumpung lagi gabut. Pikir-pikir enaknya lanjut bahas ML kayak kemaren ( ͡° ͜ʖ ͡°). Pandas adalah semacam library dari Python yang biasa...

Back to top ↑

Machine Learning

Python 3.11

3 minute read

The latest version of Python has been released on 24th October 2022 last week. The 3.11 changelog consist of a lot of bug fixes, improvements, and additional...

Natural Language Processing with NLTK

4 minute read

Natural Language Processing (NLP) is broadly defined as the automatic manipulation of natural language, like speech and text. Natural language is primarily ...

Dimensionality Reduction

2 minute read

The performance of machine learning algorithms can degrade with too many input variables. Having a large number of dimensions in the feature space can mean t...

K Means

2 minute read

Clustering is a technique widely used to find groups of observations (clusters) that share similar characteristics. This process is not driven by a specific ...

K-Means Exercise

7 minute read

Exercise from Jose Portilla Python for Data Science Bootcamp.

Support Vector Machine

5 minute read

The support vector machine is a generalization of a classifier called maximal margin classifier. The maximal margin classifier is simple, but it cannot be ap...

Random Forest

2 minute read

Bagging is an ensemble algorithm that fits multiple models on different subsets of a training dataset, then combines the predictions from all models. Random ...

Decision Tree

6 minute read

Decision trees are very popular machine learning algorithm. They are popular because a variety of reasons, being their interpretability probably their most i...

K Nearest Neighbour

6 minute read

K Nearest Neighbour (KNN) works by choosing the best $k$ of neighbour. Neighbour by definition is a person living near or next door to the speaker or person ...

Logistic Regression

4 minute read

Logistic regression is a classification algorithm used to assign observations to a discrete set of classes. It is a statistical machine learning algorithm th...

EDA Regression Exercise

5 minute read

Today, i will try Exploratory Data Analysis and regression with insurance data from Kaggle. Let’s take a look

Linear Regression

5 minute read

Linear regression is useful for finding relationship between two continuous variables. Linear regression is a linear model, a model that creates a linear rel...

Interactive Visualization Plotly

1 minute read

The plotly Python library is an interactive, open-source plotting library that supports over 40 unique chart types covering a wide range of statistical, fina...

Better Visualization : Seaborn 2

2 minute read

Multiplot grid are general types of plots that allow you to map plot types to rows and columns of a grid, this helps you create similar plots separated by fe...

Better Visualization : Seaborn

2 minute read

After discussing basic visualization with Matplotlib, now let’s try another but more attractive visualization library called Seaborn. Seaborn is a Python dat...

Python Matplotlib

3 minute read

Matplotlib is the most basic library of data visualization with Python. It created to try to replicate MatLab’s (another programming language) plotting capab...

Python Pandas 2

5 minute read

Continue from last post, lets continue about the features in pandas.

Python Pandas

6 minute read

Next, let’s discussing Pandas. Preparing the data and munging the same was the initial outcomes of python before the introduction of Panda libraries. after ...

Python Numpy

4 minute read

Let’s continue with Numpy. NumPy is a python library used for working with arrays. It also has functions for working in domain of linear algebra and matrices...

Python Basic

11 minute read

Before we dive into Pandas, Numpy and Matplotlib, let’s try remind us the basic of the python first. I wont cover all Python stuff because it took too much t...

Starting Material

3 minute read

Ok, now after 1 day break and dilly dally learning theory about Machine Learning and Evaluation Metric (I’m kind of regret tell the theory first because it t...

Evaluation Metric Special ROC-AUC

1 minute read

A receiver operating characteristic curve, or ROC curve, is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its dis...

Evaluation Metric Clustering

9 minute read

Clustering is an important part of the machine learning pipeline for business or scientific enterprises utilizing data science. As the name suggests, it help...

Evaluation Metric Regression

3 minute read

Regression task is the prediction of the state of an outcome variable at a particular timepoint with the help of other correlated independent variables. The ...

Evaluation Metric Binary Classification

6 minute read

Classifier performance depends greatly on the characteristics of the data to be classified. There is no single classifier that works best on all given proble...

Testing and Validation

2 minute read

To know how good your model is to actually try it on new cases or different cases. But if your model not doing good as it expected, surely you will hesitate ...

Challenges in Machine Learning

7 minute read

At a some level, running machine learning systems at scale is challenging for several reasons. The systems issues are often misunderstood. Although best prac...

Types of Machine Learning

6 minute read

There are many different types of machine learning system. At a high-level, machine learning is simply the study of teaching a computer program or algorithm ...

Libraries in Python for Data Science

4 minute read

First of all, why using Python for Data Science ? According to recent surveys by KDNugget, Python is the preferred programming language for data scientists. ...

Interesting and Open Public Data Agregator

1 minute read

There are hundreds of zettabytes of data available on internet, but most of them is not publicly accessed. Today, i will share where to find open public data...

0010 Belajar Machine Learning : Matplotlib

1 minute read

Midnight post lagi -_-. Enaknya bahas apaan ya ??? Karena bakal nggak asik kalo ML tanpa illustrasi ( ͡° ͜ʖ ͡°), mendingan bahas tentang illustrasi yang bisa...

0001 Belajar Machine Learning : Pandas

3 minute read

Midnight post nih gan mumpung lagi gabut. Pikir-pikir enaknya lanjut bahas ML kayak kemaren ( ͡° ͜ʖ ͡°). Pandas adalah semacam library dari Python yang biasa...

Back to top ↑

Pandas

Day 5 Algorit.ma : Classification Model

27 minute read

Day 5, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...

Day 4 Algorit.ma : Regression Model

32 minute read

Day 4, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...

Day 3 Algorit.ma : Practical Statistics

23 minute read

Day 3, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...

Day 1 Algorit.ma : Python For Data Analysis

38 minute read

Day 1, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...

Pandas Exercise 7 : Visualization Bonus

1 minute read

The continuity of my practice on Pandas exercise from guisapmora. This one is interesting because it covers the basic exercise of visualization in Matplotlib.

Pandas Alternatives : Modin

3 minute read

Pandas library has became the “one must installed” library for data manipulation in python and is widely used by data scientist and analyst. Pandas provide a...

Pandas Exercise 7 : Visualization

3 minute read

The continuity of my practice on Pandas exercise from guisapmora. This one is interesting because it covers the basic exercise of visualization in Matplotlib.

Pandas Exercise 1 : Knowing your data

10 minute read

So in this exercise is we are going to use dataset from the internet to make it easier. You could download the exercise from here. I just bored and keep tryi...

Categorical Encoding 2

11 minute read

Another reference and shared post from https://www.mygreatlearning.com/blog/label-encoding-in-python/

Feature Scaling

5 minute read

Numerical data is already digestible by machine learning or mathematical formula. But it doesn’t mean that is no longer need feature engineering or preproces...

Handling Missing Values

11 minute read

Missing value in your data is pretty common in real life. In fact, the chance that at least one data point is missing increases as the data set size increase...

Stocks Finance Exercise

7 minute read

In this data project we will focus on exploratory data analysis of stock prices. Keep in mind, this project is just meant to practice your visualization and ...

Data Capstone Exercise

8 minute read

For this capstone project we will be analyzing some 911 call data from Kaggle. The data contains the following fields:

Python Crash Course Exercise 6

2 minute read

This day i will completing data visualization with Pandas Exercise. If you want to solve it all by yourself, you can download notebooks file here and sample ...

Python Crash Course Exercise 4

3 minute read

Today i will completing Pandas Exercise using SF Salaries. If you want to solve it all by yourself, you can download notebooks file here and dataset here

Python Crash Course Exercise 3

5 minute read

Today i will completing Pandas Exercise using Ecommerce Purchase. If you want to solve it all by yourself, you can download notebooks file here and dataset h...

Python Pandas 2

5 minute read

Continue from last post, lets continue about the features in pandas.

Python Pandas

6 minute read

Next, let’s discussing Pandas. Preparing the data and munging the same was the initial outcomes of python before the introduction of Panda libraries. after ...

Data Types

3 minute read

There are many different kind of data types. In this blog, i will explain these data types based on most common understanding in Data Science. Specifically i...

0001 Belajar Machine Learning : Pandas

3 minute read

Midnight post nih gan mumpung lagi gabut. Pikir-pikir enaknya lanjut bahas ML kayak kemaren ( ͡° ͜ʖ ͡°). Pandas adalah semacam library dari Python yang biasa...

Back to top ↑

Exercise

Pandas Exercise 7 : Visualization Bonus

1 minute read

The continuity of my practice on Pandas exercise from guisapmora. This one is interesting because it covers the basic exercise of visualization in Matplotlib.

Pandas Exercise 7 : Visualization

3 minute read

The continuity of my practice on Pandas exercise from guisapmora. This one is interesting because it covers the basic exercise of visualization in Matplotlib.

Pandas Exercise 1 : Knowing your data

10 minute read

So in this exercise is we are going to use dataset from the internet to make it easier. You could download the exercise from here. I just bored and keep tryi...

MNIST with Multilayer Perceptron

10 minute read

In this postwe will build out a Multi Layer Perceptron model to try to classify hand written digits using TensorFlow (a very famous example).

K-Means Exercise

7 minute read

Exercise from Jose Portilla Python for Data Science Bootcamp.

EDA Regression Exercise

5 minute read

Today, i will try Exploratory Data Analysis and regression with insurance data from Kaggle. Let’s take a look

Linear Regression Exercise

7 minute read

This week I will dedicate my time to solve all exercise from Jose Portilla Python for Data Science Bootcamp.

Stocks Finance Exercise

7 minute read

In this data project we will focus on exploratory data analysis of stock prices. Keep in mind, this project is just meant to practice your visualization and ...

Data Capstone Exercise

8 minute read

For this capstone project we will be analyzing some 911 call data from Kaggle. The data contains the following fields:

Python Crash Course Exercise 6

2 minute read

This day i will completing data visualization with Pandas Exercise. If you want to solve it all by yourself, you can download notebooks file here and sample ...

Python Crash Course Exercise 5

2 minute read

This day i will completing Matplotlib Exercise. If you want to solve it all by yourself, you can download notebooks file here

Python Crash Course Exercise 4

3 minute read

Today i will completing Pandas Exercise using SF Salaries. If you want to solve it all by yourself, you can download notebooks file here and dataset here

Python Crash Course Exercise 3

5 minute read

Today i will completing Pandas Exercise using Ecommerce Purchase. If you want to solve it all by yourself, you can download notebooks file here and dataset h...

Python Crash Course Exercise 2

4 minute read

Today i will completing Numpy Exercise. If you want to solve it all by yourself, you can download notebooks file here

Python Crash Course Exercise

2 minute read

This week I will dedicate my time to solve all exercise from Jose Portilla Python for Data Science Bootcamp. Today i will completing some exercise from Pytho...

Back to top ↑

Sklearn

Natural Language Processing with NLTK

4 minute read

Natural Language Processing (NLP) is broadly defined as the automatic manipulation of natural language, like speech and text. Natural language is primarily ...

Dimensionality Reduction

2 minute read

The performance of machine learning algorithms can degrade with too many input variables. Having a large number of dimensions in the feature space can mean t...

K Means

2 minute read

Clustering is a technique widely used to find groups of observations (clusters) that share similar characteristics. This process is not driven by a specific ...

Support Vector Machine

5 minute read

The support vector machine is a generalization of a classifier called maximal margin classifier. The maximal margin classifier is simple, but it cannot be ap...

Random Forest

2 minute read

Bagging is an ensemble algorithm that fits multiple models on different subsets of a training dataset, then combines the predictions from all models. Random ...

Decision Tree

6 minute read

Decision trees are very popular machine learning algorithm. They are popular because a variety of reasons, being their interpretability probably their most i...

Categorical Encoding 2

11 minute read

Another reference and shared post from https://www.mygreatlearning.com/blog/label-encoding-in-python/

K Nearest Neighbour

6 minute read

K Nearest Neighbour (KNN) works by choosing the best $k$ of neighbour. Neighbour by definition is a person living near or next door to the speaker or person ...

Logistic Regression

4 minute read

Logistic regression is a classification algorithm used to assign observations to a discrete set of classes. It is a statistical machine learning algorithm th...

Linear Regression

5 minute read

Linear regression is useful for finding relationship between two continuous variables. Linear regression is a linear model, a model that creates a linear rel...

Feature Selection

15 minute read

Most data nowadays is huge and massive. Dataset often comes with many irrelevant features that do not contribute much to the accuracy of your predictive mode...

Categorical Encoding

5 minute read

Practically, in real dataset, the dataset contain categorical value. So what is the difference between casual string value and categorical value ? Well, some...

Word Vector

7 minute read

Machine learning can’t process non-numeric value. Then how to process image or text data ? Before you train your image or text data, you need to transform th...

Numerical Transformer

1 minute read

After rescaling or normalize the data, there is another way to change the distribution of the data by transformation. There are 3 different ways to transform...

Feature Scaling

5 minute read

Numerical data is already digestible by machine learning or mathematical formula. But it doesn’t mean that is no longer need feature engineering or preproces...

Handling Missing Values

11 minute read

Missing value in your data is pretty common in real life. In fact, the chance that at least one data point is missing increases as the data set size increase...

Back to top ↑

Classification

Day 5 Algorit.ma : Classification Model

27 minute read

Day 5, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...

Natural Language Processing with NLTK

4 minute read

Natural Language Processing (NLP) is broadly defined as the automatic manipulation of natural language, like speech and text. Natural language is primarily ...

NLP Project

7 minute read

Welcome to the NLP Project for this section of the course. In this NLP project you will be attempting to classify Yelp Reviews into 1 star or 5 star categori...

Decision Tree

6 minute read

Decision trees are very popular machine learning algorithm. They are popular because a variety of reasons, being their interpretability probably their most i...

K Nearest Neighbour

6 minute read

K Nearest Neighbour (KNN) works by choosing the best $k$ of neighbour. Neighbour by definition is a person living near or next door to the speaker or person ...

Logistic Regression

4 minute read

Logistic regression is a classification algorithm used to assign observations to a discrete set of classes. It is a statistical machine learning algorithm th...

Evaluation Metric Special ROC-AUC

1 minute read

A receiver operating characteristic curve, or ROC curve, is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its dis...

Evaluation Metric Binary Classification

6 minute read

Classifier performance depends greatly on the characteristics of the data to be classified. There is no single classifier that works best on all given proble...

Back to top ↑

EDA

Python Data Visualization Guide

4 minute read

Creating a visualization may not as easier as it looks. Some of the visualizations may look cool but not interpret what they mean. Imagine after a hard and l...

Automated EDA Library for Python

2 minute read

After I reviewed my knowledge of exploratory data analysis (EDA) here, I am wondering if there is some way or a new way to understand your dataset more easil...

NLP Project

7 minute read

Welcome to the NLP Project for this section of the course. In this NLP project you will be attempting to classify Yelp Reviews into 1 star or 5 star categori...

Movie Recommender System Analysis

7 minute read

Welcome to the code notebook for Recommender Systems with Python. In this lecture we will develop basic recommendation systems using Python and pandas. There...

K-Means Exercise

7 minute read

Exercise from Jose Portilla Python for Data Science Bootcamp.

EDA Regression Exercise

5 minute read

Today, i will try Exploratory Data Analysis and regression with insurance data from Kaggle. Let’s take a look

Stocks Finance Exercise

7 minute read

In this data project we will focus on exploratory data analysis of stock prices. Keep in mind, this project is just meant to practice your visualization and ...

Data Capstone Exercise

8 minute read

For this capstone project we will be analyzing some 911 call data from Kaggle. The data contains the following fields:

Back to top ↑

Random

What is ChatGPT

3 minute read

GPT (Generative Pre-training Transformer) is a type of artificial intelligence model developed by OpenAI that can be used for tasks such as language translat...

How to get your personal Dota2 Data

3 minute read

This is going to be a short post. This is really interesting for me personally. As a Data Scientist and avid Dota 2 player, what could be better than doing d...

LeTAD #3: Next Season DPC

3 minute read

Well now let’s talk about the next season DPC and the drama all around it. Of course, a lot of things happening so to start with congratulations to Tundra Es...

LeTAD #2: Dota 2 Configuration

7 minute read

So for the second post, I am going to show you what is the configuration of my personal Dota2 settings. It is going to be a little bit of a long explanation ...

What is Data Science

5 minute read

So today i will start my journey with recollecting from the internet about what is Data Science ? So what is Data Science ? Why is Data Science so popular be...

Starting New Journey

less than 1 minute read

Short blog post. I hope in the future i can try to be productive. Recently, i feel bored with the Covid-19 situation. So i hope in the next few days i can ma...

Hello World

5 minute read

Syntax highlighting is a feature that displays source code, in different colors and fonts according to the category of terms. This feature facilitates writin...

Back to top ↑

Matplotlib

Python Crash Course Exercise 6

2 minute read

This day i will completing data visualization with Pandas Exercise. If you want to solve it all by yourself, you can download notebooks file here and sample ...

Python Crash Course Exercise 5

2 minute read

This day i will completing Matplotlib Exercise. If you want to solve it all by yourself, you can download notebooks file here

Interactive Visualization Plotly

1 minute read

The plotly Python library is an interactive, open-source plotting library that supports over 40 unique chart types covering a wide range of statistical, fina...

Better Visualization : Seaborn 2

2 minute read

Multiplot grid are general types of plots that allow you to map plot types to rows and columns of a grid, this helps you create similar plots separated by fe...

Better Visualization : Seaborn

2 minute read

After discussing basic visualization with Matplotlib, now let’s try another but more attractive visualization library called Seaborn. Seaborn is a Python dat...

Python Matplotlib

3 minute read

Matplotlib is the most basic library of data visualization with Python. It created to try to replicate MatLab’s (another programming language) plotting capab...

0010 Belajar Machine Learning : Matplotlib

1 minute read

Midnight post lagi -_-. Enaknya bahas apaan ya ??? Karena bakal nggak asik kalo ML tanpa illustrasi ( ͡° ͜ʖ ͡°), mendingan bahas tentang illustrasi yang bisa...

Back to top ↑

Algoritma

Day 6 Algorit.ma : Unsupervised Learning

44 minute read

Day 6, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...

Day 5 Algorit.ma : Classification Model

27 minute read

Day 5, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...

Day 4 Algorit.ma : Regression Model

32 minute read

Day 4, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...

Day 3 Algorit.ma : Practical Statistics

23 minute read

Day 3, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...

Day 1 Algorit.ma : Python For Data Analysis

38 minute read

Day 1, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...

Preclass Bootcamp Algorit.ma BFLP Audit

3 minute read

Well this post (I hope I can make it as a series) will be my personal notes and documentation of data science bootcamp session from Algorit.ma. Please notes ...

Back to top ↑

Preprocessing

Categorical Encoding 2

11 minute read

Another reference and shared post from https://www.mygreatlearning.com/blog/label-encoding-in-python/

Feature Selection

15 minute read

Most data nowadays is huge and massive. Dataset often comes with many irrelevant features that do not contribute much to the accuracy of your predictive mode...

Categorical Encoding

5 minute read

Practically, in real dataset, the dataset contain categorical value. So what is the difference between casual string value and categorical value ? Well, some...

Word Vector

7 minute read

Machine learning can’t process non-numeric value. Then how to process image or text data ? Before you train your image or text data, you need to transform th...

Numerical Transformer

1 minute read

After rescaling or normalize the data, there is another way to change the distribution of the data by transformation. There are 3 different ways to transform...

Feature Scaling

5 minute read

Numerical data is already digestible by machine learning or mathematical formula. But it doesn’t mean that is no longer need feature engineering or preproces...

Handling Missing Values

11 minute read

Missing value in your data is pretty common in real life. In fact, the chance that at least one data point is missing increases as the data set size increase...

Back to top ↑

Data Science

Libraries in Python for Data Science

4 minute read

First of all, why using Python for Data Science ? According to recent surveys by KDNugget, Python is the preferred programming language for data scientists. ...

Interesting and Open Public Data Agregator

1 minute read

There are hundreds of zettabytes of data available on internet, but most of them is not publicly accessed. Today, i will share where to find open public data...

Data Types

3 minute read

There are many different kind of data types. In this blog, i will explain these data types based on most common understanding in Data Science. Specifically i...

Structured and Unstructured Data

3 minute read

One of the simple ways to think about data is wether it is structured or not. Well, the first thing not all data is created equal or the same. Some data is s...

Data Science Tools

6 minute read

After finding a reason and motivation to start learning about Data Science, lets we talk about the Tools to practicing Data Science. Machine learning tools m...

What is Data Science

5 minute read

So today i will start my journey with recollecting from the internet about what is Data Science ? So what is Data Science ? Why is Data Science so popular be...

Back to top ↑

Clustering

Day 6 Algorit.ma : Unsupervised Learning

44 minute read

Day 6, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...

Dimensionality Reduction

2 minute read

The performance of machine learning algorithms can degrade with too many input variables. Having a large number of dimensions in the feature space can mean t...

K Means

2 minute read

Clustering is a technique widely used to find groups of observations (clusters) that share similar characteristics. This process is not driven by a specific ...

K-Means Exercise

7 minute read

Exercise from Jose Portilla Python for Data Science Bootcamp.

Evaluation Metric Clustering

9 minute read

Clustering is an important part of the machine learning pipeline for business or scientific enterprises utilizing data science. As the name suggests, it help...

Back to top ↑

Personal

Last Year Challenge

2 minute read

Well, this post going to be my finaly year in the Netherlands. A bit of drama here and there but let;s see what happens.

2nd year at the Netherlands

2 minute read

A long break between the previous post. I’ve been busy with my new job (more like training) recently so yeah I might update this blog a little bit next year ...

Farewell The Netherlands

4 minute read

It’s been a while I created a blog post. Apparently my last post was almost 8 months ago. But now I created a farewell post to The Netherlands because I am g...

LeTAD #2: Dota 2 Configuration

7 minute read

So for the second post, I am going to show you what is the configuration of my personal Dota2 settings. It is going to be a little bit of a long explanation ...

Back to top ↑

Scoring

Evaluation Metric Special ROC-AUC

1 minute read

A receiver operating characteristic curve, or ROC curve, is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its dis...

Evaluation Metric Clustering

9 minute read

Clustering is an important part of the machine learning pipeline for business or scientific enterprises utilizing data science. As the name suggests, it help...

Evaluation Metric Regression

3 minute read

Regression task is the prediction of the state of an outcome variable at a particular timepoint with the help of other correlated independent variables. The ...

Evaluation Metric Binary Classification

6 minute read

Classifier performance depends greatly on the characteristics of the data to be classified. There is no single classifier that works best on all given proble...

Back to top ↑

Regression

Day 4 Algorit.ma : Regression Model

32 minute read

Day 4, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...

EDA Regression Exercise

5 minute read

Today, i will try Exploratory Data Analysis and regression with insurance data from Kaggle. Let’s take a look

Linear Regression

5 minute read

Linear regression is useful for finding relationship between two continuous variables. Linear regression is a linear model, a model that creates a linear rel...

Evaluation Metric Regression

3 minute read

Regression task is the prediction of the state of an outcome variable at a particular timepoint with the help of other correlated independent variables. The ...

Back to top ↑

Numpy

Numerical Transformer

1 minute read

After rescaling or normalize the data, there is another way to change the distribution of the data by transformation. There are 3 different ways to transform...

Data Capstone Exercise

8 minute read

For this capstone project we will be analyzing some 911 call data from Kaggle. The data contains the following fields:

Python Crash Course Exercise 2

4 minute read

Today i will completing Numpy Exercise. If you want to solve it all by yourself, you can download notebooks file here

Python Numpy

4 minute read

Let’s continue with Numpy. NumPy is a python library used for working with arrays. It also has functions for working in domain of linear algebra and matrices...

Back to top ↑

Visualization

Pandas Exercise 7 : Visualization Bonus

1 minute read

The continuity of my practice on Pandas exercise from guisapmora. This one is interesting because it covers the basic exercise of visualization in Matplotlib.

Pandas Exercise 7 : Visualization

3 minute read

The continuity of my practice on Pandas exercise from guisapmora. This one is interesting because it covers the basic exercise of visualization in Matplotlib.

Python Data Visualization Guide

4 minute read

Creating a visualization may not as easier as it looks. Some of the visualizations may look cool but not interpret what they mean. Imagine after a hard and l...

Back to top ↑

Dota

How to get your personal Dota2 Data

3 minute read

This is going to be a short post. This is really interesting for me personally. As a Data Scientist and avid Dota 2 player, what could be better than doing d...

LeTAD #3: Next Season DPC

3 minute read

Well now let’s talk about the next season DPC and the drama all around it. Of course, a lot of things happening so to start with congratulations to Tundra Es...

LeTAD #2: Dota 2 Configuration

7 minute read

So for the second post, I am going to show you what is the configuration of my personal Dota2 settings. It is going to be a little bit of a long explanation ...

Back to top ↑

Anaconda

Day 1 Algorit.ma : Python For Data Analysis

38 minute read

Day 1, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...

Creating a shortcut for Jupyter Notebook

less than 1 minute read

This is a quick post on how to create a shortcut for Jupyter Notebook. In this case, you need to connect your PATH of your Python Conda. Here’s how:

Preclass Bootcamp Algorit.ma BFLP Audit

3 minute read

Well this post (I hope I can make it as a series) will be my personal notes and documentation of data science bootcamp session from Algorit.ma. Please notes ...

Back to top ↑

Seaborn

Stocks Finance Exercise

7 minute read

In this data project we will focus on exploratory data analysis of stock prices. Keep in mind, this project is just meant to practice your visualization and ...

Better Visualization : Seaborn 2

2 minute read

Multiplot grid are general types of plots that allow you to map plot types to rows and columns of a grid, this helps you create similar plots separated by fe...

Better Visualization : Seaborn

2 minute read

After discussing basic visualization with Matplotlib, now let’s try another but more attractive visualization library called Seaborn. Seaborn is a Python dat...

Back to top ↑

Tensorflow

MNIST with Multilayer Perceptron

10 minute read

In this postwe will build out a Multi Layer Perceptron model to try to classify hand written digits using TensorFlow (a very famous example).

Introduction Neural Network

7 minute read

Artificial Neural Network (ANN) is a computational model that is inspired by the way biological neural networks in the human brain process information. Artif...

Back to top ↑

Informatika Sosial

Ulasan Startup di Indonesia (Mamikos)

5 minute read

Mencari info kost adalah sesuatu yang wajib dilakukan oleh anak rantau. Melanjutkan pendidikan di luar kota atau bekerja di luar kota tentunya sudah bukan ha...

Back to top ↑

Data Types

Data Types

3 minute read

There are many different kind of data types. In this blog, i will explain these data types based on most common understanding in Data Science. Specifically i...

Structured and Unstructured Data

3 minute read

One of the simple ways to think about data is wether it is structured or not. Well, the first thing not all data is created equal or the same. Some data is s...

Back to top ↑

English

TOEFL WritingTemplate

6 minute read

Broadly speaking, you’ll get a TOEFL independent writing question based on one of the following styles:

Back to top ↑

iBT

TOEFL WritingTemplate

6 minute read

Broadly speaking, you’ll get a TOEFL independent writing question based on one of the following styles:

Back to top ↑

Decision Tree

Decision Tree

6 minute read

Decision trees are very popular machine learning algorithm. They are popular because a variety of reasons, being their interpretability probably their most i...

Back to top ↑

Random Forest

Random Forest

2 minute read

Bagging is an ensemble algorithm that fits multiple models on different subsets of a training dataset, then combines the predictions from all models. Random ...

Back to top ↑

Data Visualization

NLP Project

7 minute read

Welcome to the NLP Project for this section of the course. In this NLP project you will be attempting to classify Yelp Reviews into 1 star or 5 star categori...

Movie Recommender System Analysis

7 minute read

Welcome to the code notebook for Recommender Systems with Python. In this lecture we will develop basic recommendation systems using Python and pandas. There...

Back to top ↑

Neural Network

Introduction Neural Network

7 minute read

Artificial Neural Network (ANN) is a computational model that is inspired by the way biological neural networks in the human brain process information. Artif...

Back to top ↑

Statistics

Day 4 Algorit.ma : Regression Model

32 minute read

Day 4, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...

Day 3 Algorit.ma : Practical Statistics

23 minute read

Day 3, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...

Back to top ↑

Algoritma dan Struktur Data

Back to top ↑

Youtube

Back to top ↑

Startup

Ulasan Startup di Indonesia (Mamikos)

5 minute read

Mencari info kost adalah sesuatu yang wajib dilakukan oleh anak rantau. Melanjutkan pendidikan di luar kota atau bekerja di luar kota tentunya sudah bukan ha...

Back to top ↑

Tools

Data Science Tools

6 minute read

After finding a reason and motivation to start learning about Data Science, lets we talk about the Tools to practicing Data Science. Machine learning tools m...

Back to top ↑

Data Agreagator

Interesting and Open Public Data Agregator

1 minute read

There are hundreds of zettabytes of data available on internet, but most of them is not publicly accessed. Today, i will share where to find open public data...

Back to top ↑

Machine Vision

Interesting and Open Public Data Agregator

1 minute read

There are hundreds of zettabytes of data available on internet, but most of them is not publicly accessed. Today, i will share where to find open public data...

Back to top ↑

Libraries

Libraries in Python for Data Science

4 minute read

First of all, why using Python for Data Science ? According to recent surveys by KDNugget, Python is the preferred programming language for data scientists. ...

Back to top ↑

Validation

Testing and Validation

2 minute read

To know how good your model is to actually try it on new cases or different cases. But if your model not doing good as it expected, surely you will hesitate ...

Back to top ↑

Speaking

Back to top ↑

Writng

TOEFL WritingTemplate

6 minute read

Broadly speaking, you’ll get a TOEFL independent writing question based on one of the following styles:

Back to top ↑

IDE

Starting Material

3 minute read

Ok, now after 1 day break and dilly dally learning theory about Machine Learning and Evaluation Metric (I’m kind of regret tell the theory first because it t...

Back to top ↑

Plotly

Interactive Visualization Plotly

1 minute read

The plotly Python library is an interactive, open-source plotting library that supports over 40 unique chart types covering a wide range of statistical, fina...

Back to top ↑

Linear Regression

Linear Regression

5 minute read

Linear regression is useful for finding relationship between two continuous variables. Linear regression is a linear model, a model that creates a linear rel...

Back to top ↑

Logistic Regression

Logistic Regression

4 minute read

Logistic regression is a classification algorithm used to assign observations to a discrete set of classes. It is a statistical machine learning algorithm th...

Back to top ↑

KNN

K Nearest Neighbour

6 minute read

K Nearest Neighbour (KNN) works by choosing the best $k$ of neighbour. Neighbour by definition is a person living near or next door to the speaker or person ...

Back to top ↑

Ensemble

Random Forest

2 minute read

Bagging is an ensemble algorithm that fits multiple models on different subsets of a training dataset, then combines the predictions from all models. Random ...

Back to top ↑

SVM

Back to top ↑

Support Vector Machine

Support Vector Machine

5 minute read

The support vector machine is a generalization of a classifier called maximal margin classifier. The maximal margin classifier is simple, but it cannot be ap...

Back to top ↑

K-Means

K-Means Exercise

7 minute read

Exercise from Jose Portilla Python for Data Science Bootcamp.

Back to top ↑

K Means

K Means

2 minute read

Clustering is a technique widely used to find groups of observations (clusters) that share similar characteristics. This process is not driven by a specific ...

Back to top ↑

NLP

NLP Project

7 minute read

Welcome to the NLP Project for this section of the course. In this NLP project you will be attempting to classify Yelp Reviews into 1 star or 5 star categori...

Back to top ↑

NLTK

Natural Language Processing with NLTK

4 minute read

Natural Language Processing (NLP) is broadly defined as the automatic manipulation of natural language, like speech and text. Natural language is primarily ...

Back to top ↑

Spark

Introduction Spark

7 minute read

Let’s learn how to use Spark with Python by using the pyspark library! Make sure to view the video lecture explaining Spark and RDDs before continuing on wit...

Back to top ↑

RDD

Introduction Spark

7 minute read

Let’s learn how to use Spark with Python by using the pyspark library! Make sure to view the video lecture explaining Spark and RDDs before continuing on wit...

Back to top ↑

Keras

Introduction Neural Network

7 minute read

Artificial Neural Network (ANN) is a computational model that is inspired by the way biological neural networks in the human brain process information. Artif...

Back to top ↑

Image

MNIST with Multilayer Perceptron

10 minute read

In this postwe will build out a Multi Layer Perceptron model to try to classify hand written digits using TensorFlow (a very famous example).

Back to top ↑

Unsupervised Learning

Principal Component Analysis

5 minute read

Hello guys, it’s been 3 months since my last post in Machine Learning. I’ll admit that I am a little bit rusty nowadays. Because of my interviews in some com...

Back to top ↑

Principal Component Analysis

Principal Component Analysis

5 minute read

Hello guys, it’s been 3 months since my last post in Machine Learning. I’ll admit that I am a little bit rusty nowadays. Because of my interviews in some com...

Back to top ↑

BeautifulSoup4

Web Scraping with BeautifulSoup4

8 minute read

The surge of available data we can find on the internet is insane. With this surge, data analytics has become a hugely important part of the way organization...

Back to top ↑

News

LeTAD #3: Next Season DPC

3 minute read

Well now let’s talk about the next season DPC and the drama all around it. Of course, a lot of things happening so to start with congratulations to Tundra Es...

Back to top ↑

OpenAI

What is ChatGPT

3 minute read

GPT (Generative Pre-training Transformer) is a type of artificial intelligence model developed by OpenAI that can be used for tasks such as language translat...

Back to top ↑

Jupyter Notebook

Creating a shortcut for Jupyter Notebook

less than 1 minute read

This is a quick post on how to create a shortcut for Jupyter Notebook. In this case, you need to connect your PATH of your Python Conda. Here’s how:

Back to top ↑

PCA

Day 6 Algorit.ma : Unsupervised Learning

44 minute read

Day 6, here I will share my notes of Inclass notebook. For further example you can check out on https://github.com/Saltfarmer/Algoritma-BFLP-DS-Audit/tree/ma...

Back to top ↑